Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

OK, right, that's the clarity of thought I was missing.

But in this case the compiler still misses the optimization with '(double)0.02f'. https://godbolt.org/z/az7819nKM

I think this is because the optimization isn't safe. I wrote a program to find a counter example to your claim that "the optimization should in theory still apply". It found one. Here's the code:

    #include <stdio.h>
    #include <stdlib.h>

    float mul_as_float(float t) {
      t += 0.02f * (float)17;
      return t;
    }

    float mul_as_double(float t) {
      t += (double)0.02f * (float)17;
      return t;
    }

    int main() {
        while (1) {
            unsigned r = rand();
            float t = *((float*)&r);

            float result1 = mul_as_float(t);
            float result2 = mul_as_double(t);
            if (result1 != result2) {
                printf("Counter example when t is %f (0x%x)\n", t, *((unsigned*)&t));
                printf("result1 is %f (0x%x)\n", result1, *((unsigned*)&result1));
                printf("result2 is %f (0x%x)\n", result2, *((unsigned*)&result2));
                return 0;
            }
        }
    }
It outputs:

    Counter example when t is 0.000000 (0x3477d43f)
    result1 is 0.340000 (0x3eae1483)
    result2 is 0.340000 (0x3eae1482)
What do you think?


On my machine, the complier constant-folds the multiplication, producing a single-precision add for `mul_as_float` and a convert-t-to-double, double-precision-add, convert-sum-to-single for `mul_as_double`. I missed the `+=` in your original comment, but adding a float to a double does implicitly promote it like that, so you'd actually need:

  t += (float)((double)0.02f * (float)17);
to achieve the "and then converting the result [of the multiplication] to float" (rather than keeping it a double for the addition) from your original comment. (With the above line in mul_as_double, your test code no longer finds a counterexample, at least when I ran it.)

If you ask for higher-precision intermediates, even implicitly, floating-point compliers will typically give them to you, hoped-for efficiency of single-precision be damned.


Ah right, yep.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: