> I believe this helps because 0.02 is a double and [...] can produce a different answer
In principle, not quite. The real/unavoidable(-by-the-compiler) problem is that 0.02 is a not a diadic rational (not representable exactly as some integer over a power of two). So its representation (rounded to 52 bits) as a double is a different real number than its representation (rounded to 23 bits) as a float. (This is the same problem as rounding pi or e to a double/float, but people tend to forget that it applies to all diadic irrationals, not just regular irrationals.)
If, instead of `0.02f` you replaced `0.02` with `(double)0.02f` or `0.015625`, the optimization should in theory still apply (although missed optimization complier bugs are of course possible).
I think this is because the optimization isn't safe. I wrote a program to find a counter example to your claim that "the optimization should in theory still apply". It found one. Here's the code:
#include <stdio.h>
#include <stdlib.h>
float mul_as_float(float t) {
t += 0.02f * (float)17;
return t;
}
float mul_as_double(float t) {
t += (double)0.02f * (float)17;
return t;
}
int main() {
while (1) {
unsigned r = rand();
float t = *((float*)&r);
float result1 = mul_as_float(t);
float result2 = mul_as_double(t);
if (result1 != result2) {
printf("Counter example when t is %f (0x%x)\n", t, *((unsigned*)&t));
printf("result1 is %f (0x%x)\n", result1, *((unsigned*)&result1));
printf("result2 is %f (0x%x)\n", result2, *((unsigned*)&result2));
return 0;
}
}
}
It outputs:
Counter example when t is 0.000000 (0x3477d43f)
result1 is 0.340000 (0x3eae1483)
result2 is 0.340000 (0x3eae1482)
On my machine, the complier constant-folds the multiplication, producing a single-precision add for `mul_as_float` and a convert-t-to-double, double-precision-add, convert-sum-to-single for `mul_as_double`. I missed the `+=` in your original comment, but adding a float to a double does implicitly promote it like that, so you'd actually need:
t += (float)((double)0.02f * (float)17);
to achieve the "and then converting the result [of the multiplication] to float" (rather than keeping it a double for the addition) from your original comment. (With the above line in mul_as_double, your test code no longer finds a counterexample, at least when I ran it.)
If you ask for higher-precision intermediates, even implicitly, floating-point compliers will typically give them to you, hoped-for efficiency of single-precision be damned.
In principle, not quite. The real/unavoidable(-by-the-compiler) problem is that 0.02 is a not a diadic rational (not representable exactly as some integer over a power of two). So its representation (rounded to 52 bits) as a double is a different real number than its representation (rounded to 23 bits) as a float. (This is the same problem as rounding pi or e to a double/float, but people tend to forget that it applies to all diadic irrationals, not just regular irrationals.)
If, instead of `0.02f` you replaced `0.02` with `(double)0.02f` or `0.015625`, the optimization should in theory still apply (although missed optimization complier bugs are of course possible).