You can run into problems relying on the averages of floating point numbers. This is something I think about often after reading Sinan Ünür’s How you average numbers matters. I thought about it again when I read Honza Brabec’s Mean of two floating point numbers can be dangerous. Despite the hyperbole of “can be dangerous” (see “Considered Harmful” Essays Considered Harmful), it certainly can do something that you don’t intend.
Honza wanted to add two floating point numbers that differed by the smallest unit possible. If you try to average that, what do you get? There’s no representable number between the two. For languages that rely on the iron to determine number ranges (unlike Perl 6 Rats, by the way), you will always have this problem. You’ve sacrificed infinite accuracy for speed.
The hexadecimal floating point format Perl added in v5.22 makes it easy to try this in Perl. Last summer I showcased using Inline::C to look at the hexadecimal format.
That 0x1p-52
is the smallest number we can get to in 64-bit numbers. One bit is the sign, the exponent takes 11 bits, and the rest are left for the mantissa. That’s one bit set in the 52 bits available in the mantissa, and it’s the least bit.
use v5.22; # smallest representable interval in IEEE 754 my $quantum = 0x1p-52; my $d1 = 0x1p0; my $d2 = $d1 + $quantum; my $d3 = ( $d2 + $d1 ) / 2; say "q: ", sprintf "%a", $quantum; say "d1: ", sprintf "%a", $d1; say "d2: ", sprintf "%a", $d2; say "d3: ", sprintf "%a", $d3; if( $d3 == $d1 ) { say "d1 and d3 is the same!"; } elsif( $d3 == $d2 ) { say "d2 and d3 is the same!"; }
When I run that I see that the average is the same as the $d1
:
q: 0x1p-52 d1: 0x1p+0 d2: 0x1.0000000000001p+0 d3: 0x1p+0 d1 and d3 is the same!
There’s no representable number between $d1
and $d2
. The value of $quantum
is indivisible (hence the name). The result of the division has to be something, and it ends up being the lower number again.
In Honza’s case, that the number didn’t change put him into an infinite loop. Dangerous? He doesn’t say what he was doing. Maybe he’s working on machines that deliver radiation where a software error can kill people. If you are working on something that is important, you should explore the bounds of your functions (perhaps with tests, as we describe in the book).