Floating-point tolerances revisited

For several years now I’ve participated in the physics tutorial session at GDC (along with Jim Van Verth, Gino van den Bergen, Erin Catto, Brian ‘Squirrel’ Eiserloh, and Marq Singer). One section I’ve covered in some detail is robustness and within this area I’ve stressed (amongst other things) the importance of distinguishing between absolute and relative tolerances when performing comparisons of floating-point numbers.

I won’t go into details about why you would want to care about how your floating-point numbers are compared. (Detailed information can be found in part in my GDC presentation slides and in much more depth in my book.) Here I will assume you already know why you care, and I will just note that absolute and relative tolerances are tested as

// Absolute tolerance comparison of x and y
if (Abs(x – y) <= EPSILON) …

and

// Relative tolerance comparison of x and y
if (Abs(x – y) <= EPSILON * Max(Abs(x), Abs(y)) …

The absolute tolerance test fails when x and y become large, and the relative tolerance test fails when they become small. It is therefore desired to combine these two tests together in a single test. Over the years at GDC, as well as in my book, I've suggested the following combined tolerance test:

if (Abs(x – y) <= EPSILON * Max(1.0f, Abs(x), Abs(y)) …

This typically works fine, but Erin Catto pointed out to me that it may sometimes be hard to control the desired behavior with just a single EPSILON that controls both the absolute and the relative tolerance at the same time. When better control is desired we can instead look at the combining of tolerances in the following way.

What we are really looking for in terms of equality of x and y is the following combined test:

if ((Abs(x - y) <= absTol) ||
    (Abs(x - y) <= relTol * Max(Abs(x), Abs(y)))) …

These two expressions can be captured in a single formula as

if (Abs(x - y) <= Max(absTol, relTol * Max(Abs(x), Abs(y)))) …

or equivalently

if (Abs(x - y) <= absTol * Max(1.0f, relTol/absTol * Max(Abs(x), Abs(y)))) …

In my book and in my GDC presentations I have simplified this expression by assuming relTol = absTol, which gives my original formula:

if (Abs(x - y) <= absTol * Max(1.0f, Abs(x), Abs(y))) …

In the cases where the assumption relTol = absTol is not desirable it makes sense to stay with the formula

if (Abs(x - y) <= Max(absTol, relTol * Max(Abs(x), Abs(y)))) …

because here one can tweak absTol and relTol independently of each other.

I've avoided going into this much detail in my GDC presentations because they tend to be cramped for time and the important message to take home is really "absolute tolerance often bad, relative tolerance mostly good." However, I felt it was important to document this additional information about floating-point comparisons somewhere, thus this short article!

Note: I moved this article from its previous home at my main domain to the blog as having it on the blog allows for proper tagging, searching, etc. My apologies if you've read this before. (Though, this article was only ever linked from the very first post I made on this blog so, realistically, odds are you never saw it before!)

5 thoughts on “Floating-point tolerances revisited”

  1. You can compare two floats in terms of their integer representation, but (a) not like that (because that’s illegal C/C++ code that runs afoul of strict aliasing rules), and (b) even if you handle the aliasing properly (by using unions or casting through char*) there are still lots of other gotchas left (such as properly handling NaNs, INF, -0, denormals, etc).

    Bruce Dawsons’s article Comparing floating point numbers nicely illustrates the complexity of performing the comparison in terms of ulps (as integers).

    Me, I’m an Occam’s razor kind of guy.

  2. As noted in your post, the relative comparison

    |a - b| <= epsilon * max(|a|, |b|)
    

    can return an incorrect result for small a and b. This happens because the right hand side of the inequality can underflow to zero. For example:

    float a = -FLT_MIN;
    float b = FLT_MIN;
    float epsilon = 2 * FLT_MIN;
     
    abs(a - b) <= epsilon * max(abs(a), abs(b))
    

    returns false due to the underflow. You can, however, avoid this by checking for a multiply underflow and restructuring the inequality with a division:

    float close_rel(float a, float b, float epsilon)
    {
        // assume small positive epsilon
        assert(epsilon >= 0.0f && epsilon <= 1.0f);
    
        float diff = abs(a - b);
        float maxab = max(abs(a), abs(b));
    
        // if the multiply won't underflow then use a multiply
        if (maxab >= 1.0f)
        {
            return diff <= (epsilon * maxab);
        }
        // multiply could underflow so use a divide if nonzero denominator
        else if (maxab > 0.0f)
        {
            // correctly returns false on divide overflow
            // (inf <= epsilon is false), since overflow means the
            // relative difference is large and they are therefore not close
            return diff / maxab <= epsilon;
        }
        else
        {
            // both a and b are zero
            return false;
        }
    }
    

    This correctly traps the problem and allows a relative comparison in all situations. This is, of course, not always desirable and I agree with you that a combined absolute and relative tolerance test is the best way to go. In almost any game situation you want two very small values to be considered equal, even if their relative difference is large. Still, there may be situations where a purely relative comparison is useful and this is one way to go about it.

  3. One way to avoid this problem is to cheat profoundly and rescale everything into the [-1,1] unit cube. You are then working in (effectively) 25-bit fixed-point (although there is more accuracy near zero). Now you can just use absolute epsilons.

    Doesn’t work in all cases of course, but a client is using it for kd-tree building and raytracing with some success.

  4. Apologies before hand if this comes across as a rather trivial question.

    In your rather wonderfully (honestly) written book Realtime Collision Detection, you present an approach to test polygon planarity. I am trying to port this algorithm into my code.

    On Page 495, you specify a distance tolerance as follows:

    // Test each vertex to see if it is farther from plane than allowed max distance
    for (int i = 0; i PLANARITY_EPSILON) return 0;
    }

    I was wondering what is a good value for the constant PLANARITY_EPSILON and whether you have had an experience of playing around with various tolerance.

    Would this depend upon the size of polygon at all i.e. for very large area polygons, the *kinks* in polygon outline may unnoticeable but the polygon may be theoretically non-planar?

Leave a Reply