Monday, April 18, 2016

Overflow and comparators

When writing a Comparator for integer-valued types in Java, the following implementation seems straightforward and clever:

private Comparator<Integer> c = (i, j) -> i - j;

When I used this implementation when writing unit tests for a Data Structures assignment (implementing a heap), much to my surprise the results were rather buggy. I was generating random integers and throwing them in the heap, followed by checking the heap property. When I replaced this implementation with the following alternative, everything worked perfectly:

private Comparator<Integer> c = (i, j) -> i < j ? -1 : i > j ? 1 : 0;

So what's the difference? Overflow. Imagine i = 2135909750 and j = -21314406. The subtraction yields -2137743140, which, being negative, is not exactly the desired result. Of course, with roles reversed underflow is also a hazard.

The lesson for me today is that sometimes, when trying to be clever and succinct, don't be too clever!

2 comments:

  1. Funny, I think I have written exactly that code in Java last year --- though I never discovered the bug!

    Another meta-lesson is that languages ought to offer arbitrary-precision integers as a sensible default, and allow you to use fixed-precision machine ints explicitly when you want them. Having to always think about over- and underflow is just extra unnecessary cognitive load. Machine-sized integers as the default is optimizing for the wrong thing (i.e. speed/memory use instead of programmer productivity).

    ReplyDelete
  2. I agree completely! Primitive data really is an optimization, most especially with large array computations.

    ReplyDelete