The Java Explorer

Tips and insights on Java

  • Subscribe

  • If you find this blog useful, please enter your email address to subscribe and receive notifications of new posts by email.

    Join 39 other followers

Comparators – part 3

Posted by Eyal Schneider on April 30, 2012

In this post and in this one, I illustrate how a bad implementation of comparators can cause annoying bugs.
Lately I saw a strangely looking Comparator implementation, which looks like this:

class WeirdComparator implements Comparator<Long> {
    public int compare(Long a, Long b) {
        if (a == null || b == null)
            return 0;
        if (a > b)
            return 1;
        if (b > a)
            return -1;
        return 0;

The implementor decided not to rely on the natural ordering of Long, since it doesn’t handle null values, which are acceptable in his use-case.

The first two lines caught my eye: they are obviously wrong in terms of the Comparator contract, but it is not immediately clear what kind of damage they can cause. The contract of Comparator/Comparable states that the implementation must define a total order on all values of the type it handles. That means, among other things, the transitivity property:
If compare(b,a)>=0 and compare(c,b)>=0, then compare(c,a)>=0. With the current implementation, compare(null, 2L)>=0 and compare(1L, null)>=0, but compare (1L, 2L)<0.

What happens if we sort using this comparator?

    Long[] arr = {100L, 101L, 102L, 103L,
                  50L , null,  51L,  52L};
    Arrays.sort(arr,new WeirdComparator());

The output in this case is a mess: “[50, 100, 101, 102, 103, null, 51, 52]”.

Java uses the Mergesort algorithm for sorting arrays of objects, and a variation of the Quicksort algorithm for sorting arrays of primitive types. Java’s Mergesort implementation has an optimization that defines a threshold of 7, such that any array range with a smaller size is not sorted recursively with Mergesort, but rather with Insertion sort.
The array size in the example is 8, so the two halves are sorted with Insertion Sort (leaving them unchanged), and then they are merged. In the merge phase, the value 50 is chosen first, and then the new head of the second half is null, which is defined to be equal to any value. This causes all the contents of the first half to be copied to the output.

The faulty comparator is also a disaster for sorted sets:

    SortedSet<Long> set = new TreeSet<Long>(new WeirdComparator());

Although we add 3 values, only the null value remains. As explained in my other post, TreeSet respects the Set invariants which prohibit duplicate values, and it uses the comparator to determine equality. Therefore, the values 10L and 20L are considered duplicates of null and aren’t added to the tree.

One Response to “Comparators – part 3”

  1. Seva said

    Very nice, thanks!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: