The Java Explorer

Tips and insights on Java

  • Subscribe

  • If you find this blog useful, please enter your email address to subscribe and receive notifications of new posts by email.

    Join 36 other followers

Archive for April, 2012

Comparators – part 3

Posted by Eyal Schneider on April 30, 2012

In this post and in this one, I illustrate how a bad implementation of comparators can cause annoying bugs.
Lately I saw a strangely looking Comparator implementation, which looks like this:

class WeirdComparator implements Comparator<Long> {
    @Override
    public int compare(Long a, Long b) {
        if (a == null || b == null)
            return 0;
        if (a > b)
            return 1;
        if (b > a)
            return -1;
        return 0;
    }
}

The implementor decided not to rely on the natural ordering of Long, since it doesn’t handle null values, which are acceptable in his use-case.

The first two lines caught my eye: they are obviously wrong in terms of the Comparator contract, but it is not immediately clear what kind of damage they can cause. The contract of Comparator/Comparable states that the implementation must define a total order on all values of the type it handles. That means, among other things, the transitivity property:
If compare(b,a)>=0 and compare(c,b)>=0, then compare(c,a)>=0. With the current implementation, compare(null, 2L)>=0 and compare(1L, null)>=0, but compare (1L, 2L)<0.

What happens if we sort using this comparator?

    Long[] arr = {100L, 101L, 102L, 103L,
                  50L , null,  51L,  52L};
    Arrays.sort(arr,new WeirdComparator());
    System.out.println(Arrays.toString(arr));

The output in this case is a mess: “[50, 100, 101, 102, 103, null, 51, 52]”.

Java uses the Mergesort algorithm for sorting arrays of objects, and a variation of the Quicksort algorithm for sorting arrays of primitive types. Java’s Mergesort implementation has an optimization that defines a threshold of 7, such that any array range with a smaller size is not sorted recursively with Mergesort, but rather with Insertion sort.
The array size in the example is 8, so the two halves are sorted with Insertion Sort (leaving them unchanged), and then they are merged. In the merge phase, the value 50 is chosen first, and then the new head of the second half is null, which is defined to be equal to any value. This causes all the contents of the first half to be copied to the output.

The faulty comparator is also a disaster for sorted sets:

    SortedSet<Long> set = new TreeSet<Long>(new WeirdComparator());
    set.add(null);
    set.add(10L);
    set.add(20L);
    System.out.println(set);

Although we add 3 values, only the null value remains. As explained in my other post, TreeSet respects the Set invariants which prohibit duplicate values, and it uses the comparator to determine equality. Therefore, the values 10L and 20L are considered duplicates of null and aren’t added to the tree.

Posted in java | Tagged: , , , , , , , | 1 Comment »