Dream java: HashSet from Sun, but likely BitSet from GNU

Posted 22 Mar 2008 at 21:11 UTC (updated 23 Mar 2008 at 21:09 UTC) by audriusa Share This

As the majority knows we now have two FOSS implementations of java runtime library: OpenJDK from Sun Microsystems and the parallel GNU Classpath project. There are various opinions on how this situation will be resolved in the future. Hence there is a natural interest to compare these two implementations.

GNU Classpath usually runs with the different java virtual machine than the Sun's code and is used as a whole, without trying to separate any part apart. In that way, no honest comparison is possible between any units that are smaller than all jre + all rtl together. However many packages in GNU Classpath are written entirely in java and are relatively weakly dependent from each other. This opens opportunity to test (and, if wanted, to use) them separately from the main project.

For Sun, we used the classes and jre of the 1.6.0_04 release (not to give opportunity to argue later that results of this research do not apply to the “original” java). GNU Classpath java.util. package was taked from CVS repository and modified to run on the same 1.6.0_04 Sun's jre (simply moving classes into another package). That way we have got the two java.util.* implementations that were capable to coexist on the same virtual machine and to be directly compared with each other.

Due JIT any comparison is only possible after the virtual machine "warms up", loading and compiling the needed classes. Hence the test sequence was executed 20 times before starting the time measurement and then measuring the total time of the 20 subsequent runs.

In many cases it was rather difficult to notice any significant difference. For instance, GNU Classpath LinkedList seems about 3 % faster or Sun's LinkedHashSet is about 5 % faster. However in two cases we observe much larger differences, probably showing that the ideal java runtime library would benefit from having code from both packages.

Sun's HashSet may be about 24 % faster

The performance of HashSet was checked against sequence of random operations, including add(..), remove(..), contains(..) and iterator().next(). Sun's HashSet implementation has executed the sequence of 500000 random operations about 24 % faster:

Run 1: HashSet: Sun 2399 ms Gnu 3081 ms
Run 2: HashSet: Sun 2556 ms Gnu 3247 ms
Run 3: HashSet: Sun 2399 ms Gnu 3081 ms

Just saying that some Sun's code runs better would likely not surprise anybody. After all, the IceTea project is fully composed assuming that the code from OpenJDK should always have priority unless it is non-Free. However ...

GNU Classpath BitSet may be about 23 % faster.

BitSet was checked against the similar sequence of random operations, including set(..), get(..) and cardinality(..). It seems that there is something in GNU BitSet that helps it to do this task faster:

Run 1: BitSet: Sun 1923 ms Gnu 1523 ms
Run 2: BitSet: Sun 1931 ms Gnu 1524 ms
Run 3: BitSet: Sun 1924 ms Gnu 1533 ms

The reasons of this difference are not immediately obvious. Comparing GNU Classpath and OpenJDK code shows that both implementations use the same basic data structure (array of 64 bit integers). However the overall code is too different to say where is the exact reason of the better performance.

Of course, the actual performance of BitSet and HashSet depends on how these classes are actually used in the customer application. However it is likely not possible to get reliable world - wide statistics on this usage. From the other side, this testing included the most typical actions that the application usually does with these classes.

The main conclusion of this small research would likely be that none of the two tested implementations is absolutely better and that the ideal java runtime library would likely contain the code from both projects.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

Share this page