7 May 2009 twisti   » (Master)

More bit-twiddling intrinsics

Today I pushed the changes for 6823354 which adds intrinsics for {Integer,Long}.{numberOfLeadingZeros,numberOfTrailingZeros}() methods.  The speedups are quite good:


Integer Long
numberOfLeadingZeros numberOfTrailingZeros numberOfLeadingZeros numberOfTrailingZeros
Intel Nehalem 32-bit 3.18 3.96 1.36 1.90
64-bit 3.83 3.74 2.02 2.17
AMD Shanghai 32-bit 1.94 3.55 0.98 2.44
32-bit w/ lzcnt 4.90 - 1.46 -
64-bit 2.52 3.09 1.86 3.26
64-bit w/ lzcnt 6.77 - 3.71 -
UltraSparc T2 32/64-bit 2.01 2.22 1.55 1.91

"w/ lzcnt" in the table means the numbers are using AMD's LZCNT (count leading zeros) instruction which is part of SSE4a.

The SPARC intrinsics need a hardware implementation of the POPC instruction.

Yet I haven't found a real-world application that uses these methods extensively (including bitCount), but if anyone knows one, please let me know.

Syndicated 2009-05-07 11:03:50 (Updated 2009-05-07 18:10:47) from twisti's weblog

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!