**7 Oct 2006** (updated 7 Oct 2006 at 20:04 UTC)

»
StevenRainwater: you
rule. Thank you!

Where can we find the updated mod_virgule code?

**SPOJ**

Phew. This
took a lot of work. I exchanged the first place with the
guy that's currently second several times until finally
changing algorithms and getting to the first place with a
decent margin.

Changelog, *ex posto facto*:

- 0.1: trial division followed by Fermat primality test.
- 0.2: Rabin-Miller.
- 0.3: accelerated a bit multiplication followed by
division operations in the modular exponentiation code (from
now on referred to as
`expmod`) with x86-specific
inline assembly (it has 32-bit -> 64-bit multiplication and
a 64-bit -> 32-bit division and remainder instructions).
- 0.4: found a "cheat" that allowed me to go back to
Fermat test, at a big performance win.
- 0.5: lots of micro-optimizations. Used a sliding bit
window in
`expmod`. Got to the first place for the
first time.
- 0.6: realized that the only way to get faster would be
to eliminate the division operation in
`expmod`.
Replaced the division with multiplication by the inverse
(some call this "Barrett reduction"). Was a teeny weeny bit
faster on my machine, but slower at spoj. The reduction
involved a 64-bit multiplication (three multiplication
instructions on 32-bit x86). :/
- 0.7: Montgomery exponentiation!

For future reference, here's a neat trick to
find the
multiplicative inverse of a number modulo 2^k, lightning
fast (this was
needed in the pre-calculation phase of Montgomery). First,
two facts:

- a^-1 = y*(2 - a*y) (mod 2^k), where y = a^-1 (mod
2^{k-1}). We can use this fact alone to derive a moderately
fast algorithm to get a^-1 mod 2^k.
- if q = 2^{k/2}*a + b (with b odd), then q*(2*b - q) =
b^2 (mod 2^k), so q^-1 = (2*b - q)*((b^2)^-1) (mod 2^k).

To get q^-1 (mod 2^k), with k even, first get
(b^2)^-1 from a table (for e.g. k=32 it will be a small
table) and then use fact #2. With k odd, do that for k-1 and
then use fact #1.