Advogato: Blog for shlomif

“Out of the Strong, Something Sweet” - How a Bug Led to a Useful Optimisation

The book Fortune or Failure: Missed Opportunities and Chance Discoveries (which my family used to own, but which I did not read) gives the case to the important role of luck and chance in scientific discoveries. Recently, when working on Project Euler Problem No. 146 I came up with a case of an accidental bug, that in turn led to an idea for a significant optimisation.

The C code with the bug (which was in turn translated from some Perl code) looked something like that:

#define DIV 9699690
#define NUM_MODS 24024
#define NUM_PRIMES 8497392

int primes[NUM_PRIMES];
int mods[NUM_MODS];

typedef long long LL;

static inline bool is_prime(LL n)
{
    LL lim = (LL)(sqrt(n));

    for (int p_idx=0; p_idx < NUM_MODS ; p_idx++)
    {
        typeof (primes[p_idx]) p = primes[p_idx];
        if (p > lim)
        {
            return true;
        }
        if (n % p == 0)
        {
            return false;
        }
    }
    return true;
}

.
.
.
            for (int y_idx=0;y_idx<sizeof(y_off)/sizeof(y_off[0]);y_idx++)
            {
                if (! is_prime(sq + y_off[y_idx]))
                {
                    goto fail;
                }
            }
            for (int n_idx=0;n_idx<sizeof(n_off)/sizeof(n_off[0]);n_idx++)
            {
                if (is_prime(sq + n_off[n_idx]))
                {
                    goto fail;
                }
            }

As you can notice eventually, the problem was that in the p_idx loop, NUM_MODS should have been the larger NUM_PRIMES. This caused the test for primality to finish faster, but to sometimes return true instead of false. As a result, I noticed that some numbers were erroneously reported as suitable, but the program finished much faster.

I corrected it and reran the program which was now much slower, but this led me to think that maybe the lower limit to the count of primes can be a pre-filter for primality for the “y_idx”/“y_off” numbers, that will run quicker and eliminate some numbers. As a result, I did this:

#define NUM_PRIMES__PRE_FILTER 24024

static inline bool is_prime__pre_filter(LL n)
{
    LL lim = (LL)(sqrt(n));

    for (int p_idx=0; p_idx < NUM_PRIMES__PRE_FILTER ; p_idx++)
    {
        typeof (primes[p_idx]) p = primes[p_idx];
        if (p > lim)
        {
            return true;
        }
        if (n % p == 0)
        {
            return false;
        }
    }
    return true;
}

.
.
.
            for (int y_idx=0;y_idx<sizeof(y_off)/sizeof(y_off[0]);y_idx++)
            {
                if (! is_prime__pre_filter(sq + y_off[y_idx]))
                {
                    goto fail;
                }
            }
            for (int y_idx=0;y_idx<sizeof(y_off)/sizeof(y_off[0]);y_idx++)
            {
                if (! is_prime(sq + y_off[y_idx]))
                {
                    goto fail;
                }
            }
            for (int n_idx=0;n_idx<sizeof(n_off)/sizeof(n_off[0]);n_idx++)
            {
                if (is_prime(sq + n_off[n_idx]))
                {
                    goto fail;
                }
            }

This made the program finish in under a minute, while yielding the correct solution. The original program, with the bug fix, was still running after several minutes.

So the bug proved to be useful and insightful. One possible future direction is to merge the two “y_idx” loops into a single function that will accept an array of numbers, and will check them all for primality using the same divisors simultaneously, so as soon as one of them is found to be non-prime, a verdict will be reached.

Licence

You can reuse this entry under the Creative Commons Attribution Noncommercial 3.0 Unported licence, or at your option any later version. See the instructions of how to comply with it.

Syndicated 2015-02-26 18:23:14 from shlomif

28 Feb 2015 shlomif » (Master)

Licence