Nearest Prime Number Algorithm December 18, 2008

So prime numbers are useful. I found that I needed to generate a prime number for a hash table (details of which should be posted soon. If all goes well that is…). And I decided to come up with an algorithm to generate that prime number. Clearly stating the problem:

“Find the smallest prime greater than n”

The simplest way of generating primes is of course using Eurothenes’ sieve¹. However, this naive method runs in order of \(\frac{n}{2} + \frac{n}{3} + \frac{n}{5} \ldots + \frac{n}{p}\) time, which MathWorld tells me is about \(T(n) = \omega(n \ln(\ln(n)))\), and uses \(\Theta(n)\) memory. This sucks for big numbers (or atleast I think I can do better). Further more, this step has to be repeated (or intelligently modified) till a prime number has been found. (I will show later that the best this can go would be \(T(\lceil(\sqrt{n})+1^2\rceil)\)), which is of the same order, \(\omega(n \ln(\ln(n+1)))\) (mind you this is a lower bound). The algorithm I came up with, which I thought might be better is: generate all primes till \(\sqrt{n}\), and check if any of them divide the number in question. If they don’t well, we have a prime, ladies and gentlemen. If they do, then we’ll just have to move along won’t we…

This method takes \(\Theta(T(\sqrt{n}) + \pi(\sqrt{n}))\) time for each step. \(\pi(n)\) is a very very special number. It denotes the number of primes less than a number ‘n’. The prime number theorem² states that this number converges to \(Li(x)\), i.e. \(\int \frac{x}{\ln(x)} dx\).

Another tremendously useful aspect of this theorem is that it can be used to trivially prove that there will always be a prime between two perfect squares (thus proving the statement in the previous approach). We can use this aspect by only checking division with primes less than \(\sqrt{n}+1\). This gives us a worst case scenario of \(\Theta((2n+1)\pi(\sqrt{n}+1))\) (This comes from \((n+1)^2 - n^2 = 2n+1\)). This simplifies to \(\Theta(n^{1.5}/\ln(n))\). It also requires \(\Theta(\sqrt{n})\) memory (to compute the sieve). Much to my embarrassment, it turns out that the former is in fact better, then mine (though it does trade off for memory).

Finally, I have tried to improve this algorithm by maintaining the remainder of all the prime divisions, and then incrementing by the least non-conflicting number. This is because if \(n \equiv -1 \pmod p)\), then \(n+1\) will be divisible by that prime. I don’t know how to analyse the run time of such an algorithm, but I’m sure it’s better than the latter.

Code for this experiment can be found here.

Finally, thanks to:

Pranesh Srinivasan: Who convinced me that the sieve is the best method for prime number generation, for all numbers \(< 10^7\).
Anirudh Sivaraman: For pointing out errors in my running-order calculation. (It doesn’t change the result though).