Sieve of Atkin explanation

Edit: the only thing that I still do not understand is what the x and y variables are referring to in the pseudo code. Could someone please shed some light on this for me?

For a basic explanation of the common use of the 'x' and 'y' variables in the Wikipedia page pseudo-code (or other better implementations of the Sieve of Atkin), you might find my answer to another related question helpful.


Wikipedia article has an explanation:

  • "The algorithm completely ignores any numbers divisible by two, three, or five. All numbers with an even modulo-sixty remainder are divisible by two and not prime. All numbers with modulo-sixty remainder divisible by three are also divisible by three and not prime. All numbers with modulo-sixty remainder divisible by five are divisible by five and not prime. All these remainders are ignored."

Let's start with the famous

primes = sieve [2..] = sieve (2:[3..])   
  -- primes are sieve of list of 2,3,4... , i.e. 2 prepended to 3,4,5...
sieve (x:xs) = x : sieve [y | y <- xs, rem y x /= 0]   -- set notation
  -- sieve of list of (x prepended to xs) is x prepended to the sieve of 
  --                  list of `y`s where y is drawn from xs and y % x /= 0

Let's see how it proceeds for a few first steps:

primes = sieve [2..] = sieve (2:[3..]) 
                     = 2 : sieve p2     -- list starting w/ 2, the rest is (sieve p2)
p2 = [y | y <- [3..], rem y 2 /= 0]     -- for y from 3 step 1: if y%2 /= 0: yield y

p2 is to contain no multiples of 2. IOW it'll only contain numbers coprime with 2 — no number in p2 has 2 as its factor. To find p2 we don't actually need to test divide by 2 each number in [3..] (that's 3 and up, 3,4,5,6,7,...), because we can enumerate all the multiples of 2 in advance:

rem y 2 /= 0  ===  not (ordElem y [2,4..])     -- "y is not one of 2,4,6,8,10,..."

ordElem is like elem (i.e. is-element test), it just assumes that its list argument is an ordered, increasing list of numbers, so it can detect the non-presence safely as well as presence:

ordElem y xs = take 1 (dropWhile (< y) xs) == [y]   -- = elem y (takeWhile (<= y) xs) 

The ordinary elem doesn't assume anything, so must inspect each element of its list argument, so can't handle infinite lists. ordElem can. So, then,

p2 = [y | y <- [3..], not (ordElem y [2,4..])]  -- abstract this as a function, diff a b =
   = diff      [3..]                 [2,4..]    --       = [y | y <- a, not (ordElem y b)]
   -- 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
   -- . 4 . 6 . 8 . 10  . 12  . 14  . 16  . 18  . 20  . 22  .
   = diff [3..] (map (2*)            [2..] )  --  y > 2, so [4,6..] is enough
   = diff [2*k+x | k <- [0..],  x <- [3,4]]   -- "for k from 0 step 1: for x in [3,4]:
          [2*k+x | k <- [0..],  x <- [  4]]   --                            yield (2*k+x)"
   = [     2*k+x | k <- [0..],  x <- [3  ]]   -- 2 = 1*2 = 2*1
   = [3,5..]                                  -- that's 3,5,7,9,11,...

p2 is just a list of all odds above 2. Well, we knew that. What about the next step?

sieve p2 = sieve [3,5..] = sieve (3:[5,7..]) 
                         = 3 : sieve p3
p3 = [y | y <- [5,7..], rem y 3 /= 0]
   = [y | y <- [5,7..], not (ordElem y [3,6..])]           -- 3,6,9,12,...
   = diff [5,7..] [6,9..]         -- but, we've already removed the multiples of 2, (!)
   -- 5 . 7 . 9 . 11 . 13 . 15 . 17 . 19 . 21 . 23 . 25 . 27 .
   -- . 6 . . 9 . . 12  . . 15 . . 18  . . 21 . . 24  . . 27 .
   = diff [5,7..] (map (3*) [3,5..])                       -- so, [9,15..] is enough
   = diff [2*k+x | k <- [0..], x <- [5]] (map (3*)
          [2*k+x | k <- [0..], x <- [    3]] )
   = diff [6*k+x | k <- [0..], x <- [5,7,9]]               -- 6 = 2*3 = 3*2
          [6*k+x | k <- [0..], x <- [    9]] 
   = [     6*k+x | k <- [0..], x <- [5,7  ]]               -- 5,7,11,13,17,19,...

And the next,

sieve p3 = sieve (5 : [6*k+x | k <- [0..], x <- [7,11]])
         = 5 : sieve p5
p5 = [y | y <-        [6*k+x | k <- [0..], x <- [7,11]], rem y 5 /= 0]
   = diff [ 6*k+x | k <- [0..], x <- [7,11]]   (map   (5*)
          [ 6*k+x | k <- [0..], x <- [                  5,       7]]) -- no mults of 2 or 3!
   = diff [30*k+x | k <- [0..], x <- [7,11,13,17,19,23,25,29,31,35]]  -- 30 = 6*5 = 5*6
          [30*k+x | k <- [0..], x <- [                 25,      35]]
   = [     30*k+x | k <- [0..], x <- [7,11,13,17,19,23,   29,31   ]]

This is the sequence with which the sieve of Atkin is working. No multiples of 2, 3 or 5 are present in it. Atkin and Bernstein work modulo 60, i.e. they double the range:

p60 = [ 60*k+x | k <- [0..], x <- [1, 7,11,13,17,19,23,29,31, 37,41,43,47,49,53,59]]

Next, they use some sophisticated theorems to know some properties of each of those numbers and deal with each accordingly. The whole thing is repeated (a-la the "wheel") with a period of 60.

  • "All numbers n with modulo-sixty remainder 1, 13, 17, 29, 37, 41, 49, or 53 (...) are prime if and only if the number of solutions to 4x^2 + y^2 = n is odd and the number is squarefree."

What does that mean? If we know that the number of solutions to that equation is odd for such a number, then it is prime if it is squarefree. That means it has no repeated factors (like 49, 121, etc).

Atkin and Bernstein use this to reduce the number of operations overall: for each prime (from 7 and up), we enumerate (and mark for removal) the multiples of its square, so they are much further apart than in the sieve of Eratosthenes, so there are fewer of them in a given range.

The rest of the rules are:

  • "All numbers n with modulo-sixty remainder 7, 19, 31, or 43 (...) are prime if and only if the number of solutions to 3x^2 + y^2 = n is odd and the number is squarefree."

  • "All numbers n with modulo-sixty remainder 11, 23, 47, or 59 (...) are prime if and only if the number of solutions to 3x^2 − y^2 = n is odd and the number is squarefree."

This takes care of all the 16 core numbers in the definition of p60.

see also: Which is the fastest algorithm to find prime numbers?


The time complexity of the sieve of Eratosthenes in producing primes up to N is O(N log log N), while that of the sieve of Atkin is O(N) (putting aside the additional log log N optimizations that don't scale well). The accepted wisdom though is that the constant factors in the sieve of Atkin are much higher and so it might only pay off high above the 32-bit numbers (N > 232), if at all.


Here's a c++ implementation of sieve of atkins that might help you...

#include <iostream>
#include <cmath>
#include <fstream>
#include<stdio.h>
#include<conio.h>
using namespace std;

#define  limit  1000000

int root = (int)ceil(sqrt(limit));
bool sieve[limit];
int primes[(limit/2)+1];

int main (int argc, char* argv[])
{
   //Create the various different variables required
   FILE *fp=fopen("primes.txt","w");
   int insert = 2;
   primes[0] = 2;
   primes[1] = 3;
   for (int z = 0; z < limit; z++) sieve[z] = false; //Not all compilers have false as the       default boolean value
   for (int x = 1; x <= root; x++)
   {
        for (int y = 1; y <= root; y++)
        {
             //Main part of Sieve of Atkin
             int n = (4*x*x)+(y*y);
             if (n <= limit && (n % 12 == 1 || n % 12 == 5)) sieve[n] ^= true;
             n = (3*x*x)+(y*y);
             if (n <= limit && n % 12 == 7) sieve[n] ^= true;
             n = (3*x*x)-(y*y);
             if (x > y && n <= limit && n % 12 == 11) sieve[n] ^= true;
        }
   }
        //Mark all multiples of squares as non-prime
   for (int r = 5; r <= root; r++) if (sieve[r]) for (int i = r*r; i < limit; i += r*r) sieve[i] = false;
   //Add into prime array
   for (int a = 5; a < limit; a++)
   {
            if (sieve[a])
            {
                  primes[insert] = a;
                  insert++;
            }
   }
   //The following code just writes the array to a file
   for(int i=0;i<1000;i++){
             fprintf(fp,"%d",primes[i]);
             fprintf(fp,", ");
   }

   return 0;
 }

The wiki page is always a good place to start, since it explains the algorithm in full and provides commented pseudocode. (N.B. There's a lot of detail, and since the wiki website is reliably up, I won't quote it here.)

For references in the specific languages you mentioned:

  • C implementation (optimised)
  • Python implementation

Hope that helps.