I'VE GOT THE BYTE ON MY SIDE

57005 or alive

The Bailey-Borwein-Plouffe algorithm in C# and F#

Nov 3, 2012 C# F# math

Earlier this year I wrote about calculating the first \(n\) digits of \(\pi\) using a Machin-style formula.  This type of formula is very efficient for generating all of the first \(n\) digits, but what if we really only want the \(n\) th digit by itself?  The most naive approach would be to calculate all of the digits, then simply pick the last one.  But is there is a faster way?  What does the code look like?

It turns out that there is indeed a faster way to get the \(n\) th digit, as long as you don’t mind getting the digit in base 16.  The method is based on the Bailey-Borwein-Plouffe formula, a very interesting identity which represents  \(\pi\)  as an infinite sum of terms, each with a coefficient of \(1/16^k\) .

This means that each term of the sum roughly corresponds to a base-16 digit.  One has to keep track of carrying and whatnot, but the general idea is that the terms of the sum slot nicely into the hexadecimal columns.  So we can grab the \(n\) th hexadecimal digit individually by simply taking the first \(n\) terms of this sum and discarding the data we don’t care about (i.e. data which contribute only to digits in position \(< n\) ).  For the technical details see this section of the Wikipedia article.

In C#

Enough talk, let’s see some code!  A reference implementation of the algorithm, in C, can be found here.  A more-or-less direct translation to C# looks something like this:

using System;
using System.Text;

public static class Program
{
    private const int NumHexDigits = 16;
    private const double Epsilon = 1e-17;
    private const int NumTwoPowers = 25;

    private static double[] twoPowers = new double[NumTwoPowers];

    public static void Main()
    {
        double pid, s1, s2, s3, s4;
        int digitPosition = 1000000;
        string hexDigits;

        InitializeTwoPowers();

        //  Digits generated follow immediately after digitPosition.
        s1 = Series(1, digitPosition);
        s2 = Series(4, digitPosition);
        s3 = Series(5, digitPosition);
        s4 = Series(6, digitPosition);

        pid = 4d * s1 - 2d * s2 - s3 - s4;
        pid = pid - (int)pid + 1d;
        hexDigits = HexString(pid, NumHexDigits);

        Console.WriteLine("Position = {0}", digitPosition);
        Console.WriteLine("Fraction = {0}", pid);
        Console.WriteLine("Hex digits =  {0}", hexDigits.Substring(0, 10));
    }

    // Returns the first NumHexDigits hex digits of the fraction of x.
    private static string HexString(double x, int numDigits)    
    {
        string hexChars = "0123456789ABCDEF";
        StringBuilder sb = new StringBuilder(numDigits);
        double y = Math.Abs(x);

        for (int i = 0; i < numDigits; i++)
        {
            y = 16d * (y - Math.Floor(y));
            sb.Append(hexChars[(int)y]);
        }

        return sb.ToString();
    }

    // This routine evaluates the series  sum_k 16^(n-k)/(8*k+m) 
    //    using the modular exponentiation technique.
    private static double Series(int m, int n)
    {
        double denom, pow, sum = 0d, term;

        //  Sum the series up to n.
        for (int k = 0; k < n; k++)
        {
            denom = 8 * k + m;
            pow = n - k;
            term = ModPow16(pow, denom);
            sum = sum + term / denom;
            sum = sum - (int)sum;
        }

        //  Compute a few terms where k >= n.
        for (int k = n; k <= n + 100; k++)
        {
            denom = 8 * k + m;
            term = Math.Pow(16d, (double)(n - k)) / denom;

            if (term < Epsilon)
                break;

            sum = sum + term;
            sum = sum - (int)sum;
        }

        return sum;
    }

    // Fill the power of two table
    private static void InitializeTwoPowers()
    {
        twoPowers[0] = 1d;

        for (int i = 1; i < NumTwoPowers; i++)
        {
            twoPowers[i] = 2d * twoPowers[i - 1];
        }
    }

    // ModPow16 = 16^p mod m.  This routine uses the left-to-right binary 
    //  exponentiation scheme.
    private static double ModPow16(double p, double m)
    {
        int i;
        double pow1, pow2, result;

        if (m == 1d)
            return 0d;

        // Find the greatest power of two less than or equal to p.
        for (i = 0; i < NumTwoPowers; i++)
        {
            if (twoPowers[i] > p)
            {
                break;
            }
        }

        pow2 = twoPowers[i - 1];
        pow1 = p;
        result = 1d;

        // Perform binary exponentiation algorithm modulo m.
        for (int j = 1; j <= i; j++)
        {
            if (pow1 >= pow2)
            {
                result = 16d * result;
                result = result - (int)(result / m) * m;
                pow1 = pow1 - pow2;
            }

            pow2 = 0.5 * pow2;

            if (pow2 >= 1d)
            {
                result = result * result;
                result = result - (int)(result / m) * m;
            }
        }

        return result;
    }
}

Running this code, we see that the 10 hex digits of \(\pi\) starting at position 1,000,000 are 6C65E52CB4.  Cool!

In F#

Below is my translation to F#.  The algorithm is exactly the same as above, it’s just been written in a more functional style.  In particular, I removed all of the “for” loops and replaced them with tail-recursive loops.  This algorithm is easy to parallelize, so I went ahead and did that, too.

let epsilon = 1e-17
let twoPowers = [| for i = 0. to 25. do yield int (2. ** i) |]

// returns n^exp mod m, using binary exponentiation
let modPow n exp m =
    match exp with
    | x when x < 1 ->
        n ** float x
    | _ ->
        let rec step i powTwo powOne acc = 
            match i with
            | 0 -> acc
            | _ ->
                if powOne >= powTwo then
                    // multiply accumulator by n
                    step i powTwo (powOne - powTwo) ((n * acc) % m)
                elif powTwo >= 2 then
                    // square the accumulator
                    step (i - 1) (powTwo / 2) powOne ((acc * acc) % m)
                else
                    // just iterate
                    step (i - 1) (powTwo / 2) powOne acc

        let i = Array.findIndex (fun twoPow -> twoPow > exp) twoPowers
        step i twoPowers.[i - 1] exp 1.

// calculates the series sum_k 16^(n-k)/(8*k+m) using modular exponentiation
let series m n =
    let rec sum i acc =
        match n - i with
        | -100 -> acc  // do some extra terms to account for slop
        | p ->
            let denom = float (8 * i + m)
            let term = (modPow 16. p denom) / denom
            if i >= n && term < epsilon then
                acc   // bail out if extra terms are not significant
            else
                sum (i + 1) ((acc + term) % 1.)
    sum 0 0.

// calculates the hex digits representing the fractional part of the given number
let digits numDigits (x : float) =
    let rec step i x acc =
        match i with
        | 0 -> acc
        | _ ->
            let newX = 16. * (x - floor x)
            step (i - 1) newX (sprintf "%s%X" acc (int newX))
    step numDigits x ""

// returns "numDigits" hex digits of pi, starting at digit "index"
// async version
let piDigitsAsync index numDigits =
    [(1, 4.); (4, -2.); (5, -1.); (6, -1.)]
    |> List.map (fun (m, c) -> async { return c * series m index })
    |> Async.Parallel
    |> Async.RunSynchronously
    |> Array.sum
    |> digits numDigits

// synchronous version
let piDigitsSync index numDigits =
    [(1, 4.); (4, -2.); (5, -1.); (6, -1.)]
    |> List.map (fun (m, c) -> c * series m index)
    |> List.sum
    |> digits numDigits

let time f =
    let sw = System.Diagnostics.Stopwatch.StartNew()
    f ()
    sw.Stop()
    printfn "Elapsed: %A ms" sw.Elapsed.TotalMilliseconds

time (fun () -> printfn "%s" (piDigitsSync 1000000 10) )
time (fun () -> printfn "%s" (piDigitsAsync 1000000 10) )

I got interesting results when comparing the performance of the C, C#, and F# code.  My test machine is Win8 x64, 2.8 GHz 4-core Intel i7, and I tested by running the code as shown above - calculating the 10 hex digits starting at position 1,000,000.

I have it on some expert authority that the floating-point modulo operation is known to be quite slow on amd64 compared to x86, which could explain the F# results.  But then why wouldn’t we see the same pattern in C and C#?  I’m not sure, but if I find out I’ll update this post with the answer.

[Update 11/15/2012]

If you want to see the F# code in action, or tinker with it yourself, check it out at the preview site for the new Try F#.


Comments