The Bailey-Borwein-Plouffe algorithm in C# and F#
Nov 3, 2012Earlier this year I wrote about calculating the first \(n\) digits of \(\pi\) using a Machin-style formula. This type of formula is very efficient for generating all of the first \(n\) digits, but what if we really only want the \(n\) th digit by itself? The most naive approach would be to calculate all of the digits, then simply pick the last one. But is there is a faster way? What does the code look like?
It turns out that there is indeed a faster way to get the \(n\) th digit, as long as you don’t mind getting the digit in base 16. The method is based on the Bailey-Borwein-Plouffe formula, a very interesting identity which represents \(\pi\) as an infinite sum of terms, each with a coefficient of \(1/16^k\) .
This means that each term of the sum roughly corresponds to a base-16 digit. One has to keep track of carrying and whatnot, but the general idea is that the terms of the sum slot nicely into the hexadecimal columns. So we can grab the \(n\) th hexadecimal digit individually by simply taking the first \(n\) terms of this sum and discarding the data we don’t care about (i.e. data which contribute only to digits in position \(< n\) ). For the technical details see this section of the Wikipedia article.
In C#
Enough talk, let’s see some code! A reference implementation of the algorithm, in C, can be found here. A more-or-less direct translation to C# looks something like this:
using System;
using System.Text;
public static class Program
{
private const int NumHexDigits = 16;
private const double Epsilon = 1e-17;
private const int NumTwoPowers = 25;
private static double[] twoPowers = new double[NumTwoPowers];
public static void Main()
{
double pid, s1, s2, s3, s4;
int digitPosition = 1000000;
string hexDigits;
InitializeTwoPowers();
// Digits generated follow immediately after digitPosition.
s1 = Series(1, digitPosition);
s2 = Series(4, digitPosition);
s3 = Series(5, digitPosition);
s4 = Series(6, digitPosition);
pid = 4d * s1 - 2d * s2 - s3 - s4;
pid = pid - (int)pid + 1d;
hexDigits = HexString(pid, NumHexDigits);
Console.WriteLine("Position = {0}", digitPosition);
Console.WriteLine("Fraction = {0}", pid);
Console.WriteLine("Hex digits = {0}", hexDigits.Substring(0, 10));
}
// Returns the first NumHexDigits hex digits of the fraction of x.
private static string HexString(double x, int numDigits)
{
string hexChars = "0123456789ABCDEF";
StringBuilder sb = new StringBuilder(numDigits);
double y = Math.Abs(x);
for (int i = 0; i < numDigits; i++)
{
y = 16d * (y - Math.Floor(y));
sb.Append(hexChars[(int)y]);
}
return sb.ToString();
}
// This routine evaluates the series sum_k 16^(n-k)/(8*k+m)
// using the modular exponentiation technique.
private static double Series(int m, int n)
{
double denom, pow, sum = 0d, term;
// Sum the series up to n.
for (int k = 0; k < n; k++)
{
denom = 8 * k + m;
pow = n - k;
term = ModPow16(pow, denom);
sum = sum + term / denom;
sum = sum - (int)sum;
}
// Compute a few terms where k >= n.
for (int k = n; k <= n + 100; k++)
{
denom = 8 * k + m;
term = Math.Pow(16d, (double)(n - k)) / denom;
if (term < Epsilon)
break;
sum = sum + term;
sum = sum - (int)sum;
}
return sum;
}
// Fill the power of two table
private static void InitializeTwoPowers()
{
twoPowers[0] = 1d;
for (int i = 1; i < NumTwoPowers; i++)
{
twoPowers[i] = 2d * twoPowers[i - 1];
}
}
// ModPow16 = 16^p mod m. This routine uses the left-to-right binary
// exponentiation scheme.
private static double ModPow16(double p, double m)
{
int i;
double pow1, pow2, result;
if (m == 1d)
return 0d;
// Find the greatest power of two less than or equal to p.
for (i = 0; i < NumTwoPowers; i++)
{
if (twoPowers[i] > p)
{
break;
}
}
pow2 = twoPowers[i - 1];
pow1 = p;
result = 1d;
// Perform binary exponentiation algorithm modulo m.
for (int j = 1; j <= i; j++)
{
if (pow1 >= pow2)
{
result = 16d * result;
result = result - (int)(result / m) * m;
pow1 = pow1 - pow2;
}
pow2 = 0.5 * pow2;
if (pow2 >= 1d)
{
result = result * result;
result = result - (int)(result / m) * m;
}
}
return result;
}
}
Running this code, we see that the 10 hex digits of \(\pi\) starting at position 1,000,000 are 6C65E52CB4. Cool!
In F#
Below is my translation to F#. The algorithm is exactly the same as above, it’s just been written in a more functional style. In particular, I removed all of the “for” loops and replaced them with tail-recursive loops. This algorithm is easy to parallelize, so I went ahead and did that, too.
let epsilon = 1e-17
let twoPowers = [| for i = 0. to 25. do yield int (2. ** i) |]
// returns n^exp mod m, using binary exponentiation
let modPow n exp m =
match exp with
| x when x < 1 ->
n ** float x
| _ ->
let rec step i powTwo powOne acc =
match i with
| 0 -> acc
| _ ->
if powOne >= powTwo then
// multiply accumulator by n
step i powTwo (powOne - powTwo) ((n * acc) % m)
elif powTwo >= 2 then
// square the accumulator
step (i - 1) (powTwo / 2) powOne ((acc * acc) % m)
else
// just iterate
step (i - 1) (powTwo / 2) powOne acc
let i = Array.findIndex (fun twoPow -> twoPow > exp) twoPowers
step i twoPowers.[i - 1] exp 1.
// calculates the series sum_k 16^(n-k)/(8*k+m) using modular exponentiation
let series m n =
let rec sum i acc =
match n - i with
| -100 -> acc // do some extra terms to account for slop
| p ->
let denom = float (8 * i + m)
let term = (modPow 16. p denom) / denom
if i >= n && term < epsilon then
acc // bail out if extra terms are not significant
else
sum (i + 1) ((acc + term) % 1.)
sum 0 0.
// calculates the hex digits representing the fractional part of the given number
let digits numDigits (x : float) =
let rec step i x acc =
match i with
| 0 -> acc
| _ ->
let newX = 16. * (x - floor x)
step (i - 1) newX (sprintf "%s%X" acc (int newX))
step numDigits x ""
// returns "numDigits" hex digits of pi, starting at digit "index"
// async version
let piDigitsAsync index numDigits =
[(1, 4.); (4, -2.); (5, -1.); (6, -1.)]
|> List.map (fun (m, c) -> async { return c * series m index })
|> Async.Parallel
|> Async.RunSynchronously
|> Array.sum
|> digits numDigits
// synchronous version
let piDigitsSync index numDigits =
[(1, 4.); (4, -2.); (5, -1.); (6, -1.)]
|> List.map (fun (m, c) -> c * series m index)
|> List.sum
|> digits numDigits
let time f =
let sw = System.Diagnostics.Stopwatch.StartNew()
f ()
sw.Stop()
printfn "Elapsed: %A ms" sw.Elapsed.TotalMilliseconds
time (fun () -> printfn "%s" (piDigitsSync 1000000 10) )
time (fun () -> printfn "%s" (piDigitsAsync 1000000 10) )
I got interesting results when comparing the performance of the C, C#, and F# code. My test machine is Win8 x64, 2.8 GHz 4-core Intel i7, and I tested by running the code as shown above - calculating the 10 hex digits starting at position 1,000,000.
- The C code performed about the same when compiled for x86 or amd64 (~1.6 sec)
- The C# code performed better on amd64 (~1.6 sec) than x86 (~2 sec)
- The F# code performed much better on x86 (~1.5 sec sync, ~520 ms async) than amd64 (~3.7 sec sync, ~1.2 sec async)
I have it on some expert authority that the floating-point modulo operation is known to be quite slow on amd64 compared to x86, which could explain the F# results. But then why wouldn’t we see the same pattern in C and C#? I’m not sure, but if I find out I’ll update this post with the answer.
[Update 11/15/2012]
If you want to see the F# code in action, or tinker with it yourself, check it out at the preview site for the new Try F#.