Runs of Digits in Pi

Mathematica, 85 bytes

FromDigits/@DeleteDuplicatesBy[Join@@Subsets/@Split@RealDigits[Pi,10,#][[1]],Length]&

Anonymous function. Takes n as input, and returns the elements of the sequence in the first n digits of π. Output is in the form of {0, 3, 33, 111, ...}.


Python 2, 110 bytes

n=input()
x=p=7*n|1
while~-p:x=p/2*x/p+2*10**n;p-=2
l=m=0
for c in`x`:
 l=l*(p==c)+1;p=c
 if l>m:m=l;print p*l

The maximum number of digits to check is taken as from stdin. 10,000 digits finishes in about 2s with PyPy 5.3.

Sample Usage

$ echo 10000 | pypy pi-runs.py
3
33
111
9999
99999
999999

Something Useful

from sys import argv
from gmpy2 import mpz

def pibs(a, b):
  if a == b:
    if a == 0:
      return (1, 1, 1123)
    p = a*(a*(32*a-48)+22)-3
    q = a*a*a*24893568
    t = 21460*a+1123
    return (p, -q, p*t)
  m = (a+b) >> 1
  p1, q1, t1 = pibs(a, m)
  p2, q2, t2 = pibs(m+1, b)
  return (p1*p2, q1*q2, q2*t1 + p1*t2)

if __name__ == '__main__':
  from sys import argv
  digits = int(argv[1])

  pi_terms = mpz(digits*0.16975227728583067)
  p, q, t = pibs(0, pi_terms)

  z = mpz(10)**digits
  pi = 3528*q*z/t

  l=m=0
  x=0
  for c in str(pi):
   l=l*(p==c)+1;p=c
   if l>m:m=l;print x,p*l
   x+=1

I've switched from Chudnovsky to Ramanujan 39 for this. Chudnovsky ran out of memory on my system shortly after 100 million digits, but Ramanujan made it all the way to 400 million, in only about 38 minutes. I think this is another case were slower growth rate of terms wins out in the end, at least on a system with limited resources.

Sample Usage

$ python pi-ramanujan39-runs.py 400000000
0 3
25 33
155 111
765 9999
766 99999
767 999999
710106 3333333
22931752 44444444
24658609 777777777
386980421 6666666666

Faster Unbounded Generators

The reference implementation given in the problem description is interesting. It uses an unbounded generator, taken directly from the paper Unbounded Spigot Algorithms for the Digits of Pi. According to the author, the implementations provided are "deliberately obscure", so I decided to make fresh implementations of all three algorithms listed by the author, without deliberate obfuscation. I've also added a fourth, based on Ramanujan #39.

try:
  from gmpy2 import mpz
except:
  mpz = long

def g1_ref():
  # Leibniz/Euler, reference
  q, r, t = mpz(1), mpz(0), mpz(1)
  i, j = 1, 3
  while True:
    n = (q+r)/t
    if n*t > 4*q+r-t:
      yield n
      q, r = 10*q, 10*(r-n*t)
    q, r, t = q*i, (2*q+r)*j, t*j
    i += 1; j += 2

def g1_md():
  # Leibniz/Euler, multi-digit
  q, r, t = mpz(1), mpz(0), mpz(1)
  i, j = 1, 3
  z = mpz(10)**10
  while True:
    n = (q+r)/t
    if n*t > 4*q+r-t:
      for d in digits(n, i>34 and 10 or 1): yield d
      q, r = z*q, z*(r-n*t)
    u, v, x = 1, 0, 1
    for k in range(33):
      u, v, x = u*i, (2*u+v)*j, x*j
      i += 1; j += 2
    q, r, t = q*u, q*v+r*x, t*x

def g2_md():
  # Lambert, multi-digit
  q, r, s, t = mpz(0), mpz(4), mpz(1), mpz(0)
  i, j, k = 1, 1, 1
  z = mpz(10)**49
  while True:
    n = (q+r)/(s+t)
    if n == q/s:
      for d in digits(n, i>65 and 49 or 1): yield d
      q, r = z*(q-n*s), z*(r-n*t)
    u, v, w, x = 1, 0, 0, 1
    for l in range(64):
      u, v, w, x = u*j+v, u*k, w*j+x, w*k
      i += 1; j += 2; k += j
    q, r, s, t = q*u+r*w, q*v+r*x, s*u+t*w, s*v+t*x

def g3_ref():
  # Gosper, reference
  q, r, t = mpz(1), mpz(180), mpz(60)
  i = 2
  while True:
    u, y = i*(i*27+27)+6, (q+r)/t
    yield y
    q, r, t, i = 10*q*i*(2*i-1), 10*u*(q*(5*i-2)+r-y*t), t*u, i+1

def g3_md():
  # Gosper, multi-digit
  q, r, t = mpz(1), mpz(0), mpz(1)
  i, j = 1, 60
  z = mpz(10)**50
  while True:
    n = (q+r)/t
    if n*t > 6*i*q+r-t:
      for d in digits(n, i>38 and 50 or 1): yield d
      q, r = z*q, z*(r-n*t)
    u, v, x = 1, 0, 1
    for k in range(37):
      u, v, x = u*i*(2*i-1), j*(u*(5*i-2)+v), x*j
      i += 1; j += 54*i
    q, r, t = q*u, q*v+r*x, t*x

def g4_md():
  # Ramanujan 39, multi-digit
  q, r, s ,t = mpz(0), mpz(3528), mpz(1), mpz(0)
  i = 1
  z = mpz(10)**3511
  while True:
    n = (q+r)/(s+t)
    if n == (22583*i*q+r)/(22583*i*s+t):
      for d in digits(n, i>597 and 3511 or 1): yield d
      q, r = z*(q-n*s), z*(r-n*t)
    u, v, x = mpz(1), mpz(0), mpz(1)
    for k in range(596):
      c, d, f = i*(i*(i*32-48)+22)-3, 21460*i-20337, -i*i*i*24893568
      u, v, x = u*c, (u*d+v)*f, x*f
      i += 1
    q, r, s, t = q*u, q*v+r*x, s*u, s*v+t*x

def digits(x, n):
  o = []
  for k in range(n):
    x, r = divmod(x, 10)
    o.append(r)
  return reversed(o)

Notes

Above are 6 implementations: the two reference implementations provided by the author (denoted _ref), and four that compute terms in batches, generating multiple digits at once (_md). All implementations have been confirmed to 100,000 digits. When choosing batch sizes, I chose values that slowly lose precision over time. For example, g1_md generates 10 digits per batch, with 33 iterations. However, this will only produce ~9.93 correct digits. When precision runs out the check condition will fail, triggering an extra batch to be run. This seems to be more performant than slowly gainly extra, unneeded precision over time.

  • g1 (Leibniz/Euler)
    An extra variable j is kept, representing 2*i+1. The author does the same in the reference implementation. Calculating n separately is far simpler (and less obscure), because it uses the current values of q, r and t, rather than the next.
  • g2 (Lambert)
    The check n == q/s is admittedly quite lax. That should read n == (q*(k+2*j+4)+r)/(s*(k+2*j+4)+t), where j is 2*i-1 and k is i*i. At higher iterations, the r and t terms become increasingly less significant. As is, it's good for the first 100,000 digits, so it's probably good for all. The author provides no reference implementation.
  • g3 (Gosper)
    The author conjectures that it is unnecessary to check that n will not change in subsequent iterations, and that it only serves to slow the algorithm. While probably true, the generator is holding onto ~13% more correct digits than it has currently generated, which seems somewhat wasteful. I've added the check back in, and wait until 50 digits are correct, generating them all at once, with a noticable gain in performance.
  • g4 (Ramanujan 39)
    Calculated as

    Unfortunately, s doesn't zero out, due to the initial (3528 ÷) composition, but it's still significantly faster than g3. Convergence is ~5.89 digits per term, 3511 digits are generated at a time. If that's a bit much, generating 271 digits per 46 iterations is also a decent choice.

Timings

Taken on my system, for purposes of comparison only. Times are listed in seconds. If a timing took longer than 10 minutes, I didn't run any further tests.

            |  g1_ref |  g1_md  |  g2_md  |  g3_ref |  g3_md  |  g4_md 
------------+---------+---------+---------+---------+---------+--------
    10,000  |  1.645  |  0.229  |  0.093  |  0.312  |  0.062  |  0.062 
    20,000  |  6.859  |  0.937  |  0.234  |  1.140  |  0.250  |  0.109 
    50,000  |  55.62  |  5.546  |  1.437  |  9.703  |  1.468  |  0.234 
   100,000  |  247.9  |  24.42  |  5.812  |  39.32  |  5.765  |  0.593 
   200,000  |  2,158  |  158.7  |  25.73  |  174.5  |  33.62  |  2.156 
   500,000  |    -    |  1,270  |  215.5  |  3,173  |  874.8  |  13.51 
 1,000,000  |    -    |    -    |  1,019  |    -    |    -    |  58.02 

It's interesting that g2 eventually overtakes g3, despite a slower rate of convergence. I suspect this is because the operands grow at a significantly slower rate, winning out in the long run. The fastest implmentation g4_md is approximately 235x faster than the g3_ref implmentation on 500,000 digits. That said, there's still a significant overhead to streaming digits in this way. Calculating all digits directly using Ramanujan 39 (python source) is approximately 10x as fast.

Why not Chudnovsky?

The Chudnovsky algorithm requires a full precision square root, which I'm honestly not sure how to work in - assuming it could be at all. Ramanujan 39 is somewhat special in this regard. However, the method does seem like it might be conducive to Machin-like formulas, such as those used by y-cruncher, so that might be an avenue worth exploring.


Python 2, 298 bytes

Note, the code for generating pi is taken from the OP's implementation.

def p():
 q,r,t,j=1,180,60,2
 while 1:
  u,y=3*(3*j+1)*(3*j+2),(q*(27*j-12)+5*r)//(5*t)
  yield y
  q,r,t,j=10*q*j*(2*j-1),10*u*(q*(5*j-2)+r-y*t),t*u,j+1
p=p()
c=r=0
d=[0]
while 1:
 t=p.next()
 if t==d[len(d)-1]:d.append(t)
 else:d=[t]
 if len(d)>r:r=len(d);print"".join([`int(x)`for x in d])
 c+=1

My first attempt at golfing in Python. Outputs the sequence forever.