Google Code Jam Online Round 1A: Numbers

by Nathan

Recently I’ve taken some breaks from coding for profit to code for glory. Google Code Jam Online Round 1 began on Friday with sub-round A. To compete in Online Round 1 contestants had to successfully solve a “small” and “large” input set in the Qualification Round on July 16th. The top 840 contestants from Online Round 1A will compete in Online Round 2 on August 2nd, along with the top 840 finishers from Online Round 1B and 1C. I managed to place in the top 840 and advance to Round 2, but this post is about the problem that stumped me.

The last problem, “Numbers”, was pretty tricky with only about 4% of contestants (93 out of 2394) solving it. The question is (deceptively) simple:

In this problem, you have to find the last three digits before the decimal point for the number (3 + √5)n.

For example, when n = 5, (3 + √5)5 = 3935.73982… The answer is 935.

For n = 2, (3 + √5)2 = 27.4164079… The answer is 027.

The possible values of n for the “large input” case are

2 <= n <= 2000000000

meaning you’ll need more than high precision floating point math to solve the problem.1

I wasn’t able to solve it myself, but I enjoyed making sense of the top solutions after the contest and will explain some of them here. All solutions submitted during the contest, including those I reference here, can be downloaded from the scoreboard. If you didn’t compete in Round 1A it might be fun to stop at this point and try to solve “Numbers” yourself.

A key observation

All the solutions I’ve deciphered come down to one important fact: it suffices to calculate

(*) (3 + sqrt(5))^n + (3 - sqrt(5))^n - 1 mod 1000

Before making sense of (*) let’s establish some notation:

rp = 3 + sqrt(5)
rm = 3 - sqrt(5)

So, with this notation, the problem is to find

floor(rp^n) % 1000

and (*) can be rewritten as

(*) rp^n + rm^n - 1 mod 1000

To see (*), note that

(**) rp^n + rm^n is integral for all n

by expanding both terms via the binomial theorem and canceling the odd powers of sqrt(5), and so (*) follows from (**) and the fact that

0 < (3 - sqrt(5))^n < 1 for all n

OK, so how do we calculate rp^n + rm^n? There are at least three types of solutions in the top 10, but they’re mostly related to linear recurrences.

My favorite solution

Before talking about recurrences though, I’ll explain my favorite solution, by the 7th place finisher Reid (who might be Reid Barton, a hell of a problem solver apparently). It’s very light on theory, and easy to code. We’ll get it as a special case of a general formula.

If

p = a + sqrt(b)
m = a - sqrt(b)
R(n) = p^n + m^n

then for e = 0 or 1 we get

R(n) R(n+e) = p^{2n+e} + m^{2n+e} + (p^e + m^e)    (pm)^n
            = R(2n+e)             + 2 (1 + e(a-1)) (a^2 - b)^n

If we think of R(n) as an exponentiation, this formula gives a generalization of the recursion

x^{2n+e} = x^n x^{n+e}

used for efficient exponentiation which comes up later. In our case

a = 3
b = 5

and so

R(2n+e) = R(n) R(n+e) - 2 (1 + 2e) 4^n

Reid’s Haskell implementation is already quite readable, but a simple Python implementation is

@memoize
def R(n):
    if n == 0:
        return 2
    elif n == 1:
        return 6
    else:
        q,e = divmod(n,2)
        return (R(q) * R(q+e) - pow(4,q,1000) * 2 * (1 + 2*e)) % 1000

def solve():
    N = input()
    return '%03i' % ((R(N) - 1) % 1000)

Linear recurrences

Recall that given a linear recurrence

a_{n+2} = c_1 a_{n-1} + c_0 a_n

you can express a_n as a linear combination of roots of the
recurrence’s characteristic polynomial

c(x) = x^2 - (c_1 x + c_0)

if all the roots are distinct. This fact usually comes up in relation to the Fibonacci numbers in an algorithms or linear algebra class.

Since

(x - rp)(x - rm) = x^2 - 6x + 4

we see that the recurrence

a_{n+2} = 6 a_{n+1} - 4 a_n

has a solution

a_n = p rp^n + m rm^n

for some numbers p and m. Since we really want to calculate

rp^n + rm^n

we take p = m = 1 and get base cases

a_0 = p + m = 2
a_1 = p rp + m rm = 6

So, we’re in business if we can calculate a_n for n as large as 2 billion.

A naive solution

A naive looping implementation of the recurrence in C calculates the 2 billionth term in about 45 seconds on my 1.66 GHz Core Duo. So, it’s easy to calculate any particular term in the sequence in the allotted 8 minutes, but calculating all 100 terms in the “large input” set separately is about 10 times to slow. However, there are some obvious workarounds:

  1. store all 2 billion terms on disk (about 4GB) and have an intelligent way to index into them
  2. sort the input and then iterate through the input in parallel with calculating the sequence, storing only the terms you need (or more generally, have some way to quickly check if the current term in the sequence is one of the input terms: it’s much too slow to simply iterate over the whole list of 100 inputs each time you calculate a new sequence term)

A decent C programmer (not me) implementing (2.) would sort the input in C, but here’s a C version where I’ve used Python to pre-sort the input

#include <iostream>

int length = 100;
/* {input, case #, place to store output} */
int I[100][3] = {{103,14,0},{104,70,0},{105,85,0},{203,41,0},{204,78,0},{205,40,0},{1023,55,0},{32769,74,0},{75225234,23,0},{209877672,26,0},{276560520,27,0},{308200077,12,0},{330356541,45,0},{365921634,60,0},{368818964,36,0},{410638565,73,0},{444322222,4,0},{526886908,100,0},{546776723,39,0},{587767441,69,0},{590464207,77,0},{736875403,91,0},{800561938,13,0},{861924427,61,0},{903179180,88,0},{910062006,1,0},{936105644,59,0},{955677912,62,0},{1000875506,18,0},{1011977351,33,0},{1016518317,8,0},{1051461600,64,0},{1064277706,29,0},{1065829044,49,0},{1073741823,65,0},{1073741824,46,0},{1095316023,76,0},{1109957771,31,0},{1123049112,48,0},{1155744626,21,0},{1177572005,11,0},{1184485437,20,0},{1191967198,34,0},{1193741318,7,0},{1225400731,30,0},{1230910728,63,0},{1260829455,52,0},{1284145748,51,0},{1317056443,50,0},{1329765563,53,0},{1365380858,35,0},{1376932921,16,0},{1449613541,2,0},{1477014138,44,0},{1480436753,43,0},{1514881215,66,0},{1526692805,96,0},{1535334186,17,0},{1540548799,38,0},{1543660610,93,0},{1544963801,58,0},{1562234411,25,0},{1570752061,9,0},{1598846451,75,0},{1605814285,5,0},{1611433491,37,0},{1661296617,54,0},{1731301533,80,0},{1736284618,90,0},{1737629019,84,0},{1746961562,98,0},{1791276241,92,0},{1795024971,56,0},{1809415844,22,0},{1813833154,89,0},{1865363233,42,0},{1880802833,94,0},{1894509466,32,0},{1918517182,71,0},{1995422805,24,0},{1999999981,83,0},{1999999982,86,0},{1999999983,81,0},{1999999984,57,0},{1999999985,3,0},{1999999986,67,0},{1999999987,97,0},{1999999988,28,0},{1999999989,15,0},{1999999990,68,0},{1999999991,19,0},{1999999992,6,0},{1999999993,79,0},{1999999994,47,0},{1999999995,10,0},{1999999996,82,0},{1999999997,95,0},{1999999998,99,0},{1999999999,72,0},{2000000000,87,0}};

int main(int argc, char ** argv) {
  int a = 2, b = 6, c = 0;
  for (int i = 1; i <= 2000000000; ++i) {
    int b_ = (6*b - 4*a + 4000) % 1000;
    a = b; /* a is the ith term in the sequence */
    b = b_;
    while (I[c][0] == i && c < length)
      I[c++][2] = (a + 999) % 1000;
  }
  for (c = 0; c < length; ++c)
    printf(”Case #%d: %03d\n”, I[c][1], I[c][2]);
}

and the unix sort to post-sort the output

$ time ./a.out | sort -t # -k2,2g > C-large.out
./a.out  49.76s user 0.00s system 99% cpu 49.784 total
sort -t # -k2,2g > C-large.out  0.00s user 0.00s system 0% cpu 49.784 total

So, with a little bookkeeping, we calculate all 100 terms needed in the time it takes to calculate the last of them.

Periodicity

I’ve see two approaches used to speed up the calculation employed by the top 10 finishers and they’re both more sophisticated than my suggestions above. The 5th place finisher, Plagapong, has a simple solution: since we’re modding out by 1000 there are only 1000^2 possible values for (a_{n-1}, a_{n-2}) in

a_n  = 6 a_{n-1} - 4 a_{n-2}

and so the mod 1000 sequence must be periodic with some period at most 1000000. Plagapong wrote a helper program to calculate the period (it’s 100) and where the sequence starts being periodic (at n = 3).

Fast matrix exponentiation

The 1st and 2nd place finishers, Bohua and yuhch123, took a more heavy weight approach. Anyone that’s familiar with the relationship between exponentiation and linear recurrences probably also knows that exponentiation comes from viewing the recurrence as powering a matrix. In this case, the matrix equation is

(a_n a_{n-1}) = m^n (6 2)

for

m = 6 -4
    1  0

(yuhch123 uses (6 2), which corresponds to the base cases (a_1 a_0). Bohua, OTOH, uses (28 6), which corresponds to (a_2 a_1)). Doing exponentiation “efficiently”, namely finding x^n with O(lg n) multiplications, is a common trick, and both Bohua and yuhch123 employ it.

Here’s a Python version of their solutions:

def multmod(m1,m2):
    m3 = [[0,0],[0,0]]
    for r in range(len(m1)):
        for c in range(len(m2[0])):
            m3[r][c] = sum(m1[r][k]*m2[k][c]
                           for k in range(len(m1[0]))) % 1000
    return m3

@memoize
def powmod(n):
    if n == 1:
        return [[6,-4],[1,0]]
    else:
        q,r = divmod(n,2)
        m = multmod(powmod(q),powmod(q+r))
        return m

def solve():
    N = input()
    return ‘%03i’ % ((multmod(powmod(N),[[6],[2]])[1][0] - 1) % 1000)

where memoize is a memoization decorator defined by

def memoize(f):
    cache = {}
    def g(*args):
        if args not in cache:
            cache[args] = f(*args)
        return cache[args]
    return g

Fibonacci numbers

The next type of solution, employed by the 3rd and 4th place finishers neal.wu and newman, uses the Fibonacci numbers. Let

pp = (1 + sqrt(5))/2
pm = (1 - sqrt(5))/2

be the golden ratios and let F_n denote the nth Fibonacci number. Then

pp^2 = 1 + pp = rp/2
pm^2 = 1 + pm = rm/2
pp pm = -1

and so by the closed form expression for F_n

       pp^n - pm^n
F_n =  -----------
         sqrt(5)

we have

          (rp^n + rm^n) - 2^{n+1} (-1)^n
(F_n)^2 = ------------------------------
                      5 2^n

and so

rp^n + rm^n = 5 2^n (F_n)^2 - (-2)^{n+1}

With that you’re mostly done, but you still need to calculate F_n. Both of them use fast matrix exponentiation of the Fibonacci matrix

1 1
1 0

There’s still more to learn

I’m still working on the 6th place solution by Ahyangyi, which is another matrix power solution. Ahyangyi’s matrix isn’t a recurrence matrix, but it is similar to the recurrence matrix we saw above, meaning powering it gives linear combinations of rp^n and rm^n. I can understand that Ahyangyi’s solution works using matrix similarity together with the matrix identity

a k*b   a' k*b'   a'' k*b''
      *         =
b   a   b'   a'   b''   a''

(where a, a’, b, b’, and k are arbitrary numbers, and a” and b” depend on a, a’, b, and b’), but I have no idea how to derive it in a plausible way. Can anybody derive it?


Footnotes

1. A floating point library won’t solve the “large input” set, but it’s enough for the “small input” set, where

2 <= n <= 30

There are some Python solutions using the decimal library, and
yuhch123, the 2nd place finisher whose solution to the “large input” set is analyzed above, used the unix calculator bc to solve the “small input” set. yuhch123 used the bc script

scale=50
for (i = 2; i <= 30; i ++)
   (3 + sqrt(5))^i
quit

and some C code for IO.

Mystery Spices

by Nathan

The first thing Rigo and I noticed when we moved into our Pgh sublet on the 1st was how trashed it was. The previous tenants hadn’t cleaned anything or thrown away any of the stuff they decided not to take with them. It’s not all bad though: in addition to furniture they left about 30 mostly unlabeled tubs and bags of Indian spices, and some basmati rice and toor dal. I have no idea what most of the spices are, but I was able to identify a few. I had some spinach and an onion that desperately needed eating, so I tried a google search for “palak” and “dal”. I ended making the first result: Lasooni Dal Palak. It turned out pretty well, but we’ve got a lot of work ahead of us if we intend to eat all these mystery spices.

|