Overflow-Proofing Golly

Contents

Introduction

The very first thing I wanted to do with Golly, back when I first tried it years ago, is compare its Hashlife performance to Gosper's original benchmarks (from his 1984 paper [1]).

Here is Gosper's pattern. You can cut and paste this text into Golly, which will recognize it as a Life pattern:

# This is the pattern Gosper used as an example in his paper describing # the Hashlife algorithm # # R Wm. Gosper. Exploiting Regularities in Large Cellular Spaces. # Physica D 10, pp. 75-80 (1984) # # "An early model Symbolics 3600 ran [this pattern] several million # steps, at a rate which doubled every two or three minutes, once # the initial explosion settled into oscillation." x = 18, y = 5, rule = B3/S23 3o11b3o$o2bo10bo2bo$o6b3o4bo$o5bo2bo4bo$bo8bo4bo!

Golly easily runs this pattern to 10¹⁰⁰⁰ generations in a couple minutes. In fact, it's so fast that my computer is using less than 100% of a CPU core, because the couple minutes is pretty much all spent waiting for auto-repeat on the '+' key to repeatedly invoke the Control > Faster comand.

Unfortunately, Golly's generation counter overflows after mere seconds of holding the '+' key (or minutes if you just turn on Control > Hyperspeed and wait) because Golly converts its internal bigints to floating-point before printing on the screen, and floating-point overflows around 10³⁰⁸.

Yes, there's View > Show Exact Numbers, but that's fairly useless when there are too many digits to fit on the screen. I'm trying to benchmark Golly and I need scientific notation for that!

The Overflow-Proof bigint Formatter

The first thing to do is fix the output overflow. We need a conversion function that turns a bigint into a float, but preserves the entire exponent and a suitable number of mantissa digits. We also need to alter the user interface function that turns the floating-point number into a string.

In the Golly source code, in bigint.cpp, add this function:

/** * Turn this bigint into a double in a way that preserves huge exponents. * Here are some examples: * * To represent * the quantity We return the value * 27 1.27 * -6.02e23 -23.602 * 6.02e23 23.602 * 9.99e299 299.999 * 1.0e300 300.1 * 1.0e1000 1000.1 This would normally overflow * 6.7e12345 12345.67 And this wouldn't stand a chance even with * long double on the 8087 */ double bigint::toscinot() const { double mant, exponent; double k_1_10 = 0.1; double k_1_10000 = 0.0001; double k_base = 65536.0 * 32768.0; exponent = 0; if (v.i & 1) { /* small integer */ mant = (double)(v.i >> 1) ; } else { /* big integer: a string of 31-bit chunks */ mant = 0 ; double m = 1 ; for (int i=1; i<=v.p[0]; i++) { mant = mant + m * v.p[i] ; m *= k_base ; while (m >= 100000.0) { m *= k_1_10000; mant *= k_1_10000; exponent += 4.0; } } } /* Add the last few powers of 10 back into the mantissa */ while(((mant < 0.5) && (mant > -0.5)) && (exponent > 0.0)) { mant *= 10.0; exponent -= 1.0; } /* Mantissa might be greater than 1 at this point */ while((mant >= 1.0) || (mant <= -1.0)) { mant *= k_1_10; exponent += 1.0; } if (mant >= 0) { // Normal case: 6.02e23 -> 23.602 return(exponent + mant); } // Negative case: -6.02e23 -> -23.602 // In this example mant will be -0.602 and exponent will be 23.0 return(mant - exponent); }

The header file bigint.h needs the prototype for this new function:

double toscinot() const ;

Then we need to make the StatusBar::Stringify function smarter. This is in wxstatus.cpp, and I'm showing the whole function just to be clear (much of it is unchanged from the standard version):

wxString StatusBar::Stringify(const bigint& b) { static char buf[32]; char* p = buf; double d = b.toscinot(); if ( fabs(d) < 10.0 ) { // show exact value with commas inserted for readability d = b.todouble(); if ( d < 0 ) { d = - d; *p++ = '-'; } sprintf(p, "%.f", d); int len = strlen(p); int commas = ((len + 2) / 3) - 1; int dest = len + commas; int src = len; p[dest] = 0; while (commas > 0) { p[--dest] = p[--src]; p[--dest] = p[--src]; p[--dest] = p[--src]; p[--dest] = ','; commas--; } if ( p[-1] == '-' ) p--; } else { // use e notation for abs value > 10^9 (agrees with min & max_coord) const char * sign = ""; if (d < 0) { sign = "-"; d = 0.0 - d; } double exp = floor(d); d = (d - exp) * 10.0; exp = exp - 1.0; // UI has been set up to accomodate "9.999999e+999" // which is the same width as "9.9999999e+99", etc. if (exp < 100.0) { // two-digit exponent sprintf(p, "%s%9.7fe+%.0f", sign, d, exp); // 9.9999999e+99 } else if (exp < 1000.0) { sprintf(p, "%s%8.6fe+%.0f", sign, d, exp); // 9.999999e+999 } else if (exp < 10000.0) { sprintf(p, "%s%7.5fe+%.0f", sign, d, exp); // 9.99999e+9999 } else if (exp < 100000.0) { sprintf(p, "%s%6.4fe+%.0f", sign, d, exp); // 9.9999e+99999 } else { // for 6-digit exponent or larger we'll just always show a "d.ddd" // mantissa. 7-digit exponent appears unattainable at the present // time (late 2011) sprintf(p, "%s%5.3fe+%.0f", sign, d, exp); } } return wxString(p, wxConvLocal); }

A Usable goto.py

Once Golly is able to display meaningful generation counts, I was still trying to benchmark the Hashlife to see how fast it compares to Gosper's version.

As I mentioned above, simply holding down the '+' key (invoking the Control > Faster command) isn't good enough. The CPU load monitor shows that Golly is clearly mostly idle. What I need is a way to say "Please go right to generation 10¹⁰⁰⁰". But unfortnately, again, the goto.py and goto.pl scripts do not accept scientific notation.

I haven't finished this part yet, but I have a partial solution.

Here is a Python script that accepts arbitrary expressions, such as "3*2^(2+5*173)+17*2^100+1041", performs the arithmetic using bigints, and prints the result. It also accepts scientific notation, like "3e1000", as long as there is no decimal point and the exponent has no sign (because I want to be sure it's always an integer).

You still need to manually copy and paste the result from the Python script into the the "Go to geneation" prompt given by Golly's goto.py script.

#!/usr/bin/python # # expr.py - my first Python script # # This implements expression evaluation using a method similar to Hypercalc. # # REVISION HISTORY # 20111103 First version # 20111105 Add expr_2 using re.sub (a cleaner implementation) # 20120624 Remove whitespace before evaluating # -------------------------------------------------------------------- import re expr_addsub = re.compile('^(.*[^0-9])([0-9]+)([\+\-])([0-9]+)([^0-9].*)$') expr_mul = re.compile('^(.*[^0-9])([0-9]+)\*([0-9]+)([^0-9].*)$') expr_exp = re.compile('^(.*[^0-9])([0-9]+)\^([0-9]+)([^0-9].*)$') expr_ee = re.compile('^(.*[^0-9])([0-9]+)e([0-9]+)([^0-9].*)$') expr_remove_spaces = re.compile('^ *(.*[^ ]) *$') expr_match_white = re.compile('[ \t]+') # -------------------------------------------------------------------- p_paren = re.compile('$(\d+)$') p_ee = re.compile('(\d+)e(\d+)') p_exp = re.compile('(\d+)\^(\d+)') p_mul = re.compile('(\d+)([*/])(\d+)') p_add = re.compile('(\d+)([-+])(\d+)') # erf = expression replace function def erf_paren(match): "Remove parentheses" a = match.group(1) return a def erf_ee(match): "Scientific notation" a = int(match.group(1)) b = int(match.group(2)) return str(a*10**b) def erf_exp(match): "Exponentiation" a = int(match.group(1)) b = int(match.group(2)) return str(a**b) def erf_mul(match): "Multiplication and (integer) division" a = int(match.group(1)) # match.group(2) is operator b = int(match.group(3)) if(match.group(2) == '*'): return str(a*b) else: return str(a//b) def erf_add(match): "Addition and subtraction" a = int(match.group(1)) # match.group(2) is operator b = int(match.group(3)) if(match.group(2) == '+'): return str(a+b) else: return str(a-b) def expr_2(p): p = expr_match_white.sub('', p) gg = 1 while gg: if p_ee.search(p): p = p_ee.sub(erf_ee, p, count=1) # print 'got:', p elif p_exp.search(p): p = p_exp.sub(erf_exp, p, count=1) # print 'got:', p elif p_mul.search(p): p = p_mul.sub(erf_mul, p, count=1) # print 'got:', p elif p_add.search(p): p = p_add.sub(erf_add, p, count=1) # print 'got:', p elif p_paren.search(p): p = p_paren.sub(erf_paren, p, count=1) # print 'got:', p else: # print 'no more matches' gg = 0 return p # -------------------------------------------------------------------- gg2 = 1 while gg2: s = raw_input('expression: ') if len(s) == 0: gg2 = 0 else: s = expr_2(s) print 'result:', s

(Finally) some Automatic Benchmarks

Using the above script I was able to calculate 2²⁰⁰⁰ and paste the answer into goto.py's prompt to make Golly go to generation 2²⁰⁰⁰, doing the whole calculation automatically (i.e. without me holding the '+' key or doing any other interaction with Golly until the calculation is done).

I had Control > Hyperspeed turned on, and started with the Gosper puffer start pattern shown above. It takes a while because Golly's Hyperspeed waits a while before deciding to go to each bigger step, and it increases the step size by a factor of 4 each time. Nevertheless the speed is impressive: it gets to generation 2²⁰⁰⁰ = 1.148130695281×10⁶⁰² in about 31 minutes. On average, this is a little better than one doubling every second.

This benchmark shows that, with its default settings, Golly's hashlife can double the simulation speed and distance at a rate more than 100 times as fast as Gosper's 1984 implementation. However, it's clearly way slower than the speed I describe in the introduction, when I was holding down the '+' key to make Golly accelerate more quickly.

Even Faster

The first thing that can be done is to boost the acceleration rate of Hyperspeed with Control > Set Base Step... command. Normally the "base step" is 4, and Hyperspeed increases the step size by a factor of 4 each time. But you can change this "4" to something higher, preferably a power of 2. Repeating the above experiment with the base step set to 256, I got to generation 2²⁰⁰⁰ in just about 10 minutes.

This is not as fair a comparison to Conway's results, because Conway's implementation was set to accelerate in 2× steps. But it is a meaningful benchmark for anyone more interested in how fast Golly can go.

Fastest of All

Golly's Hyperspeed algorithm is designed to be cautious about speeding up, in order to make sure there is visible activity at all times. So, it won't speed up until Golly has been running at the current speed for a little while, achieving a reasonable frame rate. If the frame rate is very low, it won't accelerate.

I can bypass Hyperspeed entirely by using a different technique: setting the step size to the desired generation, in this case 2²⁰⁰⁰, and going there in just one step!

Load the starting pattern (so it's at generation 0 with no history in the hlife data structures)
Use Control > Set Base Step... and set it to 256
Press and hold the key for Control > Faster (normally +) until it says "Step=256^250"
Invoke Control > Next Step (normally the Tab key) just once.

Results: Starting with a freshly-launched Golly and the Conway puffer pattern, I reached generation 2²⁰⁰⁰ = 1.148e602 in just 2.8 seconds.

Footnotes

[1] R Wm. Gosper. Exploiting Regularities in Large Cellular Spaces. Physica D 10 75-80 (1984)

This page was written in the "embarrassingly readable" markup language RHTF, and was last updated on 2012 Jun 27.

s.27