Large Numbers

First page . . . Back to page 6 . . . Forward to page 8 . . . Last page (page 11)

Bowers' Array Notation (4-element Subset)

In order to generalise his operators while also making it easy to extend the system further, Bowers created his array notation. The 3-element version of Bowers array notation was already covered above. It is easy to convert from the extended operator notation to an array notation version of the same number — sacrificing a bit by making the rules for "expanding" a number from the array notation a little complex.

All of the operators defined thus far can be expressed as an array with up to four elements, as follows:

[a] = a
[a,b] = a+b
[a,b,1] = [a,b] = a+b = a^①b
[a,b,2] = a×b = a^②b
[a,b,3] = a^b = a^③b
[a,b,c] = a <c> b = hy(a,c,b)
[a,b,c,1] = a <c> b      (combining a and b with the c^th operator from the added, multiplied, exponentiated, ... sequence)
[a,b,c,2] = a <<c>> b      (combining a and b with the c^th operator from the expanded, multiexpanded, powerexpanded, ... sequence)
[a,b,c,3] = a <<<c>>> b      (combining a and b with the c^th operator from the exploded, multiexploded, powerexploded, ... sequence)

Here are the rules:

1. For one- and two-element arrays, just add the elements. [a] = a and [a,b] = a+b
2. If rule 1 does not apply, and if there are any trailing 1's, remove them: [a,b,1,1] = [a,b]; [a,b,c,1] = [a,b,c], etc.
3. If neither previous rule applies, and the 2nd entry is a 1, remove all but the first element: [a,1,b,c] = [a] = a.
4. If none of the previous rules applies, and the 3rd entry is a 1: [a,b,1,c] becomes [a,a,[a,b-1,1,c],c-1]
5. Otherwise all four elements are greater than 1: [a,b,c,d] becomes [a,[a,b-1,c,d],c-1,d]

At this point, it is interesting to note how similar the Bowers array rules are to the definition of a recursive function like the Ackermann function. The Ackermann function was originally developed with the restriction that functions must be defined entirely in terms of calls to themselves and each other, and in terms of the "successor function" s(x) = x+1. The 5 rules above can be restated:

12. f(s(s(a)),1,1,1) = a
1b. f(s(s(a)),s(s(s(b))),1,1) = f(s(s(s(a))),s(s(b)),1,1)
3. f(s(s(a)),1,s(s(b)),s(s(c))) = a
4. f(s(s(a)),s(s(s(b))),1,s(s(s(c)))) = f(s(s(a)),s(s(a)),f(s(s(a)),s(s(b)),1,s(s(s(c)))),s(s(c)))
5. f(s(s(a)),s(s(s(b))),s(s(s(c))),s(s(d))) = f(1,f(1,s(s(b)),s(s(s(c))),s(s(d))),s(s(c)),s(s(d)))

Here is an example of applying the rules to the simplest non-trivial 4-element array:

[2,2,2,2] = [2,[2,1,2,2],1,2]      (by rule 5)
[2,1,2,2] = 2 (by rule 3)
so we have [2,2,1,2]
[2,2,1,2] = [2,2,[2,1,1,2],1]      (by rule 4)
[2,1,1,2] = 2      (by rule 3)
so we have [2,2,2,1]
[2,2,2,1] = [2,2,2]      (by rule 2}
[2,2,2] = [2,[2,1,2],1]      (by rule 5)
[2,1,2] = 2      (by rule 3)
so we have [2,2,1]
[2,2,1] = [2,2]      (by rule 2)
[2,2] = 2+2 = 4      (by rule 1)

With a little effort you can see that anything starting with [2,2 is equal to 4. To get anything bigger than 4, you have to have at least one 3. Here is the simplest example:

[3,2,1,2] = [3,3,[3,1,1,2],1]      (by rule 4)
= [3,3,3,1]      (because [3,1,1,2] = 3 by rule 3)
= [3,3,3]      (by rule 2)

Once it is reduced to a 3-element array, we can convert to hyper operator notation as established earlier. So [3,2,1,2] is 3^③3 = 3³ = 27. Now using the extended operator <<1>>, 3 <<1>> 2 = 3 {3} 3. This is the same as [3,2,1,2}.

Here is another example:

[3,2,2,2] = [3,[3,1,2,2],1,2]      (by rule 5)
= [3,3,1,2]      (because [3,1,2,2] = 3 by rule 3)
= [3,3,[3,2,1,2],1]      (by rule 4)
= [3,3,[3,3,[3,1,1,2],1],1]      (by rule 4 again)
= [3,3,[3,3,3,1],1]      (because [3,1,1,2] = 3 by rule 3)
= [3,3,[3,3,3]]      (by rule 2)
= [3,3,[3,[3,2,3],2]]      (by rule 5)
= [3,3,[3,[3,[3,1,3],2],2]]      (by rule 5)
= [3,3,[3,[3,3,2],2]]      ([3,1,3] = 3 by rule 3)
= [3,3,[3,[3,[3,2,2],1],2]]      (by rule 5)
= [3,3,[3,[3,[3,[3,1,2],1],1],2]]      (by rule 5)
= [3,3,[3,[3,[3,3]],2]]      (by rules 2 and 3)
= [3,3,[3,[3,6],2]]      (by rule 1)
= [3,3,[3,9,2]]      (by rule 1)
... (about 8 repeats of rules 5 and 1 to turn [3,9,2] into 27)
= [3,3,27] = 3 {27} 3 = 3^㉗3 = 3 {26} (3 {26} (3...))

This is equivalent to 3 <<2>> 2, which expands as follows:

3 <<2>> 2 = 3 <<1>> 3 = 3 <3 <3> 3> 3 = 3 <27> 3 = 3^㉗3

3 <<2>> 2 is the same as [3,2,2,2]. In fact, the rules for the 4-element array notation are equivalent to definitions of the extended operators. The array [a,b,c,2] is equal to a <<c>> b; [a,b,c,3] is a <<<c>>> b; and in general [a,b,c,d] is a <<<<<c>>>>> b with d sets of brackets around the c.

in 2006 Chris Bird proved[56] that a 4-element array [a,b,c,d] is larger than a→a→a→...→a→(b-1)→(c-1), using Conway's chained arrow notation, and with d occurrences of a in the chain (provided that a>2, b>1, c>0 and d>1). (Bird exhibited far more patience than I, who was merely satisfied with Bowers' own assessment that a 5-element array [n,n,n,n,n] is at least as large as an n-element chain n→n→n→...→n.)

Bowers Arrays with 5 or More Elements

Of course, Bowers wanted to extend the system, so the rules were designed to work with arrays of arbitrary length. This is done by changing rules 4 and 5 to the following:

4. If none of the previous rules applies, and the 3rd entry is a 1: Define the variables a,b,S,d and R so that the array is [a,b,S,1,d,R] where a,b are the first two elements, [S,1] is the string of 1 or more 1's; d is the first element bigger than 1 and [R] is the remaining part of the array. Replace the array with [a,a,S',[a,b-1,S,1,d,R],d-1,R] where [S'] is a string of a's of equal length as string [S].

5. If none of the previous rules applies, replace the second element: [a,b,c,R] becomes [a,[a,b-1,c,R],c-1,R]

I am fairly well convinced that Bowers is right in stating that the value represented by the 5-element array [n,n,n,n,n] is at least as large as n→n→n→...→n in the Conway chained-arrow notation, where there are n items in the chain.

For more on Bowers' notation, including updated definitions and a great many more steps in defining new recursive functions, read here: Array Notation, or check the newer, longer version here: Exploding Array Function. There are some other Bowers links below.

Generalised Invention of Recursive Functions

At this point it is best to just describe the general process. Larger numbers are described by defining various types of recursive functions, always defined in terms of themselves and other previously defined recursive functions. Each new definition adds a little more complexity to the system. In order to understand any one function, you have to understand all the functions it is defined in terms of. Once you have defined a new function, you can invoke it with larger and larger arguments: f(2), f(10), f(f(1000)), etc. until the amount of digits and notation symbols becomes inconvenient, then you define a new function g(x).

It is important to note that you keep adding information: plugging in larger numbers like 2, 10, 1000 increases the information, and defining functions greatly increases the information. In general, larger numbers require more information.

But defining functions is just an operation in itself. If you define a standardised way to define new functions, then you can abstractify the process of defining the functions, and define a new function based in the idea of iterating the process of defining functions. This requires modeling the process of recursive definition and computation, something that can be done with, say, a computer program that emulates another simpler computer.

This is a jump into a second-higher level of abstraction. Just as arithmetic is an algorithmic abstractification of counting, and defining functions is an algorithmic abstractifcation of the mechanics of arithmetic, this new process of automatically repeatedly defining functions is an abstractification of that.

All of these ideas were formalised and the process of algorithmic abstractification was studied in the theory of computation by Turing, Church and Kleene, among others. They showed that all algorithmic processes within a certain limited definition of algorithmic process could be reproduced by a certain, minimal definition of computation, and used that model to show that there were certain limits to what types of things could be computed. (We'll dive into that fairly deeply later, when we get to the Lin-Rado Busy Beaver Function).

Formal Grammars

If the foregoing makes little sense, consider this concrete (but somewhat non-rigorous) example. Select any well defined, "sufficiently powerful" grammar G, consisting of a symbol-set of S symbols, finite in number, and well-defined rules of what constitutes a syntactically valid string of symbols specifying an integer. An example grammar that should be fairly familiar uses the symbols:

0 1 2 3 4 5 6 7 8 9 + * ^ ( )

and the rules that these symbols are to be strung together to make a legal set of additions, multiplications and exponentiations yielding an integer result; in this example S = 15 because we listed fifteen symbols. Just to be unambiguous, we'll require parentheses whenever two or more operators appear in a string.

Given this grammar G, for every integer N there is a set of integers E_N consisting of all the integers that can be specified as a combination of N symbols in G using G's defined grammar. This set is finite, (it has, at most, S^N elements, since there are that many combinations of N symbols from a set of S). Since E_N has a finite number of elements, it therefore has a maximum element. Define m(N) to be a new function (not a part of the grammar G) giving the value of this maximum expressible integer in the grammar G for each N. Now we have a function which is guaranteed to grow at least as fast as any function defined within G, or faster. (Technically, it is only guaranteed to grow faster above a certain minimum value of N — this is part of what we vaguely called "sufficiently powerful"). In any case, this function, or any larger function definition from f(x) = m(x) + 1 to f(x) = m(m(m(x))) or beyond, can be defined as part of a new, larger grammar G' incorporating all of the definitions of G plus the new definition of f().

So, in the specific example given here, we find in particular that for N = 3, N = 7, N = 11, the largest expressible integers in G are:

9^9, therefore m(3) = 9 ^ 9
9^(9^9), therefore m(7) = 9^④3
9^(9^(9^9)), therefore m(11) = 9^④4

and in general for N = 4 X + 3 for any natural number X,

m(N) = m(4X+3) = 9^④(X+2)

Since N is always larger than X + 2 we can define our new grammar G' just by adding the symbol:

(which represents the ^④ or hyper4 operator) and the new syntax:

a h b

where a and b are valid strings, and interpreted as a^④b. This function grows faster than G's m(x) function. In this new grammar, which we now call G':

m'(3) is 9^④9
m'(7) is 9^④(9^④9)
m'(11) is 9^④(9^④(9^④9))

Now the process could continue to grammar G'' and so on. If you continue the same idea indefinitely you just get higher hyper operators, but you could also define new symbols using the ideas given above — the Ackermann function, the Conway chained-arrow notation, etc. At each stage you have a grammar G^x with its maximal function m^x(n) to which the same idea can be applied to generate another bigger function.

TREE[3]

Based on Kruskal's tree theorem, the TREE[] function is a finite-valued integer function that gives the length of the longest possible sequence of trees, obeying certain rules (see the link). It is a computable function because a program with finite (though large) memory can check all possible ways to build a sequence of trees under the given rules, performing a depth-first search until all possibilities are exhausted. Since the TREE[] function is finite-valued, any such search is guaranteed to terminate in finite time, so such a program must terminate.

This means that TREE[n], being computable, grows more slowly than the Lin-Rado busy beaver function, discussed on the next page.

Friedman's SSCG()

Friedman's SSCG function is a finite-valued integer function that gives the length of the longest possible sequence of "simple subcubic graphs", obeying certain rules (see the link). Due to the Robertson-Seymour theorem (a stronger version of the Kruskal theorem mentioned above), such a sequence must terminate; this means that a computer program with finite (though large) memory can explore all such sequences of graphs, and compute the value of SSCG(n) for any n.

Thus, SSCG() also grows more slowly than the Lin-Rado busy beaver function. However, it also grows much faster than TREE[]. In fact, SSCG(3) is larger than TREE[3]^TREE[3](3), where in this case the superscript "TREE[3]" represents successive application of the TREE[] function.

The related SCG() function is similar, applying to subcubic graphs in general (i.e. removing the qualifier "simple" which allows multigraphs). Adam Goucher demonstrates⁴¹ that SSCG(3) > TREE(3) > SSCG(2), where ">" indicates 'mind-bogglingly less than'. He also comments⁴², There's no qualitative difference between the asymptotic growth rates of SSCG and SCG. It's clear that SCG(n) >= SSCG(n), but I can also prove SSCG(4n + 3) >= SCG(n). Thus, the SCG() and SSCG() functions have effectively the same rate of growth in any classification like that of fast-growing hierarchies; but the relationship with TREE[] and Busy Beaver remains:

BB(n) > SSCG(n) > TREE[n]

First page . . . Back to page 6 . . . Forward to page 8 . . . Last page (page 11)

Japanese readers should see: 巨大数論 (from @kyodaisuu on Twitter)

If you like this you might also enjoy my numbers page.

s.30