Large Numbers


First page . . . Back to page 2 . . . Forward to page 4 . . . Last page (page 7)


Uncomputably Larger and Uncomparable

At this point it is useful to define the concept of Uncomputably Larger.

Uncomputably Larger: A is Uncomputably Larger than B if A is larger than B, but it the difference does not show up when the numbers are expressed in the same system of representation, or if it is not possible to figure out the magnitude of the difference using ordinary numerical techniques. A system of representation is any system of digits and/or symbols used to express numbers in a standard form that lends itself well to seeing which is bigger. Standard scientific notation is an example; mathematical formulas in general are not.

If A is a class 3 number number and K is class 2 or smaller, it is easy to distinguish A × K from A, but hard to distinguish A + K from A. So A + K is uncomputably larger than A.

For an example of this, imagine A has trillions of digits. If you add some small number to it, only the last few digits will change — and all of the digits would have to be stored and examined to tell the difference. On the other hand, multiplying A by a small number N will change all the digits, and you can distinguish the difference by comparing the logarithm of A to that of A×N

If A is a class 4 number and K is class 3 or smaller, it is easy to distinguish AK from A, but hard to distinguish A × K from A. A × K is uncomputably larger than A.

If A is a class 5 number and K is class 4 or smaller, it is easy to distinguish AK from A, but hard to distinguish AK from A. ( is the hyper4 operator). AK is uncomputably larger than A.

This pattern does not continue with higher operators, because the "class" system is based on exponents. For example, if A is a class 10 number and K is class 9 or smaller, it is still easy to distinguish AK from A, and hard to distinguish AK from A.

Uncomparable: A and B are said to be Uncomparable if it is unknown how to distinguish which is larger.

Notice that this definition depends not only on A and B but also on your level of knowledge. As you go to higher and higher operators and functions it becomes quite difficult to determine which values are larger than others. It is easy to see that Skewes' Number is bigger than googolplex, but not nearly so easy to figure out which of Graham's Number and the Moser is bigger. Graham's Number and the Moser are defined with different systems of representation, and the two systems cannot be readily converted into each other. They are called uncomparable until the two systems are studied and a method is developed to show which number is larger. They have been presented in order here only after much careful study.

Power Towers

A tower of exponents like eee79 mentioned above in the discussion of Skewes' numbers is often called a power tower. Notice how in the examples of class-3 numbers there are three numbers in the power tower, and in the class-4 examples there are 4 numbers in the power tower. But this isn't necessarily always the case, for example 2222} is a power tower with 4 numbers but its value is 65536, a class-1 number.

Problem: Start with the 3-level power tower 2210. Consider two different ways to make it bigger: Increase the bottom-most number, making it H210 where H is something really huge like 1000000, or make the power tower higher by making it S2210, where S is something really small like 1.001. Determine which is biggest: the original power tower X = 2210, or the two altered versions, A = 1000000210 or B = 1.0012210 ?

First we show that A and B are both bigger than X. A>X is obvious. For B it's less obvious. We're comparing:

B = 1.0012210 ⋛ 2210 = X

Convert both to a power of 2, by taking the log base 2 of 1.001, and repeat. We get:

B = (20.001442)2210 ⋛ 2210 = X

B = 2(0.001442 × 2210) ⋛ 2210 = X

log2B = 0.001442 × 2210 ⋛ 210 = log2X

log2B = 2(-9.4377 + 210) ⋛ 210 = log2X

log2(log2B) = -9.4377 + 210 ⋛ 10 = log2(log2X)

log2(log2B) is about 1014.56, much bigger than log2(log2X) which is 10. So B>X.

Now we need to compare A to B. Let's rewrite A as a power of 1.001:

A = (1.001(log1.0011000000))210 = 1.001(log1.0011000000) × (210)

Then it is a question of which of these is larger: A' = log1.0011000000 × 210 or B' = 2210. Substituting 210 = 1024, we're comparing A' = (log1.0011000000)×1024 to B' = 2^1024, so A'/B' = (log1.0011000000)×1024 / 21024. Cancelling powers of 2, we remove 1024 from the numerator and reduce the denominator to 21014: A'/B' = log1.0011000000 / 2^1014. log1.0011000000 is pretty large (it's a little over 13822) but 2^1014 is much much larger! So B' is larger, and therefore B is the biggest of our three power towers.

We could have used any really big number H in place of 1000000 and any small number S in place of 1.001 and B would still be the biggest, as long as logSH is less than 21014. 21014 is about 1.7556×10305, a class 2 number. To show how extreme this is, let H be a googol and S be 1+1/googol. logSH would be ln(10)×100×googol = 2.3026×10102, still much less than 21014. So even with this really huge H and really small S, the power tower S2210 is still bigger than H210.

In general, if you have a tower of exponents and you want to make it larger, you'll make it much larger by adding another exponent to the bottom than by increasing the size of the bottom exponent, as long as the tower of exponents has something fairly big at the top and the numbers involved are all class 1 or on the lower end of class 2. This leads to the (somewhat nonintuitive) result that if you're comparing two towers of exponents, you can look at how many exponents are in the tower and know right away which is larger. For example,

1.11.11.11000

is much larger than

100010001000

and

1.11.11.11.11.11.11.11000

(which has 7 levels of exponents with 1000 at the top) is much larger than

1000100010001000100010001000

(which has 6 levels of exponents).

If you're interested in trying some of these out, my Perl-based calculator program can handle everything discussed so far, up to power towers thousands of numbers high.

Inventing New Operators and Functions

The concept of the "classes" described so far does quite well at handling everything that can be done with exponents, which are the most powerful operator known to most people. To proceed further we begin to invent new operators. This practice of inventing new operators continues over and over again as you go to higher and higher large numbers. The new operators overcome the limits of the old operators, limits that are reached as the old notation becomes unwieldy.

For example, class-1 numbers are written in traditional place-value notation, which is essentially abbreviated addition and multiplication. For example:

3158 = ((3 × 10 + 1) × 10 + 5) × 10 + 8

Although we don't normally think of it that way, the place-value notation avoids the unwieldy use of lots of symbols.

When expressing larger numbers, like Avogadro's number and googol, one usually uses exponents and power towers, as discussed above:

6.02 × 1023, 10100, 1010100, 27256312546656, etc.

but after a while that becomes unwieldy too. Eventually there are so many exponents that it cannot be written on a page. Then it becomes a good idea to invent a new new shorthand, which amounts to defining a new operator.

Beyond Exponents: the hyper4 (Tetration) Operator

The first new operators used by those seeking large numbers are usually higher dyadic operators. A dyadic operator is one that has two arguments — two numbers that it acts on. Usually in notation the operator is placed between the two numbers.

The most common higher dyadic operators follow the pattern set by the well-known three (addition, multiplication and exponentiation). These operators come up a lot in the definitions of large numbers that are to follow.

Following an obvious pattern in the three common operators, the new operator can be defined as shown here:

operation representation absolute definition inductive definition
addition a + b  or  ab - successor (a + (b-1)) or successor ((a-1) + b)
multiplication a×b  or  a*b  or  a.b  or  ab a + a + ... + a a+(a×(b-1)) or (a×(b-1))+a
exponentiation ab  or  a^b  or  a↑b  or  ab a × a × ... × a a×(a(b-1)) or (a(b-1))×a
hyper4 a^^b  or  a↑↑b  or  ab a^(a^(...^a)) a^(a(b-1))
hyper4 ab ((a^a)^...)^a (a(b-1))^a

Note that for the last operator, there are two ways to interpret the absolute and inductive definitions, producing different hyper4 operators. In common practice, the first one is used because the other one can be reduced to a combination of two exponent operators: ab=aa(b-1), and thus it does not really count as a new operator.

The names tetration, superpower, superdegree and powerlog have also been used to refer to either the hyper4 or hyper4 operators.

Now, suppose you want to calculate 22.5 or pie. The above definition isn't too useful because the number after the has a fractional part. What we would need is a way to "extend" the hyper4 operator to real numbers. Unfortunately, this is tough to do in a way that meets the types of standards mathematicians generally want such things to have. I also know of no proof that such extension is impossible. A lot of people have worked on this over the years, and if you're interested, I suggest you check my notes here, and the Tetration FAQ.

Another common question about hyper4 is how to perform the inverse operations — the equivalent of the "hyper4 logarithm" and "hyper4 root". There is no good answer for either one, until the problem of extending hyper4 to the reals is solved. The "hyper4 root" can be evaluated for fixed integer "root" values using Newton's method. For example, to take the "2nd hyper4 root", use this algorithm:

(given: number X, we want to find R such that R2 = X. Note that R2 = RR.)
step   

action notes
1. R = ln(X) first approximation of answer
2. Y = RR calculate the function
3. dY = Y (1 + ln R) derivative with respect to R
4. new R = old R + (X-Y)/dY new approximation by Newton's method
5. go back to step 2 repeat until accurate enough

The hyperlogarithm is intuitively similar to the "class number" (see my description of classes above) along with a fraction indicating how far through the class we are. It is very similar to the level-index representation and to the internal format used by my hypercalc program. Here are some hyperlogarithm values (to base 10) using a definition from Trappman's Tetration FAQ15:

hyperlog(2) ≅ 0.39
hyperlog(100) ≅ 1.39
hyperlog(10100) ≅ 2.39
hyperlog(1010100) ≅ 3.39
...

Some people have also developed a hyper0 function. If you think about it, addition is a shortcut for counting, in much the same way multiplication is shortcut for addition. The following definition for a hyper0 function was developed by Constantin Rubtsov:

ab = a      (if b = -∞)
ab = b      (if a = -∞)
ab = a+2 = b+2      (if a = b)
ab = a+1      (if a > b)
ab = b+1      (if b > a)

This function, appropriately enough, is also the "successor" function used as the primitive computational element in algorithms defined in the Church theory of computation, which includes the original Ackermann function. For more on how this is done see my page on functional computation.

The Hyperfactorial and Superfactorial Operators

These are single-argument functions like the factorial but producing higher values.

N.J.A. Sloane and Simon Plouffe use hyperfactorial to refer to the integer values of the K-function, a function related to the Riemann Zeta function, the Gamma function, and others. It is

H(n) = nn (n-1)(n-1) ... 33 22 11

For example, H(3) = 27×4×1 = 108 and H(5) = 86400000. This function does not really grow much faster than the normal factorial function.

In 1995, Pickover defined the superfactorial n$ (think of the dollar sign as a factorial sign with an S for "super" drawn on top of it) as follows:

n$ = n!n!n!....n!

where there are n! repetitions of n! on the right hand side. Using the hyper4 operator, n$ is equivalent to:

n$ = n! n!

There are other ways to define a higher version of the factorial, such as this and this.

Bowers' Named Numbers

Jonathan Bowers, about whom we'll discuss a lot more further below, has invented many names for special numbers in this area. For example, in analogy to googol and googolplex he refers to 10100 as giggol and 10(10100) as giggolplex.

Higher hyper operators

Of course, the pattern of dyadic operators is easily continued:

operation representation absolute definition inductive definition
hyper5 a^^^b  or  ab a(a( ... a)) a(a(b-1))
hyper6 a^^^^b  or  ab a(a( ... a)) a(a(b-1))

and so on.

Bowers has several named numbers in this area, including trisept, 77; tridecal, 1010; and the aptly named boogol, 10(100)10.

The first triadic operator

Since the dyadic operators all fall into a pattern, it is logical to define a triadic operator that combines them all. A triadic operator is an operator that acts on three numbers, just as a dyadic operator acts on two numbers.

This new triadic operator is represented as a function with three arguments, and defined as follows:

hy(a,n,b) = { 1 + b for n = 0 { { a + b for n = 1 { { a * b for n = 2 { { a ^ b for n = 3 { { a ^ hy(a,4,b-1) for n = 4 { { hy(a,n-1,hy(a,n,b-1)) for n > 4 { { a for n > 1, b = 1

the following definition is equivalent:

hy(a,n,b) = { 1 + b for n = 0 { { a for n = 1, b = 0 { { a for n > 1, b = 1 { { hy(a,n-1,hy(a,n,b-1)) for n > 0

Bowers' Array Notation (3-element Subset)

At this point we return to the work of Jonathan Bowers to introduce his array notation. This notation is elegant, powerful, relatively easy to use and covers a greater range than any other discussed on these pages, within the limits of functional formal systems.

We will start by showing a very reduced version of the notation, which uses arrays of only 1, 2, or 3 elements. The rules for converting the notation into a number are:

1. For one- and two-element arrays, just add the elements. [a] = a and [a,b] = a+b
2. If rule 1 does not apply, and if there are any trailing 1's, remove them: [a,b,1] = [a,b] = a+b; [a,1,1] = [a].
3. If neither previous rule applies, and the 2nd entry is a 1, remove all but the first element: [a,1,n] = [a] = a.
4. There is no rule 4 (there will be when we get to bigger arrays).
5. Otherwise replace the array [a,b,n] with [a,[a,b-1,n],n-1], then go back and repeat the rules to expand it further.

With just a little effort you can see that these rules make [a,b,n] equivalent to hy(a,n,b) except for the special case of n=0. Compare the formula of rule 5:

[a,b,n] = [a,[a,b-1,n],n-1]

with the general case of the definition of the hyper function:

hy(a,n,b) = hy(a,n-1,hy(a,n,b-1))

They are the same except the order of the arguments is different. Bowers arranges the arguments in order of increasing "growth potential" — the operator has higher growth potential than b, so it goes last.

So, all 3-element Bowers arrays are equivalent to the normal hyper operators. [3,2,2] = 32 = 3×2 = 6; [3,2,3] = 32 = 32, [4,5,6] = 45, etc.

hyper operator variant: Knuth's Up-arrow Notation

The use of two or more up-arrows (as in "a ^^ b" and "a ^^^ b" above) resembles a notation defined by Donald Knuth17 in 1976, and is equivalent to the hyper operator. Knuth used real arrows, written like this: a↑↑b and a↑↑↑b instead of a^^b or a^^^b.

a ↑↑ b = hy(a,4,b)
a ↑↑↑ b = hy(a,5,b)
a ↑↑↑↑ b = hy(a,6,b)
(etc.)

using the hy() function allows for a more compact representation of really large numbers that would otherwise take a lot of arrows. For example, hy(10,20,256) is equivalent to

10 ↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑ 256

and hy(256,625,4096) would be very unwieldy. Bigger numbers like hy(256,4294967296,256) can't be written at all.

This up-arrow notation is used in defining the Ackermann numbers

A(n) = n ↑↑↑...↑↑↑ N (with n up-arrows) = hy(n, n+2, n)

which are related to the Ackermann function described below.

Proof Becomes Difficult

At this point we begin to encounter functions and definitions that are difficult to compare to one another, either because they are not very thoroughly worked out, or because it takes too much work to actually prove which ones grow more quickly.

Gödel Numbers

The Gödel number of G, Gödel's undecidable sentence, is probably around here somewhere (its value depends highly on what operators, functions, etc. are available to construct primitive-recursive statements in the formalized number theory system that the Gödel technique is applied to)

Goodstein sequences

Almost certainly higher than this (but who can say?) are numbers related to the Goodstein sequence.

(For more detailed descriptions, see the Wiki entry and this page by Justin Miller)

G(n) = highest value of the sequence constructed as in Goodstein's Theorem from starting value of n.

The hereditary base-b representations are analagous to ordinal infinities and subtracting one works like the successor function on the ordinal infinities — except that "w" is always finite no matter how big it gets, so every individual "w" term is guaranteed to eventually become bound to a finite value and therefore will eventually go down to zero.

The convergence of a series with no second-level exponents is easy to see:

v2 + 2 v2 + 1 v2 v2 - 1 = (a-1) × v + (a-1) where a = value of v at this step ... (a-1) × v (a-2) × v + (b-1) where b = value of v at this step ... (a-2) × v (a-3) × v + (c-1) where c = value of v at this step ... 2 v v + (d-1) where d = value of v at this step ... v e - 1 where e = value of v at this step

When higher-level exponents are involved, the series will get longer each time a higher-level exponent has to be decremented. Each time the series will become enormously longer, but will still be of finite length. Therefore, the same principle applies.

For example, let's start with 27 (in base 2, 110112). We get:

110112
110113 = 11210. 110103 = 11110
110104 = 32410. 110034 = 32310
110035 = 75310. 110025 = 75210
110026 = 151410. 110016 = 151310
110017 = 274510.
110008 = 460810. 107778 = 460710
107779 = 719810.
1077610
1077511
1077412
1077313
1077214
1077115
1077016 = 6744010. 1076F16 = 6743910
1076F17
...

lg(2n) ≅ hy(2, 2+n, 2n)

Consider the lower Goodstein sequence, and look at just one of the exponents in that sequence, and call it "c". As we have already shown, "c" will eventually get decreased to a lower number. Call that number "d" (which is c minus one). At this point the iteration continues for an even longer time with no change to "d" or any of the other "exponents", but eventually as before, the lowest exponent will have to get dinimished again. So in this way we see that each exponent will eventually get replaced with a lower one. Each step takes massively longer than the previous step, but all steps are still of a finite length (not an infinite length) so eventually even the highest exponent will get decreased.

Other Triadic Operators

A common trick that clearly generates faster-growing functions involves defining functions that take more than two arguments. We have seen how the hyper operator, our first triadic operator, easily covers everything all the dyadic operators can handle. This trend continues. Of course, all operators can be referred to as functions, and the dyadic operators are actually functions with two arguments.


First page . . . Back to page 2 . . . Forward to page 4 . . . Last page (page 7)



If you like this you might also enjoy my numbers page.

Robert Munafo's home pages on HostMDS   (c) 1996-2010 Robert P. Munafo.   about   contact

This work is licensed under a Creative Commons Attribution 2.5 License. Details here s.13