mdbtxt1
mdbtxt2
Proceed to Safety

Large Numbers    


First page . . . Back to page 2 . . . Forward to page 4 . . . Last page (page 11)


Inventing New Operators and Functions

The concept of the "classes" described so far does quite well at handling everything that can be done with exponents, which are the most powerful operator known to most people. To proceed further we begin to invent new operators. This practice of inventing new operators continues over and over again as you go to higher and higher large numbers. The new operators overcome the limits of the old operators, limits that are reached as the old notation becomes unwieldy.

For example, class-1 numbers are written in traditional place-value notation, which is essentially abbreviated addition and multiplication. For example:

3158 = ((3 × 10 + 1) × 10 + 5) × 10 + 8

Although we don't normally think of it that way, the place-value notation avoids the unwieldy use of lots of symbols.

When expressing larger numbers, like Avogadro's number and googol, one usually uses exponents and power towers, as discussed above:

6.02 × 1023, 10100, 1010100, 27256312546656, etc.

but after a while that becomes unwieldy too. Eventually there are so many exponents that it cannot be written on a page. Then it becomes a good idea to invent a new new shorthand, which amounts to defining a new operator.

Function Hierarchies

Before going on to actual invented operators and functions, it is useful to consider how functions can be meaningfully put into an order.

In some branches of mathematics and computer science one often encounters rates of growth that can be expressed or approximated by a formula — for example, the amount of time to sort a list of N items by the bubble sort method is proportional to N2. It is common to express such a rate of growth in terms of its asymptotic behaviour, using of one of the Bachmann-Landau notations such as Big O notation. Usually these are used to put functions into growth categories, such as linear O(x), log-linear O(x log(x)), quadratic O(x2), exponential O(2x), self-exponential O(xx), and so on. Each category includes functions that grow fast enough that they "eventually" exceed any function in any lower category.

The concept of "eventually dominates" is often used, particularly in the types of function hierarchies that are of interest to us here. By the most common usage, eventual dominace is defined like this:

g() eventually dominates f() if and only if there is some m such that, for any x>m, g(x) > f(x).

By this definition, g(x) = 2x is said to eventually dominate f(x) = x+10, because for large enough arguments (in this case for x>10) g(x) is larger than f(x).

However, the above definition also means that g(x) = x+1 is said to eventually dominate f(x) = x. For many purposes this is a meaningless distinction, and various solutions are used. "Sbiis Saiban"40 defines a variant called "dominance over translations" which is similar to the above, except that there is the further requirement that you if add a constant k to the argument of f(), the dominace will happen no matter the value of k. This ensures that x+1 is not considered to dominate over x.

Let's stick with the common definition of eventual dominance, but for now just ignore any variants that differ only by adding a constant, and let's put some functions into an ordering.

The first set of functions can be linear multiples of x:

f1 f2 f3 f4 f5 . . . fk
x 2 x 3 x 4 x 5 x . . . k x

Each of these grows more quickly than the one before it. Note that I have given each function a name by using "f" with a subscript. There is a function for each positive integer k.

A faster-growing set of functions is the "constant powers" in which a constant base is raised to the power of a varying value of x. These functions are commonly said to have an "exponential growth rate". When given integer arguments they include the powers of 2, powers of 3, etc.):

g1 g2 g3 g4 g5 . . . gk
1x 2x 3x 4x 5x . . . kx

We'll define more later, but let's try to see how all functions can be put into a single ordering. Immediately we encounter an issue.

Why Function Hierarchies Require a Transfinite Ordinal Index

As you can see, in the previous section we defined two sets of functions, fk() and gk() for positive integer k, and k can be as large as one wants. To prove useful results about large numbers we want to have all functions be in a single group, say fn() for some element n.

But there are two problems:

The first problem is easily addressed, if we simply leave out g1(x) = 1x = 1 and instead start with g2().

However the second problem is a bit harder.

We could simply limit the first group to a finite size, say 100 linear functions going from f1(x) = x to f100(x) = 100x. Then add the g() functions with an index starting at 101, or maybe 102 since we're also dropping g1().

However mathematicians never like to do that. Instead, they solve the problem using transfinite ordinals. This is a topic that gets covered far later in this article, but for now we'll just use the Greek letter "omega" ω to represent "infinity", and use the rule that we are allowed to use numbers like ω+1 to mean "infinity plus one".

By using transfinite ordinals, the previous two groups of functions can be combined into one:

The Munafo Function Hierarchy (first attempt)

f1 f2 f3 f4 f5 . . . fk
x 2 x 3 x 4 x 5 x . . . k x
fω+1fω+2fω+3 fω+4fω+5 . . . fω+k
   2x 3x 4x 5x . . . kx

Notice we also dealt with the g1(x) = 1 problem by not using the index ω+1.

Why There are Competing Function Hierarchies

Using Cantor ordinals for the index allows us to get quite far in our analysis of definitions of large numbers. However, there is another problem, which recurs a lot in this sort of work: the example above skips some functions.

Suppose we want to add the constant powers (commonly called "polynomial growth rates") starting with quadratic, cubic, etc.:

h1 h2 h3 h4 h5 . . . hk
x x2 x3 x4 x5 . . . xk

Most of these fit between the two rows of functions in the earlier table. As before there is a function that does not, this time it is h1(x) = x. As before we can leave it out. The rest clearly grow faster than the linear functions, in the sense that they "eventually dominate". For example, h3(x) = x3 is larger than f100(x) = 100x whenever x>10. Also, the first of the exponential functions 2x grows faster than all of these; for example 2x exceeds x3 when x is greater than about 9.94.

However, if we want to include these in our table of functions named by fn() with a subscript n, we cannot simply insert this row into the table as shown above, because there are no avaiable Cantor ordinals between the integer-indexed ones fk() and the ones with ω in the index fω+k(). In order to incorporate these polynomial functions, we would need to do something with the indices. Here is one possible solution:

The Munafo Function Hierarchy (second attempt)

f1 f2 f3 f4 f5 . . . fk
x 2 x 3 x 4 x 5 x . . . k x
fω+1fω+2fω+3 fω+4fω+5 . . . fω+k
x2 x3 x4 x5 . . . xk
fω2+1fω2+2fω2+3 fω2+4fω2+5 . . . fω2+k
   2x 3x 4x 5x . . . kx

Here we have renumbered the exponential functions starting at ω2. (This is shorthand for ω+ω, and is not written "2ω" because the finite part should be after the ω, just as with "ω+2" and "ω2"). This makes enough room to insert the infinite class of polynomial functions, for which we re-use the indices that start with ω.

But, as you can probably guess, the problem of incompleteness has not gone away. More rows of functions can be inserted:

hk(x) = k ln(x)

   ("k x" row is here)

hk(x) = k x + ln(x)

hk(x) = k x ln(x)

   ("xk" row is here)

hk(x) = xk + x

hk(x) = k x xk = k xk+1

   ("kx" row is here)

hk(x) = k2x = (kx)2

It is also possible to insert entire infinite classes of functions between two members of the same row. This one shows what you probably already realised when the notion of polynomial was first mentioned: there are infinite sets of polynomial functions between the polynomials we already considered:

hk(x) = x3 + k x + 7
      (goes between x3 and x4)

There are also (or course) entire hierarchies of polynomials between each member of the exponential functions:

h(j,k)(x) = 2x + j x3 + k x
      (all between 2x and 3x)

These functions might not be as commonly used or as interesting, but they illustrate a principle that applies to hierarchies of functions in general: In order for a hierarchy to be useful, you have to make some precise definitions regarding what functions will be included, and what indices each function will get; and (perhaps less satisfying) you have to accept the fact that any hierarchy will leave out a lot.

This last fact, that there will always be functions left out, leads to the invariant fact that in some cases, it will be difficult to definitively say which of two large number definitions is larger. Often, the best we can do is to say something like "They both lie between fω+1(63) and fω+1(64)".

These function hierarchies will be important later, (you may skip to the fast growing hierarchy section if you wish), but for now let's return to the well-known popular approaches to expressing large numbers. New symbols adding to the familiar notation can get us pretty far.

Beyond Exponents: the hyper4 Operator

(most commonly called "tetration")

(my early name for this was "powerlog")

The first new operators used by those seeking large numbers are usually higher dyadic operators. A dyadic operator is one that has two arguments — two numbers that it acts on. Usually in notation the operator is placed between the two numbers.

The most common higher dyadic operators follow the pattern set by the well-known three (addition, multiplication and exponentiation). These operators come up a lot in the definitions of large numbers that are to follow.

Following an obvious pattern in the three common operators, the new operator can be defined as shown here:

operationrepresentation absolute definition inductive definition
addition a + b   or   ab - successor (a + (b-1)) or successor ((a-1) + b)
multiplication a×b   or   a*b   or   a.b   or   ab a + a + ... + a a+(a×(b-1)) or (a×(b-1))+a
exponentiation ab   or   a^b   or   a↑b   or   ab a × a × ... × a a×(a(b-1)) or (a(b-1))×a
hyper4 a^^b   or   a↑↑b   or   ab a^(a^(...^a)) a^(a(b-1))
hyper4 ab ((a^a)^...)^a (a(b-1))^a

Note that for the last operator, there are two ways to interpret the absolute and inductive definitions, producing different hyper4 operators. In common practice, the first one is used because the other one can be reduced to a combination of two exponent operators: ab=aa(b-1), and thus it does not really count as a new operator.

The names tetration, superpower, and superdegree have also been used to refer to the hyper4 operator. (As a child I used the somewhat misleading name powerlog for hyper4, as in 2 powerlog 5 is 65536.)

Extension to reals : Now, suppose you want to calculate 22.5 or pie. The above definition isn't too useful because the number after the has a fractional part. What we would need is a way to "extend" the hyper4 operator to real numbers. Unfortunately, this is tough to do in a way that meets the types of standards mathematicians generally want such things to have. I also know of no proof that such extension is impossible. A lot of people have worked on this over the years, and if you're interested, I suggest you check my notes here, and the Tetration FAQ.

A "logarithm" for hyper4 : Another common question about hyper4 is how to perform the inverse operations — the equivalent of the "hyper4 logarithm" and "hyper4 root". There is no good answer for either one, until the problem of extending hyper4 to the reals is solved. The "hyper4 root" can be evaluated for fixed integer "root" values using Newton's method. For example, to take the "2nd hyper4 root", use this algorithm:

(given: number X, we want to find R such that R2 = X. Note that R2 = RR.)

step      action notes
1. R = ln(X) first approximation of answer
2. Y = RR calculate the function
3. dY = Y (1 + ln R) derivative with respect to R
4. new R = old R + (X-Y)/dY new approximation by Newton's method
5. go back to step 2 repeat until accurate enough

The hyperlogarithm is intuitively similar to the "class number" (see my description of classes above) along with a fraction indicating how far through the class we are. It is very similar to the level-index representation and to the internal format used by my hypercalc program. Here are some hyperlogarithm values (to base 10) using a definition from Trappman's Tetration FAQ15:

hyperlog(2) ≈ 0.39
hyperlog(100) ≈ 1.39
hyperlog(10100) ≈ 2.39
hyperlog(1010100) ≈ 3.39
...

Another function conceptually similar to this is the inverse Ackermann function:

α(*x) = greatest n such that a1(n) < x

where a1(x) is as defined in the Ackermann section below. This inverse function grows more slowly, and is not an inverse of hyper4, but rather for the generalised hyper function.

The function "below" addition : Some people have also developed a hyper0 function. If you think about it, addition is a shortcut for counting, in much the same way multiplication is shortcut for addition. The following definition for a hyper0 function was developed by Constantin Rubtsov:

ab = a          (if b = -∞)
ab = b          (if a = -∞)
ab = a+2 = b+2          (if a = b)
ab = a+1          (if a > b)
ab = b+1          (if b > a)

This function, appropriately enough, is also the "successor" function used as the primitive computational element in algorithms defined in the Church theory of computation, which includes the original Ackermann function. For more on how this is done see my page on functional computation.


First page . . . Back to page 2 . . . Forward to page 4 . . . Last page (page 11)



Japanese readers should see: 巨大数論 (from @kyodaisuu on Twitter)

If you like this you might also enjoy my numbers page.


Robert Munafo's home pages on AWS    © 1996-2024 Robert P. Munafo.    about    contact
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Details here.
s.30