/* ries.c
RIES -- Find Algebraic Equations, Given Their Solution
Copyright (C) 2000-2012 Robert P. Munafo
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
If you got ries.c from the website www.mrob.com, the
GNU General Public License may be retrieved from the following URL:
http://www.mrob.com/ries/COPYING.txt
The remainder of this large header comment is broken up into sections
titled: HOW TO BUILD, UNFINISHED WORK, DETAILED NOTES ON ALGORITHM,
REVISION HISTORY.
HOW TO BUILD:
1. Boot your favorite UNIX-compatible computer (Step 1a: install Linux
if you don't already have it on your computer. Linux rules!)
2. Make a new directory, put this file there.
3. Compile it with the following command:
gcc -o ries ries.c -lm
Try other flags for optimization if you wish (like: -m64, -O2 or -O3,
-ffast-math, etc.)
If it reports errors like "undefined reference to `pow'", try moving the
pieces around: "gcc ries.c -lm -o ries". The command "gcc -o ries -lm ries.c"
is known to NOT work with recent versions of GCC.
4. Run ries and give it a number, for example:
ries 2.5063
INSTRUCTIONS
Instructions are in the manual, included here in nroff format (ries.1)
To use the manual, copy it to the proper place:
cp ries.1 /usr/share/man/man1
(substitute appropriate local manpage directory for your OS) then type
"man ries".
UNFINISHED WORK
Presently, "ries 0.9674026381747 --eval-expression xp/T" shows a
different answer on different computers. For example I get
1.4511039572620501 on MP16 and 1.4511039572620503 on MBP. This causes
commands, such as "ries 0.9674026381747 -l3", to produce different
results ("x/tanpi(x/pi) = 2/3" on MP16 and the bizarre "x/tanpi(x/pi)
= log_(sqrt(8))(2)" on MBP).
To fix this I probably need to bring in my own tangent function.
Add canonval, decanon, and cv_simplify routines.
canonval adds 'n', 'r', and/or '2/', '4/' to try to put an
expression's value into the range [1, 2). Call it just before calling
bt_insert. It should work on the existing expression structure and
metastack state, and it needs to stop if MAX_ELEN is reached. We also
need a decanon routine to undo its work before proceeding with the
rest of ge_2.
cv_simplify operates on two expressions (an equation), removing any
unnecessary symbols that may have been added by canonval. It takes two
expressions A and B and removes any trailing '2/', '4/', 'r' and 'n',
or possibly shifts symbols from one expression to the other, all so as
to make the equation A=B lower-complexity. Call this in check_match
right after the initial test passes and before running newton() on the
equation. This will most probably involve having two dedicated
expression data structures so it (and newton(), etc.) will see our
simplified equation rather than the two expressions that are actually
in the database.
The purpose of this is to conflate values like -2, -1, 1/2, 1, 2 into
a single tree entry, which should make more effective use of memory,
at the expense of the output being a little less well-ordered by
complexity-score.
Add -Fmax and -FMma output formats.
One or more custom defined constants.
To support this and a future defined-functions feature, each
user-defined thing needs to have a symbol auto-generated for it out of
the set of as-yet-unused byte values. In addition to mapping functions
to turn names into symbols, and additional 256-element arrays in the
symbol table, this also requires a new low-level output formatter for
-F0 and some of the debug_xxx output.
This formatter should turn non-standard symbols into '(Nm)' where
"Nm" is an abbreviated version of the user's supplied name. This
symbol can also be used in the -O, -N, and -S options, e.g. '-Op(Eg)'
would allow pi and the user-defined symbol 'Eg' to be used once each.
Constants and (eventually) functions can share a common FORTH-like
syntax:
--define : EulerGamma # seft-a (constant)
( -- The Euler-Mascheroni constant, 0.57721... )
# 50 digits for when RIES goes to higher precision
0.57721566490153286060651209008240243104215933593992
;
--define : Hypot # seft-c (two-argument function)
( a b -- sqrt(a^2+b^2) )
dup* # ( a b^2 )
swap # ( b^2 a )
dup* + sqrt
;
In the first implementation of functions, you cannot include literal
constants in the function, but instead have to --define the constant
first with its own name.
Consolidate the (current) multiple ways that roundoff, overflow, loss
of significance, and tautology errors are handled. This involves the
constants and variables: n_ovr, p_ovr, k_sin_clip, k_min_best_match,
k_vanished_dx, k_prune_deriv, and k_sig_loss. For example,
k_sig_loss*k_min_best_match should be equal to the magnitude of the
"machine epsilon" (2^-53 in the case of 64-bit double)
More importantly, we need to allow matches between an LHS and
another LHS (producing solutions with X on both sides of the
equation). The match closeness becomes |(val_r - val_l)/(deriv_l -
deriv_r)| which is the same as the current formula except for the
addition of the deriv_r term.
* Start at the bottom (exec(), eval(), newton(), etc.) and work my
way up to the top (ge_1() etc.). Rename variables and parameters (like
the "!on_rhs" passed to exec() from ge_2) so the names agree with what
they actually do ("!on_rhs" should be something like "!no_x" or
"has_x").
* Allow more user control over how much output contains x on both
sides, from old behaviour to "all eqns have x on both sides". The
default should be somewhere in the middle; this should be implemented
in the main outer loop where it tests "if (lhs_insert > rhs_insert)".
Also, the '-Ox' option should force the old behaviour.
Higher precision:
* Replace existing calculations (in newton(), exec(), etc.) with
macros that have an assembly-style syntax
* Add quad printout capability for e.g. debug_s
* Start working quad calculations into each case of exec().
Initialize the special constants (k_vanished_dx, k_prune_deriv, etc.)
differently depending on precision option
* Get eval() to to calculations in quad precision
* Get newton() to use quad
Consider options for utilizing multiple processors (threads / parallel
implementation). The following does *not* work because of the
elaborate interleaved nature of the bt_insert algorithm (read below
where it talks about "memory performance degradation"), however it
could if we return to the older implementation.
- Have 2 scans running at any one time, an RHS scan and a LHS scan.
Whenever a scan completes, initiate another. Every time a scan
completes, another thread runs through the outputs of the latest RHS
and LHS.
Another idea that is even more constrained by memory: Have N threads
running at any one time, each constraining itself to a part of the
expression search-space, distinguished by the first few symbols in the
expression. For example, after the complexity depth has gotten high
enough, we will have identified all possible combinations for the
first 3 symbols in the expression's postfix representation. The
searchspace can then be partitioned by having ge_2 perform a CRC
hashfunction on the first 3 symbols, and determine whether it should
prune or recurse depending on the low log_2(N) bits of the hash value.
This requires having each task build up its own binary tree, which
in turn requires a parallel binary tree merge:
* Add population-counts to each tree node and add a bt_index()
routine;
* When it is time to merge, each of the 4 trees becomes read-only;
* Each of 4 threads then builds up a new tree consisting of the I'th
quartiles of each of the old trees;
* Combine these 4 new trees into a single big tree by creating 2 dummy
parent nodes and 1 dummy grandfather node;
* Deallocate the old trees.
The main problem with this approach is that it uses twice as much
memory during the merge process, and 4 times as much memory bandwidth;
and (more crucially) there is no clear way to determine whether there
will be enough free memory to perform the merge.
Consider ways to perform additional levels of searching *without*
expanding the database. This would be done when the memory limit is
reached (perhaps in response to a user option):
- Walk the tree, applying all possible monadic transforms to each
node, to see if the result then produces a new match to another
existing node. For these purposes a "monadic transform" consists of
appending *either* a single seft-b symbol, *or* a seft-a followed by a
seft-c. For example, if this scan found [43L] in the tree, it would
append 's' to form the expression [43Ls], compute its value (which is
1.592289...) then search the tree for this value using a bt_find()
function. It might then find [xp/], which would be a near-match in the
case of x=5. This type of search can be efficiently parallelized by
having each of N threads traverse 1/N part of the tree. (Which in turn
suggests that we should maintain population-counts at each node and
add a bt_index() routine.)
- Greater-complexity expressions can be synthesized by appending a
short, low-complexity complete subexpression and a seft-c symbol.
For example, [43L] : [3q] : [+] => [43L3q+], which would then match
[x2-].
%%% options to add:
Higher complexity scores for trig functions unless argument
expression contains pi or x.
Warren Smith suggests, "I dont mind sin(1), but sin(1) IN AN EXPONENT
like 3^sin(1) is just ridiculous.", suggesting an idea like tables of
rules that match the end of a forth subexpression (1S^ in this
example) and if matched, either prune it outright or add to the
complexity score. (If adding to the complexity score, the possibility
of this has to be considered by the bounds-setting and short-circuit
recursion code). After further discussion he suggested that no
subexpression should be completely excluded, but an "entropy function"
should be used to weight entire expressions based on the likelihood
they would be found in actual maths. Things like 1S^ would get a
really high weight, and the expression database would be
"exponentially" more efficient at covering the types of expressions he
is interested in.
More -Fx output formats, e.g. raw symbols, calculator-keys,
HTML, RHTF, TeX, eqn, Maxima, etc. Use keywords rather than numbers
to identify the formats (e.g. "-FMma")
-W to view weights (same as -S) or define weights individually.
--include could then be used to select weight presets (e.g. a set of
weights for electrical engineering)
--log-base option to do for ln what --trig-argument-scale does for
sine and cosine.
variations on -l that allow specifying limit of search by precision,
by time, by memory usage, or by number of equations tested. The
present -l is usually correlated with all of these but is never
equal to any of them. ('-lmem=10M', '-ltime=20m', '-ldigits=8', etc.)
%%% some option to "solve for X", to the greatest extent possible.
(This will imply or require the "X only on LHS" mode). Also
requires a lot of new things: A new routine to split the
arguments of type-c operators; a special handling of sym_allowed (to
provide for the fact that, when solving, some operators become their
inverses); re-evaluate the match during and/or after solving to make
sure it is still a solution (consider -1.4142 -> "x^2 = 2" -> "x =
sqrt(2)" -> "1.4142": oops!); re-evaluate sym_allowed compliance after
solving (to support -Osss option)
%%% command-line option: if no symbol '1', then 'r' shouldn't be
allowed. Likewise for '2' + '^' and 's'; '-' and 'n'; 'e' + '^' and
'E'; others? Which should be the default, the current simpler behavior or the
"more correct" behavior? The "simple" behavior is useful, as
illustrated by the 1.4142135 example in the manpage, because it
helps people find alternate ways to express the same solution.
*/ /*
%%% macros:
- use a separate symbolset namespace (implies macros cannot call
themselves or each other, which avoids lots of problems)
- To use -O with a macro, include '_' in the -O symbolset
list; all symbols after '_' are macro names. -S assumes all macros
(why would you define a macro and not want to use it?); using -N
with a macro is a contradiction.
- Macro can use any symbols, and symbols used within macro
expansions don't count against their sym_allowed quotas (this is
probably easy; just need to make sure)
- Need to implement stack-overflow detection in the metastack
routines, both for the stack and for the undo lists.
- Test evaluator routine needs to make sure macros fit one of the
three allowed types ('a', 'b' or 'c') and that only types 'b' and 'c'
dip into the stack's current contents.
- Type 'a' can be "precompiled" since they amount to custom
constants
- Should I allow immediate operands (like 0.577...) to facilitate
defining constants that can't be computed any other way? How about
special operators, like 'exch'?
- execution can probably be accomplished by having eval() call
itself recursively. It needs to handle its own error returns.
%%% with many -i test cases the RHS expression totals are a lot bigger
than the LHS totals (while the insert totals are equal). There are
probably cases that come out the other way around, too. In cases like
this, the program could probably find more matches more quickly by
shifting the balance between LHS/RHS to favor the side that is
generating and inserting more expressions per unit time. (The present
implementation tests "lhs_insert > rhs_insert" which is usually the
best way to do it.) I could model this with formulas that take into
account the total amount of dead-ends, expressions, and inserts on
both sides, compared as a ratio to the equation total (and take the
1st derivative to figure out what side of the maximum you're on). The
idea is to maximize the number of full equations found per unit time
rather than keeping the number of LHS inserts equal to the number of
RHS inserts.
%%% Test cases to test periodically:
ries 0.08600409182 -l1 should give "3 pi x = 8/pi^2" or equivalent
ries -l1 0.434294481903252 should give "e^(1/x) = 1+9"
or "x = 1/ln(1+9)"
ries 2000.0222 '-Sn1/r+87^s' used to give bogus roundoff match
ries 27 -Sx3qL6 used to give bogus roundoff match
ries '-S14+*-/n' 27 generates a small number of equations
ries '-S4+*-/n' 27 used to generate *more* than the previous example
ries 27 '-S4+*-/' -Ox -ie Should give [x4+4*4+] = [4444+**]
which is "4(x+4)+4 = 4*4(4+4)"
ries -Nl+ 2.5063 Stephen's number: should give the three
answers "2 x = 5", "x^2 = 2 pi", "x^x = 10"
ries -3.14159 should give "-x = pi"
ries -23 ln(-x) = pi, followed by an exact match
ries -Slvsnex 1.017285 used to give bogus exact match
ries -i 14327652 -l4 used to give erroneous [xsxL7s^n] = [27s^n]
ries -i 7823310 -l4 used to give erroneous [x35s^*x/] = [35s^]
ries 2.505317647 -l1 gives [x5l-] = [6ql] (which is correct)
with error 4.8e-11. Make sure "almost-zero"
checks don't clobber this.
ries '-S4+*-/' 17 should give an answer with only 4 or 5 4's,
such as [x44/-] = [44*]
ries 17.649955654152575 log_2(x) = 1+pi
ries -l2 98.5913852852497 log_(2+phi)(x) = 3/7+pi
*/ /*
ARCHITECTURE
main -- Outer loop -- increment complexity and decide whether to add to
RHS tree or LHS tree
gen_forms -- setup first recursive level of gf_1
gf_1 -- Add a symbol of type 'a', 'b' or 'c' and recurse if complexity
allows
gf_1 -- Recursive levels until form is complete
ge_1 -- setup metastack and initial level of ge_2
ge_2 -- Generate one step of FORTH code and update metastack,
recurse if more symbols in the form
ge_2 -- Recursive levels until form is complete
canonval -- Try to put expression value in [1.0,2.0)
bt_insert -- Insert calculated result into LHS or RHS tree
report_match -- Display match without inserting (if exact
match)
check_sides -- Find closest neighbor of the opposite sidedness
and determine if there if the pair sets a new
record.
check_match -- Check a single pair to see if it's a new
solution
cv_simplify -- Simplify equation by removing e.g. "2*"
from both sides
newton -- Use Newton's Method to locate ideal value of
x for a given solution
report_match -- Display a match that converged by
Newton.
DETAILED NOTES ON ALGORITHM
To reduce the search time, the graph theory "bidirectional search"
(also called "meet-in-the-middle") technique is used. Rather than
attempting to search for solutions of the form
X = expression (1)
it searches for solutions of the form
f(X) = expression (2)
where f(X) represents an expression containing X. Each side of
equation (2) is represented as a set (in memory, a ordered list,
implemented with a binary tree) of expressions and their values. Thus
there are two lists {For efficiency, these two lists are stored in a
single binary tree, so that I can check for a match after every new
insert. With a single tree, matching items end up being close to each
other in the tree}. Each list is kept in order by numerical value.
Then, the lists can easily be scanned to locate any matches. A match
consists of an element in the left-hand list whose value is close to
that of an element of the right-hand list. Keeping the lists in order
also allows duplicate expressions, like "2 pi" and "pi * 2", to be
ignored, because they will end up having the same value and therefore
end up being in the same spot in the list..
It is important to look for "simpler" matches before going to more
complex ones. Thus, expressions have a complexity score, computed by
assigning a certain number of points to each of the symbols in the
exression. Matches are checked in order of increasing total complexity
(which is the complexity of the left-hand side plus that of the
right-hand side)
In order to avoid the sorting problems necessary to search the
expressions in a perfectly increasing complexity order, complexity
scores are lumped into discrete finite quanta, which are actually
implemented by using small integers for each component of a complexity
score, rather than real numbers.
This does not actually cause everything to be searched in order, even
modulo the quantization errors. Some equations can be represented in
an unbalanced form with lower aggregate complexity than the
least-complex balanced form: an example is the cube root of 2: The
solutions are "x^3 = 2" and "x = 3,/2", both are unbalanced (",/" is
the "Nth root" symbol). There are no good balanced solutions. In such
cases the unbalanced, low-complexity form will show up later in the
search than it "should".
*/ /*
To make generation and evaluation easy, expressions are represented in
a FORTH-like syntax with one character per symbol. Thus, "11+" is 1+1,
"2q" is sqrt(2), "ep^" is e^pi, etc. Symbols are categorized by their
stack-effect, which is abbreviated "seft" in the code.
In the actual FORTH language, the stack effect (seft) encapsulates
what a word does to the stack. It is described by a comment (some text in
parentheses) containing: the stack contents before the word is executed,
a "--" symbol, and the stack contents after the word executes.
There are three sefts in ries:
seft a: ( -- K )
Adds one thing (a constant) to the top of the stack
seft b: ( arg -- result )
Takes one value from the stack, performs an operation, and puts the
result on the stack
seft c: ( arg1 arg2 -- result )
Takes two values from the stack, performs an operation, and puts the
result on the stack
The RIES symbols of each seft are:
a: x 1 2 p 3 e 4 5 f 6 7 8 9 (x, the digits, and the constants pi, e and phi)
b: r s q l n S C T (reciprocal, squared, square root, ln, negate, sine,
cosine, tangent)
c: + - * / ^ v L (add, subtract, multiply, divide, exponent, root, logarithm)
For ease in debugging, most symbols have letters that make sense, but
not always (e.g. "f" for phi, the golden ratio; "v" for root, which
is meant to represent the v-shaped part of the standard root sign).
The only constants are the digits 1 through 9 -- no zero, and no
multidigits or decimal fractions. To get 10 you have to do "25*" or
something similar; for 1.5 you have to do "32/". A fair amount of
effort has been put into setting the symbol scores such that the score
of "25*" is only a little higher than the score for "9".
example cases for deriving the symbol weights:
(+) ~= (*) (because both occur equally often)
(-) > (+), but not by much
1, 2, 3, 4, 5, ... degrade gracefully and kind of like a slide rule
(14*) = (22*), etc. (implies (n) proportional to log(n) for 1<=n<=9)
(33*) = (9) (implies (*) = [..])
(25*) > (9), but only by a little (no problems so far)
[2q] ~= [5] :: therefore (2q) + [.] ~= (5)
[x2v] = [xq] :: (2v) + [.] = (q)
[x2^] = [xs] :: (2^) + [.] = (s)
[99+] can be >> [55*] (because smaller numbers are more likely)
%%% complexity score for a symbol can be context sensitive, e.g. 'l'
takes a higher score the second time it is used in an equation -- so
long as its range of possible scores is within the total range for its
seft. This might be a good way to eliminate some of the nonintuitive
aspects of the current system.
Once a set of symbol scores is worked out, it can be adjusted by
adding any constant to all the values (this adds a bias for long
expressions, or a bias against long expressions. In the above
"normalized" examples there is no bias, but the shorter ones will get
generated first anyway.)
The lists start out small and grow as the program searches. On the
first pass, the left-hand list consists simply of {X}, the single-
element expression for the search value, and the right-hand list
consists only of a few common constants, for example {1, 2, *e*,
*pi*}. After a few passes the left-hand list will include things like
"xv" (sqrt(x)) and "x1+" (x+1), and the right-hand side will have
similar forms such as "2v" (sqrt(2)) and "23/" (2/3). At that point,
if X is 4/9 a match will be found between "xv" and "23/", even though
the combination of "x" and "49/" has fewer symbols overall, because
the former matching is more evenly balanced. Another way of saying
this is that, since the complexity scores of "xv" and "23/" are both
lower than the score of "49/", the "xv = 23/" matching is found before
"49/" even gets a chance to be generated.
To minimize time spent generating nonsense expressions, expressions
are generated from "forms". A "form" is a symbolic expression of
expression syntax, like "aabc" for "12q+". Each letter represents a
different type of stack operation. Type "a" pushes one item, type "b"
leaves the stack with the same number of items, and "c" leaves the
stack with one less item. A form is legal if the stack depth (the "a"
count minus the "c" count) is > 0 at all times and = 1 at the end.
Thus all forms start with an "a". The number of forms for
N={1,2,3,...,8} is {1,1,2,4,9,21,51,127} (the Motzkin numbers;
Sloane's A001006). By comparison, the total number of forms without
the stack restrictions would be 3^(N-1): {1,3,9,27,81,242,729,2401}.
The savings is considerable. In the left-hand list the first symbol
will be an "x" in all generated expressions. Forms are generated on
the fly and never stored in memory. As the search proceeds, longer
forms are generated as needed.
For each form there is a "minimum" expression and a "maximum"
expression, as rated by complexity score. For example, "x1+" is the
minimum expression for form "aac" and "99L" (log base 9 of 9) is the
maximum. These minimum and maximum expressions are easy to find,
because each symbol has its own score and the expression score is just
the sum of the symbol score. Thus it is easy to determine, at the
beginning of each pass, which forms can generate expressions that are
within the current complexity range.
Expressions are generated from each valid form. The generation of
expressions uses a recursive backtracking algorithm, starting with the
first symbol and moving to the right. At each step, it checks to see
of the symbols we have so far still allow an expression which falls
within the complexity range -- if not, the latest symbol is dropped or
changed. It also avoids generating certain patterns of symbols which
would be of no computational value, or which are always equivalent to
a different (sometimes shorter) set of symbols. Examples of this
optimization are:
- "aa-" for any seft a symbol would be 0, so is not generated.
- "aa+" is the same as "a2*", so is not generated.
- "aa*" is the same as "as", so is not generated.
- "aa/" and "aal" are equivalent to 1, and are not generated.
- Almost any operator after "1" is not generated ("1x", "1/", "1r",
"1^", "1v", "1l", "1L", "1v" all do nothing useful)
- "2^" and "2v" are equivalent to "s" and "q" respectively, and are
not generated.
- "pS" is equivalent to 0
- "sq", "qs", "nn", "rr" all amount to nothing and are not generated.
The expressions are also evaluated while they are being generated. Any
subexpression that causes an error, such as divide-by-zero, can be
skipped along with all expressions that start with that subexpression.
For example, any expression starting with a constant followed by "nq",
such as "2nq" which means "sqrt(-2)", will be skipped.
As the search proceeds, each equation (that is, each combination of an
LHS with an RHS) is checked to see what value X would have to be for
the equation to work out perfectly, and the difference between this X
and the user's supplied number is called the "delta". An equation
becomes a "record-setter" if its delta is smaller than any delta seen
so far.
If a strict "non-fuzzy" comparison were made, due to roundoff errors
it would be possible for the same solution to be found twice. For
example, if X is near 4/9 = 0.44444.., two solutions are "x = 49/" and
"xq = 23/". One solution might be printed after the other because of
roundoff in the square-root operation. To avoid this, every time a new
record-setter is found, the criterion for another record is set to
0.999 times the new delta.
Determining the value of X for which the equation works out perfectly
is a hard problem. In theory one would have to solve the equation for
X. That involves lots of shifting and rearranging of symbols, and if
there is more than one X in the equation it can get into some rather
complicated algorithms.
Instead, this program compares the LHS and RHS values directly. (The
difference between the LHS and RHS is the "lhs/rhs diff", and varies
from the "delta"). This leads to several problems. Sometimes two sides
of the equation form a really good match only because both sides
involve something raised to a very small power (or a very big root --
same thing). An example of this type of problem is 1.017262042 =
[49sv] (the 81st root of 4) and 1.017313996 = [38sv] (the 64th root of
3). They differ by only 1 part in 20000, but only because both numbers
are very close to 1. All 81st and 64th roots are close to 1. This is
referred to as the "error margin problem" (or "loss of significance").
There is also the problem of zero subexpressions and constant
subexpressions involving X. An example is "x1el-^ = 42sL": The
subexpression "1el-" evaluates to 0, and the entire LHS is constant.
In order for the program to work, such "solutions" have to be detected
and eliminated. This is the "tautology problem".
Finally, there is the issue of reporting what X would perfectly solve
each equation. This is a desirable feature of the program because it
users will often want to find out what the "exact" value would be for
a given equation without actually doing the algebra and calculations
themselves. In cases like [xle+q] = [1e-s], where there is only one X
in the equation, this could be done by the computer. However, there are
lots of important cases like [xx^] = [52*] where an analytic solution
does not exist and the value of X has to be found by some other method,
such as a numeric method. This is the "root-finding problem"
Conveniently, all three problems (the error-margin problem, the
tautology problem, and the root-finding problem) are eliminated in
the same way: by calculating the *differential* of all terms and
subexpressions on the LHS. The differential, when multiplied by the
lhs/rhs diff, gives a very good approximation to the delta, which as
mentioned above is the amount that X would need to be changed to
make the equation a perfect match.
This solves the error margin problem because it equalizes the field --
all equations can get rated in terms of how much X has to change to
make LHS and RHS match, rather than how much the LHS would have to change.
It solves the tautology problem because, by definition, an LHS with
a zero differential is constant with respect to X, and therefore
constitutes a tautology solution.
It also solves the root-finding problem quite conveniently. If you have
an equation like X^X = 10, and you have an approximation for X, you can
use derivatives and Newton's method to find a better approximation for
X. When RIES takes the differential of the LHS and multiplies it by the
lhs/rhs diff, it is essentially performing one step of Newton's method.
The resulting value can be added to X to form a better approximation to
the root of the equation.
Differentials are evaluated for LHS expressions as the expressions are
generated. In the following table, the calculation of the differential
for each symbol is shown, based on the values of the operands (A and
B) and their differentials (da and db). Some of these will be familiar
to anyone who has taken calculus:
type 'a' symbols:
3 any constant 0
x target 1
type 'b' symbols:
e exponential e^A da
l natural log da / A
n negate - da
s squared 2 A da
q square root da / 2 sqrt(A)
r reciprocal - da / A^2
A arctangent da / (1+A^2)
C cosine -sin(A) da
G Gamma %%% reserved (mutex with !)
S sine cos(A) da
T tangent (1 + tan^2(A)) da
W LambertW %%% reserved
type 'c' symbols:
+ plus da + db
- minus da - db
* times A db + B da (note "product rule" below)
/ divide B da - A db / B^2 (note "quotient rule" below)
^ power A^B (ln(A) db + B da / A)
v Bth root BvA (da / A B - db ln(A) / B^2)
! factorial %%% reserved (mutex with G)
L log base B (da / A ln B) - (ln A / ln B) (db / B ln B)
product rule d/dx f(x) g(x) = df g(x) + dg f(x)
quotient rule d/dx f(x)/g(x) = (df g(x) - dg f(x)) / g(x)^2
chain rule d/dx f(g(x)) = df/dg dg/dx
An example of applying the above to derive the formula for the derivative
of A^B (where A and B are both functions of x):
A^B = exp(ln(A) B)
d/dx exp(Y) = exp(Y) dY
d/dx ln(A) = 1/A da
d/dx ln(A) B = db ln(A) + B/A da (by "product rule")
d/dx A^B = d/dx exp(ln(A) B)
= exp(ln(A) B) d/dx (ln(A) B)
= exp(ln(A) B) (db ln(A) + B/A da)
= A^B (db ln(A) + da B / A)
= A^B ln(A) db + B A^(B-1) da (standard form)
d/dx ln(A) / ln(B) = (ln(B) da / A - ln(A) db / B) / ln(B)^2
log(a+b) = log(a(1+b/a))
= log(a) + log(1+b/a)
*/ /*
It is fairly easy to show that in any equation that constitutes a
solution, if any expressions or subexpressons evaluate to 0, the
entire equation can be replaced with an equivalent form that does not
involve 0. The equivalent form is never more complex and is usually
simpler. It can also be shown that any solution involving a
subexpression that contains x and has a 0 derivative can be reduced
to a simpler solution that does not.
Why RIES Calculates Derivatives
Consider the value X = 2.5063. Each of the following solutions (or an
algebraic equivalent) will appear when you run RIES on it:
2 x = 5 for X = 2.5 *
x^2 = 2 pi for X = 2.506628274631 *
x^x = 1+9 for X = 2.5061841455888
x^2+e = 9 for X = 2.5063356063267
The exact solutions to these equations are all different values,
of course, and they all form successively closer approximations to
the supplied value 2.5063.
*/ /*
%%% the following description doesn't match the numbers; don't know
how to fix it...
Look at the last two. Notice that the supplied value, 2.5063, is
between the exact solutions of these two. Also, the equations can both
be expressed in a different way:
2 X = 5 instead of X = 5/2
X^2 = 2 pi instead of X = sqrt(2 pi)
Consider what would have happened if the last digit had been one
higher or one smaller. How much does it change the "closeness" of the
match? With the original two equations, where there is just an "X" on
the left-hand side, this is easy to figure out: if X is 0.0001 lower,
it's 0.0001 closer fit for the first equation, and 0.0001 further from
the second. But look at the alternate forms: If you subtract 0.0001
from X, that subtracts 0.0002 from 2 X, but it subtracts 0.0005 from
X^2. That's a big difference -- if we're using these forms of the
equations (as RIES does) to determine how well the two sides match,
then this altered value of X makes the match "move" further with
respect to one equation than it does with respect to the other!
The reason this happens is, of course, because we're looking at an
expresson on the left-hand side, rather than just a single "X".
Putting an expression on the left-hand side makes it harder to see how
close a match you've got.
If you still don't believe this, consider 1.047246, and exclude sine
and cosine from the function set:
ries 1.047246 -NSC
Your target value: T = 1.047246
3 x = pi for X = T - 4.84488e-05
x^5 = 3 root-of 2 for X = T + 4.81228e-05
1/ln(sqrt(x)) = 4^e for X = T + 1.77179e-05
...
We see that 1.047246 is an equally good solution for the following:
x = pi / 3 (too high by 0.00005)
x = 15th root of 2 (too low by 0.00005)
but if you express the solutions as the #ries# output does:
3 x = pi (3.14172, too high by 0.00013)
x^5 = cuberoot(2) (1.25992, too low by 0.00029)
suddenly it looks like the 3 x = pi solution is more than twice as
good. #ries# notices this and compensates for it regardless of the
form in which the equation is actually found. If you run RIES on the
value 1.047246 it will present both solutions, in the following form:
X * 3 = pi
X ^ 5 = 3 v 2
The philosophy adopted by RIES is that the "true" form for evaluating
the closeness of a match is the form where there is just one X, and
nothing else, on the left-hand-side of the equation, and just numbers
and symbols, but no X's, on the right-hand-side. (Let's call this the
"normalized form".) Put all equations into normalized form, calculate
the value on the right and look at the difference between X and this
value to determine how good the match is.
But now go back to our 2.5063 example above -- one of its solutions
was:
X^X = 10
This equation *cannot* be reduced to something with just one X on the
left-hand side -- there is no "inverse-of-X-to-the-X" function fn[1]. What
does RIES do?
fnd[1] Actually, there is, but it uses an obscure function called the
"Lambert W function" which is the inverse of *y*=*xe^{x}*. More
details are [here|+numbers:xxy_root2] if you are interested.
RIES calculates derivatives. By using the value of 2.5063 for X and
calculating the derivative of X^X for this value of X, you get (about)
19. A derivative of 19 means that any small change in X will cause a
19-times-bigger change in X^X. That's important, because it allows us
to compare the closeness of the match on "equal footing" with other,
normalized equations like X = sqrt(2 pi).
Derivatives work so well, in fact, that RIES does not even have to
bother solving its equations for x. ven if it were easy to do this
(which it is not), leaving the equations unsolved is still an
important speed improvement. Part of the reason RIES is so fast is
that it generates left-hand-sides and right-hand-sides separately
(like a wl[Bidirectional_search] in graph theory) and tries all the
combinations to find possible solutions. It can do this a lot quicker
because a half-equation is smaller than a full equation, and therefore
there are less possibilities to check out.
Furthermore, derivatives allow RIES to quickly and easily check a
possible equation to discover the value that X would have to be in
order to both sides to match exactly (it is essentially performing one
step of Newton's method).
Examples of LHS and RHS expressions for the test case 2.5063. These
are shown in groups that correspond to the pairs that would actually
generate matches in a search:
expression value deriv.
5/2 2.5 0
hyprt(10) 2.506184 0
x 2.5063 1.0
,/25 5 0
2 x 5.0126 2.0
x^2 6.281540 5.01
2 pi 6.283185 0
x^2+e 8.999822 5.01
9 9 0
9+1 10 0
x^x 10.00222 19.19
The derivative is based on the concept of an imagined error-bar in x
that is assumed to be small enough so that all reported matches are
relevant, but which is not zero. Thus, it is in units of the
infinitesimal quantity "epsilon".
Let us now imagine that there is a constant "g" equal to (pi-1) *
2.5063 ~= 5.367473. Then we would have the following grouping:
expression value deriv.
pi^2-2 7.869604 0
x+g 7.873774 1.0
pi x 7.873774 3.14
The matching "pi x = pi^2-2" constitutes a closer match than "x+g =
pi^2/2" because in the former case, x would only need to be decreased
by about 1/pi as much to make it an exact match. So, the closeness of
a match is measured as |LHS-RHS|/derivative, where smaller is better.
To enable actual error bars to be provided with data, a value of
"epsilon" must be adopted for use with notionally precise supplied
values. This can be gleaned from the number of supplied digits in the
input, or the precision of the floating point format can be used. In
the latter case we would take the data value divided by 2 to the power
of the number of digits in the mantissa.
*/ /*
Future enhancements:
- here's a way to allow printing more than one exact solution: When
inserting an LHS or RHS, if an identical item already exists in the
tree, overwrite it with the new one. This will cause at most one new,
distinct exact match per matchscan. Since this new match is of higher
aggregate complexity than the old one, it has at least a moderate
chance of being a mathematically distinct solution (-:
- spend a lot of time looking at which expressions are generated
first, and try to improve the weights so it makes more sense. Why does
2.5063 find [x2q/]=[pq] before [xs]=[p2*]?
- add constants: Feigenbaum, Euler's
- add Gamma function (might have trouble figuring out how to
calculate its derivative)
- ask the Net for practical, numerical definitions of the Riemann
Zeta function, and any others that look like they might be easy to
implement.
- explore how to expand RIES to complex numbers and complex analytic
functions. Looks easy -- after all, complex data types are native in
GCC!
*/ /*
REVISION HISTORY
20000207 Begin (it doesn't do much except parse the parameters)
20000208 Add parsing of level adjust and more design comments
20000209 Write code to generate forms. "Discover" the Motzkin numbers.
20000210 Optimize gf_1() by making it compute the stack depth as it
goes along rather than repeating the whole stack history on each test.
This improves time to generate all forms of length <=19 from 188
seconds down to to 37.9 seconds. Add tracking of min and max weights,
and start writing expression generator.
20000215 It now generates forms within min and max possible
complexity limits, to avoid generating expressions from forms that
cannot possibly fall within the current complexity limit. Add a little
optimization; speeds up a deep search from 53 seconds to 37 seconds.
20000217 It generates expressions, but doesn't prune for complexity
or evaluate. For the initial groups of 5, the numbers of expressions
evaluted are: {0, 12, 12, 72, 1368, 1368, 17928, 345672, 5875272,...}
20000217 Prune for complexity limits; each expression is now
generated exactly once. The numbers are down to: {0, 1, 3, 18, 69,
182, 1046, 5358, 27123,...} Still need to prune foolishness like "11+"
and "nn".
20000217 Prune almost all obvious trivial patterns. The numbers are
now down to {0,1,3,13,43,122,486,2186,9775,...}. Then prune [JK+] and
[JK-] for small integers, [jK*] and [jK+] for any j> 2" instead of ">> 1"
in the s[] and ds[] declarations in struct metastack.
20000220 Refine the termination condition (now it counts generated
expressions, rather than cutting off at a certain complexity score --
this is to gain independence from the specifics of the weights) and
discover another dimensioning bug in struct metastack. Benchmark
changes in PASS_GRAN: When it is set to 4, 2, and 1 the execution time
for generating all expressions up to complexity 64 (2236462
expressions) is 3.57, 4.16, and 5.24 respectively. This actually a lot
better than I thought it would be -- I thought there would be a lot
more overhead from repeating the same subexpression evaluations over
and over again.
*/ /*
20000220 Make it increase LHS and RHS complexity limits at independent
rates, such that the population remains equal between the two sides.
Add a bunch of comments documenting what's been done so far.
20000221 Write bt_insert(); it now reports exact matches! It finds,
for example, [x3+] = [4s] for X=13. However, for X=143 it reports
[xrx*] = [1], with a derivative of 1.9e-19. Add PRUNE_DERIV to try to
fix this, and it starts reporting [xxqL] = [2] with a derivative of
0.00258248 -- this turns out to be a bug in the deriv formula for 'q'.
Eventually increase PRUNE_DERIV to around 1e-14, then decide to make
it a variable and add p_ovr and n_ovr. Write bt_prev() and bt_next(),
but not using them yet.
20000221 Figure out that I can check just the nearest LHS on either
side of an RHS, and vice versa. Write check_sides() and check_match.
It now finds answers and prints them out!
20000221 Fix bug in exact-match reporting. When given 1.5065916, it
reports:
match: [xe+s1+] = [p6*] (solution is X+5.14855e-08)
Make it report delta you'd have to add to X to get each match to be
exact. Test cases that currently produce bogus results: 'ries -l1 27',
'ries -0.28676844', 'ries 403'
20000222 Adjust best_match by 0.999 each time to avoid long strings of
roundoff-error results. This fixes the './ries 403' case. Rename to
"ries" (it used to be called "misc"). Start analyzing memory usage.
Figure out how to save some memory in the node structures, and more
importantly, how to group nodes together in physical memory such that
the program degrades more gracefully when physical memory limits are
exceeded.
20000222 Implement -S and -O command-line options. This has the
side-effect of making certain bogus match bugs easier and quicker to
reproduce.
20000222 Change -S and -O to -N and -S respectively.
20000222 Convert manpage to #nroff# format. Improve implementation of
-S/-O/-N precedence. Implement -O option. Figure out why restricted
symbolsets create very small numbers of equations (the level controls
the number of expressions generated, which is always larger than the
number of expressions inserted in the tree)
20000222 Fix bug that caused negative X's to give no solutions -- it
was initializing best_match to a fraction of X, and not taking the
absolute value! Implement -i option so I can find an expression for
70458.
20000223 Ignore -i if target is non-integer. Adjust weights for seft
'b' operators; 'l' now appears much less often in expressions. Add -y
option.
20000223 Write add_rule(); convert all AM_xx rules to add_rule()
calls. Now the AM_xx rules degrade gracefully with the symbolset
options. Change -y option to -x.
20000223 Clean up formatting in report_match(); add complexity score to
output. Write perl script to benchmark and take statistics on -i option
for a wide rance of integers; leave it running overnight. Change all 'int'
to 's16'.
*/ /*
20000224 Today's date as a mathematical expression:
(((4^(4^(1/e))-pi)^2)-pi)^2 the repetition is cool.
20000224 Write initial version of infix_1, then add a few rules
(ordering of bare symbols in '+' and '*'; always reverse order for 'v'
and 's'). Add -F option.
20000224 Add memory usage statistic.
20000225 Write infix_preproc; implement parentheses precedence.
20000225 Write eval() and newton(); printed X values are now (almost
always) exact roots of their equations. Add AM_1K rules.
20000226 Add copyright and GPL notices; add URL to first printf
20000226 Add symbol definition strings
20000227 Add sin and cos operators. Eliminate several bogus exact
match errors related to roundoff and loss of accuracy, e.g.
cosine(0.0001). Fix major bug in k_prune_deriv test: it only pruned
positive small derivs, not negative small derivs. Add symbol FORTH
names and write postfix_formatter (but it isn't used yet). Don't report
matches if newton() returned an error.
20000227 Add -ie variant to -i option (only_exact)
20000228 Kill another missing-fabs bug; remove prune on "almost exact
matches" which is now adequately covered by the derivative tests.
20000228 Add 'E' operator, AM_l and AM_E attributes, and PS_REVPOW.
check_match now uses newton() to evaluate score more accurately.
Add loss-of-significance test to '-' operator.
20000228 check_match calling newton was royally slowing things down.
Now it uses the old, much quicker test and then uses newton() as a
confirmation test. This cuts time for "ries 2.5063" from 4.09 down to
1.39 on the Cyrix 180 (which is about what it has been since 0223).
Add statistics of pruned subexpressions ("dead-ends").
20000229 Fix bug in newton() that prevented success if target was
negative. Add debug printf's 'mnr'.
20000229 Add time display; add debug printf's 'opqsABCDEFG'.
20000229 Add debug printf's 'Hy' and a few notes about complex analytic
definitions
20000301 Change debug printf 's'; add another loss-of-significance test
to '+' and '-'; add debug printf's 'IJKLtuvx'. Add 'I' operator and
special-case tests for no defined symbols of each seft (this is a massive
optimization for "ries -ie 7 '-S1+*-/^v'")
20000302 Add debug printf 'w' and improve 'r'. Add arctan function (but
not using it yet) and some notes about derivatives.
200003xx Benchmarks on a 333-MHz Celeron:
-command-------------------------- -mem- time equations
ries 2.5063141592653589 960K 0.3 9.5619e7
ries -l1 2.5063141592653589 3072K 1.5 1.0306e9
ries -l2 2.5063141592653589 10.7M 6.5 1.2832e10
ries -l3 2.5063141592653589 32.7M 24.4 1.2042e11
200203xx Benchmarks on an 800-MHz PowerPC G4 (iMac, model M6498LL/A):
-command-------------------------- -mem- time equations
ries 2.5063141592653589 960K 0.1 9.5062e7
ries -l1 2.5063141592653589 3072K 0.8 1.0248e9
ries -l2 2.5063141592653589 10.6M 3.6 1.2765e10
ries -l3 2.5063141592653589 32.6M 14.3 1.1976e11
ries -l4 2.5063141592653589 114M 70.3 1.4609e12
20020610 Figures from an unknown test (I can't find a record of the
details). I think the numbers are: memory usage; total
expressions/total distinct; total equations.
-l0 1136K 32198/14576 53165000
-l1 2876K 145183/51492 666660000
-l2 7632K 530668/152904 5712000000
-l3 27100K 2311204/559419 78120000000
200310xx Benchmarks on an 800-MHz PowerPC G4 (iBook G4, model M9164LL/A):
-command-------------------------- -mem- time equations
ries 2.5063141592653589 960K 0.1 9.5062e7
ries -l1 2.5063141592653589 3072K 0.7 1.0248e9
ries -l2 2.5063141592653589 10.6M 3.2 1.2765e10
ries -l3 2.5063141592653589 32.6M 12.8 1.1976e11
ries -l4 2.5063141592653589 114M 62.3 1.4609e12
ries -l5 2.5063141592653589 398M 388 1.7825e13
20050715 Benchmarks on a 2-GHz PowerPC G5 (part of a dual system,
model M9455LL/A):
-command-------------------------- -mem- time equations
ries 2.5063141592653589 960K 0.0 9.5062e7
ries -l1 2.5063141592653589 3072K 0.2 1.0248e9
ries -l2 2.5063141592653589 10.6M 1.5 1.2765e10
ries -l3 2.5063141592653589 32.6M 6.6 1.1976e11
ries -l4 2.5063141592653589 114M 36.2 1.4609e12
ries -l5 2.5063141592653589 398M 260 1.7825e13
ries -l6 2.5063141592653589 1.41G 2183 2.2297e14
20070511 Alan Eliasen emails me to tell me that 'ries -l4 193707721'
gives the spurious result "x.(1/S(p))/x = 1/S(p) (exact match) {107}".
This is a problem I have known about for a while. I reproduce the
problem and reduce the symbol set to make it appear faster; it can be
reproduced with:
ries -F '-SxpSr/*' -l0 1002353667
Add rule ("",'S', AM_p, 0), which eliminates "sin(pi)". It now reports
the following more complex version of the same bug:
x.(1/S(p.p/p))/x = 1/S(p.p/p) (exact match) {181}
{This isn't actually a problem, it's doing what it should: When the
symbol '1' is not allowed, then "K/K" for any constant K *is* allowed.
However "ries -F '-SxpSr/*1' -l1 1002353667" does produce the bug.
See 20090513 for fix. -20090513}
20090216 Clean up the comments, and add the "ARCHITECTURE" section,
plus a few notes on how to approach a multithreaded implementation.
Here are the benchmarks from the 733 MHz Pentium 3 (from the manpage,
which I have now decided to update):
memory equations digits runtime
usage tested matched (733MHz P3)
-l0 960K 95,000,000 6+ 0.1 sec
-l1 3.1M 1,030,000,000 7+ 0.7 sec
-l2 11 M 12,800,000,000 8+ 3.3 sec
-l3 33 M 120,000,000,000 9+ 12 sec
-l4 115M 1,470,000,000,000 11+ 63 sec
and here are the benchmarks from the 2.33-GHz Core 2 Duo (MacBook
Pro):
-command-------------------------- -mem- time equations
ries 2.5063141592653589 960K 0.0 9.4018e7
ries -l1 2.5063141592653589 3008K 0.1 1.0153e9
ries -l2 2.5063141592653589 10.6M 0.6 1.2643e10
ries -l3 2.5063141592653589 32.4M 2.7 1.1865e11
ries -l4 2.5063141592653589 113M 13.6 1.4461e12
ries -l5 2.5063141592653589 396M 81.7 1.7642e13
ries -l6 2.5063141592653589 1.40G 651 2.2062e14
20090510 Add standard output format, and make it the default.
"-F" now takes a numeric argument (which formerly it did not)
and "-F" alone defaults to "-F0" which is what it did before.
20090511 In infix modes, display operator 'L' (log base A of B)
in advance of both of its arguments. No longer put parens around
single-symbol argument of negate (e.g. emit "-x" instead of "-(x)").
20090513 Add rules to put bare 'x' after non-x-containing expressions
and bare 'pi', 'phi' and 'e'. Finish postfix_formatter (which had
never been brought into spec with the other formatters) and add it
as a 4th output option.
As noted at 20070511, there has been a problem with expressions like
"pi*pi/pi". The command "ries '-SxpSr/*1' -l1 1002353667 -F0" currently
shows the problem. I fix this by adding the three-symbol pattern AM_KxK
and the rule ("", '/', AM_KxK, 0).
20090515 "ries -l6 2.5063141592653589" running uncontested on the
2.26-GHz Nehalem uses 1.40G of memory and tests 2.2154e14 equations in
442.4 seconds. Another test: 1.3525746932102463: 1.43G, 2.3011e14,
463.2. These times are 47% and 40% faster than the Core 2 Duo.
Slight improvements to -Ds output. Improve rules for emitting "*",
" ", or "" for multiplication in different situations. Add rule
("12345678n", '-', AM_jK), and def_amkey/sym_amkey[] optimization.
20090808 Remove "val" parameter of my_alloc()
20101218 Increase default level to -l2 (while preserving the effect
of all '-l' values when such an option is provided: "ries 1.234" is now
the same as "ries -l2 1.234", but "ries -l1 1.234" does the same thing
it used to do).
Print 'x' instead of 'X'; use strcmp and strncmp in a few places.
20111216 Increase MAX_ELEN from 16 to 19 since we have room (struct
size is 64 in either case). Comment out unused obt_xxx declarations.
20111218 check_sides now checks more than one neighboring LHS if the
newly-inserted node is an RHS. Oddly, this does not cause more matches
to be reported, and I am not sure why.
20111219 print_end now displays max LHS and RHS complexity values; add
--find-expression option.
Add k_vanished_dx and use it in several places to enforce a stricter
constraint on derivatives. This eliminates many of the more obscure
tautologies like "x^(4/ln(sqrt(x)))".
Add --evel-expression option, and error numbers and error strings to
support it.
20111220 Restore ERR_EXEC_TRIG_LOW_DX error for RHS expressions;
restore legend in normal infix display format; eval() now reports
ERR_EVAL_TOO_LONG.
20111221 Slight improvements to postfix_formatter; report (most)
unrecognized or badly-formatted command options.
Catch a few more tautologies by comparing magnitude of dx to
magnitude of x.
20111222 Add --version option.
20111223 -S without any symbols displays the symbol table and exits.
20111224 Enhance -Ds output by explicitly mentioning the Newton-Raphson
step.
20111226 Add loss-of-significance tests to each of the operators that
seem to need it: ln(x), e^x, sin, cos, +, -, a^b, a,/b, and log_a(b).
Significance loss errors have always been easy to find in ries
output, by giving a target value that has a simple solution (like
"ries -1.4142135", which is used throughout this note). After ries
gives the simple solution (in this case "x^2 = 2"), subsequent
solutions can incorporate the simple solution in a
loss-of-significance tautology.
The first of these solutions was "1/(x+sqrt(2)) = e^(4^2) {86}".
This incorporates the direct solution into a tautology by turning
"x^2=2" into "x+sqrt(2)=0", which isn't exactly 0 because x is not
exactly sqrt(2). Significance can be said to be lost because many of
the significant bits of the two terms "x" and "sqrt(2)" cancel each
other out in the addition.
This "x + -x" case is probably the most common source of significance
loss that I had not addressed until today. I fixed it by adding the
"fabs(rv) < (fabs(a) * k_sig_loss)" tests in exec() label
"add_common:"
"ries -1.4142135" then gave "-ln(x^2-1) = 1/7^8 {109}" until I added
the k_sig_loss test in exec() case 'l'
It then gave "-sin(pi x^2) = (1/5)^9 {111}" until I added the
k_sig_loss test in exec() case 'S'
It then gave "log_2(x^2-1) = 1/-(e^(e^e)) {117}" until I added the
k_sig_loss test in exec() case 'L'
Then it gave "1/(1/(x-1)-x) = (e^(4^2)),/2 {122}" until I added the
k_sig_loss test in exec() case 'v'
It then gave "(x-1/(x+1))^2 = e^(1/(e^7)^2) {125}" until I added the
k_sig_loss test in exec() case 'E'
After doing all of the above there was only one operator ('^') that
seemed to need a k_sig_loss test.
20111227 Change many details of formatting and printing to accomodate
the recent changes in handling of significant digits. Add constants
k_nominal_digits and k_usable_digits and associated format strings
fmt_g_xxx and use these in most places a floating-point value is
displayed. Redo the equation justify (space-padding) code to make the
best use of 80 columns while still showing centered equations and 15
significant digits when the -x option is given.
Add init_numerics(); k_phi and k_pi are now computed from scratch.
New 'z' option to -D and debug_z flag to show messages printed by
init_numerics().
20111228
Add more k_sig_loss tests in sin and cos.
Display CPU time as "%.3f" rather than the old "%d.%d" that showed
seconds and tenths.
RIES now exits when (best_match < k_0) regardless of the got_exact
flag; this fixes a bug that would cause RIES to loop forever if it had
not yet gotten an "exact match" at the point when best_match goes
negative.
init_numerics() now computes k_e and detects the size of the ULP
(Unit in Last Place or "least significant bit").
Add --min-match-distance option and g_min_matchsize to prevent
reporting of really close matches (useful in e.g. finding classical
approximation formulas for pi)
Add --significance-loss-margin and k_sig_loss to allow going back
to RIES behaviour before all the new k_sig_loss tests were added.
Share code by combining two exit tests into exact_exit() routine.
ln(a) and log_a(b) functions now no longer reject a case like
ln(1.23e10) (in which only about 1 significant digit is lost)
k_prune_deriv had been used for three different purposes, and now is
being replaced with three separate variables. The three purposes of
k_prune_deriv were:
1. Detecting convergence in newton(). Now using new k_newton_settled
2. Pruning subexpression tautologies in ge_2(). Now using k_vanished_dx
3. Setting n_ovr and p_ovr in init2(). For now this is unchanged.
20111229 The big trig change: S and C now take units of pi radians, so
"3rS" now produces the value sin(pi/3)=0.866025... To support this I
also add the --trig-argument-scale option, which if given with the
parameter "1" gives the old trig units (radians). As a bonus, users
can now get degrees or grads or any other angle units.
To avoid confusion, the trig functions are now called "sinpi" and
"cospi" in the output, unless --trig-argument-scale is set to
something other than pi; and the function definitions make this
obvious by adding a message like "For the trig functions, 360 units
are a full circle." at the end.
While I'm at it, I add the tangent function 'T'. This had been in
the manual and in the symbolset since the start, but had not yet been
written in exec(). This of course makes a lot of things solve more
quickly, such as Gosper's "0.9674026381747" (the root of "tan(x) =
3x/2", which previously needed a -l5 search, but with the unit change
and the tangent function, a -l0 search is sufficient.)
One result of the trig units change is that there are a lot more
natural identities connecting common fractions and small radicals. For
example sqrt(2) is 2 sinpi(1/4), sqrt(3) is 2 sinpi(1/3), etc. This
makes ries generate and insert fewer expressions at a given complexity
level. It's so good in fact that you get higher complexity *and* less
memory usage even after adding the tangent function. Here are some
benchmarks, showing how much memory it used up and what level of
complexity it was able to achieve:
--------old-------- --------new--------
complexity complexity
target-value LHS:RHS memory time LHS:RHS memory time
1.5063 -l2 66:61 13632 KiB .319 66:62 12928 .342
.328106566874978253 80:76 486848 KiB 39.5 81:76 461312 38.0
1.9511889024071 80:76 494016 KiB 39.7 81:76 458432 37.8
.922524879060934752 80:76 499264 KiB 40.4 80:77 464320 40.8
(All except the first is at level -l5; the old figures were measured
using the options "--trig-argument-scale 1 -NT")
20111230 Add a pruning rule for [K+K-]->[] and pattern AM_KpK (which I noticed
after updating the sin(4)/cos(4)=1.1578212823 example from the manpage
and discovering that "ries 1.1578212823" prints "x+1-1 = tanpi(4/pi)".)
Time measurement is now done by wall-clock time rather than CPU
cycle time. We do this with the gettimeofday() function. Formerly we
used getrusage(), which under-reports elapsed time on Nehalem and
later Intel microarchitectures when the core that we are running in
also has another thread running. This change also involved adding the
inittime() routine.
Make several changes to facilitate porting this file to other systems:
* #include and use LONG_MAX instead of __LONG_MAX__ (which
was a GCC-only predefined constant)
* The symbol stack effect attribute (formerly called "class") is now
called "seft" (both in the comments and more importantly in variable
names) to facilitate porting to C++ (where "class" cannot be used as a
variable name because it is a language reserved keyword)
* As mentioned above we now measure time with gettimeofday(); in
Windows we provide a gettimeofday() based on _ftime().
* #include "stdafx.h" has been added before the other includes, to
facilitate Visual C++
* #define RIES_VERSION ... has been moved to fall after the includes.
20111231 Add bt_first(), not yet used.
20120102 Increase k_min_best_match when fabs(target)>1.0, to avoid a
"search forever without exiting" condition that would always happen when
fabs(target)>16.0
Add --include/-p (load profile) option, implemented in new routines:
file_read, delimit_args, and parse_args (which includes all the
argument-parsing formerly in main()).
An included file contains options, delimited by blank space with
optional comments starting with '#'. Each non-blank non-comment
'token' is treated as if it were a string in argv[], with the strings
inserted into argv[] at the point where the settings-file option is
given, with recursion to support nested includes. Max 10MiB per file,
max recursion depth is 27 levels. This is to provide for all the fancy
new features I'd like to add someday (like user-defined symbol-weights
and functions).
Add bt_depth and bt_stats (which is run by debug_y) to see if
rebalancing the binary tree will help. I conclude that it will not
help much: in a typical run the average tree depth with rebalancing
would be 17.5, and the actual (unbalanced) average depth is 24.5.
20120103 Add check_exact_match(), which makes bt_insert consist almost
entirely of "generic binary tree code"
Add next_isparam(), clean up argument parsing and fix a bug with
parsing --significance-loss-margin argument.
20120104 Improve several of the debug_n printfs; add a bunch of
debug_q printfs to answer the question of why the check_sides change
of 20111218 did not produce more matches.
Rename the old "on_rhs" variable to "using_x".
Remove a lot of obsolete comments relating to an old idea for
maintaining two separate (LHS and RHS) lists. Update the comments on
derivative formulas and the Gamma function.
check_sides now always checks only one expression to either side of
the newly-inserted expression. This causes a very few changes in
output, mainly in the first one or two reported matches. The reasons
are explained below in a block comment "THE CHECK_SIDES PARADOX".
To support the check_sides investigation I added the -D0 option and
debug_0, which does a complete dump of the expressions database after
every gen_forms pass.
20120105 Add stack overflow and underflow checks in eval(). Increase
MAX_ELEN again to 21 (the expr struct is still 64 bytes).
Add MAX_SEFT_POP checks in add_symbol().
Add better documentation of the workings of the metastack routines.
Implement canonval() but leave it disabled by default right now.
This tries to transform expressions into forms that have a value in
[1.0,2.0), which also transforms expressions. For example, compare:
ries 1.50631415926535897932 --canon-reduction 0
x-1 = 1/2 for x = T - 0.00631416 {50}
1/ln(x) = sqrt(6) for x = T - 0.00213357 {62}
x^4 = 2+pi for x = T - 0.000489459 {68}
(x+1)^2 = 2 pi for x = T + 0.000314115 {69}
x^pi = tanpi(sqrt(2)) for x = T - 0.000224565 {70}
x^2 = tanpi(1/e) for x = T + 9.42083e-05 {60}
x^2-2 = 1/(1+e) for x = T - 1.35846e-05 {79}
1/(log_3(x)) = 3-1/pi for x = T + 8.87321e-06 {87}
1/x+sqrt(2) = 1/ln(phi) for x = T - 2.77223e-06 {86}
1/(log_7(x)) = 5-1/4 for x = T - 6.29902e-07 {94}
. . . . . . . . .
log_(2/sqrt(3))(x) = 3-1/2^e for x = T - 1.21748e-09 {128}
max complexity: 66 62 128
dead-ends: 2290352 5560466 7850818 CPU time: 0.369
expressions: 183663 420208 603871
distinct: 92770 113599 206369 Memory: 12928KiB
ries 1.50631415926535897932 --canon-reduction nr2
x = 3/2 for x = T - 0.00631416 {48}
(1/ln(x))/2 = sqrt(6)/2 for x = T - 0.00213357 {98}
x^4/2 = (2+pi)/2 for x = T - 0.000489459 {104}
(x+1)^2/2 = 2 pi/2 for x = T + 0.000314115 {105}
-(x-4)/2 = pi,/2 for x = T - 5.21371e-05 {95}
(1/(x^2-2))/2 = -(pi-5) for x = T + 2.16534e-05 {110}
(1/(x^2-2))/2 = (1+e)/2 for x = T - 1.35846e-05 {115}
e^(x-1) = (1/pi+3)/2 for x = T - 5.75694e-06 {101}
1/(x/phi)^2 = 8,/pi for x = T + 7.44845e-07 {94}
sqrt(3-x) = -tanpi(e-1) for x = T - 6.44308e-07 {101}
(1/-(log_7(1/x)))/2 = (5-1/4)/2 for x = T - 6.29902e-07 {144}
. . . . . . . . .
sinpi(1/5),/x/2 = 1/-cospi(1/ln(sqrt(7))) for x = T + 1.15156e-10 {151}
max complexity: 66 62 128
dead-ends: 2290352 5560466 7850818 CPU time: 0.369
expressions: 183663 420208 603871
distinct: 63206 67234 130440 Memory: 8192KiB
The intent is to get further with a given amount of memory by
collapsing equivalent equations like "x=-2" and "-x=2" into the same
solution. It succeeds in doing this, but also causes different matches
to be discovered and reported (despite the fact that the main loop is
still making the same choices about whether to call gen_forms on LHS
or on RHS). I need to investigate this further to discover why (one
possibility is that the new solutions come from things that would have
matched earlier, but didn't, because they lie in different places on
the number line and wouldn't meet up without needing more complexity).
20120107 -pfoo now tries opening "foo.ries" if "foo" cannot be opened.
Add real-time benchmarking in my_alloc (not yet finished).
20120108 More work on thrash_check(); add --min-memory option (which
presently does nothing, but eventually will prevent RIES from quitting
early in the event of unanticipated system slowness). Add --max-memory
option (which makes RIES definitely quit before exceeding a given
amount of memory).
20120113 Move nested functions out of parse_args()
20120115 Add --memory-abort-threshold option and fix some segfaults
that were happening if the user does not give an expected option
argument.
Add typedef sym_attr_block and use it for all the symbol table arrays.
20120122 Add CANONVAL_MUL2 operation and stub cv_simplify() routine.
THE CHECK_SIDES PARADOX
The RIES algorithm maintains a single list of RHS and LHS expressions
sorted in numerical order. These are distributed pretty much randomly,
and when a new expression is added, RIES checks the preceding and following
list items to see if it can form an equation with the new item.
The closeness of a match is the difference in values divided by the
derivative of the LHS:
closeness = |LHS(x)-RHS|/(d/dx LHS(x))
If the new node is an LHS, then the values of LHS(x) and d/dx LHS(x)
are the same for every match-comparison that is made. One RHS in each
direction (upward and downward) is all that needs to be tested,
because any further RHS's will generate a larger value of
|LHS(x)-RHS|.
However, when the new node is an RHS, the value of d/dx LHS(x) will
differ for each LHS that is found while scanning upward and downward
for candidate matches. Therefore, even after finding a new optimal
solution, the possibility exists that there might be another
even-closer optimal solution if you keep scanning further, coming from
an LHS that has a much lower value of d/dx LHS(x).
Here is a concrete example. La and Lb are two LHS's with derivatives
of 1 and 10. Ra and Rb are two RHS's. They are shown here as if laid
out on a number line to make the example clearer. The nodes are
inserted in the order: La, Lb, Ra, Rb.
value: 1 2 3 4 5 6 7 8 9
Ra Rb La Lb
d/dx: - - 1 10
When Ra is inserted, the match La=Ra is found, with closeness
(8-1)/1=7. Then Rb is inserted, and a new match La=Rb is found, with
closeness (8-7)/1=1. These are the only two matches it will report.
However, the match Lb=Ra is has a closeness of (9-1)/10=0.8, which is
closer, and Lb=Rb has a closeness of (9-7)/10=0.2 which is closer
still. Lb=Ra should have been reported instead of La=Rb. The program
should report La=Ra, Lb=Ra, and Lb=Rb (in that order).
For a real-life example, use the command:
ries .328106566874978253 -l-4 --trig-argument-scale 1 -NT -Dy0
which only gives one result, "5 x = sqrt(e)". With -D0 it dumps the
entire table of values on each complexity pass. Look through this
output for the first occurrance of [xrS] and [9r] together:
28 xrS { 35} = 0.093664890868384448 , dx = 9.2481888991224963
29 xp/ { 34} = 0.10443956395812863 , dx = 0.31830988618379069
30 xs { 24} = 0.10765391922648457 , dx = 0.65621313374995649
31 x3/ { 35} = 0.10936885562499275 , dx = 0.33333333333333331
32 9r { 26} = 0.1111111111111111
33 8r { 26} = 0.125
The output shows that [9r] is inserted a few passes later than 'xrS',
so check_sides is looking at [9r]'s neighbors [x3/] and [8r].
For a while in 201112 I made RIES look past the nearest neighbor,
and this command reported the result "sin(1/x) = 1/9". But as you can
see, sin(1/x) matches 1/9 more closely than x/pi, x^2, and x/3,
because the derivative of sin(1/x) is so much higher.
This is an example of the above-described problem, and is the reason
why for a while RIES was checking multiple neighbors on each insert.
The Paradox:
Despite the foregoing, the present algorithm (in which check_sides
only looks at the one closest neighbor on both sides of a newly-added
expression) turns out to work very well.
To understand why, look at the output of ries 1.506591651 -Dy0 and
find the first appearance of [1p6*-] :
448 x1+p^n { 56} = -17.937341258252982 , dx = -22.481451854903934
449 1p6*- { 51} = -17.849555921538759
450 xe+sn { 51} = -17.84955591743638 , dx = -8.4497469589180909
451 xTe/ { 49} = -17.762306262640966 , dx = 857.98450178681276
452 x7^n { 46} = -17.61849853924631 , dx = -81.859931782354067
When [1p6*-] is inserted, the LHS expressions [xe+sn] and [xTe/] are
already present. [xTe/] has a much bigger derivative (over 100 times
as large), so it looks like a good candidate for a match that would be
missed if check_sides only looked at the first neighbor of [1p6*-].
However, the distance in x values from [1p6*-] to [xTe/] is over 20
million times larger in magnitude than the distance from [1p6*-] to
[xe+sn], so the higher derivative of [eTe/] doesn't stand a chance.
When checking for other similar cases, the same thing always happens:
when an RHS and LHS forms a new record close match, any other LHS's
in the area don't come anywhere close to being another new match.
The reason for this is in the statistics. Going back to the
.328106566874978253 example, consider the range of values between 0
and 1. By the time there are 1000 expressions in this range, the
average distance between expressions will be about 0.001. However, the
distance between the two *closest* expressions will be much smaller,
somewhere on the order of e/10^-6.
Now consider what happens when you insert another 1000 expressions
at random places in the range (0..1). There is a reasonably good
chance that one of these new 1000 points will come closer to an
existing point than any of the other old points was. This will be a
new match. However, the odds of having *another* new match at the same
time are very very low -- about 1 in 1000. In other words, in order
to get a situation where there are two good LHS matches near an RHS,
all three have to be within 10^-6 of each other.
*/
/* stdafx.h is a precompiled header in Microsoft Visual C++. It needs to
be first (before all the code, including other includes) because the
compiler includes (as an optimization) the assumption that any user code
will be parsed only after the precompiled header is loaded. */
#ifdef _WIN32
# include "stdafx.h"
#endif
#include
#include
#include
#include
#include
/* ---------gettimeofday---------
RIES uses gettimeofday() to measure how much time was used in the
search. To port RIES to another OS, add another #ifdef case to
provide another gettimeofday() function. */
#ifdef _WIN32
/* Windows version. Note that _WIN32 is defined even if building for a
64-bit target (which in addition defines _WIN64) */
/* and are needed for _ftime()
http://msdn.microsoft.com/en-us/library/z54t9z5f(v=vs.71).aspx */
# include
# include
/* defines the UNIX-compatible "timeval" structure that we take
as a parameter, allowing us to emulate the UNIX routine that RIES
was designed to use. It's in winsock because network protocols use a lot
of data structures that were originally defined by UNIX systems. */
# include
// from www.linuxjournal.com/article/5574
void gettimeofday(struct timeval* t,void* timezone)
{
struct _timeb timebuffer;
_ftime( &timebuffer );
t->tv_sec=timebuffer.time;
t->tv_usec=1000*timebuffer.millitm;
}
#else
/* On UNIX, Linux, MacOS X, and CygWin systems the gettimeofday function
is in the standard libraries and is defined by these #includes. */
# include
# include
#endif
/* -------------- defines ------------------------------------------------- */
#define RIES_VERSION "2012 Jan 22"
/* Default search level. For backwards compatibility, the -l option adds
a number to the DEFAULT_LEV_BASE value. Without a -l option, it acts as if
-l was given with the parameter DEFAULT_LEV_ADJ. */
#define DEFAULT_LEV_BASE 2.0
#define DEFAULT_LEV_ADJ 2.0
/* Maximum length of a symbolic expression. NOTE: right now it's dimensioned
to reflect a normal symbol set. However, in a run with a very limited
symbol set the expressions grow in number a lot more slowly as the
complexity score and length increase, and therefore a much higher MAX_ELEN
would be necessary. The main problem is it affects the size of the
list nodes, and therefore the memory footprint of the program. I guess
I'll change it later, when I decide how to make the list nodes variable
in size.
%%% It might be possible to fix this by a method similar to that used in
rubik2. In the RIES case, we only add an expression to the tree when there
is a single item on its FORTH stack, and there is no way to predict if or
when a subexpression will ever have a stack depth of 1. However, we could
add nodes to an auxiliary list when their MAX_ELEN space is full, then any
expressions that come from them would have a predecessor pointer pointing
back to the aux. item.
For example, suppose ge_2 is at depth 16 and the current partial
expression is ep2+3+4+5+6+7+8+. At this point there is no room to add more,
and nothing has been put in the tree from this expression yet (except the
initial "e"). So, we allocate a node that just contains "ep2+3+4+5+6+7+8+"
and start reusing the MAX_ELEN space. Any nodes inserted into the LHS/RHS
tree will have their "predecessor" pointer pointing back to the
"ep2+3+4+5+6+7+8+" node, so that if they generate a match, the full
subexpression can be reconstructed.
*/
#define MAX_ELEN 21
/* The size of the memory blocks we use */
#define ALLOC_SIZE 65536L /* %%% should be at least 8x the VM page size */
/* This can be increased to improve speed, but it also decreases the
efficiency of memory usage by having the balance of LHS's and RHS's be
further from a 1:1 ratio. */
#define PASS_GRAN 1
/* -------------- typedefs ------------------------------------------------ */
typedef short s16;
typedef unsigned short u16;
typedef long s32;
typedef unsigned long u32;
typedef long long int s64;
/* SYMBOL_RANGE is the dimension of an array capable of storing one of
each possible value of typedef symbol. */
#define SYMBOL_RANGE 256
typedef char symbol;
/* NOTE: %%% Although it looks like I am trying to maintain
independence from having the symbol type be 'char', I actually have
not accomplished this in the code. However, it won't be a total
mess to convert it to 16-bit symbols or something like that, if
someone decides that's necessary. The worst part will designing a
new command-line syntax for specifying the symbolset for a search.
However, *BEWARE*! If you're increasing the symbol set
significantly past its original level of about 40 symbols, the
users will pay dearly in efficiency (runtime). Having a great
variety of symbols will massively slow down the search. In
particular, please resist the temptation to make one symbol for
every integer (or even every prime number) from 1 to 1000, or some
other arbitrary big number. There should not be more integer
symbols than every other symbol combined! RIES is perfectly happy
synthesizing the larger integers from the small ones on its own,
just like it does for the fractions and irrationals.
If the fact that a search for "163.0" yields "(x+2)/5 = 2^5+1" as
an answer, the solution isn't to make "163" a symbol. Instead, look
at ways to make RIES automatically combine integer subexpressions
like "2^5+1" and, when possble, simplify the equation by moving the
other 5 to the RHS. On the other hand, keep in mind that the ability to
show "163" expressed in terms of "two 2's, two 5's and a 1" is one of
the reasons RIES was created. */
/* phantom symbols */
#define PS_REVSUB 1
#define PS_REVDIV 2
#define PS_cross 3
#define PS_REVPOW 4
#define IS_PHANTOM(x) (x < 10)
/* this struct is used for expressions that have been inserted into LHS or
RHS lists. */
typedef struct expr {
double val; /* The expression's floating-point value */
double der; /* The derivative (for LHS expressions only: if
RHS, this will be 0.) */
struct expr *left; /* left child tree or 0 if none */
struct expr *up; /* parent node, or 0 if we're the head */
struct expr *right; /* right child tree or 0 if none */
s16 elen; /* number of symbols in the expression, e.g. 8 */
symbol sym[MAX_ELEN+1]; /* The expression in symbolic form, e.g. "p6*1-qe-"
this field is last to allow for a possible
future with variable length allocation */
} expr; /* 8+8+4+4+4+2+21+1=52 bytes, or 64 if using 64-bit pointers */
/*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Obfuscated Binary Trees (An idea that seemed hackish at the time)
A way to save memory on the RHS (constant subexpression) nodes. These
notes date back to a time when I considered maintaining separate trees
for LHS and RHS nodes.
typedef struct rhs_node {
double val; // The expression's floating-point value
void * P_L; // parent and left child pointers
void * P_R; // parent and right child pointers
symbol sym[MAX_ELEN+1]; // The expression in symbolic form, e.g.
// "p6*1-qe-", null-terminated. This field
// is last to allow for variable length
// allocation in the future.
} rhs_node; // size: 8+4+4+12=28
P
P^L P^R
L R
The three links (parent, left-child and right-child) are XORed
together into the two fields P^R and P^L. As the tree is traversed,
the traversal routine always keeps track of what node it came from and
what that node's numeric value was. Then the three links are
reconstructed as follows:
If the current node was reached by going down, we know P. L and R are
reconstructed by XORing P with the two fields P^R and P^L.
If this node was reached by going up, we know either L or R. To find
out which one, we compare the val field of this node with that of the
node we just came from. If it's bigger, we came from the left child
and therefore, L is known. If it's smaller we came from the
right-child and R is known.
If R is known, P is reconstructed by XORing R with P^R. Then L is
reconstructed by XORing P with P^L. If L is known, the same method is
used to get P and R.
There is another, perhaps more direct way to save one pointer per node:
maintain a stack of parent pointers. As you descend, push parent pointers
on the stack, and pop them off as you ascend. However, it does not
appear that the methods can be combined to save two pointers per node:
if you descend, the node you arrive at has unknown L and R, both of
which would have to get reconstructed.
typedef struct obt_traversal {
s16 side; // LHS or RHS
void * realnode; // points to actual LHS or RHS node (0 if we're
// at the root)
double x; // The value
double dx; // Derivative
void * up; // Reconstructed parent pointer
void * left; // Reconstructed left-child pointer
void * right; // Reconstructed right-child pointer
// If they want the expression, they can get it from *realnode
} obt_traversal;
// %%% obt (which literally stands for "obfuscated binary tree")
// routines aren't written yet.
void obt_new(obt_traversal * it);
void obt_clone(obt_traversal * from, obt_traversal * to);
void obt_up(obt_traversal * it);
void obt_left(obt_traversal * it);
void obt_right(obt_traversal * it);
______________________________________________________________________________*/
/* M E T A S T A C K !
A metastack works like a stack of stacks. Imagine a normal stack
with its push and pop operators. A metastack lets you treat the whole
stack as an object being pushed and popped. Every time you do a push or
pop, a copy of the *entire stack* gets pushed onto the metastack.
Metastacks support a third operation, called "undo", which pops the
previous version of the stack off of the metastack; this popped stack
replaces the current stack. Thus, undo enables you to go "back in time",
returning the stack to the state it was at some point in the past
without having to remember what values were popped and pushed.
This is useful in RIES's expression generator because it allows
intermediate calculations to be reused from one expression to the next
during the recursive scan, and upon backtracking the stacks from all
earlier, shorter subexpressions are restored.
To illustrate the workings of the metastack data structures and
explain why they are allocated as they are, we need to understand the
consequences of the choice of MAX_ELEN (which determines the maximum
number of symbols in an expression generated by ge_1 and ge_2).
First, realize that gen_forms is designed to generate only those
forms that leave a single item on the stack at the end, and ge_1/ge_2
will comply with this. So the expression "234+" is "incomplete"
because it leaves two items on the stack. Also, "23++" would never be
generated even as a partial expression because it causes a stack
underflow.
Odd and even differ:
max expression sp
length form expr max
MAX_ELEN = 4:
aabc 23s+ 2
aacb 23+s 2
abac 2s3+ 2
abbb 2sss 1
MAX_ELEN = 5:
aaacc 234++ 3
aabbc 23ss+ 2
aacac 23+4+ 2
aacbb 23+ss 2
ababc 2s3s+ 2
abbac 2ss3+ 2
abbbb 2ssss 1
expr. operation emulated metastack data
stack ---uv--- ---duv-- uvp -------ms------- msp --s-- -ds-- sp
[] [] (empty) [ , , , ] [ , , , ] 0 [ , , , , , , , ] 0 [ , ] [ , ] 0
[2] ms_push 2, [ , , , ] [ , , , ] 0 [v, , , , , , , ] 1 [2, ] [0, ] 1
[23] ms_push 2,3, [ , , , ] [ , , , ] 0 [v,v, , , , , , ] 2 [2,3] [0,0] 2
[23s] ms_pop 2, [3, , , ] [0, , , ] 1 [v,v,^, , , , , ] 3 [2, ] [0, ] 1
ms_push 2,9, [3, , , ] [0, , , ] 1 [v,v,^,v, , , , ] 4 [2,9] [0,0] 2
[23] ms_undo 2, [3, , , ] [0, , , ] 1 [v,v,^, , , , , ] 3 [2, ] [0, ] 1
ms_undo 2,3, [ , , , ] [ , , , ] 0 [v,v, , , , , , ] 2 [2,3] [0,0] 2
[23+] ms_pop 2, [3, , , ] [0, , , ] 1 [v,v,^, , , , , ] 3 [2, ] [0, ] 1
ms_pop [] [3,2, , ] [0,0, , ] 2 [v,v,^,^, , , , ] 4 [ , ] [ , ] 0
ms_push 5, [3,2, , ] [0,0, , ] 2 [v,v,^,^,v, , , ] 5 [5, ] [0, ] 1
[23] ms_undo [] [3,2, , ] [0,0, , ] 2 [v,v,^,^, , , , ] 4 [ , ] [ , ] 0
ms_undo 2, [3, , , ] [0, , , ] 1 [v,v,^, , , , , ] 3 [2, ] [0, ] 1
ms_undo 2,3, [ , , , ] [ , , , ] 0 [v,v, , , , , , ] 2 [2,3] [0,0] 2
*/
#define MS_UV_MAX ((MAX_ELEN)-1)
#define MS_UNDO_MAX ((MAX_ELEN) * 2)
#define MS_STK_MAX (((MAX_ELEN)+1) >> 1)
typedef struct metastack {
double uv[MS_UV_MAX]; /* undo values */
double udv[MS_UV_MAX]; /* undo values (differentials) */
s16 uvp; /* undo values pointer */
s16 ms[MS_UNDO_MAX]; /* metastack (undo opcodes) */
s16 msp; /* metastack pointer (undo opcodes index) */
double s[MS_STK_MAX]; /* current stack */
double ds[MS_STK_MAX]; /* stack of differentials for LHS */
s16 sp; /* stack pointer */
} metastack;
/* the metastack undo opcodes. If you wanted to support more than simple
push and pop you would add another opcode for each operation (example:
a modify-in-place operator, something which normally requires a pop
followed by a push) */
#define MSO_PUSH 1
#define MSO_POP 2
/* "pe" stands for "partial expression". This struct is used by the
recursive expression generator. Typically about half of its symbols
will be filled in, and the complexity of the symbols written so far is
compered to rminw and rmaxw for pruning.
Here's an example. Let's say the current weight limits are 30 to 31:
we're trying to generate expressions whose total weight is either 30
or 31. And suppose further that the partial expression currently contains
two symbols with a total weight of 18. That means that the remaining
symbols must add up to either 12 or 13 for the expression to be accepted.
When it's generating expressions, RIES already knows what form (see
below) the expression fits into, and therefore it knows how many symbols
are left to generate and what types they are. Therefore, certain statements
can be made about the weights of the symbols yet-to-be generated. Let's
say the current form calls for two more symbols, both of type 'b', and
suppose further that all 'b' symbols have weights of 8, 9 or 10. Well, that
means that at this point in our example equation, even if the two remaining
symbols are 8's, the total weight will be too high. Thus, the first two
symbols form an impossible combination (from the point of view of trying
to meet the current range of 30 to 31), and the expression generator
can backtrack immediately, without having to proceed to explore all the
combinations of two more type 'b' symbols that might be added here. That
saves a LOT of time -- this optimization alone cuts the number of
expressions by a factor of over 200 (for searches of 10,000,000 expressions)
and even more if a larger number of expressions is searched.
*/
typedef struct pe {
s16 cplx; /* complexity of this partial-expression */
s16 elen; /* number of symbols, e.g. 2 */
symbol sym[MAX_ELEN+1]; /* the symbols, e.g. "p6" */
s16 rminw[MAX_ELEN+1]; /* remaining minimum weight */
s16 rmaxw[MAX_ELEN+1]; /* remaining maximum weight */
} pe;
/* A "form" is a pseudo-expression consisting only of A's, B's and C's. The
A's, B's and C's represent the three types of symbols that make up
real expressions. Before real expressions are generated, RIES first
determines what sequences of type-A, type-B and type-C symbols will
constitute legal expressions. (It's easier to describe what an illegal
expression is: It's an expression that causes the stack to underflow
by executing an operation without enough operands, or an expression
that leaves extra stuff on the stack when it's done.) */
typedef struct form {
s16 min_weight; /* min attainable complexity with this form */
s16 max_weight; /* max attainable complexity with this form */
s16 stack; /* stack depth at end of form */
s16 flen; /* number of symbols in form */
symbol sym[MAX_ELEN+2]; /* the form, e.g. "aabc" */
} form;
typedef struct sym_attr_block {
symbol seft; /* the symbol's stack effect */
s16 weight; /* for scoring */
s16 mask; /* Attributes, masked with this, must be 0 */
s16 yes; /* nonzero if this symbol is given in -S */
s16 once; /* nonzero if this symbol is given in -O */
s16 no; /* nonzero if this symbol is given in -N */
s16 allowed; /* Number of symbols allowed in each expression */
s16 count; /* used in ge_2() to keep track of how many symbols
are in expression; part of -O option. */
char * defn; /* string which, when printed, define a symbol's meaning */
char * desc; /* Descriptions of symbols that are obvious in context */
s16 def_given;
s16 def_needed;
char * forth; /* The symbol's representation in postfix output format */
s16 amkey; /* "easy" attributes */
} sym_attr_block;
/* -------------- variables ----------------------------------------------- */
char * g_argv0; /* Set to argv[0] by main for use by sudden death errors */
double g_levadj; /* -l option or default DEFAULT_LEV_ADJ */
char *freepool; /* pointer to current alloc block */
u32 freesize; /* and bytes left in current alloc block */
expr *lhs_root; /* binary tree for LHS list */
/* expr *rhs_root; binary tree for RHS list (not currently used) */
s16 using_x; /* This is nonzero if we're currently generating expressions
for the LHS tree. */
s16 lmax; /* current complexity maximum limit for LHS */
s16 rmax; /* current complexity maximum limit for RHS */
s16 lmin; /* current complexity minimum limit for LHS */
s16 rmin; /* current complexity minimum limit for RHS */
double tlevel; /* The maximum level (depth) of the search. The search
ends when LHS + RHS > tlevel */
double target; /* The value for which we are searching */
double exec_x; /* the value exec() uses for 'x' symbol */
s16 got_exact;
sym_attr_block sym_attrs[SYMBOL_RANGE];
s16 weight_base; /* weight per symbol for expression
complexity score */
s16 x_lhs_only; /* nonzero if symbol 'x' should only be on
LHS (this is not necessarily the same as
"-Ox" because there can be multiple x's
in an LHS) */
#define LINELEFT_INIT (79-2)
int used_trig; /* Set if any trig symbol has been used in
a result */
s16 S_option;
s16 g_show_ss;
s16 int_subexpr; /* nonzero if they gave -i option */
s16 only_exact; /* nonzero to print only the exact match, if any */
s16 relative_x; /* nonzero if X values should be given relative to T */
/* attribute masks */
#define AM_KK 0x0001 /* K K - */
#define AM_1 0x0002 /* 1 - */
#define AM_2 0x0004 /* 2 - */
#define AM_n 0x0008 /* n - */
#define AM_r 0x0010 /* r - */
#define AM_55 0x0020 /* J K - where J and K are both 5 or less */
#define AM_jK 0x0040 /* J K - where J < K */
#define AM_RHS 0x0080 /* set only when filling RHS list */
#define AM_sq 0x0100 /* o - where o is 's' or 'q' */
#define AM_1K 0x0200 /* 1 K - */
#define AM_l 0x0400 /* l - */
#define AM_E 0x0800 /* E - */
#define AM_p 0x1000 /* p - */
#define AM_KxK 0x2000 /* K * K - */
#define AM_KpK 0x4000 /* K + K - */
#define MAX_SEFT_POP 20
symbol g_asym[MAX_SEFT_POP]; /* the valid seft 'a' symbols */
s16 n_asym;
s16 a_minw; /* minimum weight of seft 'a' symbols */
s16 a_maxw; /* maximum weight of seft 'a' symbols */
symbol g_bsym[MAX_SEFT_POP]; /* the valid seft 'b' symbols */
s16 n_bsym;
s16 b_minw; /* minimum weight of seft 'b' symbols */
s16 b_maxw; /* maximum weight of seft 'b' symbols */
symbol g_csym[MAX_SEFT_POP]; /* the valid seft 'c' symbols */
s16 n_csym;
s16 c_minw; /* minimum weight of seft 'c' symbols */
s16 c_maxw; /* maximum weight of seft 'c' symbols */
s16 s_minw; /* minimum weight of any symbol */
s16 max_flen; /* max length of forms generated thus far */
/* Irrational constants have lots-o-digits just in case this program ever
gets ported to a C compiler that offers quad-precision floating point. */
double k_0 = 0.0;
double k_1 = 1.0;
double k_phi = 1.61803398874989484820458683436563811772030;
double k_2 = 2.0;
double k_e = 2.71828182845904523536028747135266249775724;
double k_3 = 3.0;
double k_pi = 3.14159265358979323846264338327950288419716;
double k_4 = 4.0;
double k_5 = 5.0;
double k_6 = 6.0;
double k_7 = 7.0;
double k_8 = 8.0;
double k_9 = 9.0;
/* Constants that parametrize functions */
double k_sincos_arg_scale = 0;
/* These constants are used to set legal limits in various functions */
double k_sin_clip = 0.99999;
double k_2pi = 6.2831853071795864;
double k_eXlim = 690.0;
double k_matchstart = 0.01;
double k_precision_ulp = 0.0;
double k_min_best_match = 1.0e-15;
double k_sig_loss = 0.01;
// double k_sig_loss = 01.0e-10;
double k_vanished_dx = 1.0e-6;
double k_prune_deriv = 1.0e-10;
double k_newton_settled = 1.0e-15;
double g_min_matchsize = 0;
double p_ovr;
double n_ovr;
/* Format strings and precision constants.
The binary formats, and associated precision/significance values, of the
floating-point types that one is likely to encounter are:
significand
bin. decimal
IEEE binary64 double 53 15.95
8087 80-bit extended long double 64 19.27
double-double 107 31.21
IEEE binary128 quad 113 34.01
*/
/*
significant digits constants, pre-initialized for IEEE binary64.
The number of "nominal" digits is the significand (decimal) value above,
rounded to an integer.
The number of "usable" digits is based on the this and our choice of
k_sig_loss to limit loss of significance. k_sig_loss is 0.01 by default,
which is 10^-2 so we lose 2 digits of significance.
*/
int k_nominal_digits = 17;
int k_usable_digits = 15;
/* Formatting strings, pre-initialized for IEEE binary64. 'nominal'
and 'usable' are as defined above. Note that some binary formats
(anything that is not compiler-native, like double-double) will not
use these format strings, but instead only use the k_xxx_digits values
above, because the library xxprintf() functions only work with
compiler-native formats.
The 'fixed' strings are designed for fitting any value in a fixed
number of characters, so output lines up in neat columns. This has to
be 6 characters wider than the number of significant figures: 1 for
the sign, 1 for the decimal point, and 4 for an exponent like "e+27"
or "e-05". These strings from the output of "ries 2.5063 -DG" serve
as examples:
|123456789.123456789.1| <- 21 characters wide
+---------------------+ left-justified printf format: "%-21.15g"
|748.729680171193 | common case: 1+15 = 16 characters
|0.470252846911253 | leading zero: 1+1+15 = 17 characters
|-0.000131760338472132| sign and 4 leading zeros: 1+1+1+3+15 = 21 characters
|-5.80064296934474e-05| sign, decmial, 'e', sign and exponent: 1+1+15+1+1+2
+---------------------+
*/
char * fmt_g_nominal = "%.17g";
char * fmt_g_nom_fixed = "%-23.17g";
char * fmt_g_usable = "%.15g";
char * fmt_g_usa_fixed = "%-21.15g";
/* Variables used by the search algorithm */
double best_match;
u32 g_ne;
u32 insert_count;
u32 prune_count, lhs_prune, rhs_prune;
u32 mem_used_KiB;
int output_format;
#define OF_POSTFIX 0
#define OF_CONDENSED 1
#define OF_NORMAL 2
#define OF_FORTH 3
u32 lhs_insert, rhs_insert;
u32 lhs_gen, rhs_gen;
long gen_total;
/* Counters used in by thrash_check to estimate how long it should have taken
us to use a given chunk of memory */
long g_exec_calls;
long g_cv_calls;
/* debugging options: */
/* numbers from "ries -l2 2.5063141592653589 -DJ | wc -l" */
s16 debug_s; /* report_match: "show work" 300 */
/* (values of all subexpressions) */
s16 debug_o; /* eval: sym, x and dx at each step 788 */
s16 debug_n; /* newton: x and dx at each step 255 */
s16 debug_p; /* check_match entry 453890 */
s16 debug_q; /* check_match past 1st-stage test 109 */
s16 debug_m; /* ms_push, ms_pop, ms_peek, ms_undo 10216232 */
s16 debug_r; /* exec 1706583 */
/* ge_2: (uppercase for LHS, lowercase for RHS) */
/* exec: */
s16 debug_A; /* prune partial exec error 29949 64184 */
s16 debug_B; /* prune partial zero 196 2797 */
s16 debug_C; /* prune partial noninteger with -i 38 38 */
s16 debug_D; /* prune partial overflow 1681 4171 */
s16 debug_E; /* prune partial dx near 0 1333 */
/* full expr: */
s16 debug_F; /* prune full dx near 0 1157 */
s16 debug_G; /* insert 305319 670210 */
s16 debug_0; /* dump entire database */
/* partial expr: */
s16 debug_H; /* rules 361668 731891 */
s16 debug_I; /* symbols to try 3286347 6618747 */
s16 debug_J; /* prune complexity 2111413 4235348 */
s16 debug_K; /* prune rules 208550 451853 */
s16 debug_L; /* prune symcount 38 38 */
/* ge_1: */
s16 debug_t; /* entry 9380 */
s16 debug_u; /* rminw and rmaxw calculation 31649 */
s16 debug_v; /* expressions generated 4709 */
s16 debug_w; /* gf_1 28978 */
s16 debug_x; /* add_rule 89 */
s16 debug_y; /* main loop 353 */
s16 debug_z; /* init and miscellaneous */
s16 debug_M; /* memory allocation */
#define DBG_LHS 2
#define DBG_RHS 1
char * symbol_names[256];
/* -------------- variables used for special commands --------------------- */
s16 g_enable_output; /* Enable normal ries output */
s16 g_eval_expr;
/* For the --find-expression command */
#define MAX_FIND_EXPR 16
s16 g_num_find_expr;
char * g_find_expr[MAX_FIND_EXPR];
/* -------------- prototypes ---------------------------------------------- */
void show_version(void);
void brief_help(void);
char * file_read(const char * filename);
void delimit_args(const char *rawbuf, int * nargs, char * * * argv);
double gettime(void);
void inittime(void);
void thrash_check(long alloced);
char * my_alloc(u32 size);
void ms_push(metastack *ms, double x, double dx);
double ms_pop(metastack *ms, double *diff);
double ms_peek(metastack *ms, double *diff, s16 *sptr);
void ms_undo(metastack *ms);
char * err_string(s16 err);
s16 exec(metastack *ms, symbol op, s16 *undo_count, s16 do_dx);
s16 infix_1(symbol * expr, char * term, symbol * t_op);
void infix_preproc(symbol * expr, symbol * out);
void init_symbol_names(void);
s16 infix_expand(char * input, char * output);
s16 postfix(symbol * expr, char * term);
s16 postfix_formatter(symbol * expr, char * out, s16 maxlen);
s16 complexity(symbol * expr);
s16 eval(symbol * expr, s16 contains_x, double * val, double * dx,
s16 *sptr, s16 show_work);
s16 newton(symbol * lhs, symbol * rhs, double *root, double *diff_dx);
s16 cv_simplify(symbol * lhs, symbol * rhs, double *root, double *diff_dx);
void defsym_used(symbol * expr);
void describe_symbols(void);
void print_end(void);
void check_exit(int e);
void report_match(expr *lhs, expr *rhs, pe *exm, double score);
int check_match(expr * lhs, expr * rhs);
expr * bt_first(expr * tree);
u32 bt_depth(expr * it);
expr * bt_prev(expr *it);
expr * bt_next(expr *it);
void check_sides(expr * it);
void check_exact_match(double x, double dx, pe *ex, expr * it);
s16 bt_insert(double x, double dx, pe *ex, s16 * res1);
int canonval(pe * bpe, metastack * ms, double *p_x, double *p_dx,
s16 * muc_ptr);
void decanon(metastack * ms, s16 muc);
u32 ge_2(form *base, pe *bpe, s16 e_minw, s16 e_maxw, metastack *ms);
u32 ge_1(form *base, s16 e_minw, s16 e_maxw);
u32 gf_1(form *base, s16 minw, s16 maxw);
u32 gen_forms(s16 minw, s16 maxw);
void def_amkey(char * syms, s16 mask);
void define_amkeys(void);
void add_symbol(symbol sym, symbol type, s16 weight, char *forth,
char *infix, char * def_terse, char * def_normal, char * description);
void show_symset(void);
void add_rule(symbol * symset, symbol sym, s16 mask);
void init_numerics(void);
void init1(void);
void init2(void);
char * pa_next_peek(void);
int pa_next_isparam(void);
char * pa_get_arg(void);
char * pa_stk_pop(void);
int parse_args(int nargs, char *argv[]);
int main(int nargs, char *argv[]);
/* -------------- functions ----------------------------------------------- */
void show_version(void)
{
printf(
"ries version of %s, Copyright (C) 2000-2012 Robert P. Munafo\n",
RIES_VERSION);
printf("%s",
"This is free software; see the source for copying conditions. There is NO\n"
"warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.\n"
);
}
void brief_help(void)
{
printf("%s",
"Usage:\n"
" ries [options] target-value\n"
"\n"
"Target value is required and may be any number. Options include:\n"
" -l3 Search further (``level-3 search'')\n"
" -x Show matched values as ``x = value'' rather than ``x = T + epsilon''\n"
" -N+-/ Do not use symbols +, - or /\n"
" -F0 Show expressions in native (compact postfix) format\n"
"\n"
"There are many more options; get the full manual at www.mrob.com/ries\n"
);
}
#define FILE_READ_SIZE 1024
#define FILE_READ_MAX 10*1024*1024
char * file_read(const char * filename)
{
FILE * in;
in = fopen(filename, "rb");
if (in == NULL) {
/* Try appending ".ries" */
char * name_ext;
int nc;
nc = asprintf(&name_ext, "%s.ries", filename);
if (nc > 0) {
in = fopen(name_ext, "rb");
}
}
if (in == NULL) {
fprintf(stderr, "%s: Could not open '%s' or '%s.ries' for reading.\n"
"\n\n", g_argv0, filename, filename);
brief_help();
exit(-1);
}
int buf_sz = FILE_READ_SIZE;
char * base_ptr = (char *) malloc(buf_sz);
int t_len = 0;
while (!feof(in) && !ferror(in)) {
if (t_len + FILE_READ_SIZE > buf_sz) {
/* The next read might go beyond the allocated memory, so we need to
reallocate */
if (buf_sz > FILE_READ_MAX) break;
buf_sz = buf_sz * 2;
base_ptr = (char*)realloc(base_ptr, buf_sz);
}
{
/* Now read a little bit more */
char * p = base_ptr + t_len;
t_len += fread(p, 1, FILE_READ_SIZE, in);
}
}
fclose(in);
/* Reallocate again to free up the memory we didn't use. */
base_ptr = (char*)realloc(base_ptr, t_len + 1);
/* It needs a trailing null byte */
base_ptr[t_len] = 0;
return base_ptr;
} // End of file_read
/* "A RIES argument-file is a sequence of one or more non-blank words
separated by blanks." */
void delimit_args(const char *rawbuf, int * nargs, char * * * argv)
{
int n;
char ** av;
n = 0;
av = 0;
if (nargs) { *nargs = n; }
if (argv) { *argv = av; }
if (rawbuf) {
unsigned char * p;
n = 0;
/* Scan for and canonicize whitespace. */
p = (unsigned char *) rawbuf;
while(*p) {
/* Skip any leading space */
while(*p && ((*p <= ' ') || (*p == '\177'))) { *p = ' '; p++; }
if (*p == '#') {
/* Comment delimiter. Change everything to space until we get to
a control character, which we assume is an end of line */
while(*p && ((*p == '\t') || (*p >= ' '))) { *p = ' '; p++; }
} else if (*p && (*p > ' ')) {
/* A word that doesn't start with '#' is an arg; skip it */
while(*p && (*p > ' ')) { p++; }
}
/* We are now at a null or a delimiter; loop can now continue. */
}
/* Count up the strings */
p = (unsigned char *) rawbuf;
while(*p) {
/* Skip any leading space */
while(*p == ' ') { p++; }
/* If there is an arg, count it */
if (*p) {
n++;
/* Skip the nonspace */
while(*p && (*p != ' ')) { p++; }
}
/* We are now at a null or a blank space; loop can now continue. */
}
/* Now n is the number of args, and we can allocate the argv */
av = malloc( ((long) n) * sizeof(char *) );
if (av == 0) {
fprintf(stderr, "%s: Cannot alloate argv block.\n", g_argv0);
exit(-1);
}
if (nargs) { *nargs = n; }
if (argv) { *argv = av; }
/* Set pointers, marking with nulls as we go */
p = (unsigned char *) rawbuf;
while(*p) {
/* Skip any leading space and turn it into nulls */
while(*p == ' ') { *p = 0; p++; }
/* If there is an arg, count it */
if (*p) {
/* Save a pointer to this string, with (paranoid) check to avoid
overwriting the argv */
if (n > 0) {
*av = (char *) p;
av++;
n--;
}
/* Skip the nonspace */
while(*p && (*p != ' ')) { p++; }
}
/* We are now at a null or a blank space; loop can now continue. */
}
}
} // End of delimit_args
double tod_start; /* gettime() value at program start */
/* gettime is the only really OS-specific routine in this whole program.
It returns how long the program has been running, measured in tenths
of a second. For example, it returns 14 if the program has run for
1.4 seconds. This is actual elapsed "clock-on-the-wall" time, not
necessarily a measure of how much time the CPU has spent working on
RIES. For example, if you close your laptop while RIES is running it
will include the time the system was "asleep" in its final printout
of time used. */
double gettime(void)
{
struct timeval tod_record;
double t_now;
/* There is a block of code inside an #ifdef above (search for
"---gettimeofday---") that sets up the appropriate include files
and/or defines a gettimeofday() function based on compile-time flags
like _WIN32. See the block-comment there for more details. */
gettimeofday(&tod_record, 0);
t_now = ((double) (tod_record.tv_sec))
+ ((double) (tod_record.tv_usec)) / 1.0e6;
return(t_now - tod_start);
}
double g_min_memory;
double g_max_memory;
double g_avg_alloc_rate, g_good_alloc_rate;
double memstat_when;
u32 memstat_where, rate_where;
double my_alloc_when, g_ttl_elapsed;
long tc_alloced;
long g_rate_increase_run, g_max_rir;
double g_decay_slug;
int g_last_thrash;
void inittime(void)
{
double now;
/* Init the global so gettime() has a valid value to subtract from
its new measurement */
tod_start = 0;
/* With tod_start set to 0 gettime will return the absolute current time */
now = gettime();
/* Set the global to this value. */
tod_start = now;
/* From now on, calls to gettime() will measure time from the moment we
made the preceding gettime() call. */
g_avg_alloc_rate = g_good_alloc_rate = 0.0;
my_alloc_when = memstat_when = gettime();
tc_alloced = 0; memstat_where = rate_where = 0;
g_cv_calls = 0; g_exec_calls = 0;
g_ttl_elapsed = 0; g_rate_increase_run = g_max_rir = 0;
g_last_thrash = 0;
}
s32 bitcount(s32 x)
{
x = ((x & 0xAAAAAAAA) >> 1) + (x & 0x55555555);
x = ((x & 0xCCCCCCCC) >> 2) + (x & 0x33333333);
x = ((x & 0xF0F0F0F0) >> 4) + (x & 0x0F0F0F0F);
x = ((x & 0xFF00FF00) >> 8) + (x & 0x00FF00FF);
x = (x >> 16) + (x & 0x0000FFFF);
return(x);
}
double g_mem_bad_ratio = 2.0;
/* Try to detect if the system is "thrashing" (swapping memory pages in
and out from the hard drive) by measuring the elapsed time and comparing to
an estimate of how much time our computation should have taken. */
void thrash_check(long alloced)
{
double now, alloc_elapsed, predicted_rate;
tc_alloced += alloced;
if (tc_alloced < (1024L*1024L)) {
/* It's not time to benchmark memory yet... */
return;
}
/* Look at how much time has passed since the last time we were here */
now = gettime();
alloc_elapsed = now - my_alloc_when;
/* Reset the timer for use next time */
my_alloc_when = now;
/* Convert to units of seconds per megabyte */
double this_alloc_rate = alloc_elapsed * 1.0e6 / ((double) tc_alloced);
g_ttl_elapsed += this_alloc_rate;
/* Determine the timebase we should use for our decaying averages: Short
halflife at first, then longer. */
double decay_numer, decay_denom;
if (mem_used_KiB < 20480) {
decay_numer = 0.9;
} else {
decay_numer = 0.993;
}
decay_denom = 1.0 - decay_numer;
/* Decaying average with half-life of roughly 100 samples:
0.5 ^ 0.01 = 0.993092... */
if (g_avg_alloc_rate == 0) {
g_avg_alloc_rate = this_alloc_rate;
} else {
g_avg_alloc_rate = (decay_numer * g_avg_alloc_rate)
+ (decay_denom * this_alloc_rate);
}
if (this_alloc_rate < g_avg_alloc_rate) {
/* This sample brought down the average, so it should be counted
towards the "good performance level" statistic. */
if (g_good_alloc_rate == 0) {
g_good_alloc_rate = this_alloc_rate;
} else {
g_good_alloc_rate = (decay_numer * g_good_alloc_rate)
+ (decay_denom * this_alloc_rate);
}
}
rate_where = mem_used_KiB;
/* Find out how many times exec() was called, which is a good measure of
how much computation we've done and therefore how much time "should
have" elapsed */
predicted_rate = (((double) g_exec_calls) + 1.0) / 4.0e6
+ (((double) g_cv_calls) + 1.0) / 2.5e6;
/* From the prediction and the actual elapsed time, make a measure of
sluggishness. */
double sluggish_ratio;
sluggish_ratio = g_ttl_elapsed / predicted_rate;
if (debug_M) {
printf(
"%4ld MiB, ex%8ld, cv%7ld: %6.3g/%6.3g = %6.3g",
((mem_used_KiB >> 9) + 1) >> 1,
g_exec_calls, g_cv_calls,
g_ttl_elapsed, predicted_rate, sluggish_ratio);
}
/* Reset the raw counters for the next sample */
g_cv_calls = 0;
g_exec_calls = 0;
g_ttl_elapsed = 0;
/* Count how many times in a row the new sample is bigger than the
decaying average */
if (sluggish_ratio > g_decay_slug * g_mem_bad_ratio) {
if (debug_M) { printf(" >%6.3g", g_decay_slug); }
g_rate_increase_run++;
} else {
if (debug_M) { printf(" %6s", ""); }
g_rate_increase_run = 0;
}
/* g_max_rir keeps track of the longest "run" since the last time
g_max_rir was reset */
if (g_rate_increase_run > g_max_rir) {
g_max_rir = g_rate_increase_run;
}
/* Keep a bitmap of the history of recent extreme incidents */
if (sluggish_ratio > g_decay_slug * g_mem_bad_ratio) {
g_last_thrash |= 1;
}
g_last_thrash <<= 1;
g_last_thrash &= ((1 << 10) - 1);
/* Decay-average the slug ratio */
if (g_decay_slug == 0) {
g_decay_slug = sluggish_ratio;
} else {
g_decay_slug = (decay_numer * g_decay_slug)
+ (decay_denom * sluggish_ratio);
}
if (mem_used_KiB >= memstat_where + 10240) {
/* It's time to look at the stats and make a guess */
if (debug_M) {
// To evaluate these measurements, run RIES in different memory
// environments using a command like:
// ries 2.50631415926535897932 -l8 -DM
printf(" avg %6.3g", g_decay_slug);
printf(", run %ld", g_max_rir); g_max_rir = 0;
printf(" (%6.3g ; %6.3g sec/MB)", g_avg_alloc_rate, g_good_alloc_rate);
if (((double)mem_used_KiB) * 1024.0 > g_min_memory) {
/* Here we would check for a "getting really slow" trend and quit
if it is exceeded */
if (bitcount(g_last_thrash) >= 3) {
/* Okay, pull the plug. */
printf("\nExiting now because memory has gotten very slow.\n");
print_end();
exit(0);
}
}
}
fflush(stdout);
memstat_when = now;
memstat_where = mem_used_KiB;
}
if (debug_M) { printf("\n"); }
tc_alloced = 0;
}
/* All allocation is done in fixed-size blocks, and nothing is ever
* deallocated, so it's very easy to manage. This is your typical
* cached blocked allocate funtion */
char * my_alloc(u32 size)
{
s16 do_alloc;
char * rv;
/* assume the worst */
rv = 0;
/* first, see if there's a block with enough space */
do_alloc = 0;
if (freepool) {
if (freesize >= size) {
/* we're okay */
} else {
do_alloc = 1;
}
} else {
do_alloc = 1;
}
/* if we need to allocate, do so */
if (do_alloc) {
freesize = 0;
freepool = (char *) malloc((size_t) ALLOC_SIZE);
/* did we get any? */
if (freepool) {
freesize = ALLOC_SIZE;
mem_used_KiB += (ALLOC_SIZE / 1024L);
}
thrash_check(ALLOC_SIZE);
if (((double)mem_used_KiB) * 1024.0 > g_max_memory) {
printf("Stopping now because %g bytes of memory have been used.\n",
((double)mem_used_KiB) * 1024.0);
print_end();
exit(0);
}
}
/* if we have a block now, we can allocate from it */
if (freepool) {
rv = freepool;
freepool += size; /* %%% how do I find out the alignment
requirement for pointers to structs? */
freesize -= size;
}
return(rv);
}
/* metastack routines
The following illustrate the contents of the stack and undo
list for some sample expressions. The expressions read from top
to bottom, and there is one line for each push or pop. (There
are no undo's illustrated here; an undo would be just moving up
to a previous line). At the end of each example is the total
number of steps: This is the number of items in the undo list.
expression: 1+(2+(4+8)) expression: sqrt(sqrt((2^2)^2)^2)^2
postfix: 1248+++ postfix: 2ssqsqs
op stack --undo-stack-- op stack --undo-stack--
1 1 2 2
2 1 2 s - 2
4 1 2 4 4 2
8 1 2 4 8 s - 2 4
+ 1 2 4 8 16 2 4
1 2 8 4 q - 2 4 16
1 2 12 8 4 4 2 4 16
+ 1 2 8 4 12 s - 2 4 16 4
1 8 4 12 2 16 2 4 16 4
1 14 8 4 12 2 q - 2 4 16 4 16
+ 1 8 4 12 2 14 4 2 4 16 4 16
- 8 4 12 2 14 1 s - 2 4 16 4 16 4
15 8 4 12 2 14 1 16 2 4 16 4 16 4
total steps: 13 total steps: 13
max stack: 4 max stack: 1
max undo stack: 6 max undo stack: 6
expression: 1/(3-1/(-2))
postfix: 11nr-r
op stack --undo-stack--
3 3
2 3 2
n 3 2
3 -2 2
r 3 2 -2
3 -0.5 2 -2
- 3 2 -2 -0.5
- 2 -2 -0.5 3
3.5 2 -2 -0.5 3
r - 2 -2 -0.5 3 3.5
0.286 2 -2 -0.5 3 3.5
total steps: 11
max stack: 2
max undo stack: 5
*/
void ms_init(metastack *ms)
{
ms->uvp = 0;
ms->msp = 0;
ms->sp = 0;
}
/* ms_push does a standard PUSH operation. */
void ms_push(metastack *ms, double x, double dx)
{
s16 sp, msp;
/* push x */
sp = ms->sp;
(ms->ds)[sp] = dx;
(ms->s)[sp] = x;
/* remember */
msp = ms->msp;
ms->ms[msp] = MSO_PUSH;
if (debug_m) { printf("push %d %g (%g) '%d'\n", sp, x, dx, msp); }
/* for a push, we don't need to add any undo values */
sp++;
ms->sp = sp;
msp++;
ms->msp = msp;
}
/* ms_pop does a standard POP operation. */
double ms_pop(metastack *ms, double *diff)
{
s16 sp, msp, uvp;
double rv, drv;
/* pop a value */
sp = ms->sp;
sp--;
rv = (ms->s)[sp];
drv = (ms->ds)[sp];
if (diff) {
*diff = drv;
}
ms->sp = sp;
/* remember the action and the data */
msp = ms->msp;
ms->ms[msp] = MSO_POP;
/* save the popped values in the undo list */
uvp = ms->uvp;
if (debug_m) {
printf("pop %d %g (%g) '%d' .%d.\n", sp, rv, drv, msp, uvp);
}
ms->udv[uvp] = drv;
ms->uv[uvp] = rv;
msp++;
ms->msp = msp;
uvp++;
ms->uvp = uvp;
return rv;
}
/* ms_peek just lets you see what's on the top of the stack. */
double ms_peek(metastack *ms, double *diff, s16 *sptr)
{
s16 sp;
double rv, drv;
sp = (ms->sp)-1;
rv = ms->s[sp];
drv = ms->ds[sp];
if (diff) {
*diff = drv;
}
if (sptr) {
*sptr = sp;
}
if (debug_m) { printf("peek %d %g (%g)\n", sp, rv, drv); }
return (rv);
}
/* ms_undo performs a metastack UNDO operation -- it returns the stack to the
state it was in before the most recent PUSH or POP. */
void ms_undo(metastack *ms)
{
s16 msp, opcode;
/* pop the opcode */
msp = ms->msp;
opcode = ms->ms[--msp];
ms->msp = msp;
if (opcode == MSO_PUSH) {
/* to undo a PUSH is easy -- just pop the SP. */
ms->sp--;
if (debug_m) { printf("undo-push\n"); }
} else {
s16 uvp, sp;
double uv, udv;
/* undo a POP. This is like a PUSH except we have to retrieve the value
from the undo data. */
uvp = ms->uvp;
uvp--;
uv = ms->uv[uvp];
udv = ms->udv[uvp];
ms->uvp = uvp;
sp = ms->sp;
ms->ds[sp] = udv;
ms->s[sp] = uv;
sp++;
ms->sp = sp;
if (debug_m) { printf("undo-pop\n"); }
}
}
/* Cody, William J., Jr., and Waite, William. Software Manual for the
Elementary Functions. Prentice-Hall (Englewood Cliffs, New Jersey, 1980).
*/
/* the man page for atan2 doesn't specify its behavior when both arguments
are zero, so just to be safe I am defining it myself. */
double arctan(double a, double b)
{
double rv;
if ((a == k_0) && (b == k_0)) {
rv = k_0;
} else {
rv = atan2(a,b);
}
return rv;
}
/* sinh
sinh(x) = (e^x - e^-x) / 2
*/
/* complex natural logarithm
cln(z) = ln(cabs(z)) + phase(z) i */
/* complex exponential
cexp(z) = exp(a) (cos b + i sin b) */
/* complex sine (and sinh)
csin(z) = (cexp(i z) - cexp(-i z)) / 2i
csin(i z) == i csinh(z)
csin(z) = (cexp(i z) - cexp(-i z)) / 2i
= (cexp(ai - b) - cexp(b - ai)) / 2i
= (e^-b (cos a + i sin a) - e^b (cos a - i sin a)) / 2i
= (e^-b cos a + i e^-b sin a - e^b cos a + i e^b sin a) / 2i
= i 1/2 e^-b cos a - 1/2 e^-b sin a - i 1/2 e^b cos a - 1/2 e^b sin a
a + bi = i 1/2 e^-b cos a - 1/2 e^-b sin a - i 1/2 e^b cos a - 1/2 e^b sin a
a = - 1/2 e^-b sin a - 1/2 e^b sin a
bi = i 1/2 e^-b cos a - i 1/2 e^b cos a
a = - 1/2 e^-b sin a - 1/2 e^b sin a = - sin a cosh b
b = 1/2 e^-b cos a - 1/2 e^b cos a = - cos a sinh b
b = - cos a (e^b - e^-b)/2
b 2 / (e^-b - e^b) = - cos a
-2 b / (e^-b - e^b) = cos a
2 b / (e^b - e^-b) = cos a
a = 2 pi K + cos' (2 b / (e^b - e^-b))
a = - sin a (e^b + e^-b)/2
-1 = sin(a)/a (e^b + e^-b)/2
-2/(e^b + e^-b) = sin(a)/a
- bell(b) = sin(a)/a
b = +/- bell' (-sin(a)/a) [defined only when sin(a)/a < 0]
+/- bell'(-sin(a)/a) = - cos a sinh(+/- bell'(sin(a)/a))
2 pi K + cos' (2 b / (e^b - e^-b)) = - cosh b sqrt(1 - (2 b / (e^b - e^-b))^2)
*/
/* More elaborate functions follow. These are the ones that do not
come directly out of the closure of the simple operators + - * / ^ v
(as is the case for log, exp, sin and cos and their related functions)
but from new concepts such as the integral of a function, etc.
This includes the standard distribution, the Gamma function,
Bessel functions, etc. */
/* Gamma function and generalized factorial
The best way to provide Gamma in RIES is by the Lanczos approximation.
Source is in .../pub/ries/lanczos-gamma.rhtf
To calculate the derivative of gamma requires the digamma function.
Source is in .../pub/ries/gamma-sources.hid
When I go to quad precision, new Lanczos coefficients need to be
generated; a program to do this is in
.../proj/rily-pi/tars/viktor-toth-lanczos-gamma-function.tgz
An unresolved issue is what to do about quad-precision digamma. The de
Moivre coefficients are probably not good enough. I might be able to
get away with it due to the fact that iterated Newton works even when
the derivative functions have error. Digamma functions can be tested
with Gauss' digamma theorem, which gives an exact formula for all
"proper fractions" m/k (i.e., m and k are integers with 0=1; i--) {
sum += lct[i] / (z + ((double) i));
}
sum += lct[0];
printf("ls2p %7g l(b^e) %7g -b %7g l(s) %7g\n", ln_sqrt_2_pi,
log(base)*(z+0.5), -base, log(sum));
// Gamma[z] = Sqrt(2*Pi) * sum * base^[z + 0.5] / E^base
return ((ln_sqrt_2_pi + log(sum)) - base) + log(base)*(z+0.5);
}
double lanczos_gamma(double z)
{
return(exp(lanczos_ln_gamma(z)));
}
// The digamma function is the derivative of gammaln.
//
// Reference:
// J Bernardo,
// Psi ( Digamma ) Function,
// Algorithm AS 103,
// Applied Statistics,
// Volume 25, Number 3, pages 315-317, 1976.
//
// From http://www.psc.edu/~burkardt/src/dirichlet/dirichlet.f
// (with modifications for negative numbers and extra precision)
//
double digamma(double x)
{
double neginf = -INFINITY;
static const double c = 12,
digamma1 = -0.57721566490153286,
trigamma1 = 1.6449340668482264365, // pi^2/6
s = 1e-6,
s3 = 1./12,
s4 = 1./120,
s5 = 1./252,
s6 = 1./240,
s7 = 1./132,
s8 = 691./32760,
s9 = 1./12,
s10 = 3617./8160;
double result;
// Illegal arguments
if((x == neginf) || isnan(x)) {
return NAN;
}
// Singularities
if((x <= 0) && (floor(x) == x)) {
return neginf;
}
// Negative values
// Use the reflection formula (Jeffrey 11.1.6):
// digamma(-x) = digamma(x+1) + pi*cot(pi*x)
//
// This is related to the identity
// digamma(-x) = digamma(x+1) - digamma(z) + digamma(1-z)
// where z is the fractional part of x
// For example:
// digamma(-3.1) = 1/3.1 + 1/2.1 + 1/1.1 + 1/0.1 + digamma(1-0.1)
// = digamma(4.1) - digamma(0.1) + digamma(1-0.1)
// Then we use
// digamma(1-z) - digamma(z) = pi*cot(pi*z)
//
if(x < 0) {
return digamma(1-x) + M_PI/tan(-M_PI*x);
}
// Use Taylor series if argument <= S
if(x <= s) return digamma1 - 1/x + trigamma1*x;
// Reduce to digamma(X + N) where (X + N) >= C
result = 0;
while(x < c) {
result -= 1/x;
x++;
}
// Use de Moivre's expansion if argument >= C
// This expansion can be computed in Maple via asympt(Psi(x),x)
if(x >= c) {
double r = 1/x, t;
result += log(x) - 0.5*r;
r *= r;
// this version for lame compilers
t = (s5 - r * (s6 - r * s7));
result -= r * (s3 - r * (s4 - r * t));
}
return result;
}
For higher precision Gamma, I can use Bill Gosper's proposal:
[
From: "Bill Gosper"
To: "math-fun mailing list"
Date: Sun, 8 Jan 2012 23:24:37 -0800
Subject: Re: [math-fun] Gamma function (formulas collection)
I missed where Warren's paper probably says that, for "unrestricted"
(arbitrary precision) numerics, the divergence of the log Gamma and
polygamma series is rather irrelevant. A plausible algorithm is simply
to add some positive integer n to z so that the Stirling series z+n
swoops within the desired accuracy. Then divide by (z+1)(z+2)...(z+n).
Windschitl's and Warren's improvements make this even more attractive.
Looking to provide Gamma numerics for Robert's RIES, I accidentally
overkilled the problem by Remezing up
f[z]:=(0.011987912746518014561506488044010 +
z^2 (-0.028678142684262583619372798888834 +
z^2 (-0.10376325088888750555218325558841 +
z^2 (-0.034612435282031905677869242849747 -
0.0024691358024691358024691358024691 z^2))))/(z^7 *
(45.984169004371332413538940188341 +
z^2 (54.768197084214553421696753890699 +
z^2 (14.960893432079977923321408051895 + z^2))))
Then plotting
z! /(Sqrt[2 Pi] (E^(-f[z] - 4 z) z^(2 + 6 z) Sinh[1/z]^(2 z))^(1/4) -1
9 < z < oo shows maximum error 6*10^-26 (at only one ripple because I
neglected the weight function). So up to nine divides could be
required for this method. But RIES only needs 10^-17, for which f[z]
could be a lot simpler and n smaller than 9.
To squeeze some actual fun out of this: Why is f an odd function?
--rwg
PS, I wasn't confused in that 2001 Remez posting--just confusing. In
the next sentences I explained why truncating the Chebys doesn't
minimax. I was just using "Chebychef optimal" to mean the best you
could do by subtracting off Chebys.
]
binomial coefficients
The binomial coefficient "n over k" is:
bincoef(n,k) = n! / ((n-k)! k!)
= exp(lngamma(n) - (lngamma(n-k) + lngamma(k)))
Riemann Zeta function: see the implementation in .../zeta/ken-takusagawa
Jacobi elliptic functions:
Jacobi amplitude am(u,k) inverse of the Elliptic Integral of the First Kind
(in MMa: JacobiAmplitude[u, k^2])
sn(u, k) = sin(am(u, k))
cn(u, k) = cos(am(u, k))
dn(u, k) = = sqrt(1-(k sn(u,k))^2) = d/du am(u, k)
Two versions are in function-sources.txt; see also:
http://en.wikipedia.org/wiki/Theta_function
http://code.google.com/p/elliptic/source/browse/trunk/ellipj.m
http://sourceforge.net/projects/asymptote/forums/forum/409349/topic/4725092
*/
#define ERR_EXEC_DIV_ZERO -1
#define ERR_EXEC_ROOT_NEG -2
#define ERR_EXEC_LOG_NEG -3
#define ERR_EXEC_OVERFLOW -4
#define ERR_EXEC_TRIG_RANGE -5
#define ERR_EXEC_TRIG_LOW_DX -6
#define ERR_EXEC_SIG_LOSS -7
#define ERR_EXEC_POW_NEG_BASE -8
#define ERR_EXEC_LOG_BAD_BASE -9
#define ERR_EXEC_ILLEGAL_SYMBOL -10
#define ERR_NEWTON_ZERO_DX -11
#define ERR_NEWTON_NO_CONVERGE -12
#define ERR_EVAL_TOO_LONG -13
#define ERR_EVAL_METASTACK_OVERFLOW -14
#define ERR_EVAL_UNKNOWN_SEFT -15
#define ERR_EVAL_STACK_OVERFLOW -16
#define ERR_EVAL_STACK_UNDERFLOW -17
typedef struct err_str {
s16 val;
char * str;
} err_str;
#define NUM_ERRORS 17
err_str error_strings[NUM_ERRORS] = {
ERR_EXEC_DIV_ZERO, "Divide by zero",
ERR_EXEC_ROOT_NEG, "Root of a negative value",
ERR_EXEC_LOG_NEG, "Logarithm of a negative value",
ERR_EXEC_OVERFLOW, "Overflow",
ERR_EXEC_TRIG_RANGE, "Trigonometric argument out of range",
ERR_EXEC_TRIG_LOW_DX, "Trigonometric argument generates near-constant",
ERR_EXEC_SIG_LOSS, "Loss of significance",
ERR_EXEC_POW_NEG_BASE, "Power of a negative base",
ERR_EXEC_LOG_BAD_BASE, "Logarithm to base 1 or negative base",
ERR_EXEC_ILLEGAL_SYMBOL, "Illegal symbol",
ERR_NEWTON_ZERO_DX, "Zero derivative in Newton iteration",
ERR_NEWTON_NO_CONVERGE, "Newton iteration did not converge",
ERR_EVAL_TOO_LONG, "Expression is too long",
ERR_EVAL_METASTACK_OVERFLOW, "Metastack overflow",
ERR_EVAL_UNKNOWN_SEFT, "Symbol of unknown seft",
ERR_EVAL_STACK_OVERFLOW, "Stack overflow",
ERR_EVAL_STACK_UNDERFLOW, "Stack underflow",
};
char * err_string(s16 err)
{
s16 j;
for(j=0; j k_eXlim) {
return ERR_EXEC_OVERFLOW;
}
if (a < k_sig_loss) {
/* Loss-of-significance error, e.g. "e^0.0001" */
return ERR_EXEC_SIG_LOSS;
}
rv = exp(a);
if (do_dx) {
drv = rv * da;
}
ms_push(ms, rv, drv); *undo_count = 2;
break;
case 'S' : /* sine */
/* d/dx sin(u) = cos(u) */
a = ms_pop(ms, &da); *undo_count = 1;
a *= k_sincos_arg_scale;
if ((a >= k_2pi) || (-a >= k_2pi)) {
/* This is to eliminate nonsense "solutions" like "sin(X^9) = 1/4" */
return ERR_EXEC_TRIG_RANGE;
}
rv = sin(a);
if (fabs(rv) > k_sin_clip) {
/* This is to eliminate stuff like "sin(pi/2 + 0.00001) = 1"
%%% This is partly redundant with testing dx < k_vanished_dx.
If x=pi/2+0.00001 then the derivative will be really small and
it's a meaningless solution. On the other hand, sin(pi/2 + 0.00001)
itself is sufficiently precise (there is no loss of significant
figures) and might be of interest. */
return ERR_EXEC_TRIG_LOW_DX;
}
if (fabs(cos(a)) < k_sig_loss) {
/* Significance is lost when the value of sin or cos is near +- 1 */
return ERR_EXEC_SIG_LOSS;
}
if (fabs(rv) < (fabs(a) * k_sig_loss)) {
/* Loss-of-significance error, e.g. "sin(pi+0.0001)" */
return ERR_EXEC_SIG_LOSS;
}
if (do_dx) {
drv = cos(a) * da;
if (fabs(drv) < (fabs(da) * k_sig_loss)) {
return ERR_EXEC_SIG_LOSS;
}
}
ms_push(ms, rv, drv); *undo_count = 2;
break;
case 'C' : /* cosine */
/* d/dx cos(u) = - sin(u) */
a = ms_pop(ms, &da); *undo_count = 1;
a *= k_sincos_arg_scale;
if ((a >= k_2pi) || (-a >= k_2pi)) {
/* This is to eliminate nonsense "solutions" like "sin(X^9) = 1/4" */
return ERR_EXEC_TRIG_RANGE;
}
rv = cos(a);
if (fabs(rv) < (fabs(a) * k_sig_loss)) {
/* Loss-of-significance error, e.g. "cos(pi/2.0001)" */
return ERR_EXEC_SIG_LOSS;
}
if (fabs(sin(a)) < k_sig_loss) {
/* Significance is lost when the value of sin or cos is near +- 1 */
return ERR_EXEC_SIG_LOSS;
}
if (fabs(rv) > k_sin_clip) {
/* This is to eliminate stuff like "cos(0.00001) = 1" */
return ERR_EXEC_TRIG_LOW_DX;
}
if (do_dx) {
drv = k_0 - (sin(a) * da);
}
ms_push(ms, rv, drv); *undo_count = 2;
break;
case 'T' : /* tangent */
/* d/dx tan(u) = (1 + tan^2(u)) du */
a = ms_pop(ms, &da); *undo_count = 1;
a *= k_sincos_arg_scale;
if ((a >= k_2pi) || (-a >= k_2pi)) {
/* This is to eliminate nonsense "solutions" like "tan(X^9) = 1/4" */
return ERR_EXEC_TRIG_RANGE;
}
rv = tan(a);
if (fabs(rv) > (1 / k_sig_loss)) {
/* Loss-of-significance error, e.g. "tan(pi/2.0001)" */
return ERR_EXEC_SIG_LOSS;
}
if ((a!=0) && (fabs(rv/a) < k_sig_loss)) {
/* Significance is lost when the value of tangent is near zero
(except when the argument is also near zero) */
return ERR_EXEC_SIG_LOSS;
}
if (do_dx) {
drv = (1.0 + rv*rv) * da;
}
ms_push(ms, rv, drv); *undo_count = 2;
break;
/* seft 'c' ( arg1 arg2 -- val ) symbols */
case '-' :
b = ms_pop(ms, &db);
a = ms_pop(ms, &da); *undo_count = 2;
/* this operator shares the loss-of-significance tests with '+' */
b = - b; db = - db;
goto add_common;
case '+' :
b = ms_pop(ms, &db);
a = ms_pop(ms, &da); *undo_count = 2;
add_common: ;
rv = a + b;
if ((fabs(a) < (fabs(b) * k_sig_loss))
|| (fabs(b) < (fabs(a) * k_sig_loss)) ) {
/* loss-of-significance error, e.g. "1 + e^(5^2)"
In addition to the risk of meaningless tautologies like
"x^2 = 2" followed by "x^2+e^(e^pi) = e^(e^pi)+2",
this test is also useful for pruning: the expression
"1 + e^(5^2)" is useless to us because there is no way to
distinguish it from the simpler expression "e^(5^2)" */
return ERR_EXEC_SIG_LOSS;
}
if ((fabs(rv) < (fabs(a) * k_sig_loss))
|| (fabs(rv) < (fabs(b) * k_sig_loss)) ) {
/* another loss-of-significance error, e.g. "1e40 - (1e40+1)" */
return ERR_EXEC_SIG_LOSS;
}
if (do_dx) {
drv = da + db;
}
ms_push(ms, rv, drv); *undo_count = 3;
break;
case '*' :
/* d/dx u v = v du + u dv */
b = ms_pop(ms, &db);
a = ms_pop(ms, &da);
if (do_dx) {
drv = (b * da) + (a * db);
}
rv = a * b;
ms_push(ms, rv, drv); *undo_count = 3;
break;
case '/' :
/* d/dx u/v = (v du - u dv) / v^2 */
b = ms_pop(ms, &db); *undo_count = 1;
if (b == k_0) {
return ERR_EXEC_DIV_ZERO;
}
a = ms_pop(ms, &da);
if (do_dx) {
drv = ((b * da) - (a * db)) / (b * b);
}
rv = a / b;
ms_push(ms, rv, drv); *undo_count = 3;
break;
case '^' : /* a to the power of b */
/* d/dx u^v = v u^(v-1) du + ln(u) u^v dv */
b = ms_pop(ms, &db);
a = ms_pop(ms, &da); *undo_count = 2;
if (a <= k_0) {
/* This would give a complex answer except when b is an integer,
but in that case there will always be another expression that
gives the same value without needing to raise a negative number
to a power. */
return ERR_EXEC_POW_NEG_BASE;
}
if (fabs(b) < k_sig_loss) {
/* loss-of-significance error, e.g. "2^0.001" */
return ERR_EXEC_SIG_LOSS;
}
rv = pow(a, b);
if (do_dx) {
drv = rv * ( (b * da / a) + (log(a) * db) );
}
ms_push(ms, rv, drv); *undo_count = 3;
break;
case 'v' : /* the bth root of a, that is, a^(1/b) */
b = ms_pop(ms, &db); *undo_count = 1;
if (b == k_0) {
return ERR_EXEC_OVERFLOW;
}
if (fabs(b) > (1.0 / k_sig_loss)) {
/* loss-of-significance error, e.g. "100000,/2" */
return ERR_EXEC_SIG_LOSS;
}
a = ms_pop(ms, &da); *undo_count = 2;
if (a < k_0) {
return ERR_EXEC_ROOT_NEG;
}
rv = pow(a, k_1 / b);
if (do_dx) {
drv = rv * ( (da / (a * b)) - (log(a) * db / (b*b)) );
}
ms_push(ms, rv, drv); *undo_count = 3;
break;
case 'L' : /* log base b of a, that is, ln(a)/ln(b) */
b = ms_pop(ms, &db); *undo_count = 1;
if ((b <= k_0) || (b == k_1)) {
return ERR_EXEC_LOG_BAD_BASE;
}
if (fabs(1.0 - b) < k_sig_loss) {
/* loss-of-significance error, e.g. "log_0.999(2)" or "log_1.001(2)" */
return ERR_EXEC_SIG_LOSS;
}
a = ms_pop(ms, &da); *undo_count = 2;
if (a <= k_0) {
return ERR_EXEC_LOG_NEG;
}
if (fabs(a - 1.0) <= k_sig_loss) {
/* loss-of-significance error, e.g. "log_2(1.00001)" */
return ERR_EXEC_SIG_LOSS;
}
lb = log(b);
rv = log(a) / lb;
if (do_dx) {
drv = (da / (a * lb)) - (rv * db / (b * lb));
}
ms_push(ms, rv, drv); *undo_count = 3;
break;
#if 0
case 'A' : /* atan2 function (quadrant-correct arctangent) */
b = ms_pop(ms, &db);
a = ms_pop(ms, &da);
rv = arctan(a,b);
if (do_dx) {
/* %%% I need to verify this. It also needs to be transformed into
a form that won't lose accuracy when computed (consider b near 0)
probably need to use two forms: one for when fabs(a) > fabs(b),
and one for the other case. */
drv = (k_1 / (k_1 + (a*a / b*b))) * (b * da - a * db) / (b*b);
}
ms_push(ms, rv, drv); *undo_count = 3;
break;
#endif
default:
return ERR_EXEC_ILLEGAL_SYMBOL;
break;
}
if (debug_r) {
if (do_dx && sym_attrs[op].seft == 'c') {
printf("exec %8.5g", a);
printf(" (%8.5g)", da);
printf(" '%c'", op);
printf(" %8.5g", b);
printf(" (%8.5g)", db);
printf(" -> %10.6g", rv);
printf(" d/dx=%10.6g", drv);
} else {
if (sym_attrs[op].seft != 'a') {
printf("exec %14.10g", a);
if (do_dx) {
printf(" (%14.10g)", da);
}
}
printf(" '%c'", op);
if (sym_attrs[op].seft == 'c') {
printf(" %14.10g", b);
if (do_dx) {
printf(" (%14.10g)", db);
}
}
printf(" -> %14.10g", rv);
if (do_dx) {
printf(" d/dx=%14.10g", drv);
}
}
printf("\n");
}
return 0;
}
/* infix_1 pulls one full subexpression off the end of an expression,
converts it to infix, and writes the result into the supplied char *.
It is still in one-character-per-symbol format.
It also modifies the expression by moving its terminating 0 backwards.
It writes the term's final operator into t_op, which is to aid the caller
in determining whether precedence will require the use of parentheses.
infix_1 does not put parentheses around the term, however it
does parenthesize subterms within the term. */
s16 infix_1(symbol * expr, char * term, symbol * t_op)
{
s16 iptr, optr; /* input, output pointers */
symbol op;
char st_a[MAX_ELEN * 4]; /* subterm A */
symbol op_a; /* and its operator */
char st_b[MAX_ELEN * 4]; /* subterm B */
symbol op_b;
s16 swap;
char * term_a; /* for exchanging the two terms if we decide to do so */
char * term_b; /* for exchanging the two terms if we decide to do so */
symbol op_t; /* for swapping op_a and op_b */
char * s; /* for copying subterms to output */
s16 prec_a, prec_b; /* precedence flags for seft 'c' operators */
optr = 0;
/* go to the end */
iptr = 0;
while(expr[iptr]) {
iptr++;
}
/* check for underrun */
if (iptr == 0) {
return 1;
}
/* get the operator */
--iptr; op = expr[iptr]; expr[iptr] = 0;
*t_op = op;
/* check its type */
switch(sym_attrs[op].seft) {
case 'a':
/* oooh, this is easy */
term[optr++] = op;
/* we terminate the input string here. */
expr[iptr] = 0;
break;
case 'b':
/* we've got the op, get the term */
if(infix_1(expr, st_a, &op_a)) {
return 1;
}
swap = 0;
if (op == 's') {
swap = 1;
}
prec_a = 1;
if ((op == 'n') && (sym_attrs[op_a].seft != 'c')) {
prec_a = 0;
}
/* %%% what happens next should depend on op and op_a */
if (swap == 0) {
term[optr++] = op;
}
if (prec_a) {
term[optr++] = '(';
}
for(s = st_a; *s; s++) {
term[optr++] = *s;
}
if (prec_a) {
term[optr++] = ')';
}
if (swap) {
term[optr++] = op;
}
break;
case 'c':
/* get the terms */
if(infix_1(expr, st_b, &op_b)) {
return 1;
}
if(infix_1(expr, st_a, &op_a)) {
return 1;
}
/* default is to keep them in order */
term_a = st_a; term_b = st_b; swap = 0;
/* optional swaps for the commutative operators
if adding, try to put a constant at the end */
if ((op == '+') && (sym_attrs[op_a].seft == 'a') && (op_a != 'x')) {
swap = 1;
}
/* rules for multiplying */
if (op == '*') {
int a_has_x, b_has_x;
a_has_x = (strchr(term_a, 'x') != 0);
b_has_x = (strchr(term_b, 'x') != 0);
/* a bare seft-a symbol always precedes an expression
%%% this swap should not be done if the baresymbol is 'x' and
the expression has no 'x' */
if ((! a_has_x) && (op_b == 'x')) {
/* no swappy */
} else if ((sym_attrs[op_a].seft != 'a') && (sym_attrs[op_b].seft == 'a')) {
swap = 1;
} else if ((op_a == 'x') && (! b_has_x)) {
swap = 1;
}
/* with two bare symbols multiplied by each other, swap if the
latter is an integer. Note: the integer test of op_b implicitly
tests for the second term being seft-a. */
if ((sym_attrs[op_a].seft == 'a') && (op_b >= '0') && (op_b <= '9')) {
swap = 1;
} else if ((op_a == 'x') &&
( (op_b == 'e') || (op_b == 'f') || (op_b == 'p')) ) {
/* "x pi" -> "pi x" */
swap = 1;
}
}
/* Always swap for operators that are inherently swapped. */
if (op == 'v') {
swap = 1;
} else if (op == 'L') {
swap = 1;
} else if (op == PS_REVSUB) {
swap = 1;
op = '-';
*t_op = '-';
} else if (op == PS_REVDIV) {
swap = 1;
op = '/';
*t_op = '/';
} else if (op == PS_REVPOW) {
swap = 1;
op = '^';
*t_op = '^';
}
/* okay, do the swap */
if (swap) {
term_a = st_b; term_b = st_a;
op_t = op_a; op_a = op_b; op_b = op_t;
}
/* determine precedence */
prec_a = prec_b = 0; /* default don't use parens */
if (op == 'L') {
/* Latter part is always in parens; first part (base) is
in parens only if it's an expression */
prec_b = 1;
if (sym_attrs[op_a].seft != 'a') {
prec_a = 1;
}
}
if (sym_attrs[op_a].seft == 'c') {
/* We have (s1 op_a s2) op s3 */
prec_a = 1; /* default use parens */
if (strchr("+*/^v", op_a) && strchr("+-", op)) {
/* anything followed by + or - doesn't need parens */
prec_a = 0;
} else if (strchr("*^v", op_a) && strchr("*/", op)) {
/* * ^ or v followed by * or / doesn't need parens */
prec_a = 0;
}
}
if (sym_attrs[op_b].seft == 'c') {
/* We have s1 op (s2 op_b s3) */
prec_b = 1; /* default use parens */
if ((op == '+') && strchr("+-*/^v", op_b)) {
/* + followed by anything doesn't need parens */
prec_b = 0;
} else if ((op == '-') && strchr("*/^v", op_b)) {
/* - followed by mult or higher operators doesn't need parens */
prec_b = 0;
} else if ((op == '*') && strchr("*/^v", op_b)) {
/* * followed by mult or higher operators doesn't need parens */
prec_b = 0;
} else if ((op == '/') && strchr("^v", op_b)) {
/* / followed by ^ or v doesn't need parens */
prec_b = 0;
}
}
/* We're all set to generate output */
if (op == 'L') {
/* This operator goes in front of both arguments */
term[optr++] = op;
}
if (prec_a) {
term[optr++] = '(';
}
for(s = term_a; *s; s++) {
term[optr++] = *s;
}
if (prec_a) {
term[optr++] = ')';
}
if (op == '*') {
if (0) {
} else if ((op_a <= '9') && prec_b) {
/* digit times anything starting with parentheses */
/* Emit no symbol at all */
} else if ((op_a <= '9') && (term_b[0] <= '9')) {
/* digit times anything starting with a bare digit */
term[optr++] = op; /* emit an actual "*" */
} else if ((sym_attrs[op_a].seft != 'a') || (sym_attrs[op_b].seft != 'a')
|| (term_b[0] > '9')) {
/* printf("op_a '%c' op_b '%c' mult '.'\n", op_a, op_b); */
term[optr++] = '.'; /* This becomes a blank space " " */
} else {
term[optr++] = op;
}
} else if (op == 'L') {
/* We already emitted it */
} else {
term[optr++] = op;
}
if (prec_b) {
term[optr++] = '(';
}
for(s = term_b; *s; s++) {
term[optr++] = *s;
}
if (prec_b) {
term[optr++] = ')';
}
break;
}
/* terminate input and output strings */
term[optr] = 0;
return 0;
}
/* infix_preproc performs manipulations on expressions that make the
infix conversion easier. For example, the "squared" operator gets
turned into "^2". That gives the infix conversion less special
cases to check for when it figures out where to put the parentheses.
We can also do some simplification here, for example [xn2+] which
is "-x+2" in infix, becomes [x2_] "2-x". */
void infix_preproc(symbol * expr, symbol * out)
{
symbol * s;
symbol * o;
s = expr;
o = out;
while(*s) {
/* [-n] = "-(a-b)" seems to never occur. If it did, we could
map [-n] -> [_]
substring conversion of [-n] -> [_] where '_' is a reversed '-'
operator. Likewise situation for [/r]; use '\'. */
if (0) {
} else if ((*s == 'n') && (sym_attrs[*(s+1)].seft == 'a') && (*(s+2) == '+')) {
/* converting [xn2+] into [x2_]. Test case: 1.3204 gives
"-(x)+2 = e/4" without this conversion and "2-x = e/4" with. */
*o++ = *(s+1);
*o++ = PS_REVSUB;
s += 2;
} else if (*s == 's') {
/* [s] -> [2^] */
*o++ = '2';
*o++ = '^';
} else if (*s == 'r') {
/* [r] -> [1\] */
*o++ = '1';
*o++ = PS_REVDIV;
} else if (*s == 'E') {
/* [E] -> [e`] where ` is reverse power */
*o++ = 'e';
*o++ = PS_REVPOW;
} else {
*o++ = *s;
}
s++;
}
*o++ = 0;
}
/* infix_expand expands the output of infix_1 into something more
human-readable. It returns the length of the output. */
s16 infix_expand(char * input, char * output)
{
s16 i, j, l, l2;
char c;
l = strlen(input);
j = 0;
for(i=0; i 0) && (out[len-1] == ' ')) {
len--;
out[len] = 0;
}
return len;
}
/* complexity simply adds up the complexity scores (weights) in an
expression. */
s16 complexity(symbol * expr)
{
symbol sym;
s16 comp;
comp = 0;
while(sym = *expr++) {
comp += sym_attrs[sym].weight;
}
return comp;
}
/* eval evaluates an expression; useful for retrieving the values of both
sides after a match, or for iterating Newton's method. It returns
an error if it doesn't evaluate for some reason. */
s16 eval(symbol * expr, s16 contains_x, double * val, double * dx,
s16 *sptr, s16 show_work)
{
metastack ms;
symbol * s;
symbol dbg_scratch[MAX_ELEN + 1];
symbol ss;
s16 err, i;
s16 undo_count;
/* default return values */
*val = k_0; *dx = k_0;
ms_init(&ms);
for(s = expr, i=0; s[i]; i++) {
if (i >= MAX_ELEN) {
return ERR_EVAL_TOO_LONG;
}
/* Determine if the metastack will overflow */
{
int sp1 = 0; int sp2 = 0; int ms1 = 0;
switch (sym_attrs[s[i]].seft) {
case 'a': ; sp2 = 1; ms1 = 1; break;
case 'b': ; sp1 = 1; ms1 = 2; break;
case 'c': ; sp1 = 2; ms1 = 3; break;
default: ; return ERR_EVAL_UNKNOWN_SEFT;
}
if (ms.sp < sp1) {
return ERR_EVAL_STACK_UNDERFLOW;
}
if (ms.sp + sp2 > MS_STK_MAX) {
return ERR_EVAL_STACK_OVERFLOW;
}
if (ms.msp + ms1 > MS_UNDO_MAX) {
return ERR_EVAL_METASTACK_OVERFLOW;
}
}
err = exec(&ms, s[i], &undo_count, contains_x);
if (err) {
return err;
}
if (debug_o || show_work) {
char escratch[MAX_ELEN + 1];
char fscratch[MAX_ELEN * 4];
char gscratch[MAX_ELEN * 16];
symbol ss;
double val, dval;
s16 j;
dbg_scratch[i] = s[i];
dbg_scratch[i+1] = 0;
infix_preproc(dbg_scratch, escratch);
err = infix_1(escratch, fscratch, &ss);
infix_expand(fscratch, gscratch);
val = ms_peek(&ms, &dval, 0);
if (debug_o) { printf("eval "); }
for(j=strlen(gscratch); j<27; j++) {
printf(" ");
}
printf("%s = ", gscratch);
printf(fmt_g_nom_fixed, val);
if (dval != k_0) {
printf(" (d/dx = ");
printf(fmt_g_nominal, dval);
printf(")");
}
printf("\n");
}
}
*val = ms_peek(&ms, dx, sptr);
return 0;
}
#define NEWTON_MAX_ITER 6
/* newton performs Newton's method to determine the precise value of X
for a given LHS and RHS to match. If the method does not converge
or if there is an error the target value is returned instead. */
s16 newton(symbol * lhs, symbol * rhs, double *root, double *diff_dx)
{
double rhs_val, rhs_dx;
double lhs_val, lhs_dx;
double curr, prev;
s16 i, err, rhs_has_x;
/* Set default return values. */
*root = target;
/* If we return before we can compute a valid diff_dx, then we should
treat it as a nonconverging case. We express that by returning a
diff_dx of 0 */
*diff_dx = 0;
rhs_has_x = (strchr(rhs, 'x') != 0);
/* first get the RHS value. */
rhs_dx = 0;
err = eval(rhs, rhs_has_x, &rhs_val, &rhs_dx, 0, 0); // In newton()
if (err) {
return err;
}
if (debug_n) {
printf("newton RHS [%s] = ", rhs);
printf(fmt_g_nominal, rhs_val);
printf("\n");
printf(" iterating LHS [%s]...\n", lhs);
}
curr = target; prev = curr - k_1;
for(i=0; i clen) && (sl < lineleft) ) {
csym = sym;
candidate = sym_attrs[csym].defn;
clen = strlen(candidate);
}
}
}
}
/* did we actually find one that fits? */
if (candidate) {
printf(" %s", sym_attrs[csym].defn);
sym_attrs[csym].def_given = 1;
lineleft = lineleft - clen - 2;
} else {
printf("\n");
lineleft = LINELEFT_INIT;
}
}
}
if (lineleft < LINELEFT_INIT) {
printf("\n");
}
if (used_trig) {
if (k_sincos_arg_scale == k_pi) {
/* In this case the functions are called "sinpi", etc. and are defined
by the sym_defn[] strings */
} else {
/* Tell what units were used, special-case radians */
printf(" For the trig functions, ");
if (k_sincos_arg_scale == 1.0) {
printf("2 pi");
} else {
printf("%g", 2.0 * k_pi / k_sincos_arg_scale);
}
printf(" units are a full circle.\n");
}
}
}
/* print_end generates the summary statistics that get printed at the
end. */
void print_end()
{
s64 combos;
u32 total_insert;
double tsec = gettime();
if (debug_s) {
printf(
"\n"
" NOTE: Some values will have lost significance in the last one or\n"
" two digits due to round-off during intermediate calculations.\n"
);
}
if (g_enable_output && (output_format <= OF_NORMAL)) {
describe_symbols();
}
total_insert = lhs_insert + rhs_insert;
printf("\n");
printf(" --LHS-- --RHS-- -Total-\n");
printf(" max complexity: %11d %11d %11d\n", lmax, rmax, lmax+rmax);
if (g_enable_output) {
printf(" dead-ends: %11ld %11ld %11ld Time: %.3f\n",
lhs_prune, rhs_prune, lhs_prune + rhs_prune, tsec);
printf(" expressions: %11ld %11ld %11ld \n",
lhs_gen, rhs_gen, gen_total);
printf(" distinct: %11ld %11ld %11ld Memory: %ldKiB\n",
lhs_insert, rhs_insert, total_insert, mem_used_KiB);
/* tell them how much work we did. */
printf("\n");
combos = ((s64) lhs_insert) * ((s64) rhs_insert);
printf(" Total equations tested: %20lld (%.4g)\n",
combos, (double) combos);
}
if (0) {
char text[2];
printf("Hit return: ");
scanf("%c", text);
}
}
/* check_exit checks to see if it's time to exit after a new match.
If the only_exact flag is set we exit because we're done. If not,
then we test to see if best_match is negative. Once best_match becomes
negative there is no chance of getting any more nonexact matches,
and therefore this new match is the last useful output we'll get. */
void check_exit(int e)
{
if (e && only_exact) {
printf(" (Stopping now because -ie option was given.)\n");
print_end();
exit(0);
}
if (best_match < g_min_matchsize) {
/* Decide what number to print in the following message. If the
--min_match-distance option was not given then g_min_matchsize will
be zero; but k_min_best_match is positive in all cases.
g_min_matchsize is zero then we're exiting because of
subtracting k_min_best_match above. */
double t = (k_min_best_match > g_min_matchsize) ?
k_min_best_match : g_min_matchsize;
printf(
" (Stopping now because best match is within %7.3g of target value.)\n",
t);
print_end();
exit(0);
}
}
/* report_match does the formatting to print out a match. You can either
supply an LHS and RHS that are already in the tree, or just one tree
member and a pe (which you would do if reporting an exact match and
don't want to insert the item) */
void report_match(expr *lhs, expr *rhs, pe * exm, double delta)
{
char * le;
char * re;
s16 i;
char fval[30]; /* formatted numerical value (before manual left-justify) */
s16 posn; /* "cursor position" for column padding */
int width; /* A column width */
int justify_overflow;
char l_escratch[MAX_ELEN];
char l_fscratch[MAX_ELEN * 4];
char l_gscratch[MAX_ELEN * 16];
char r_escratch[MAX_ELEN];
char r_fscratch[MAX_ELEN * 4];
char r_gscratch[MAX_ELEN * 16];
char * l_fmt;
char * r_fmt;
symbol ss;
s16 err;
s16 lf_len, rf_len;
s16 re_has_x;
double l_val, r_val, l_dx, r_dx;
/* ignore second and subsequent exact matches. */
if ((delta == k_0) && got_exact) {
return;
}
if (lhs) {
le = (char *) (lhs->sym);
} else {
le = (char *) (exm->sym);
}
if (rhs) {
re = (char *) (rhs->sym);
} else {
re = (char *) (exm->sym);
}
re_has_x = (strchr(re, 'x') != 0);
/* Evaluate both sides, and show work if that option was given */
if (debug_s) {
printf("\nbased on:\n");
}
eval(le, 1, &l_val, &l_dx, 0, debug_s); // In report_match()
eval(re, re_has_x, &r_val, &r_dx, 0, debug_s);
/* Format the LHS expression */
if (output_format == OF_POSTFIX) {
err = postfix(le, l_escratch);
l_fmt = l_escratch; lf_len = strlen(l_fmt);
} else if (output_format == OF_CONDENSED) {
infix_preproc(le, l_escratch);
err = infix_1(l_escratch, l_fscratch, &ss);
l_fmt = l_fscratch; lf_len = strlen(l_fmt);
} else if (output_format == OF_NORMAL) {
infix_preproc(le, l_escratch);
err = infix_1(l_escratch, l_fscratch, &ss);
lf_len = infix_expand(l_fscratch, l_gscratch);
l_fmt = l_gscratch;
} else if (output_format == OF_FORTH) {
lf_len = postfix_formatter(le, l_gscratch, MAX_ELEN * 16);
err = 0; l_fmt = l_gscratch;
}
if (err) {
l_fmt[0] = 0;
}
/* Format the RHS expression */
if (output_format == OF_POSTFIX) {
err = postfix(re, r_escratch);
r_fmt = r_escratch; rf_len = strlen(r_fmt);
} else if (output_format == OF_CONDENSED) {
infix_preproc(re, r_escratch);
err = infix_1(r_escratch, r_fscratch, &ss);
r_fmt = r_fscratch; rf_len = strlen(r_fmt);
} else if (output_format == OF_NORMAL) {
infix_preproc(re, r_escratch);
err = infix_1(r_escratch, r_fscratch, &ss);
rf_len = infix_expand(r_fscratch, r_gscratch);
r_fmt = r_gscratch;
} else if (output_format == OF_FORTH) {
rf_len = postfix_formatter(re, r_gscratch, MAX_ELEN * 16);
err = 0; r_fmt = r_gscratch;
}
if (err) {
r_fmt[0] = 0;
}
if (debug_s) {
printf("then equating %s to %s,\n", l_fmt, r_fmt);
printf("and solving for x by the Newton-Raphson method, I got\n");
printf("x = ");
printf(fmt_g_usable, target + delta);
printf(" = T ");
if (delta < k_0) {
printf("- %.6lg\n", -delta);
} else {
printf("+ %.6lg\n", delta);
}
printf("therefore:\n");
}
/* Display the LHS and RHS as an equation, nicely centered. If either side
can't fit we steal space from the other side. The amount of space we can
use depends on the relative_x option (which affects the width of the
"x+..." column). The default width values (44 and 40) are chosen to fit
in 78 and 80 columns, respectively. If we go to a higher precision
this will need to change significantly. */
if (g_enable_output) {
int l_pad, r_pad;
int t_width, equ_width;
width = relative_x ? 44 : 40; /* Space allocated for LHS and RHS */
equ_width = 5; /* one space on each end, " = " in the middle */
t_width = width + equ_width; /* total width of our output */
/* compute padding (blank space) needed to make LHS and RHS fill their
allotted space */
l_pad = width/2; /* In "solve for x" mode we would give more to RHS */
r_pad = width - l_pad;
l_pad -= lf_len; if (l_pad < 0) { l_pad = 0; }
r_pad -= rf_len; if (r_pad < 0) { r_pad = 0; }
/* See how wide the result will be; if too wide, steal padding from
the left, then from the right */
width = l_pad + lf_len + equ_width + rf_len + r_pad;
while ((width > t_width) && (l_pad > 0)) {
l_pad--; width--;
}
while ((width > t_width) && (r_pad > 0)) {
r_pad--; width--;
}
/* The result is bigger than t_width only when the equation genuinely won't
fit in the space we want. Note this for use later. */
justify_overflow = (width - t_width);
if (justify_overflow < 0) {
justify_overflow = 0;
}
for(i=0; isym, rhs->sym);
}
/* start with a single step of Newton, which we can do easily because
the x and dx values are already computed */
total_deriv = lhs->der - rhs->der;
if (total_deriv == k_0) {
fprintf(stderr, "check_match got dx = 0!\n");
return 0;
}
delta = (rhs->val - lhs->val) / total_deriv;
score = fabs(delta);
if ((score < best_match) && (score > g_min_matchsize)) {
/* We have a good candidate for a new match. Now we use
Newton's method to get a more accurate score */
if (debug_q) {
printf(" [%s] ~= [%s] (score %g), calling newton\n",
lhs->sym, rhs->sym, score);
}
if (err = cv_simplify(lhs->sym, rhs->sym, &root, &total_deriv)) {
/* Eval got an error, or Newton did not converge: in either case
we don't accept, because a failed Newton converge is probably
a pathological case like sin(1/a) near a=0 */
if (debug_q) {
printf(" newton returned %d\n", err);
}
return 0;
}
if (fabs(total_deriv) < k_vanished_dx) {
if (debug_q) {
printf(" derivative = %g is too small\n", total_deriv);
}
}
delta = root - target;
if (debug_q) {
printf("newton: target %g root %g delta %g\n", target, root, delta);
}
score = fabs(delta);
if (score == k_0) {
/* Newton's method revealed that we have an exact match. We'll
report it (in case it's our first exact match) but not adjust our
report threshold. */
if (debug_q) {
printf(" exact match\n");
}
report_match(lhs, rhs, 0, delta);
return 1;
} else if (score < best_match) {
if (debug_q) {
printf(" new record %g\n", score);
}
report_match(lhs, rhs, 0, delta);
/* The minus 1.0e-15 is to avoid having lots of matches that beat
each other only because of roundoff in the score calculation.
This happens if you invoke e.g. "ries 0.434294481903252" */
best_match = (score * 0.999) - k_min_best_match;
if (debug_q) {
printf(" lowering match threshold to %g\n", best_match);
}
check_exit(0);
return 1;
} else if (debug_q) {
printf(" post-newton score %g not good enough\n", score);
}
} else if (debug_q) {
printf(" first score %g not good enough\n", score);
}
return 0;
} // End of check_match
/* bt_first gives a pointer to the first node in the tree. Pass in the
tree's root pointer. */
expr * bt_first(expr * it)
{
while (it && it->left) {
it = it->left;
}
return it;
}
/* bt_depth returns the depth (number of links down from root) of an item. */
u32 bt_depth(expr * it)
{
u32 rv;
/* Start at zero */
rv = 0;
while(it) {
rv++;
it = it->up;
}
return(rv);
}
void bt_stats(void);
void bt_stats(void)
{
long n;
long tdepth;
expr * it;
it = bt_first(lhs_root);
n = 0; tdepth = 0;
while(it) {
if (debug_0) {
printf("%8ld %10s {%3d} = ", n, it->sym, complexity(it->sym));
printf(fmt_g_nom_fixed, it->val);
if (it->der) {
printf(", dx = ");
printf(fmt_g_nominal, it->der);
}
printf("\n");
}
/* Accumulate stats */
tdepth += bt_depth(it);
it = bt_next(it);
n++;
}
printf("Current tree stats:\n");
printf(" Nodes: %ld\n", n);
printf(" Avg. Depth: %f\n", ((double) tdepth) / ((double) n));
}
/* bt_prev traverses the list "backwards" to the next-smaller expression. */
expr * bt_prev(expr *it)
{
expr *old;
/* if it has a left child it's relatively easy */
if (it->left) {
/* go down left, then down right to dead end */
it = it->left;
while (it->right) {
it = it->right;
}
} else {
/* here we have to traverse up until we get to a node from which we
were the right. We also need to worry about going all the way off
the top, which would mean we're done. */
old = 0;
while(it && (old == it->left)) {
old = it;
it = it->up;
}
}
return it;
}
/* bt_next traverses the list to the next-greater expression. */
expr * bt_next(expr *it)
{
expr *old;
/* if it has a right child it's relatively easy */
if (it->right) {
/* go down right, then down left to dead end */
it = it->right;
while (it->left) {
it = it->left;
}
} else {
/* here we have to traverse up until we get to a node from which we
were the left. We also need to worry about going all the way off
the top, which would mean we're done. */
old = 0;
while(it && (old == it->right)) {
old = it;
it = it->up;
}
}
return it;
}
/*
check_sides looks for a match for a given expression. It checks
expressions of the opposite type (LHS or RHS) on either side (lower-
and higher-valued) of the supplied expression, and for each, calls
check_match.
*/
void check_sides(expr *it)
{
expr * other;
int cm_result;
/* "it" is the expression that is just now being added to the database.
Check if it is an RHS or an LHS. */
if (it->der == k_0) {
/* New way */
/* we've got a RHS. Look on both sides for an LHS. If we find another
RHS, we can stop because that RHS, by definition, will be a
closer match to any LHS that lies beyond. */
other = bt_prev(it);
if (other && (other->der != k_0)) {
/* "other" is an LHS, test it. */
cm_result = check_match(other, it);
if (cm_result && debug_q) {
printf("(1st) LHS (left) = New RHS\n");
}
}
/* Now we do the same thing again, to the right this time. */
other = bt_next(it);
if (other && (other->der != k_0)) {
cm_result = check_match(other, it);
if (cm_result && debug_q) {
printf("(1st) LHS (left) = New RHS\n");
}
}
} else {
/* we've got an LHS */
other = bt_prev(it);
/* Check for an RHS, which always has a zero derivative term */
if (other && (other->der == k_0)) {
/* We have an RHS, check it */
cm_result = check_match(it, other);
if (cm_result && debug_q) {
printf("New LHS = first RHS (left)\n");
}
}
/* do the same thing again, to the right this time. */
other = bt_next(it);
if (other && (other->der == k_0)) {
cm_result = check_match(it, other);
if (cm_result && debug_q) {
printf("New LHS = first RHS (right)\n");
}
}
}
} /* end of check_sides */
void check_exact_match(double x, double dx, pe *ex, expr * it)
{
if (dx == k_0) {
/* new item is RHS. */
if (it->der == k_0) {
/* Tree item is RHS, too. We just discard in this case. */
} else {
/* Tree item is LHS... */
if (fabs(it->der) < k_vanished_dx) {
if (debug_q) {
printf("chk_ex_match it->dx = %g is too small\n", it->der);
}
} else {
report_match(it, 0, ex, k_0);
}
}
} else {
/* new item is LHS. */
if (it->der == k_0) {
/* tree item is RHS... */
if (fabs(dx) < k_vanished_dx) {
if (debug_q) {
printf("chk_ex_match dx = %g is too small\n", dx);
}
} else {
report_match(0, it, ex, k_0);
}
} else {
/* Tree item is LHS too, discard.
%%% In this instance we will want to do a normal check_match
and reject if the two derivatives are equal (indicating a
tautology). Even if we get a match, we would still discard the new
node afterwards because the existing node is more likely
to yield equations with a lesser combined complexity. */
}
}
}
/* bt_insert adds an expression to the tree. The dx parameter
is used to determine if it's an RHS or an LHS expression. */
s16 bt_insert(double x, double dx, pe *ex, s16 * res1)
{
expr * it;
s16 going, insert, fillin, report, i;
fillin = 0; going = 0; insert = 0; *res1 = 0;
it = lhs_root;
/* If there's a tree to descend, descend it. */
if (it) {
going = 1;
insert = 1;
} else {
/* insert and copy first node */
lhs_root = (expr *) my_alloc(sizeof(expr));
if (lhs_root == 0) {
return 1;
}
*res1 = 1;
insert_count++;
it = lhs_root;
it->up = 0;
going = 0; /* no descending to do */
insert = 0; /* and we just inserted */
fillin = 1; /* but we need to fill it in */
}
while(going) {
if (it->val == x) {
/* Exact match: there is already a node with the exact same value.
We never insert another node with the same value, because by
definition (due to the way we generate simpler expressions first)
any equation made with the newly inserted node would be more complex
than the existing equation we can get with the existing, simpler
node.
However, we do take the opportunity to report an exact match,
checking the derivative of the side that contains X to avoid
reporting a tautology. */
going = 0;
insert = 0;
check_exact_match(x, dx, ex, it); /* BT_CODE_MATCH */
} else {
/* no match yet: descend. */
if (x < it->val) { /* BT_CODE_CMP */
/* go to left child */
if (it->left) {
it = it->left;
} else {
/* no left: that means we insert here. */
insert = -1;
going = 0;
}
} else {
/* go to right child */
if (it->right) {
it = it->right;
} else {
/* no right: that means we insert here. */
insert = 1;
going = 0;
}
}
}
}
if (insert) {
expr *n;
n = (expr *) my_alloc(sizeof(expr));
if (n == 0) {
return 1;
}
*res1 = 1;
insert_count++;
if (insert > 0) {
it->right = n;
n->up = it;
} else {
it->left = n;
n->up = it;
}
it = n;
fillin = 1;
}
if (fillin) {
/* copy the expression into the new node (BT_CODE_FILLIN) */
it->val = x;
it->der = dx;
it->left = 0;
it->right = 0;
it->elen = ex->elen;
for(i=0; i<=it->elen; i++) {
it->sym[i] = ex->sym[i];
}
}
/* Last thing to do is to check for a new match. */
if (insert) {
check_sides(it); /* BT_CODE_INSERTED */
}
return 0;
}
s16 g_dbg_side;
#define CANONVAL_NEGATE 1
#define CANONVAL_RECIPROCAL 2
#define CANONVAL_DIV2 4
#define CANONVAL_MUL2 8
int g_canon_ops;
/*
canonval takes an expression which should be complete, and tries to
append additional operators to make its value fall within the range
[1.0,2.0).
There are 4 types of transformations that we try to make, and they are
partly redundant: [r] and [2*] are only used if the value is in the
range (-1.0,1.0), but if [r] is used then the value will not be in
that range anymore, and [2*] will not trigger. This redundancy is
there to maximize the effectiveness of canonval when the symbolset has
been restricted via the -S/-O/-N options.
We check sym_attrs[*].count vs. sym_attrs[*].allowed, but don't bother
to update the counts because these are the last symbols that will be
added and we don't add more than one of any symbol.
All parameters except muc_ptr are both inputs and return values.
*/
int canonval(
pe * bpe, /* The expression to operate on */
metastack * ms, /* The expression's metastack */
double *p_x, /* Value of the expression */
double *p_dx, /* Derivateive */
s16 *muc_ptr /* Metastack undo count */
)
{
int muc = 0;
s16 uc;
s16 ip = bpe->elen;
s16 sp;
int exec_err;
double x, dx;
g_cv_calls++;
x = *p_x; dx = *p_dx;
if (debug_G & g_dbg_side) {
printf("canonval: ip %d, el %d [%s] val=%g, dv=%g\n",
ip, bpe->elen, bpe->sym, x, dx);
}
if ((g_canon_ops & CANONVAL_NEGATE)
&& (sym_attrs['n'].count < sym_attrs['n'].allowed)
&& (x < 0.0) && (ip < MAX_ELEN))
{
// Negate
bpe->sym[ip++] = 'n';
exec_err = exec(ms, bpe->sym[ip-1], &uc, using_x); muc += uc;
if (exec_err) {
*muc_ptr = muc;
return exec_err;
}
x = ms_peek(ms, &dx, &sp);
if (debug_G & g_dbg_side) {
bpe->sym[ip] = 0;
printf(" neg -> ip %d, el %d [%s] val=%g, dv=%g\n",
ip, bpe->elen, bpe->sym, x, dx);
}
}
if ((g_canon_ops & CANONVAL_RECIPROCAL)
&& (sym_attrs['r'].count < sym_attrs['r'].allowed)
&& (x*x < 1.0) /* fabs(x) < 1.0 */
&& (ip < MAX_ELEN))
{
// Take the reciprocal
bpe->sym[ip++] = 'r';
exec_err = exec(ms, bpe->sym[ip-1], &uc, using_x); muc += uc;
if (exec_err) {
*muc_ptr = muc;
return exec_err;
}
x = ms_peek(ms, &dx, &sp);
if (debug_G & g_dbg_side) {
bpe->sym[ip] = 0;
printf(" recip -> ip %d, el %d [%s] val=%g, dv=%g\n",
ip, bpe->elen, bpe->sym, x, dx);
}
}
if ((g_canon_ops & CANONVAL_DIV2)
&& (sym_attrs['/'].count < sym_attrs['/'].allowed)
&& (sym_attrs['2'].count < sym_attrs['2'].allowed)
&& (x*x >= 4.0) /* fabs(x) >= 2.0 */
&& (ip+1 < MAX_ELEN))
{
// Divide by 2
bpe->sym[ip++] = '2';
exec_err = exec(ms, bpe->sym[ip-1], &uc, using_x); muc += uc;
if (exec_err) {
*muc_ptr = muc;
return exec_err;
}
bpe->sym[ip++] = '/';
exec_err = exec(ms, bpe->sym[ip-1], &uc, using_x); muc += uc;
if (exec_err) {
*muc_ptr = muc;
return exec_err;
}
x = ms_peek(ms, &dx, &sp);
if (debug_G & g_dbg_side) {
bpe->sym[ip] = 0;
printf(" 2/ -> ip %d, el %d [%s] val=%g, dv=%g\n",
ip, bpe->elen, bpe->sym, x, dx);
}
}
if ((g_canon_ops & CANONVAL_MUL2)
&& (sym_attrs['*'].count < sym_attrs['*'].allowed)
&& (sym_attrs['2'].count < sym_attrs['2'].allowed)
&& (x*x < 1.0) /* fabs(x) < 1.0 */
&& (ip+1 < MAX_ELEN))
{
// Multiply by 2
bpe->sym[ip++] = '2';
exec_err = exec(ms, bpe->sym[ip-1], &uc, using_x); muc += uc;
if (exec_err) {
*muc_ptr = muc;
return exec_err;
}
bpe->sym[ip++] = '*';
exec_err = exec(ms, bpe->sym[ip-1], &uc, using_x); muc += uc;
if (exec_err) {
*muc_ptr = muc;
return exec_err;
}
x = ms_peek(ms, &dx, &sp);
if (debug_G & g_dbg_side) {
bpe->sym[ip] = 0;
printf(" 2* -> ip %d, el %d [%s] val=%g, dv=%g\n",
ip, bpe->elen, bpe->sym, x, dx);
}
}
/* We made it through the exec()s without error, so now we'll save the
results of our calculations. */
bpe->elen = ip;
*p_x = x; *p_dx = dx;
*muc_ptr = muc;
return 0;
}
void decanon(metastack * ms, s16 muc)
{
while(muc) {
ms_undo(ms);
muc--;
}
}
/* ge_2 is the core code for generating expressions from a form.
base -----bpe-----
comp elen syms
ab 0 0 -
ab 10 1 1
ab 17 2 1l
ab 17 2 1n
ab 13 1 2
ab 20 2 2r
ab 20 2 2q
ab 20 2 2l
*/
u32 ge_2(form *base,
pe *bpe,
s16 e_minw, s16 e_maxw,
metastack *ms)
{
symbol seft, sym;
symbol *syms;
s16 ns;
s16 minw, maxw;
s16 in_cpx; /* Complexity of input partial expression */
s16 rminw, rmaxw;
s16 ip;
u32 n;
s16 recurse;
s16 atts;
s16 muc; /* metastack undo count */
double x, dx;
s16 err, sp;
n = 0; muc = 0;
ip = bpe->elen;
/* execute the opcode that just got added (if any) */
if (ip > 0) {
g_exec_calls++;
/* if there are errors like [0/] or [1-q], we'll return right away,
thereby pruning */
err = exec(ms, bpe->sym[ip-1], &muc, using_x);
if (err) {
if (debug_A & g_dbg_side) {
bpe->sym[ip] = 0;
printf("prune partial exec error [%s) got %d\n", bpe->sym, err);
}
while(muc) { ms_undo(ms); muc--; }
return 0;
}
/* Now find out what the current value on the stack is */
x = ms_peek(ms, &dx, &sp);
/* We can always prune zero subexpressions. Any solution that
contains a zero subexpression can be reduced to a simpler
solution that does not contain a zero subexpression. */
if (x == k_0) {
if (debug_B & g_dbg_side) {
bpe->sym[ip] = 0;
printf("prune partial zero [%s) = ", bpe->sym);
printf(fmt_g_usable, x);
printf("\n");
}
while(muc) { ms_undo(ms); muc--; }
prune_count++;
return 0;
}
/* Prune non-integer subexpressions if the -i option was given. */
if (int_subexpr) {
if (x != floor(x)) {
if (debug_C & g_dbg_side) {
bpe->sym[ip] = 0;
printf("prune partial noninteger [%s) = ", bpe->sym);
printf(fmt_g_nominal, x);
printf("\n");
}
while(muc) { ms_undo(ms); muc--; }
prune_count++;
return 0;
}
}
/* %%% here we should do a bt_find and check that the trailing N opcodes
of our PE are equal to that found (if any). If not, our PE is a
redundant and possibly more complex way to generate a value, and
can be pruned. This should be a significant optimization. It covers
the same ground as the AM_ rules, but catches many cases they miss
(like multiple equivalent sums: 37+, 28+, 19+, 136++, 16n3-+, and
on and on... */
/* %%% I don't know if C's '>' operator works right with infinities.
* %%% this could be done more efficiently inside exec() */
/* Any subexpression that overflows either in value or in the
differential causes pruning. */
if ((x >= p_ovr) || (x <= n_ovr)) {
if (debug_D & g_dbg_side) {
bpe->sym[ip] = 0; /* Do not display not-yet-exec'd symbols */
printf("prune partial overflow [%s) = ", bpe->sym);
printf(fmt_g_nominal, x);
printf("\n");
}
while(muc) { ms_undo(ms); muc--; }
prune_count++;
return 0;
}
/* ignore any LHS partial-expressions with nonzero but very small
derivative (tautology problem from roundoff error) */
if (using_x) {
/* formerly tested "(fabs(dx) > k_0) && (fabs(dx) < k_prune_deriv)" */
/* pp. 20111228 was "(fabs(dx)/(1.0 + fabs(x)) < k_prune_deriv)" */
/* Test the current top of stack for bogus derivative. We have to
test dx != 0 to ensure we don't prune constants, like pruning the
"2" in "x2+" before the "+" has been executed. */
if ( (dx != k_0) && (fabs(dx)/(1.0 + fabs(x)) < k_vanished_dx) )
{
bpe->sym[ip] = 0; /* Do not display not-yet-exec'd symbols */
if (debug_E & g_dbg_side) {
printf("prune partial dx~=0 [%s) = ", bpe->sym);
printf(fmt_g_usable, x);
printf(", d/dx = ");
printf(fmt_g_nominal, dx);
printf("\n");
}
while(muc) { ms_undo(ms); muc--; }
prune_count++;
return 0;
}
/* Ignore complete LHS expressions with zero derivative (tautology
problem). This catches all normal non-roundoff tautologies, like
"-x/x" and "(x-4)-(x-1)" */
if ( (sp == 0) && (fabs(dx)/(1.0 + fabs(x)) < k_vanished_dx) )
{
bpe->sym[ip] = 0; /* Do not display not-yet-exec'd symbols */
if (debug_F & g_dbg_side) {
printf("prune full.1 dx~=0 [%s] = ", bpe->sym);
printf(fmt_g_usable, x);
printf(", d/dx = ");
printf(fmt_g_nominal, dx);
printf("\n");
}
while(muc) { ms_undo(ms); muc--; }
prune_count++;
return 0;
}
}
}
/* insert or recurse? */
if (ip >= base->flen) {
s16 res1;
/* it's now as long as it can get. */
// We still have values of x, dx, and sp from calling ms_peek() above
bpe->sym[ip] = 0; // Needed by bt_insert
if (debug_G & g_dbg_side) {
printf("ge_2 insert [%s] = ", bpe->sym);
printf(fmt_g_usa_fixed, x);
if (dx != 0) {
printf(", d/dx = ");
printf(fmt_g_usa_fixed, dx);
}
printf(", comp {%d}\n", bpe->cplx);
}
/* Implement the --find-expression option */
if (g_num_find_expr) {
s16 i;
for(i=0; isym, g_find_expr[i]) == 0) {
printf(" [%s] = ", bpe->sym);
printf(fmt_g_usable, x);
if (dx != 0) {
printf(", d/dx = ");
printf(fmt_g_usable, dx);
}
printf(", complexity = {%d}\n", bpe->cplx);
}
}
}
if (using_x && (fabs(dx) < k_vanished_dx)) {
if (debug_E & g_dbg_side) {
printf("prune full.2 dx~=0 [%s) = ", bpe->sym);
printf(fmt_g_usable, x);
printf(", d/dx = ");
printf(fmt_g_nominal, dx);
printf("\n");
}
while(muc) { ms_undo(ms); muc--; }
prune_count++;
return 0;
} else {
s16 uc; /* Metastack undo count for canonval operation */
s16 oip;
oip = bpe->elen;
/* Bring value into canonical range */
err = canonval(bpe, ms, &x, &dx, &uc);
if (err) {
if (debug_G & g_dbg_side) {
bpe->sym[ip] = 0;
printf("prune canonval exec error [%s) got %d\n", bpe->sym, err);
}
while(uc) { ms_undo(ms); uc--; }
return 0; // Don't count this as a valid generated expression.
}
/* Looks good so far, now try to insert in tree */
bpe->sym[bpe->elen] = 0; // Needed by bt_insert
if (bt_insert(x, dx, bpe, &res1)) { // this is in ge_2
fprintf(stderr, "%s: Out of memory\n", g_argv0);
exit(1);
}
if (res1 == 0) {
if (debug_G & g_dbg_side) {
printf(" rejected (duplicate value)\n");
}
/* We do not undo and return 0 here, because we want to count this as a
"generated value". So we fall through to the "return 1;" below. */
}
/* Undo the canonval ops */
decanon(ms, uc);
bpe->elen = oip;
}
/* undo stack manipulation and return */
while(muc) { ms_undo(ms); muc--; }
return 1; // We generated 1 item
}
/* we reach here if it's incomplete, and we've just executed the last
symbol, and now it's time to add new symbols.
This is also the first code executed in the case where there is no
expression yet (initial call from ge_1()) */
/* get seft and comp */
seft = base->sym[ip];
in_cpx = bpe->cplx;
/* set up our variables for generation */
syms = 0;
switch(seft) {
case 'a':
syms = g_asym; ns = n_asym;
minw = a_minw; maxw = a_maxw;
break;
case 'b':
syms = g_bsym; ns = n_bsym;
minw = b_minw; maxw = b_maxw;
break;
case 'c':
syms = g_csym; ns = n_csym;
minw = c_minw; maxw = c_maxw;
break;
}
/* evaluate remaining available minw and maxw */
rminw = bpe->rminw[ip+1];
rmaxw = bpe->rmaxw[ip+1];
/* calculate the attributes for the symbols we have so far */
if (ip > 0) {
if (debug_H & g_dbg_side) {
bpe->sym[ip] = 0;
printf("attributes for [%s): ", bpe->sym);
}
}
atts = using_x ? 0 : AM_RHS;
/* no debug print for using_x because it's pretty obvious */
if (ip > 2) {
/* Three-symbol patterns */
if ( (base->sym[ip-1] == 'a')
&& (base->sym[ip-2] == 'c')
&& (base->sym[ip-3] == 'a') ) {
/* We have ...aca */
if (debug_H & g_dbg_side) {
printf(" (%c%c%c)", bpe->sym[ip-3], bpe->sym[ip-2], bpe->sym[ip-1]);
}
/* KxK rule is true if we do K * K with the same constant both times. */
if (bpe->sym[ip-1] == bpe->sym[ip-3]) {
if (bpe->sym[ip-2] == '*') {
atts |= AM_KxK;
if (debug_H & g_dbg_side) { printf(" K*K"); }
} else if (bpe->sym[ip-2] == '+') {
atts |= AM_KpK;
if (debug_H & g_dbg_side) { printf(" K+K"); }
}
}
}
}
if (ip > 1) {
/* two-symbol patterns */
/* currently, all two-symbol patterns are for (aa) forms */
if ((base->sym[ip-1] == 'a')
&& (base->sym[ip-2] == 'a')) {
if (debug_H & g_dbg_side) {
printf(" (%c%c)", bpe->sym[ip-2], bpe->sym[ip-1]);
}
/* KK rule is true if the same constant occurs twice in a row. */
if (bpe->sym[ip-1] == bpe->sym[ip-2]) {
atts |= AM_KK;
if (debug_H & g_dbg_side) { printf(" KK"); }
}
/* 55 rule is true if both constants are integers less than or equal
* to 5. */
if ((bpe->sym[ip-1] < '6')
&& (bpe->sym[ip-2] < '6')) {
atts |= AM_55;
if (debug_H & g_dbg_side) { printf(" 55"); }
}
/* 1K rule is true if first constant is 1 */
if (bpe->sym[ip-2] == '1') {
atts |= AM_1K;
if (debug_H & g_dbg_side) { printf(" 1K"); }
}
/* jK rule: true if smaller constant is followed by larger constant.
* For these purposes, the noninteger constants are considered
* larger than all the integers. (This can be changed if necessary
* by adding an indirection array defining symbol sequence, or by
* using the symbol weights as a sequencing measure -- but for now,
* it works fine this way.) */
if (bpe->sym[ip-1] > bpe->sym[ip-2]) {
atts |= AM_jK;
if (debug_H & g_dbg_side) { printf(" jK"); }
}
}
}
if (ip > 0) {
/* one-symbol patterns */
#if 0
if (debug_H & g_dbg_side) {
printf(" (%c)", bpe->sym[ip-1]);
}
if (bpe->sym[ip-1] == '1') {
atts |= AM_1;
if (debug_H & g_dbg_side) { printf(" 1"); }
} else if (bpe->sym[ip-1] == '2') {
atts |= AM_2;
if (debug_H & g_dbg_side) { printf(" 2"); }
} else if (bpe->sym[ip-1] == 'n') {
atts |= AM_n;
if (debug_H & g_dbg_side) { printf(" n"); }
} else if (bpe->sym[ip-1] == 'r') {
atts |= AM_r;
if (debug_H & g_dbg_side) { printf(" r"); }
} else if (bpe->sym[ip-1] == 'l') {
atts |= AM_l;
if (debug_H & g_dbg_side) { printf(" l"); }
} else if (bpe->sym[ip-1] == 'E') {
atts |= AM_E;
if (debug_H & g_dbg_side) { printf(" E"); }
} else if (bpe->sym[ip-1] == 'p') {
atts |= AM_p;
if (debug_H & g_dbg_side) { printf(" p"); }
} else if ((bpe->sym[ip-1] == 's')
|| (bpe->sym[ip-1] == 'q')) {
atts |= AM_sq;
if (debug_H & g_dbg_side) { printf(" sq"); }
}
if (debug_H & g_dbg_side) { printf("\n"); }
#else
atts |= sym_attrs[bpe->sym[ip-1]].amkey;
if (debug_H & g_dbg_side) {
printf(" (%c)", bpe->sym[ip-1]);
if (atts & AM_1) { printf(" 1"); }
if (atts & AM_2) { printf(" 2"); }
if (atts & AM_n) { printf(" n"); }
if (atts & AM_r) { printf(" r"); }
if (atts & AM_l) { printf(" l"); }
if (atts & AM_E) { printf(" E"); }
if (atts & AM_p) { printf(" p"); }
if (atts & AM_sq) { printf(" sq"); }
printf("\n");
}
#endif
}
bpe->elen = ip + 1;
if (debug_I & g_dbg_side) { printf("%d symbols to try.\n", ns); }
while (ns > 0) {
/* get next symbol and write it */
sym = *syms++;
if (debug_I & g_dbg_side) {
bpe->sym[ip] = 0;
printf("trying [%s) + '%c':\n", bpe->sym, sym);
}
bpe->sym[ip] = sym;
sym_attrs[sym].count += 1;
/* update complexity */
bpe->cplx = in_cpx + sym_attrs[sym].weight;
/* begin pruning */
recurse = 1;
/* Check to see if it is possible to get a full expression that is
within the global limits */
if (bpe->cplx + rminw > e_maxw) {
if (debug_J & g_dbg_side) {
printf("prune complexity {%d} + rminw[%d] {%d} > e_maxw {%d}\n",
bpe->cplx, ip+1, rminw, e_maxw);
}
recurse = 0;
} else if (bpe->cplx + rmaxw < e_minw) {
if (debug_J & g_dbg_side) {
printf("prune complexity {%d} + rmaxw[%d] {%d} < e_minw {%d}\n",
bpe->cplx, ip+1, rmaxw, e_minw);
}
recurse = 0;
}
/* does this symbol generate a stupid combination? */
else if (atts & sym_attrs[sym].mask) {
if (debug_K & g_dbg_side) {
printf("prune on symbol rules: ");
if (ip > 2) putchar(bpe->sym[ip-3]);
if (ip > 1) putchar(bpe->sym[ip-2]);
if (ip > 0) putchar(bpe->sym[ip-1]);
printf(":%c atts %x & mask %x != 0\n", sym, atts, sym_attrs[sym].mask);
}
recurse = 0;
}
/* LHS expressions always start with 'x' */
else if (using_x && (ip == 0) && (sym != 'x')) {
if (debug_K & g_dbg_side) {
printf("prune LHS must start with 'x'\n");
}
recurse = 0;
}
/* Check symbol count for this symbol */
else if (sym_attrs[sym].count > sym_attrs[sym].allowed) {
if (debug_L & g_dbg_side) {
printf("prune symcount[%c]\n", sym);
}
recurse = 0;
}
if (recurse) {
/* okay, it's all set to recurse */
n += ge_2(base, bpe, e_minw, e_maxw, ms); // this is in ge_2 (recursive call)
} else {
prune_count++;
}
/* undo the writing of this symbol. */
sym_attrs[sym].count -= 1;
ns--;
}
/* undo the damage */
bpe->elen = ip;
bpe->cplx = in_cpx;
/* we're done with the IR's of this opcode */
while(muc) { ms_undo(ms); muc--; }
return n;
}
u32 ge_1(form *base, s16 e_minw, s16 e_maxw)
{
pe bpe;
metastack ms;
s16 blen;
u32 n;
s16 i, rminw, rmaxw;
symbol sym;
u32 start_ins_count;
blen = base->flen;
/* for now, just report it */
(*base).sym[blen] = 0;
if (debug_t) {
printf("ge_1 on form {%s}\n", (char *) (&((*base).sym[0])));
printf("%3d >= %3d <= {%s} <= %3d >= %3d\n", e_maxw, base->min_weight,
(char *) (&((*base).sym[0])), base->max_weight, e_minw);
}
n = 0;
/* set up the pe struct */
bpe.cplx = 0;
bpe.elen = 0;
/* calculate and fill in the rmw fields */
rminw = rmaxw = 0;
i=blen;
bpe.rminw[i] = rminw;
bpe.rmaxw[i] = rmaxw;
while(i>0) {
i--;
sym = (*base).sym[i];
switch(sym) {
case 'a': rminw += a_minw; rmaxw += a_maxw; break;
case 'b': rminw += b_minw; rmaxw += b_maxw; break;
case 'c': rminw += c_minw; rmaxw += c_maxw; break;
}
if (debug_u) {
printf("rminw[%d] = %d, rmaxw[%d] = %d.\n", i, rminw, i,rmaxw);
}
bpe.rminw[i] = rminw;
bpe.rmaxw[i] = rmaxw;
}
/* set up the metastack */
ms_init(&ms);
/* generate! */
start_ins_count = insert_count;
n = ge_2(base, &bpe, e_minw, e_maxw, &ms); // this is in ge_1
if(debug_v) {
printf("form %s generated %ld expressions", base->sym, n);
if (insert_count > start_ins_count) {
printf(" and inserted %ld", insert_count - start_ins_count);
}
printf(".\n");
}
return n;
}
/* generate forms by simple recursive algorithm.
------base------ ------next------
flen sym-- stack flen sym-- stack Comments
0 - 0 Initial call
0 - 0 1 a 1 Setting up call to myself
1 a 1 Entering recursive invocation
1 a 1 2 aa 2 Setting up call to myself
. . . (. . .)
1 a 1 Entering recursive invocation
1 a 1 2 ab 1 Setting up another call to myself
*/
u32 gf_1(form *base, s16 e_minw, s16 e_maxw)
{
s16 blen;
form next;
s16 i, recurse_forms, gen_expr;
symbol s, slim;
u32 n;
n = 0;
/* copy base form */
blen = base->flen;
for(i=0; istack + ('b' - s); /* 1 for a, 0 for b, -1 for c */
if (s == 'a') {
next.min_weight = base->min_weight + a_minw;
next.max_weight = base->max_weight + a_maxw;
} else if (s == 'b') {
next.min_weight = base->min_weight + b_minw;
next.max_weight = base->max_weight + b_maxw;
} else {
next.min_weight = base->min_weight + c_minw;
next.max_weight = base->max_weight + c_maxw;
}
/* check form symtax */
recurse_forms = 1;
if (next.stack < 1) {
/* stack can't be zero or underflow */
if (debug_w) {
printf("gf_1 prune [%s) stack would underflow\n", next.sym);
}
recurse_forms = 0;
} else if (next.min_weight + s_minw > e_maxw) {
/* adding this symbol would make a form that can't possibly generate
any expressions within the given complexity range. */
if (debug_w) {
printf("gf_1 prune [%s) complexity\n", next.sym);
}
recurse_forms = 0;
} else if (blen >= MAX_ELEN) {
/* limits expression length to our physical allocation size. If this
actually triggers at runtime, it implies that the weights are too
spread out or that MAX_ELEN is just too darn small. */
if (debug_w) {
printf("gf_1 prune [%s) length\n", next.sym);
}
recurse_forms = 0;
} else if ((next.flen + next.stack) >= MAX_ELEN + 1) {
/* in this case it would have no way of getting the stack down to one
item before exceeding MAX_ELEN */
if (debug_w) {
printf("gf_1 prune [%s) stack too high\n", next.sym);
}
recurse_forms = 0;
}
/* check viability for expressions */
gen_expr = 0;
if (next.stack == 1) {
gen_expr = 1;
if (next.min_weight > e_maxw) {
if (debug_w) {
printf("gf_1 [%s] min weight too big for expressions\n", next.sym);
}
gen_expr = 0;
} else if (next.max_weight < e_minw) {
if (debug_w) {
printf("gf_1 [%s] max weight too small for expressions\n", next.sym);
}
gen_expr = 0;
}
}
if (gen_expr) {
if (debug_w) {
printf("gf_1 generating expressions on form [%s]\n", next.sym);
}
g_ne += ge_1(&next, e_minw, e_maxw); // this is in gf_1
}
/* recurse, if appropriate */
if (recurse_forms) {
n += gf_1(&next, e_minw, e_maxw); // this in gf_1 (recursive call)
}
/* count the leaf nodes... leaf. huh-huh. heh huh heh-heh. */
if (gen_expr) {
n++;
}
}
return(n);
}
/* generate forms on-the-fly, given a minimum and maxmum complexity score.
* It will generate all expressions with valid forms that lie within the
* complexity limits. */
u32 gen_forms(s16 e_minw, s16 e_maxw)
{
form base;
u32 n;
n = 0;
base.flen = 0;
base.stack = 0;
base.min_weight = 0;
base.max_weight = 0;
n = gf_1(&base, e_minw, e_maxw);
return(n);
}
/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Routines for initializing the data structures and variables used for the
search.
*/
void def_amkey(char * syms, s16 mask)
{
while(*syms) {
sym_attrs[*syms].amkey |= mask;
syms++;
}
}
void define_amkeys(void)
{
def_amkey("1", AM_1);
def_amkey("2", AM_2);
def_amkey("n", AM_n);
def_amkey("r", AM_r);
def_amkey("sq", AM_sq);
def_amkey("l", AM_l);
def_amkey("E", AM_E);
def_amkey("p", AM_p);
}
void add_symbol(symbol sym, symbol seft, s16 weight, char * forth,
char * infix, char * def_terse, char * def_normal, char * description)
{
if (output_format == OF_NORMAL) {
sym_attrs[sym].defn = def_normal;
} else {
sym_attrs[sym].defn = def_terse;
}
if (sym_attrs[sym].defn) {
if(strlen(sym_attrs[sym].defn) > LINELEFT_INIT) {
fprintf(stderr, "Symbol definition for '%c' too long:\n%s\n", sym,
sym_attrs[sym].defn);
}
}
sym_attrs[sym].desc = description;
if(forth) {
sym_attrs[sym].forth = forth;
}
if (infix) {
symbol_names[sym] = infix;
}
if (IS_PHANTOM(sym)) {
sym_attrs[sym].seft = seft;
return;
}
/* All symbols have a base complexity of 10 points, plus the individual
per-symbol weight */
weight += 10;
sym_attrs[sym].seft = seft;
sym_attrs[sym].weight = weight;
sym_attrs[sym].mask = 0;
if (sym_attrs[sym].allowed == 0) {
/* this is to leave the weights unchanged */
return;
}
if (seft == 'a') {
if (n_asym >= MAX_SEFT_POP) {
fprintf(stderr, "add_symbol: Too many seft 'a' symbols.\n");
exit(-1);
}
g_asym[n_asym++] = sym;
if (n_asym == 1) {
a_minw = weight;
a_maxw = weight;
} else {
if (weight < a_minw) {
a_minw = weight;
} else if (weight > a_maxw) {
a_maxw = weight;
}
}
} else if (seft == 'b') {
if (n_bsym >= MAX_SEFT_POP) {
fprintf(stderr, "add_symbol: Too many seft 'b' symbols.\n");
exit(-1);
}
g_bsym[n_bsym++] = sym;
if (n_bsym == 1) {
b_minw = weight;
b_maxw = weight;
} else {
if (weight < b_minw) {
b_minw = weight;
} else if (weight > b_maxw) {
b_maxw = weight;
}
}
} else if (seft == 'c') {
if (n_csym >= MAX_SEFT_POP) {
fprintf(stderr, "add_symbol: Too many seft 'c' symbols.\n");
exit(-1);
}
g_csym[n_csym++] = sym;
if (n_csym == 1) {
c_minw = weight;
c_maxw = weight;
} else {
if (weight < c_minw) {
c_minw = weight;
} else if (weight > c_maxw) {
c_maxw = weight;
}
}
}
}
char * seft_names[3] = {
"Constants",
"Functions of one variable",
"Functions of two variables",
};
/* Show the set of symbols that is defined, along with seft, weight,
definition, etc. */
void show_symset(void)
{
int i, seft;
char *def;
for(seft = 'a'; seft <= 'c'; seft++) {
printf("%s:\n", seft_names[seft-'a']);
printf(" sym seft weight name description\n");
for (i=0; i 1.0) {
i -= 1.0;
/* 1.0 + 1/i * (k_e) */
k_e = 1.0 + k_e / i;
}
if (debug_z) {
printf("init_num: k_e %.17g chk %.17g diff %g\n",
k_e, check_e, check_e - k_e);
}
}
{
double check_pi = 3.14159265358979323846264338327950288419716;
double i;
/* This pi algorithm is based on the infinite sum attributed to
Isaac Newton:
pi/2 = SIGMA_(k=0..inf) [ k! / (2k+1)!! ]
= 1 + 1/3 + 2/5*3 + 3*2/7*5*3 + 4*3*2/9*7*5*3 + ...
= 1 + 1/3 * (1 + 2/5 * (1 + 3/7 * (1 + 4/9 * (1 + ...))))
which can be readily unrolled into the loop shown here. It is slow
(requiring lots of divisions by distinct primes) but surpasses
faster-converging methods like Gauss-Legendre and Borwein-Borwein
in that it produces the most precise possible answer in IEEE binary64.
*/
i = 50.0; /* For IEEE binary64 */
k_pi = 1.57;
while(i > 1.0) {
i -= 1.0;
k_pi = 1.0 + (i / (1.0 + (2.0 * i))) * k_pi;
}
k_pi = 2.0 * k_pi;
if (debug_z) {
printf("init_num: k_pi %.17g chk %.17g diff %g\n",
k_pi, check_pi, check_pi - k_pi);
}
/* If the trig scale hasn't been set, use the default value */
if (k_sincos_arg_scale <= 0) {
k_sincos_arg_scale = k_pi;
}
k_2pi = 2.0 * k_pi;
}
}
/* init1() sets defaults (anything that can be overridden or changed by
command-line arguments) */
void init1()
{
s16 i;
inittime();
g_enable_output = 1;
for(i=0; i<", 0, 0, 0);
add_symbol(PS_REVPOW, 'c', 0, 0, "!^", 0, 0, 0);
/* These are used for infix formatting */
symbol_names['.'] = " ";
symbol_names['('] = "(";
symbol_names[')'] = ")";
/* Report error and abort if there is only one type A symbol
(namely X, which is always included) */
if (n_asym < 2) {
fprintf(stderr,
"%s: You must allow at least one constant symbol.\n",
g_argv0);
exit(1);
}
/* Add "identity" operator if there are no type B symbols */
if (n_bsym == 0) {
sym_attrs['I'].allowed = MAX_ELEN;
add_symbol('I', 'b', 10, "nop", "I", "I = identity", 0, "identity");
}
/* If there are no type C symbols, gf_1 will notice n_csym and
will skip generating forms with 'c's */
define_amkeys();
/* operators have masks for pruning (optimization). These are commented
* with reasons, given as "transformation" equations. A transformation
* looks like this: [KK*] => [Ks], and represents two sequences of operators
* that have the same value or have the same effect in an expression. The
* form to the left of the '=>' is the form being eliminated by the rule,
* and the form to the right is shown to demonstrate why the form on the
* left is eliminated. When the transformation has a shorter form on the
* right than on the left, as with [1r] => [1], the reason for the rule
* is obvious. In other cases it is important to eliminate only the form
* that has a higher complexity score. If the scores are equal, such as
* [rn] => [nr], we pick one that interacts favorably with other rules.
* The form on the right is said to be "forced", meaning that any expressions
* involving this type of calculation are "forced" to do it in the right-hand
* form.
* You must not "force" a form that is also eliminated by another rule!
* For example, the following two rules are OK by themselves, but together
* cause a problem:
* [sr] => [rs]
* [rs] => [sr]
* However, you can also "force" a form that is also "forced" into another
* form by another rule. The following two rules exemplify this:
* [rn] => [nr]
* [rq] => [qr]
* These two rules, combined, will cause the combinations of 'r', 'n', and
* 'q' to be reduced from six {[rnq], [rqn], [nrq], [nqr], [qrn], [qnr]} to
* two {[nqr], [qnr]}
* Each rule has a symbolset which must be present in order for that
* rule to be used.
* symset sym mask mval */
add_rule("", 'x', AM_RHS);
add_rule("nr", 'n', AM_r); /* [rn] => [nr] */
add_rule("", 'n', AM_n); /* [nn] => [] */
add_rule("1", 'r', AM_1); /* [1r] => [1] */
add_rule("", 'r', AM_r); /* [rr] => [] */
add_rule("1", 's', AM_1); /* [1s] => [1] */
add_rule("4", 's', AM_2); /* [2s] => [4] */
add_rule("s", 's', AM_n); /* [ns] => [s] */
add_rule("rs", 's', AM_r); /* [rs] => [sr] */
add_rule("4^", 's', AM_sq); /* [qs] => []; [ss] => [4^] */
add_rule("1", 'q', AM_1); /* [1q] => [1] */
add_rule("rq", 'q', AM_r); /* [rq] => [qr] */
add_rule("4v", 'q', AM_sq); /* [sq] => []; [qq] => [4v] */
add_rule("", 'l', AM_1); /* [1l] => 0 */
add_rule("ln", 'l', AM_r); /* [rl] => [ln] */
add_rule("", 'l', AM_E); /* [El] => [] */
add_rule("", 'E', AM_l); /* [lE] => [] */
add_rule("Er", 'E', AM_n); /* [nE] => [Er] */
add_rule("Sn", 'S', AM_n); /* [nS] => [Sn] */
add_rule("C", 'C', AM_n); /* [nC] => [C] */
/* The operators are not included in their own require-symset string
unless they are also used in the target of the forced transformation.
This is important with the -O option. For example, if they specify
-O+, the '+' rules will have the effect of "saving" the '+' for
"a more important", i.e. irreducible, use. Of course, it doesn't
actually prevent solutions from being found, it just makes them get
found sooner. */
add_rule("2*", '+', AM_KK); /* [KK+] => [K2*] */
add_rule("*23456789", /* If our integers maxed out at an even number we
wouldn't need the "*" here */
'+', AM_55); /* [25+]=>[7]; [55+]=>[52*] */
add_rule("+", '+', AM_jK); /* [jK+] => [Kj+] */
add_rule("-", '+', AM_n); /* [n+] => [-] */
add_rule("", '-', AM_KK); /* [KK-] => 0 */
add_rule("1234n",'-', AM_55); /* [JK-] => [L] or [Ln] */
add_rule("12345678n",
'-', AM_jK); /* [35-] => [2n] */
add_rule("+", '-', AM_n); /* [n-] => [+] */
add_rule("s", '*', AM_KK); /* [KK*] => [Ks] */
add_rule("*", '*', AM_jK); /* [jK*] => [Kj*] */
add_rule("*", '*', AM_1); /* [1*] => [] */
add_rule("*n", '*', AM_n); /* [n*] => [*n] */
add_rule("/", '*', AM_r); /* [r*] => [/] */
add_rule("1", '/', AM_KK); /* [KK/] => [1] */
add_rule("r", '/', AM_1K); /* [1K/] => [Kr] */
add_rule("", '/', AM_1); /* [1/] => [] */
add_rule("/n", '/', AM_n); /* [n/] => [/n] */
add_rule("*", '/', AM_r); /* [r/] => [*] */
add_rule("", '^', AM_1); /* [1^] => [] */
add_rule("s", '^', AM_2); /* [2^] => [s] */
add_rule("^r", '^', AM_n); /* [n^] => [^r] */
add_rule("1", '^', AM_1K); /* [1K^] => [1] */
add_rule("v", '^', AM_r); /* [r^] => [v] */
add_rule("", 'v', AM_1); /* [1v] => [] */
add_rule("q", 'v', AM_2); /* [2v] => [q] */
add_rule("vr", 'v', AM_n); /* [nv] => [vr] */
add_rule("1", 'v', AM_1K); /* [1Kv] => [1] */
add_rule("^", 'v', AM_r); /* [rv] => [^] */
add_rule("1", 'L', AM_KK); /* [KKL] => [1] */
add_rule("", 'L', AM_1); /* [1L] => undefined */
add_rule("Ln", 'L', AM_r); /* [rL] => [Ln] */
add_rule("", 'L', AM_1K); /* [1KL] => 0 */
/* Added on 20070511 */
add_rule("", 'S', AM_p); /* [pS] => 0 */
/* Added on 20090513 */
add_rule("", '/', AM_KxK);/* [K*K/] -> [] */
add_rule("1n", 'C', AM_p); /* [pC] => [1n] */
/* Added on 20111230 */
add_rule("", '-', AM_KpK);/* [K+K-] -> [] */
s_minw = a_minw;
if (b_minw < s_minw) {
s_minw = b_minw;
}
if (c_minw < s_minw) {
s_minw = c_minw;
}
/* This init is for ge_2. */
for(i=0; i= 0) {
/* Get the next argument at the present stacklevel */
pa_get_arg(); // Sets pa_this_arg
if (0) {
} else if (pa_this_arg == 0) {
/* This happens is pa_get_arg has run out of args or if it gets an
arg that is a null string. Either case means we should pop */
pa_stk_pop();
} else if (strcmp(pa_this_arg, "--ries-arguments-end") == 0) {
/* Ignore any more arguments at this level */
pa_stk_pop();
} else if ((strncmp(pa_this_arg, "-p", 2) == 0)
|| (strcmp(pa_this_arg, "--include") == 0)) {
/* load Parameters (or "Profile") from a file */
if (strcmp(pa_this_arg, "--include") == 0) {
pa_get_arg();
if (pa_this_arg == 0) {
fprintf(stderr,
"%s: --include requires a filename, e.g. '--include trig.ries'\n"
"\n\n",
g_argv0);
brief_help();
exit(-1);
}
} else {
pa_this_arg += 2; /* skip the "-p" */
}
if (*pa_this_arg) {
char * filebuf;
int n;
char * * av;
filebuf = file_read(pa_this_arg);
delimit_args(filebuf, &n, &av);
if (n) {
if ((pa_sp+1) < MAX_FILE_DEPTH) {
/* Push new set of args onto the stack */
pa_sp++;
stk_nargs[pa_sp] = n;
stk_argv[pa_sp] = av;
} else {
fprintf(stderr,
"%s: -p parameters nested too deep (max %d levels).\n"
"\n\n", g_argv0, MAX_FILE_DEPTH);
brief_help();
exit(-1);
}
} else {
/* File had no tokens; we could complain but we let it pass. */
}
} else {
fprintf(stderr,
"%s: -p parameter requires a filename, e.g. '-ptrig.ries'\n"
"\n\n", g_argv0);
brief_help();
exit(-1);
}
/* First we check the "--foo-bar VAL" type options, in which the
"opcode" and its "arguments" are each separate elements of argv[].
These are used for special or rarely-used commands, like the
command that gives an expression's complexity score */
} else if (strcmp(pa_this_arg, "--canon-reduction") == 0) {
if (pa_next_isparam()) {
/* Set which types of canonval reduction to use. */
char * s;
s = pa_get_arg();
if (s) {
g_canon_ops = 0;
while(*s) {
switch(*s) {
case 'n': ; g_canon_ops |= CANONVAL_NEGATE; break;
case 'r': ; g_canon_ops |= CANONVAL_RECIPROCAL; break;
case '2': ; g_canon_ops |= CANONVAL_MUL2; break;
case '5': ; g_canon_ops |= CANONVAL_DIV2; break;
default: ; break;
}
s++;
}
}
} else {
fprintf(stderr,
"%s: --canon-reduction requires a set of reduction types, "
"e.g. 'nr25'\n"
"\n\n", g_argv0);
brief_help();
exit(-1);
}
} else if (strcmp(pa_this_arg, "--find-expression") == 0) {
if (pa_next_isparam()) {
/* This is used to scan for one of more expression(s) and print
out their stats, in whatever order the happen to be found. This
is an easier-to-use replacement for the -DGg option filtered
through grep. */
while(pa_next_isparam()) {
g_find_expr[g_num_find_expr] = pa_get_arg();
g_num_find_expr++;
}
g_enable_output = 0;
} else {
fprintf(stderr,
"%s: --find-expression should be followed by compact "
"postfix expression(s).\n"
"\n\n", g_argv0);
brief_help();
exit(-1);
}
} else if (strcmp(pa_this_arg, "--eval-expression") == 0) {
if (pa_next_isparam()) {
/* This is used to check FORTH expression syntax; it is a complement
to --find-expression and some of the other command options
like -F0 and -DGg */
while(pa_next_isparam()) {
g_find_expr[g_num_find_expr] = pa_get_arg();
g_num_find_expr++;
}
g_enable_output = 0;
g_eval_expr = 1;
} else {
fprintf(stderr,
"%s: --eval-expression should be followed by compact "
"postfix expression(s).\n"
"\n\n", g_argv0);
brief_help();
exit(-1);
}
} else if (strcmp(pa_this_arg, "--min-match-distance") == 0) {
double t;
pa_get_arg();
if (pa_this_arg && sscanf(pa_this_arg, "%lg", &t)) {
g_min_matchsize = fabs(t);
printf("Using minimum match distance: %g\n", g_min_matchsize);
} else {
fprintf(stderr,
"%s: --min-match-distance should be followed by a numeric "
"argument.\n"
"\n\n", g_argv0);
brief_help();
exit(-1);
}
} else if (strcmp(pa_this_arg, "--min-memory") == 0) {
double t;
pa_get_arg();
if (pa_this_arg && sscanf(pa_this_arg, "%lg", &t)) {
g_min_memory = fabs(t);
printf("Memory-hogging safeguard disabled for the first %g bytes.\n",
g_min_memory);
} else {
fprintf(stderr,
"%s: --min-memory should be followed by a numeric "
"argument.\n"
"\n\n", g_argv0);
brief_help();
exit(-1);
}
} else if (strcmp(pa_this_arg, "--max-memory") == 0) {
double t;
pa_get_arg();
if (pa_this_arg && sscanf(pa_this_arg, "%lg", &t)) {
g_max_memory = fabs(t);
printf("Will not use more than %g bytes of memory.\n",
g_max_memory);
} else {
fprintf(stderr,
"%s: --max-memory should be followed by a numeric "
"argument.\n"
"\n\n", g_argv0);
brief_help();
exit(-1);
}
} else if (strcmp(pa_this_arg, "--memory-abort-threshold") == 0) {
double t;
pa_get_arg();
if (pa_this_arg && sscanf(pa_this_arg, "%lg", &t)) {
if (t > 1.0) {
g_mem_bad_ratio = t;
printf("Memory slowness abort ratio: %g.\n",
g_mem_bad_ratio);
} else {
fprintf(stderr,
"%s: --memory-abort-threshold should be at least 1.0,"
" and values less than about 1.5 are unlikely to be of"
" much use.\n"
"\n\n", g_argv0);
brief_help();
exit(-1);
}
} else {
fprintf(stderr,
"%s: --memory-abort-threshold should be followed by a numeric\n"
"argument larger than 1.0 (larger than 1.5 is recommended).\n"
"\n\n", g_argv0);
brief_help();
exit(-1);
}
} else if (strcmp(pa_this_arg, "--significance-loss-margin") == 0) {
double t;
pa_get_arg();
if (pa_this_arg && sscanf(pa_this_arg, "%lg", &t)) {
if ((t >= 0.0) && (t < 100.0)) {
t = pow(10.0, -t);
k_sig_loss = t;
printf("Using significance loss margin: %g\n", k_sig_loss);
} else {
fprintf(stderr,
"%s: --significance-loss-margin must be between 0.0 "
"and 100.0\n"
"\n\n", g_argv0);
brief_help();
exit(-1);
}
} else {
fprintf(stderr,
"%s: --significance-loss-margin should be followed by a numeric "
"argument.\n"
"\n\n", g_argv0);
brief_help();
exit(-1);
}
} else if (strcmp(pa_this_arg, "--trig-argument-scale") == 0) {
double t;
pa_get_arg();
if (pa_this_arg && sscanf(pa_this_arg, "%lg", &t)) {
if ((t >= 0.0) && (t < 100.0)) {
k_sincos_arg_scale = t;
printf("Argument of trig functions will be scaled by ");
printf(fmt_g_usable, k_sincos_arg_scale);
printf("\n");
} else {
fprintf(stderr,
"%s: --trig-argument-scale must be between 0.0 "
"and 100.0\n"
"\n\n", g_argv0);
brief_help();
exit(-1);
}
} else {
fprintf(stderr,
"%s: --trig-argument-scale should be followed by a numeric "
"argument.\n"
"\n\n", g_argv0);
brief_help();
exit(-1);
}
} else if (strcmp(pa_this_arg, "--version") == 0) {
show_version();
exit(0);
/* Single-character arguments:
-0 to -9 target number
-D Debug
-F Format
-i integer
-l level
-p parameters (parsed above)
-N Not these symbols
-O Once-only symbols
-S Symbolset
-x x: Show X, not T+epsilon
*/
/* Next we check the "-xN" type options, in which the "opcode" is a
single letter and its "arguments" follow it without a space in
between. These are used for the options you'll commonly want to
give when invoking ries to solve a problem. */
} else if (strncmp(pa_this_arg, "-l", 2) == 0) {
/* they gave a level */
pa_this_arg += 2; /* skip the "-l" */
nv = sscanf(pa_this_arg, "%lf", &g_levadj);
if (nv) {
tlevel = DEFAULT_LEV_BASE + g_levadj;
} else {
fprintf(stderr,
"%s: -l parameter requires a number, e.g. '-l3'\n"
"\n\n", g_argv0);
brief_help();
exit(-1);
}
} else if (strncmp(pa_this_arg, "-N", 2) == 0) {
/* Not these symbols */
symbol sym;
pa_this_arg += 2; /* skip the "-N" */
while(*pa_this_arg) {
sym = (symbol) (*pa_this_arg);
pa_this_arg++;
sym_attrs[sym].no = 1;
}
} else if (strncmp(pa_this_arg, "-S", 2) == 0) {
/* Only these symbols */
pa_this_arg += 2; /* skip the "-S" */
if (*pa_this_arg == 0) {
/* Without arguments, show the symbols in use and their definitions */
g_show_ss = 1;
} else while(*pa_this_arg) {
symbol sym;
S_option = 1;
sym = (symbol) (*pa_this_arg);
pa_this_arg++;
sym_attrs[sym].yes = 1;
}
} else if (strncmp(pa_this_arg, "-O", 2) == 0) {
/* Once-only symbols */
symbol sym;
pa_this_arg += 2; /* skip the "-O" */
while(*pa_this_arg) {
sym = (symbol) (*pa_this_arg);
pa_this_arg++;
sym_attrs[sym].once = 1;
}
} else if (strncmp(pa_this_arg, "-D", 2) == 0) {
/* Debugging options */
char d;
pa_this_arg += 2; /* skip the "-D" */
while(d = *pa_this_arg) {
switch(d) {
case 'a': debug_A |= DBG_RHS; break;
case 'b': debug_B |= DBG_RHS; break;
case 'c': debug_C |= DBG_RHS; break;
case 'd': debug_D |= DBG_RHS; break;
case 'e': debug_E |= DBG_RHS; break;
case 'f': debug_F |= DBG_RHS; break;
case 'g': debug_G |= DBG_RHS; break;
case 'h': debug_H |= DBG_RHS; break;
case 'i': debug_I |= DBG_RHS; break;
case 'j': debug_J |= DBG_RHS; break;
case 'k': debug_K |= DBG_RHS; break;
case 'l': debug_L |= DBG_RHS; break;
case 'm': debug_m = 1; break;
case 'M': debug_M = 1; break;
case 'n': debug_n = 1; break;
case 'o': debug_o = 1; break;
case 'p': debug_p = 1; break;
case 'q': debug_q = 1; break;
case 'r': debug_r = 1; break;
case 's': debug_s = 1; break;
case 't': debug_t = 1; break;
case 'u': debug_u = 1; break;
case 'v': debug_v = 1; break;
case 'w': debug_w = 1; break;
case 'x': debug_x = 1; break;
case 'y': debug_y = 1; break;
case 'z': debug_z = 1; break;
case 'A': debug_A |= DBG_LHS; break;
case 'B': debug_B |= DBG_LHS; break;
case 'C': debug_C |= DBG_LHS; break;
case 'D': debug_D |= DBG_LHS; break;
case 'E': debug_E |= DBG_LHS; break;
case 'F': debug_F |= DBG_LHS; break;
case 'G': debug_G |= DBG_LHS; break;
case 'H': debug_H |= DBG_LHS; break;
case 'I': debug_I |= DBG_LHS; break;
case 'J': debug_J |= DBG_LHS; break;
case 'K': debug_K |= DBG_LHS; break;
case 'L': debug_L |= DBG_LHS; break;
case '0': debug_0 = 1; break;
}
pa_this_arg++;
}
} else if (strncmp(pa_this_arg, "-i", 2) == 0) {
/* Integer subexpressions */
int_subexpr = 1;
if (pa_this_arg[2] == 'e') {
only_exact = 1;
}
} else if (strcmp(pa_this_arg, "-x") == 0) {
/* Show values of x rather than "x = T + epsilon" */
relative_x = 0;
} else if (strncmp(pa_this_arg, "-F", 2) == 0) {
/* Select expression display format */
pa_this_arg += 2; /* skip the "-F" */
if (*pa_this_arg == 0) {
output_format = OF_FORTH; /* default */
} else {
nv = sscanf(pa_this_arg, "%d", &output_format);
if (nv == 0) {
fprintf(stderr,
"%s: -F parameter requires a number, e.g. '-F0'\n"
"\n\n", g_argv0);
brief_help();
exit(-1);
}
}
}
/* test for number must be last, because it might have a leading '-' */
else if (((pa_this_arg[0] >= '0') && (pa_this_arg[0] <= '9'))
|| (pa_this_arg[0] == '-') || (pa_this_arg[0] == '.')) {
/* This would be the target number */
nv = sscanf(pa_this_arg, "%lf", &target);
if (nv == 1) {
g_got_target = 1;
} else {
fprintf(stderr, "%s: Unknown option '%s'\n"
"\n\n", g_argv0, pa_this_arg);
brief_help();
exit(-1);
}
} else {
fprintf(stderr, "%s: Unknown option '%s'\n"
"\n\n", g_argv0, pa_this_arg);
brief_help();
exit(-1);
}
}
/* At this point pa_sp <= 0 and stk_nargs[pa_sp] == 0, so we're out of
args */
}
int main(int nargs, char *argv[])
{
s16 j;
u32 genf, gent;
u32 searchmax;
s16 timeout;
g_argv0 = argv[0]; /* Used in various sudden-death fprintf's */
init1();
g_got_target = 0;
/* parse arguments */
if (nargs > 1) {
parse_args(nargs, argv);
} else {
fprintf(stderr, "%s: Please specify a target number.\n"
"\n\n", g_argv0);
brief_help();
exit(1);
}
/* init the evaluation system */
init2();
/* Execute the '-S' command */
if (g_show_ss) {
show_symset();
/* We exit because the main purpose of a bare -S is to learn the weights.
However the user may have expected more, so we explain the necessary
-S syntax. */
printf("%s: %s", g_argv0,
"Exiting now (to do an equation search, omit '-S' or include\n"
"symbol names, for example: 'ries 3.14159 -S123456789+/')\n");
exit(0);
}
/* Execute the --eval-expression option */
if (g_eval_expr && g_num_find_expr) {
s16 i, err, sp, contains_x;
double x, dx;
symbol * expr;
exec_x = target;
for(i=0; i 0) {
/* The other clients of eval() always have complete expressions,
so eval() does not check for this error */
printf("Error: incomplete expression (%d item(s) remain on stack)\n",
sp);
} else {
printf("[%s] = ", expr);
printf(fmt_g_usable, x);
printf("; d/dx = ");
printf(fmt_g_usable, dx);
printf(", complexity = {%d}\n", complexity(expr));
}
printf("\n");
}
exit(0);
}
if (g_got_target == 0) {
fprintf(stderr, "%s: Please specify a target number.\n"
"\n\n", g_argv0);
brief_help();
exit(1);
}
/* -i option is of no use when X isn't an integer, because X is a
* subexpression of all LHS's. */
if (int_subexpr && (target != floor(target))) {
if (g_enable_output) {
printf("ries: Ignoring -i because target isn't an integer.\n");
}
int_subexpr = 0;
only_exact = 0;
}
if (g_enable_output) {
printf("\n");
printf(" Your target value: T = ");
printf(fmt_g_usa_fixed, target);
printf("%s", " www.mrob.com/ries\n\n");
}
best_match = fabs(target * k_matchstart);
if (debug_q) {
printf("Initial match threshold = %g\n", best_match);
}
/* Adjust k_min_best_match to accomodate precision and large targets.
%%% For now we only do this for large-magnitude targets, not for targets
close to 0. I want to look more closely at how I handle SIG_LOSS errors
and how that affects operation when the target is close to zero. */
if (fabs(target) > 1.0) {
k_min_best_match = fabs(target) * 8.0 * k_precision_ulp;
if (debug_z) {
printf("Large target, setting kmbm=%g\n", k_min_best_match);
}
}
exec_x = target;
/* printing symbol weight limits for debugging */
if (debug_z) {
printf("Symbol weight ranges:\n");
printf(" a_minw %d, a_maxw %d\n", a_minw, a_maxw);
printf(" b_minw %d, b_maxw %d\n", b_minw, b_maxw);
printf(" c_minw %d, c_maxw %d\n", c_minw, c_maxw);
}
/* calculate search cutoff. This number is the total number of
* expressions generated. Note that the total number of possible
* combinations between LHS and RHS is much more: If N expressions
* are generated and equally divided between LHS and RHS, the total
* number of combinations is N^2/4. Thus, we get 4 times as many
* combinations each time we double this number.
* The user interface says that each level will produce a 10-times
* greater search. This means checking 10 times as many combinations.
* however, as the number of combinations rises they also get more
* spread out, because more very large and very small numbers are
* getting generated. Because of this spreading out, LHS and RHS
* expressions become slightly less likely to be near each other as
* the number of combinations goes up. Therefore, in order to get 10
* times more "relevant" combinations, we multiply by factors of 4,
* since 4^2 is 16, somewhat more than 10. */
searchmax = 2000L;
if (tlevel >= -2.0) {
double t;
t = ((double) searchmax) * pow(4.0, tlevel);
if (t > ((double) LONG_MAX)) {
searchmax = LONG_MAX;
} else {
searchmax = (long) t;
}
}
gen_total = 0L;
if (debug_y) {
printf("Will stop after generating searchmax=%ld expressions\n",
searchmax);
}
/* generate and print solutions */
lmin = rmin = 1;
lmax = rmax = PASS_GRAN;
rhs_gen = lhs_gen = 0;
rhs_insert = lhs_insert = 0;
rhs_prune = lhs_prune = 0;
gent = 0; timeout = 0;
got_exact = 0;
while (gen_total < searchmax) {
/* We increase the complexity limit of whichever side has generated
the least number of expressions so far. This is to ensure that we
stay near the optimal point where the search time is O(sqrt(K^N)).
If we let one side outnumber the other by a lot, the search gets
to be more like O(K^N), and we don't want that (-:
The really great thing about this method (testing the counts
at runtime) is that it adapts gracefully to changes in the symbol
set, which of course can be specified on the command line. For
example, if they significantly limit the number of constants,
the LHS will grow at a faster rate than the RHS because its one
additional type-a symbol ('x') becomes a significant factor in how
many expressions there are of each complexity.
Since LHS and RHS are stored in the same struct type, there is no
memory savings from storing fewer LHS's, however there might be
a bit of speed cost from evaluating the derivatives.
storing them. Some kind of benchmarking would be in order. */
g_ne = 0; insert_count = 0; prune_count = 0;
if (lhs_insert > rhs_insert) {
rmin += PASS_GRAN; rmax += PASS_GRAN;
/* Generate RHS expressions */
using_x = 0;
g_dbg_side = DBG_RHS;
genf = gen_forms(rmin, rmax);
if (debug_y) {
printf("finished RHS from {%d} to {%d}: %ld forms, %ld expressions.\n",
rmin, rmax, genf, g_ne);
}
rhs_gen += g_ne;
rhs_insert += insert_count;
rhs_prune += prune_count;
} else {
lmin += PASS_GRAN; lmax += PASS_GRAN;
/* Generate LHS expressions */
using_x = 1;
g_dbg_side = DBG_LHS;
genf = gen_forms(lmin, lmax);
if (debug_y) {
printf("finished LHS from {%d} to {%d}: %ld forms, %ld expressions.\n",
lmin, lmax, genf, g_ne);
printf("\n");
}
lhs_gen += g_ne;
lhs_insert += insert_count;
lhs_prune += prune_count;
}
gent += genf;
gen_total += g_ne;
/* Test the number of new expressions found. Reset the timeout if we
found new expressions. If we go too many passes and find nothing, the
search will terminate.
This test prevents an infinite loop in the case where the
searchmax is so big (or the symbolset so small) that all possible
expressions of size MAX_ELEN are generated becore searchmax is
reached. (If this actually happens, it probebly means MAX_ELEN should
be increased.) */
if (g_ne > 0) {
timeout = 0;
} else {
timeout++;
}
/* %%% with -i and/or -S it's not very reliable. Instead of timing
* out we should use an explicit test: is complexity minimum
* larger than max complexity possible in length MAX_ELEN? */
if (timeout > 40) {
printf("exhaustion timeout, complexity LHS {%d}, RHS {%d} (increase MAX_ELEN?)\n",
lmax, rmax);
gen_total = searchmax;
}
if (debug_y || debug_0) {
printf("gen_total %ld (lhs %ld rhs %ld) searchmax %ld\n",
gen_total, lhs_insert, rhs_insert, searchmax);
bt_stats();
}
}
if (debug_y) {
printf("Level %.g search complete (gen_total exceeded searchmax).\n",
g_levadj);
}
if (g_enable_output) {
if (only_exact) {
printf("No exact solution was found (try using -l%d).\n",
((int) floor(g_levadj+1.0)));
} else {
printf(" (for more results, use the option '-l%d')\n",
((int) floor(g_levadj+1.0)));
}
}
print_end();
return 0;
}
/*
ries.c
RIES -- Find Algebraic Equations, Given Their Solution
Copyright (C) 2000-2012 Robert P. Munafo
See copyright notice at beginning of this file.
*/