.TH RIES 1L \" -*- nroff -*- .SH NAME ries \- find algebraic equations, given their solution .SH SYNOPSIS .B ries [-\fBl\fIn\fR] [-\fBi[e]\fR] [-\fBx\fR] [-\fBF\fR] [-\fBS\fIsss\fR] [-\fBN\fIsss\fR] [-\fBO\fIsss\fR] [-\fBD\fIxxx\fR] \fIvalue\fP .SH DESCRIPTION .B ries searches for algebraic equations in one variable that have the given number as a solution. It avoids trivial or reducible solutions like "\fIx\fR = \fIx\fR". If .I value is an integer, .B ries will find an exact solution expressed in terms of single-digit integers. For example, if you supply the value 2.5063, the first part of \fBries\fP's output will resemble the following: .fam C Your target: T = 2.5063 2 X = 5 for X = T - 0.0063 {49} ln(x)^2 = sin(1) for X = T - 0.00373232 {64} X^2 = 2 pi for X = T + 0.000328275 {55} X^X = 1 + 9 for X = T - 0.000115854 {69} X^2 + e = 9 for X = T + 3.56063e-05 {63} .fam T The output gives progressively "more complex" solutions (as described below) that come progressively closer to matching your number. There are four columns: equations in symbolic form (two columns of symbolic expressions with '=' in the middle), solution of equation (value of \fIx\fR expressed as \fIT\fR plus a small error term), and total complexity score (described below). Options allow complete control over what symbols, constants and functions are used in solutions, or to limit solutions to all-integer values. .SH OPTIONS Options must be separate: `\FCries -l1 -i -Ox 27\FT', not `\FCries -l1iOx 27\FT'. .IP \fB-l\fIn\fR Specifies the level of the search (default 0). .B \-l1 will search about 10 times as many equations and take at least 4 times as long (depending on memory limits) as \fB-l0\fP. Use higher levels to add more factors of 10. Here are typical figures, measured on a 2.33-GHz Core 2 Duo with the sample command \fBries -l\fP\fIn\fR\fB 2.5063141592653589\fP for different values of searchlevel \fIn\fR: .fam C memory digits runtime usage equations tested matched (2.33 GHz) -l0 960K 95,000,000 6+ 0.02 sec -l1 3.1M 1,030,000,000 7+ 0.1 sec -l2 11 M 12,800,000,000 8+ 0.6 sec -l3 32 M 120,000,000,000 9+ 2.7 sec -l4 113M 1,470,000,000,000 11+ 13.6 sec -l5 396M 17,600,000,000,000 12+ 82 sec .fam T (for a 733-MHz Pentium 3, the times are about 5 times longer and the rest of the figures are the same. \fBries\fP also works on much older and smaller systems, and can test billions of equations in less than a minute on 1990's hardware) When free memory is exhausted; performance depends on your operating system. Under Linux and Mac OS, \fBries\fP keeps running but the system slows to a crawl. Other operating systems might experience similar effects, or simply not allow \fBries\fP to get more memory (at which point \fBries\fP terminates its search and quits immediately). If you don't know what your OS will do, be careful before running \fBries\fP with higher levels. In extreme cases your computer's response might slow down so much that you are unable to save files in other programs. The memory limits are not reached nearly as quickly when the symbolset is greatly limited with \fB-S\fP, \fB-O\fP and \fB-N\fP or when \fB-i\fP is specified. \fB-i\fP in particular should allow about two more levels in any given amount of memory. .IP \fB-N\fIsss\fR .B \-N followed by one or more characters specifies symbols (constants and operators) that .B ries should not use in its equations. The symbols are as follows: .RS 0.7i .IP 1-9 The integers 1 through 9. (\fBries\fP constructs all larger integers from combinations of these.) .IP p pi = 3.14159... .IP e e = 2.71828... .IP f phi = (1+sqrt(5))/2 = 1.61803... .IP n Negative .IP r Reciprocal .IP s Squared .IP q Square root .IP "S C" Sine, cosine .IP "T A" Tangent, arctangent .IP l ln (natural logarithm) .IP "+ -" Plus and minus .IP "* /" Times, divide .IP ^ Power: 2 ^ 3 = 8 .IP v Root: 3 v 27 = cube root of 27 .IP L Log-base-N .RE There are lots of potential uses for \fB\-N\fP. For example, if you invoke .B ries on a small irrational number, you might get several solutions that involve the unary and binary logarithm operators 'l' and 'L'. If you decide you aren't interested in such solutions you can just add .B \-NlL to your command line, and all such solutions will be skipped. If you are checking an unknown number that you found in the context of some larger problem, you probably have some idea what constants and operators may be involved, or not involved, in the phenomenon that produced your number. Use \fB-N\fR to rule out functions you don't think are relevant. Note that .B ries will often run considerably slower when you limit it to a very small set of symbols, mainly because it cannot use its optimization rules (this is described below under ALGORITHM). Also, with fewer symbols the average length of expressions is longer, and that makes the search slower. .IP \fB-S\fIsss\fR Specifies a symbol set, as with \fB\-N\fP, but has the opposite effect: .I only these symbols will be used. If .B \-S and .B \-N are used together, the .B \-N is ignored. .B \-S can be used to solve those old problems of the sort "How can the number 27 be expressed using only the four basic operators and the digit 4?" The answer is given by: \FCries '-S4+-*/' 27 -Ox\FT (The .B \-Ox option is described next). To solve the same problem using the .B \-N option, you'd have to type: \FCries -Npef12356789rsqlL^v 27 -Ox\FT .IP \fB-O\fIsss\fR Specifies symbols which should appear no more than once on each side of the equation. This option can be combined with .B \-S or \fB\-N\fP, in which case they augment each other. Any conflicts are resolved on a symbol-by-symbol basis: \fB-S\fP overrules \fB-O\fP, \fB-O\fP overrules \fB-N\fP. One additional symbol is available with \fB-O\fP: .RS 0.7i .IP x The variable on the left-hand-side .RE Thus, you can use \fB-Ox\fP to limit \fBries\fP's output to equations that have only one '\fIx\fR' in them and are therefore easy to solve for \fIx\fR using only the most basic algebra techniques. This also makes \fBries\fP's output more like that of traditional expression-finders, which search for expressions equal to \fIx\fR rather than equations in \fIx\fR. Here's an example: \FCries -i 16\FT gives lots of answers like "\fIx\fR + 3^\fIx\fR = 9^8" with \fIx\fR very, very close to T, because 3^16 is close to 9^8. \FCries -i 16 -Ox\FT eliminates all these, and prints more interesting answers like "(\fIx\fR+7)^4 = 6^7". .IP \fB-i\fR Require that all expressions, and all subexpressions, must have integer values. This is primarily useful if you are searching for an exact solution for a large integer. Note that inexact solutions will still be given, but both sides will be integers. An example of such a solution is "\fIx\fR * 9 = 8^2" where \fIx\fR=7. \fB-i\fP is ignored if the supplied target value is not an integer. .IP \fB-x\fR Print actual values of \fIx\fR (the roots of the equations found) rather than expressing \fIx\fR as \fIT\fR plus/minus a small number. .IP \fB-F\fIn\fR Select output format. If \fIn\fR is omitted it is 3; if \fB-F\fR is not specified at all, the format will be 2. The following formats are available; each shows the output of \FCries 1.506591651 -F\FT\fIn\fR: Format 0: Compressed FORTH-like postfix format: Each operator and constant is just a single symbol. .fam C x1- = 2r for X = T - 0.00659165 {50} xrS = fr for X = T - 0.00562976 {60} xCr = 4s for X = T + 0.00166391 {60} xl = eS for X = T + 0.00140386 {57} .fam T Format 1: Infix format, but with single-letter symbols. If this format is specified, a table of symbols will be printed after the main table of results. The rest of the expression syntax is the same as the normal format. For example, "q(l(\fIx\fR)) = p-1" means "sqrt(ln(\fIx\fR)) = pi - 1". .fam C x-1 = 1/2 for X = T - 0.00659165 {50} S(1/x) = 1/f for X = T - 0.00562976 {60} 1/C(x) = 4^2 for X = T + 0.00166391 {60} l(x) = S(e) for X = T + 0.00140386 {57} .fam T Format 2: Standard infix expression format (this is the default). .fam C x-1 = 1/2 for X = T - 0.00659165 {50} sin(1/x) = 1/phi for X = T - 0.00562976 {60} 1/cos(x) = 4^2 for X = T + 0.00166391 {60} ln(x) = sin(e) for X = T + 0.00140386 {57} .fam T Format 3: Print solutions in postfix format, similar to that used in FORTH and on certain old pocket calculators. This is close to the format used internally by \fBries\fP (to get the exact, condensed format, use \fI-F3\fR). This is intended mainly for use by scripts that use \fBries\fP as an engine to generate equations and then perform further manipulation on them. However, this option will also help you distinguish what symbols were actually used internally to generate an answer. For example, the symbols 's' and '^2' both show up as '^2' in the normal output, but in postfix they appear as "dup*" and "2 **" respectively. .fam C x 1 - = 2 recip for X = T - 0.00659165 {50} x recip sin = phi recip for X = T - 0.00562976 {60} x cos recip = 4 dup* for X = T + 0.00166391 {60} x ln = e sin for X = T + 0.00140386 {57} .fam T Most of the symbols used by \fB-F3\fP are self-explanatory. The nonobvious ones are: \fBneg\fR for negate, \fBrecip\fR for reciprocal, \fBdup*\fR for square, \fBsqrt\fR for square root, \fB**\fR for power (A^B), \fBroot\fR for Bth root of A, \fBlogn\fR for logarithm (to base B) of A. For these last three, \fIA\fR is the first operand pushed on the stack and \fIB\fR is the second. .IP \fB-D\fIxx\fR Display diagnostic messages. A detailed understanding of the \fBries\fP algorithms (described below) is assumed. \fB-D\fP is followed by one or more letters specifying the messages you want to see. Options \fBA\fP through \fBL\fP and \fBa\fP through \fBl\fP (except \fBe\fP and \fBf\fP) apply to the LHS and RHS respectively. For each option, the number of lines of output that you can expect from a command like \FCries -l2 -D\FT\fIx\fR\FC 2.506314159\FT is shown: .RS 0.7i .IP A,a \fB[29949; 64184]\fP partial expression error (e.g. divide by zero); pruned .IP B,b \fB[196; 2797]\fP partial expression is zero; pruned .IP C,c \fB[38; 38]\fP partial expression is non-integer (and -i option given); pruned .IP D,d \fB[1681; 4171]\fP partial expression overflowed; pruned .IP E \fB[1333; --]\fP partial expression derivative nearly zero; pruned .IP F \fB[1157; --]\fP full expression derivative nearly zero; pruned .IP G,g \fB[305319; 670210]\fP expression added to database .IP H,h \fB[361668; 731891]\fP show attributes of each expression tested .IP I,i \fB[3286347; 6618747]\fP show each new symbol to be added before complexity pruning .IP J,j \fB[2111413; 4235348]\fP show symbols skipped by complexity pruning .IP K,k \fB[208550; 451853]\fP show symbols skipped by redundancy and tautology rules .IP L,l \fB[38; 38]\fP show symbols skipped to obey -O option .IP m \fB[10216232]\fP show all metastack operations .IP n \fB[255]\fP show Newton iteration values and errors if any .IP o \fB[788]\fP show work in detail: operator/symbol, x and dx at each step .IP p \fB[453890]\fP show match checks .IP q \fB[109]\fP show close matches dispatched to Newton and results of test .IP r \fB[1706583]\fP show results (value and derivative of operands and result) for each opcode executed .IP s \fB[300]\fP show your work: displays values of each subexpression for every reported answer .IP t \fB[9380]\fP show all abc-forms passed to expression generation .IP u \fB[31649]\fP show steps of min/max complexity ranging for each abc-form .IP v \fB[4709]\fP show number of expressions generated by each abc-form .IP w \fB[28978]\fP show details of abc-form generation (pruning, weights, etc.) .IP x \fB[89]\fP show all rules used (varies with the -N -O and -S options) .IP y \fB[353]\fP statistics and decisions made in main loop .RE .IP Of these, \fB-Ds\fP is probably the most useful and fun to look at. \fB-Dy\fP gives a nice top-level view of the statistics of the search. Most of the options that generate lots of output are useful if filtered through \FCgrep\FT; this can tell you why a certain subexpression is or is not appearing in results. (Note that subexpressions are reported in the \fB-F0\fP terse postfix format.) .SH ALGORITHM .B ries begins its search with small, simple equations and proceeds to longer, more complex ones. To determine the order to search, .B ries uses many \fIcomplexity rules\fP, including the following: .IP 1. If you add a symbol to an equation, the result is more complex: \fIx\fR + 1 = 3 is more complex than \fIx\fR = 3 \fIx\fR + 1 = ln(3) is more complex than \fIx\fR + 1 = 3 \fIx\fR - 7 = 4^2 is more complex than \fIx\fR - 7 = 4 .IP 2. If two equations are the same except for one number, the equation with the higher number is more complex: \fIx\fR + 1 = 5 is more complex than \fIx\fR + 1 = 3 \fIx\fR^3 + 1 = 3 is more complex than \fIx\fR^2 + 1 = 3 .IP 3. If two equations are the same except for one symbol, the equation with the "more exotic" symbol is more complex: \fIx\fR ^ 5 = 3 is more complex than \fIx\fR + 5 = 3 .P As .B ries searches it finds solutions -- these are equations for which \fIx\fR is close to being an exact answer. Each time it finds a solution it prints it out. Then .B ries raises its standard for the next answer: The next answer .B ries prints must be a closer match to your supplied value than all the answers it has given so far. (The only exception to this rule is an exact match: .B ries will print the simplest exact solution but will then continue to print more inexact solutions. This is important if you only know a few digits of your number but don't care about the fact that it can be expressed as an integer divided by a power of 10.) Instead of trying complete equations, .B ries actually constructs half-equations, called \fIleft-hand-side expressions\fP and \fIright-hand-side expressions\fP, abbreviated LHS's and RHS's. It keeps a list of LHS's and a list of RHS's, and it keeps these lists in numerical order at all times. This enables .B ries to find matches much faster. All LHS's contain \fIx\fR and all RHS's do not. Thus, 1000 LHS's and 1000 RHS's make a total of 1000000 possible equations, and all 1000000 combinations can be quickly checked just by scanning through the two lists in numerical order. This is why .B ries is able to check billions of equations in such a short time. The closeness of an LHS match depends on the value of \fIx\fR, and also on the derivative with respect to \fIx\fR of the LHS expression. Because of this, .B ries calculates derivatives of LHS's as well as their values. There are dozens of optimization rules .B ries uses, like the following: .IP \fBa+\fP Don't try "K + K" for any constant \fBK\fP because "K * 2" is equivalent. .IP \fBb+\fP Don't try "3 + 4" (or any two unequal integers from 1 to 5) because another single integer (in this case "7") is shorter. .IP \fBa*\fP Don't try "1 * K" for any constant \fBK\fP because "K" is shorter. .IP \fBb*\fP Don't try both "2 * 4" and "4 * 2" because they are equivalent. .IP \fBc*\fP Don't try "K * K" because "K ^ 2" is shorter. .IP \fBar\fP Don't try "1 / (1 / expr))" for any expression \fBexpr\fP because "expr" is shorter. .IP \fBa^\fP Don't try "2 ^ 2" or "2 squared" because "4" is shorter. .IP \fBb^\fP Don't try "expr ^ 2" for any expression \fBexpr\fP because "expr squared" is shorter. .RE These rules are important -- they make the search about 10 times faster. However, if the symbol set is limited via \fB-N\fP, \fB-O\fP or \fB-S\fP, some of these rules cannot be used. For each optimization rule there are one or more symbolset exclude rules like the following: Don't use rule \fBa+\fP if either of the symbols '*' or '2' is disabled. In order to maintain maximum efficiency, .B ries checks each rule individually against the symbolset, and uses as many rules as it can. You can see this process in action by trying a command like \FCries 1.4142135\FT, which gives the answer "\fIx\fR^2 = 2". If you disable the 's' (squared) and '^' (power) symbols with \FCries 1.4142135 -Ns^\FT, rules \fBc*\fP and \fBb^\fP go away, and the answer becomes "\fIx\fR = sqrt(2)". If you also disable 'q' (square root) the answer becomes "\fIx\fR \fIx\fR = 2". Disable '*' and you get "log_2(\fIx\fR) = 1/2"; disable 'L' and it becomes "\fIx\fR = 2,/2"; disable 'v' and get "\fIx\fR/(1/\fIx\fR) = 2". Finally, disabling '/' as well, the command becomes \FCries 1.4142135 '-Ns^q*Lv/'\FT and we finally get an answer that you probably would not have been able to guess: \fIx\fR+1/(1-\fIx\fR) = -1 (note that the '/' in this answer is actually '1/', the reciprocal operator 'r'). Throughout this whole process you can see the complexity score of the equation increase as the solution becomes more and more obscure. .SH UNEXPECTED BEHAVIOR Adding or changing the symbolset with the \fB-S\fP, \fB-O\fP and \fB-N\fP options often causes unexpected changes in the output. For example, \FCries 2.5063\FT yields the solution \fIX\fP^2 = \fIpi\fR * 2 but \FCries 2.5063 -N+-\FT gives the same solution as \fIx\fR / sqrt(2) = sqrt(\fIpi\fR). This seems counterintuitive -- there was no + or - in the first answer, so why did .B ries decide it had to give the answer in a different form? In fact, both solutions are generated, but .B ries will only print the first one it encounters during its search. Note that the second version has a simpler RHS and a more complex LHS. The mysterious behavior results from the fact that .B ries always tries to keep the number of LHS and RHS expressions equal as it performs its search. Eliminating operators with the \fB-N\fP option means that more complex expressions must be generated to reach the "quota". In this particular case, the symbolset restriction has a greater effect on the LHS than on the RHS, so as the search is progressing, LHS complexity grows a little more quickly than RHS complexity. In this particular case, \fIpi\fR * 2 is a more complex expression than sqrt(\fIpi\fR) because it has more symbols. Conversely, \fIx\fR^2 is \fIless\fP complex than \fIx\fR / sqrt(2). So, one of the solutions has more complexity on its left and the other has more complexity on its right. \fIx\fR^2 = \fIpi\fR * 2 and \fIx\fR / sqrt(2) = sqrt(\fIpi\fR) are equivalent solutions -- they both come equally close to the supplied value 2.5062 -- but only one can be found first. Once the first one is reported, the other is not, because \fBries\fP only reports solutions that are at least 0.1% better than the previous solution. The \fB-l\fP option is meant to give control over the number of solutions searched, but it actually controls the number of LHS and RHS expressions generated. Because two RHS's often have the same value, and only one (the first) gets kept, the number of solutions checked (which is the RHS count times the LHS count) depends on how often you get two LHS's or RHS's with the same value. This happens particularly often when the symbol set is severely restricted. If \fBries\fP tried to compensate for this, the result would be that severely limited symbolsets would take a very long time to run and would generate really long equations. The current implementation is considered to be a good tradeoff. .SH BUGS Performance does not degrade gracefully when the physical memory limit is hit, because expression nodes are allocated in sequential order in memory, without regard to where they will end up in the tree. This could be improved in the future with percentile demographics and a sort performed one time only, after the tree reaches a healthy (but not excessive) size. Although it tries to avoid it, \fBries\fR will often print more than one equivalent solution. It misses the fact that the multiple solutions are equivalent because of roundoff error. It still occasionally prints "answers" that are actually tautologies empty in meaning. A simple example would be \fIx\fR/\fIx\fR = 1, but \fBries\fP handles that case and most other simple cases. When this bug happens it's always something much more complex, like "\fIx\fR (1/sin(pi \fIx\fR/\fIx\fR))/\fIx\fR = 1/sin(pi pi 1/pi)". Due to roundoff error, .B ries sometimes prints completely wrong answers. An example is 1/(1/(1/\fIx\fR)-\fIx\fR)=8^(7*2). For certain values of \fIx\fR, 1/(1/\fIx\fR) comes out slightly different from \fIx\fR (because of roundoff error), and therefore the LHS ends up being a really big number, in this case 8^14. Make sure to double-check \fBries\fR's answers before using them. .SH ACRONYM .B ries (pronounced "reese" or "reeze") is an acronym for "RILYBOT Inverse Equation Solver". The expansion of .I RILYBOT includes two more acronyms whose combined length is greater than 11. The full expansion of .B ries grows without limit and is well-defined but not primitive-recursive. Contact the author for more information. .SH AUTHOR Robert P. Munafo, .