/* ries.c RIES -- Find Algebraic Equations, Given Their Solution Copyright (C) 2000-2008 Robert P. Munafo This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. If you got ries.c from the website www.mrob.com, the GNU General Public License may be retrieved from the following URL: http://www.mrob.com/ries/COPYING.txt The remainder of this large header comment is broken up into sections titled: HOW TO BUILD, UNFINISHED WORK, DETAILED NOTES ON ALGORITHM, REVISION HISTORY. */ /* HOW TO BUILD: 1. Boot your favorite UNIX-compatible computer (Step 1a: install Linux if you don't already have it on your computer. Linux rules!) 2. Make a new directory, put this file there. 3. Compile it with the following command: gcc -o ries -lm ries.c (Step 3a): on a 486 or later and with a fairly current version of GCC, you can use the command line I use: gcc -m486 -malign-double -O3 -fomit-frame-pointer \ -ffast-math -o ries -lm ries.c 4. Run ries and give it a number, for example: ries 2.5063 */ /* UNFINISHED WORK %%% Output postprocessors aren't written yet -- sqrt still appears just as "q", for example. %%% options to add: -Fx to specify other output formats, e.g. raw symbols, calculator-keys, TeX or eqn, etc. -S without symbol list to display symbolset and exit -W to view weights, define weights individually, or select weight presets (e.g. electrical engineering) some option to select colloquialisms: degrees instead of radians; log10 instead of ln variations on -l that allow specifying limit of search by precision, by time, by memory usage, or by number of equations tested %%% some option to "solve for X", to the greatest extent possible. Requires a lot of different things: A new routine to split the arguments of type-c operators; a special handling of sym_allowed (to provide for the fact that, when solving, some operators become their inverses); re-evaluate the match during and/or after solving to make sure it is still a solution (consider -1.4142 -> "x^2 = 2" -> "x = sqrt(2)" -> "1.4142": oops!); re-evaluate sym_allowed compliance after solving (to support -Osss option) %%% command-line option: if no symbol '1', then 'r' shouldn't be allowed. Likewise for '2' + '^' and 's'; '-' and 'n'; 'e' + '^' and 'E'; others? Which should be the default, the simpler behavior or the "more correct" behavior? */ /* %%% macros: - use a separate symbolset namespace (implies macros cannot call themselves or each other, which avoids lots of problems) - To use -O with a macro, include '_' in the -O symbolset list; all symbols after '_' are macro names. -S assumes all macros (why would you define a macro and not want to use it?); using -N with a macro is a contradiction. - Macro can use any symbols, and symbols used within macro expansions don't count against their sym_allowed quotas (this is probably easy; just need to make sure) - Need to implement stack-overflow detection in the metastack routines, both for the stack and for the undo lists. - Test evaluator routine needs to make sure macros fit one of the three allowed types ('a', 'b' or 'c') and that only types 'b' and 'c' dip into the stack's current contents. - Type 'a' can be "precompiled" since they amount to custom constants - Should I allow immediate operands (like 0.577...) to facilitate defining constants that can't be computed any other way? How about special operators, like 'exch'? - execution can probably be accomplished by having eval() call itself recursively. It needs to handle its own error returns. %%% The memory performance degradation can be made more graceful by putting LHS and RHS in separate trees. Since LHS's and RHS's are created in batches, they will be allocated in homogeneous blocks. That means that, if the memory blocks are allocated in sequentially increasing locations in memory, it should be possible to swap in the entire LHS tree at once while leaving almost the entire RHS tree swapped out, and vice versa. The algorithms would have to be modified to not try to do matching at insertion time, and go to separate passes instead -- to avoid descending the RHS tree on each LHS insertion. The main problem that would still exist is in the linear scan -- it will still tend to jump all around from one block to another as you do the scan, and since both trees are being scanned at once, everything becomes a candidate for swapping. If the nodes can be allocated in some way that keeps them, at least partially, physically sorted in memory, it would allow RIES to run at a memory usage level up to 2x the physical available memory size with a moderate, but not excessive, level of swapping. Perhaps we could do a statistics pass through the tree when it reaches a certain modest size, like 1000 nodes, and note at what numeric values each quartile of the distribution lies. Then my_alloc could pull node space from a different freeblock based on what quartile the node will fall into. This will have the desired effect if the freeblocks are substantially bigger than the VM page size. %%% with many -i test cases the RHS expression totals are a lot bigger than the LHS totals (while the insert totals are equal). There are probably cases that come out the other way around, too. In cases like this, the program could probably find more matches more quickly by shifting the balance between LHS/RHS to favor the side that is generating and inserting more expressions per unit time. I need to model this with formulas that take into account the total amount of dead-ends, expressions, and inserts on both sides, compared as a ratio to the equation total (and take the 1st derivative to figure out what side of the maximum you're on) %%% Test cases to test periodically: ries 0.08600409182 -l1 should give "3 pi x = 8/pi^2" or equivalent ries -l1 0.434294481903252 should give "x = 1/ln(1+9)" or equivalent ries 2000.0222 '-Sn1/r+87^s' used to give bogus roundoff match ries 27 -Sx3qL6 used to give bogus roundoff match ries '-S14+*-/n' 27 generates a small number of equations ries '-S4+*-/n' 27 used to generate *more* than the previous example ries 27 '-S4+*-/' -Ox -ie Should give [x4+4*4+] = [4444+**] ries -Nl+ 2.5063 Stephen's number: should give the three answers "2x=5", "x^2=2pi", "x^x=10" ries -3.14159 should give "-x = pi" ries -23 only one solution, an exact match ries -Slvsnex 1.017285 used to give bogus exact match ries -i 14327652 -l4 used to give erroneous [xsxL7s^n] = [27s^n] ries -i 7823310 -l4 used to give erroneous [x35s^*x/] = [35s^] ries 2.505317647 -l1 gives [x5l-] = [6ql] (which is correct) with error 4.8e-11. Make sure "almost-zero" checks don't clobber this. ries '-S4+*-/' 17 should give an answer with only 4 or 5 4's, such as [x44/-] = [44*] */ /* DETAILED NOTES ON ALGORITHM To reduce the search time, the common "meet-in-the-middle" technique is used. Rather than attempting to search for solutions of the form X = expression (1) it searches for solutions of the form f(X) = expression (2) where f(X) represents an expression containing X. Each side of equation (2) is represented as a set (in memory, a ordered list, implemented with a binary tree) of expressions and their values. Thus there are two lists. Each list is kept in order by numerical value. Then, the lists can easily be scanned to locate any matches. A match consists of an element in the left-hand list whose value is close to that of an element of the right-hand list. Keeping the lists in order also allows duplicate expressions, like "2 pi" and "pi * 2", to be ignored. It is important to look for "simpler" matches before going to more complex ones. Thus, expressions have a complexity score, computed by assigning a certain number of points to each of the symbols in the exression. Matches are checked in order of increasing total complexity (which is the complexity of the left-hand side plus that of the right-hand side) In order to avoid the sorting problems necessary to search the expressions in a perfectly increasing complexity order, complexity scores are lumped into discrete finite quanta, which are actually implemented by using small integers for each component of a complexity score, rather than real numbers. This does not actually cause everything to be searched in order, even modulo the quantization errors. Some equations can be represented in an unbalanced form with lower aggregate complexity than the least-complex balanced form: an example is the cube root of 2: The solutions are "x3^ = 2" and "x = 23v", both are unbalanced. (See below for expression syntax) There are no good balanced solutions. In such cases the unbalanced, low-complexity form will show up later in the search than it "should". */ /* To make generation and evaluation easy, expressions are represented in a FORTH-like syntax with one character per symbol. Thus, "11+" is 1+1, "2q" is sqrt(2), "ep^" is *e*^*pi*, etc. There are three classes of symbols: a: x 1 2 p 3 e 4 5 f 6 7 8 9 (*pi*, *e*, *phi*) b: r s q l n S C T (recip, squared, sqrt, ln, negate, sin, cos, tan) c: + - * / ^ v L (root, logN) For ease in debugging, most symbols have letters that make sense, but not always (e.g. "f" for *phi*, the golden ratio; "v" for root, which is meant to represent the v-shaped part of the standard root sign). The only constants are the digits 1 through 9 -- no zero, and no multidigits or decimal fractions. To get 10 you have to do "25*" or something similar; for 1.5 you have to do "32/". A fair amount of effort has been put into setting the symbol scores such that the score of "25*" is only a little higher than the score for "9". example cases for deriving the symbol weights: (+) ~= (*) (because both occur equally often) (-) > (+), but not by much 1, 2, 3, 4, 5, ... degrade gracefully and kind of like a slide rule (14*) = (22*), etc. (implies (n) proportional to log(n) for 1<=n<=9) (33*) = (9) (implies (*) = [..]) (25*) > (9), but only by a little (no problems so far) [2q] ~= [5] :: therefore (2q) + [.] ~= (5) [x2v] = [xq] :: (2v) + [.] = (q) [x2^] = [xs] :: (2^) + [.] = (s) [99+] can be >> [55*] (because smaller numbers are more likely) %%% complexity score for a symbol can be context sensitive, e.g. 'l' takes a higher score the second time it is used in an equation -- so long as its range of possible scores is within the total range for its class. This might be a good way to eliminate some of the nonintuitive aspects of the current system. Once a set of symbol scores is worked out, it can be adjusted by adding any constant to all the values (this adds a bias for long expressions, or a bias against long expressions. In the above "normalized" examples there is no bias, but the shorter ones will get generated first anyway.) */ /* The lists start out small and grow as the program searches. On the first pass, the left-hand list consists simply of {X}, the single- element expression for the search value, and the right-hand list consists only of a few common constants, for example {1, 2, *e*, *pi*}. After a few passes the left-hand list will include things like "xv" (sqrt(x)) and "x1+" (x+1), and the right-hand side will have similar forms such as "2v" (sqrt(2)) and "23/" (2/3). At that point, if X is 4/9 a match will be found between "xv" and "23/", even though the combination of "x" and "49/" has fewer symbols overall, because the former matching is more evenly balanced. Another way of saying this is that, since the complexity scores of "xv" and "23/" are both lower than the score of "49/", the "xv = 23/" matching is found before "49/" even gets a chance to be generated. To minimize time spent generating nonsense expressions, expressions are generated from "forms". A "form" is a symbolic expression of expression syntax, like "aabc" for "12q+". Each letter represents a different type of stack operation. Type "a" pushes one item, type "b" leaves the stack with the same number of items, and "c" leaves the stack with one less item. A form is legal if the stack depth (the "a" count minus the "c" count) is > 0 at all times and = 1 at the end. Thus all forms start with an "a". The number of forms for N={1,2,3,...,8} is {1,1,2,4,9,21,51,127} (the Motzkin numbers; Sloane's A001006). By comparison, the total number of forms without the stack restrictions would be 3^(N-1): {1,3,9,27,81,242,729,2401}. The savings is considerable. In the left-hand list the first symbol will be an "x" in all generated expressions. Forms are generated on the fly and never stored in memory. As the search proceeds, longer forms are generated as needed. For each form there is a "minimum" expression and a "maximum" expression, as rated by complexity score. For example, "x1+" is the minimum expression for form "aac" and "99L" (log base 9 of 9) is the maximum. These minimum and maximum expressions are easy to find, because each symbol has its own score and the expression score is just the sum of the symbol score. Thus it is easy to determine, at the beginning of each pass, which forms can generate expressions that are within the current complexity range. */ /* Expressions are generated from each valid form. The generation of expressions uses a recursive backtracking algorithm, starting with the first symbol and moving to the right. At each step, it checks to see of the symbols we have so far still allow an expression which falls within the complexity range -- if not, the latest symbol is dropped or changed. It also avoids generating certain patterns of symbols which would be of no computational value, or which are always equivalent to a different (sometimes shorter) set of symbols. Examples of this optimization are: - "aa-" for any class a symbol would be 0, so is not generated. - "aa+" is the same as "a2*", so is not generated. - "aa*" is the same as "as", so is not generated. - "aa/" and "aal" are equivalent to 1, and are not generated. - Almost any operator after "1" is not generated ("1x", "1/", "1r", "1^", "1v", "1l", "1L", "1v" all do nothing useful) - "2^" and "2v" are equivalent to "s" and "q" respectively, and are not generated. - %%% "ps" is equivalent to 0 - "sq", "qs", "nn", "rr" all amount to nothing and are not generated. The expressions are also evaluated while they are being generated. Any subexpression that causes an error, such as divide-by-zero, can be skipped along with all expressions that start with that subexpression. For example, any expression starting with a constant followed by "nq", such as "2nq" which means "sqrt(-2)", will be skipped. As the search proceeds, each equation (that is, each combination of an LHS with an RHS) is checked to see what value X would have to be for the equation to work out perfectly, and the difference between this X and the user's supplied number is called the "delta". An equation becomes a "record-setter" if its delta is smaller than any delta seen so far. If a strict "non-fuzzy" comparison were made, due to roundoff errors it would be possible for the same solution to be found twice. For example, if X is near 4/9 = 0.44444.., two solutions are "x = 49/" and "xv = 23/". One solution might be printed after the other because of roundoff in the square-root operation. To avoid this, every time a new record-setter is found, the criterion for another record is set to 0.999 times the new delta. */ /* Determining the value of X for which the equation works out perfectly is a hard problem. In theory one would have to solve the equation for X. That involves lots of shifting and rearranging of symbols, and if there is more than one X in the equation it can get into some rather complicated algorithms. Instead, this program compares the LHS and RHS values directly. (The difference between the LHS and RHS is the "lhs/rhs diff", and varies from the "delta"). This leads to several problems. Sometimes two sides of the equation form a really good match only because both sides involve something raised to a very small power (or a very big root -- same thing). An example of this type of problem is 1.017262042 = [49sv] (the 81st root of 4) and 1.017313996 = [38sv] (the 64th root of 3). They differ by only 1 part in 20000, but only because both numbers are very close to 1. All 81st and 64th roots are close to 1. This is the referred to as the "error margin problem". There is also the problem of zero subexpressions and constant subexpressions involving X. An example is "x1el-^ = 42sL": The subexpression "1el-" evaluates to 0, and the entire LHS is constant. In order for the program to work, such "solutions" have to be detected and eliminated. This is the "singularity problem". Finally, there is the issue of reporting what X would perfectly solve each equation. This is a desirable feature of the program because it users will often want to find out what the "exact" value would be for a given equation without actually doing the algebra and calculations themselves. In cases like [xle+q] = [1e-s], where there is only one X in the equation, this could be done by the computer. However, there are lots of important cases like [xx^] = [52*] where an analytic solution does not exist and the value of X has to be found by some other method, such as a numeric method. This is the "root-finding problem" Conveniently, all three problems (the error-margin problem, the singularity problem, and the root-finding problem) are eliminated in the same way: by calculating the *differential* of all terms and subexpressions on the LHS. The differential, when multiplied by the lhs/rhs diff, gives a very good approximation to the delta, which as mentioned above is the amount that X would need to be increased to make the equation a perfect match. */ /* This solves the error margin problem because it equalizes the field -- all equations can get rated in terms of how much X has to change to make LHS and RHS match, rather than how much the LHS would have to change. It solves the singularity problem because, by definition, an LHS with a zero differential is constant with respect to X, and therefore constitutes a singularity solution. It also solves the root-finding problem quite conveniently. If you have an equation like X^X = 10, and you have an approximation for X, you use derivatives and Newton's method to find a better approximation for X. When RIES takes the differential of the LHS and multiplies it by the lhs/rhs diff, it is essentially performing one step of Newton's method. The resulting value can be added to X to form a better approximation to the root of the equation. Differentials are evaluated for LHS expressions as the expressions are generated. In the following table, the calculation of the differential for each symbol is shown, based on the values of the operands (A and B) and their differentials (da and db). Some of these will be familiar to anyone who has taken calculus: type 'a' symbols: 3 any constant 0 x target 1 type 'b' symbols: n negate - da s squared 2 da A q square root da / 2 sqrt(A) r reciprocal - da / A^2 l natural log da / A e exponential e^A da type 'c' symbols: + plus da + db - minus da - db * times A db + B da / divide B da - A db / B^2 ^ power A^B (ln(A) db + B da / A) v Bth root BvA (da / A B - db ln(A) / B^2) L log base B (da / A ln B) - (ln A / ln B) (db / B ln B) product rule d/dx f(x) g(x) = df g(x) + dg f(x) quotient rule d/dx f(x)/g(x) = (df g(x) - dg f(x)) / g(x)^2 chain rule d/dx f(g(x)) = df/dg dg/dx A^B = exp(ln(A) B) d/dx exp(Y) = exp(Y) dY d/dx ln(A) = 1/A da d/dx ln(A) B = db ln(A) + B/A da d/dx A^B = exp(ln(A) B) (db ln(A) + B/A da) = A^B (db ln(A) + da B / A) = A^B ln(A) db + B A^(B-1) da (standard form) d/dx ln(A) / ln(B) = (ln(B) da / A - ln(A) db / B) / ln(B)^2 log(a+b) = log(a(1+b/a)) = log(a) + log(1+b/a) */ /* It is fairly easy to show that in any equation that constitutes a solution, if any expressions or subexpressons evaluate to 0, the entire equation can be replaced with an equivalent form that does not involve 0. The equivalent form is never more complex and is usually simpler. By a similar but longer process it can be shown that any solution involving a subexpression with a 0 derivative can be reduced to a simpler solution that does not involve a zero-derivative subexpression. Why RIES Calculates Derivatives Consider the value X = 2.5063. Each of the following solutions (or an algebraic equivalent) will appear when you run RIES on it: X = 5 / 2 (for X = 2.500000000000...) X^X = 10 (for X = 2.506184145588...) X = sqrt(9 - e) (for X = 2.506335606326...) X = sqrt(2 pi) (for X = 2.506628274631...) The exact solutions to these equations are all different values, of course, and they all form successively closer approximations to the supplied value 2.5063. */ /* %%% the following description doesn't match the numbers; don't know how to fix it... Look at the last two. Notice that the supplied value, 2.5063, is between the exact solutions of these two. Also, the equations can both be expressed in a different way: 2 X = 5 instead of X = 5/2 X^2 = 2 pi instead of X = sqrt(2 pi) Consider what would have happened if the last digit had been one higher or one smaller. How much does it change the "closeness" of the match? With the original two equations, where there is just an "X" on the left-hand side, this is easy to figure out: if X is 0.0001 lower, it's 0.0001 closer fit for the first equation, and 0.0001 further from the second. But look at the alternate forms: If you subtract 0.0001 from X, that subtracts 0.0002 from 2 X, but it subtracts 0.0005 from X^2. That's a big difference -- if we're using these forms of the equations (as RIES does) to determine how well the two sides match, then this altered value of X makes the match "move" further with respect to one equation than it does with respect to the other! The reason this happens is, of course, because we're looking at an expresson on the left-hand side, rather than just a single "X". Putting an expression on the left-hand side makes it harder to see how close a match you've got. If you still don't believe this, consider 1.047246. This is an equally good solution for the following: X = pi / 3 (too high by 0.00005) X = 15th root of 2 (too low by 0.00005) but if you express them the other way: 3 X = pi (3.14172, too high by 0.00013) x^15 = 2 (1.99730, too low by 0.0027) suddenly it looks like the 3 X = pi solution is a *way* better solution! RIES notices this and compensates for it regardless of the form in which the equation is actually found. If you run RIES on the value 1.047246 it will present both solutions, in the following form: X * 3 = pi X ^ 5 = 3 v 2 The philosophy adopted by RIES is that the "true" form for evaluating the closeness of a match is the form where there is just one X, and nothing else, on the left-hand-side of the equation, and just numbers and symbols, but no X's, on the right-hand-side. (Let's call this the "normalized form".) Put all equations into normalized form, calculate the value on the right and look at the difference between X and this value to determine how good the match is. But now go back to our 2.5063 example above -- one of its solutions was: X^X = 10 This equation *cannot* be reduced to something with just one X on the left-hand side -- there is no "inverse-of-X-to-the-X" function. What does RIES do? RIES calculates derivatives. By using the value of 2.5063 for X and calculating the derivative of X^X for this value of X, you get (about) 19. A derivative of 19 means that any small change in X will cause a 19-times-bigger change in X^X. That's important, because it allows us to compare the closeness of the match on "equal footing" with other, normalized equations like X = sqrt(2 pi). Derivatives work so well, in fact, that RIES does not even have to bother adjusting any its equations to normalized form. This is important, because part of the reason RIES is so fast is that it generates left-hand-sides and right-hand-sides separately and tries all the combinations to find possible solutions. It can do this a lot quicker because a half-equation is smaller than a full equation, and therefore there are less possibilities to check out. Furthermore, derivatives allow RIES to quickly and easily calculate and display the value that X would have to be in order to match each equation exactly (it is essentially performing one step of Newton's method). */ /* %%% Another way to optimize the memory situation: replace the memory allocation with a file. To prevent lots of wild seeking, the list would have to be maintained in sequential order. The way to do this is to cache new additions in memory and periodically merge-sort the memory list with the file to create a new file. The trick is how to detect when memory is "full". I don't want the user to have to worry about setting a memory usage preference setting. Actually, this idea could be implemented without the file too -- just allocate blocks in memory and "write" the data records into those blocks. The "list" blocks would be more efficient in memory since they don't need to contain child pointers. Future enhancements: - here's a way to allow printing more than one exact solution: When inserting an LHS or RHS, if an identical item already exists in the tree, overwrite it with the new one. This will cause at most one new, distinct exact match per matchscan. Since this new match is of higher aggregate complexity than the old one, it has at least a moderate chance of being a mathematically distinct solution (-: - spend a lot of time looking at which expressions are generated first, and try to improve the weights so it makes more sense. Why does 2.5063 find [x2q/]=[pq] before [xs]=[p2*]? - add constants: Feigenbaum, Euler's - add Gamma function (might have trouble figuring out how to calculate its derivative) - ask the Net for practical, numerical definitions of the Riemann Zeta function, and any others that look like they might be easy to implement. - explore how to expand RIES to complex numbers and complex analytic functions. Looks easy -- after all, complex data types are native in GCC! */ /* REVISION HISTORY 20000207 Begin (it doesn't do much except parse the parameters) 20000208 Add parsing of level adjust and more design comments 20000209 Write code to generate forms. "Discover" the Motzkin numbers. 20000210 Optimize gf_1() by making it compute the stack depth as it goes along rather than repeating teh whole stack history on each test. This improves time to generate all forms of length <=19 from 188 seconds down to to 37.9 seconds. Add tracking of min and max weights, and start writing expression generator. 20000215 It now generates forms within min and max possible complexity limits, to avoid generating expressions from forms that cannot possibly fall within the current complexity limit. Add a little optimization; speeds up a deep search from 53 seconds to 37 seconds. 20000217 It generates expressions, but doesn't prune for complexity or evaluate. For the initial groups of 5, the numbers of expressions evaluted are: {0,12,12,72,1368,1368,17928,345672,5875272,...} 20000217 Prune for complexity limits; each expression is now generated exactly once. The numbers are down to: {0,1,3,18,69,182,1046,5358,27123,...} Still need to prune foolishness like "11+" and "nn". 20000217 Prune almost all obvious trivial patterns. The numbers are now down to {0,1,3,13,43,122,486,2186,9775,...}. Then prune [JK+] and [JK-] for small integers, [jK*] and [jK+] for any j> 2" instead of ">> 1" in the s[] and ds[] declarations in struct metastack. 20000220 Refine the termination condition (now it counts generated expressions, rather than cutting off at a certain complexity score -- this is to gain independence from the specifics of the weights) and discover another dimensioning bug in struct metastack. Benchmark changes in PASS_GRAN: When it is set to 4, 2, and 1 the execution time for generating all expressions up to complexity 64 (2236462 expressions) is 3.57, 4.16, and 5.24 respectively. This actually a lot better than I thought it would be -- I thought there would be a lot more overhead from repeating the same subexpression evaluations over and over again. */ /* 20000220 Make it increase LHS and RHS complexity limits at independent rates, such that the population remains equal between the two sides. Add a bunch of comments documenting what's been done so far. 20000221 Write bt_insert(); it now reports exact matches! It finds, for example, [x3+] = [4s] for X=13. However, for X=143 it reports [xrx*] = [1], with a derivative of 1.9e-19. Add PRUNE_DERIV to try to fix this, and it starts reporting [xxqL] = [2] with a derivative of 0.00258248 -- this turns out to be a bug in the deriv formula for 'q'. Eventually increase PRUNE_DERIV to around 1e-14, then decide to make it a variable and add p_ovr and n_ovr. Write bt_prev() and bt_next(), but not using them yet. 20000221 Figure out that I can check just the nearest LHS on either side of an RHS, and vice versa. Write check_sides() and check_match. It now finds answers and prints them out! 20000221 Fix bug in exact-match reporting. When given 1.5065916, it reports: match: [xe+s1+] = [p6*] (solution is X+5.14855e-08) Make it report delta you'd have to add to X to get each match to be exact. Test cases that currently produce bogus results: 'ries -l1 27', 'ries -0.28676844', 'ries 403' 20000222 Adjust best_match by 0.999 each time to avoid long strings of roundoff-error results. This fixes the './ries 403' case. Rename to "ries" (it used to be called "misc"). Start analyzing memory usage. Figure out how to save some memory in the node structures, and more importantly, how to group nodes together in physical memory such that the program degrades more gracefully when physical memory limits are exceeded. 20000222 Implement -S and -O command-line options. This has the side-effect of making certain bogus match bugs easier and quicker to reproduce. 20000222 Change -S and -O to -N and -S respectively. 20000222 Convert manpage to #nroff# format. Improve implementation of -S/-O/-N precedence. Implement -O option. Figure out why restricted symbolsets create very small numbers of equations (the level controls the number of expressions generated, which is always larger than the number of expressions inserted in the tree) 20000222 Fix bug that caused negative X's to give no solutions -- it was initializing best_match to a fraction of X, and not taking the absolute value! Implement -i option so I can find an expression for 70458. 20000223 Ignore -i if target is non-integer. Adjust weights for class 'b' operators; 'l' now appears much less often in expressions. Add -y option. 20000223 Write add_rule(); convert all AM_xx rules to add_rule() calls. Now the AM_xx rules degrade gracefully with the symbolset options. Change -y option to -x. 20000223 Clean up formatting in report_match(); add complexity score to output. Write perl script to benchmark and take statistics on -i option for a wide rance of integers; leave it running overnight. Change all 'int' to 's16'. */ /* 20000224 Today's date as a mathematical expression: (((4^(4^(1/e))-pi)^2)-pi)^2 the repetition is cool. 20000224 Write initial version of infix_1, then add a few rules (ordering of bare symbols in '+' and '*'; always reverse order for 'v' and 's'). Add -F option. 20000224 Add memory usage statistic. 20000225 Write infix_preproc; implement parentheses precedence. 20000225 Write eval() and newton(); printed X values are now (almost always) exact roots of their equations. Add AM_1K rules. 20000226 Add copyright and GPL notices; add URL to first printf 20000226 Add symbol definition strings 20000227 Add sin and cos operators. Eliminate several bogus exact match errors related to roundoff and loss of accuracy, e.g. cosine(0.0001). Fix major bug in k_prune_deriv test: it only pruned positive small derivs, not negative small derivs. Add symbol FORTH names and write postfix_formatter (but it isn't used yet). Don't report matches if newton() returned an error. 20000227 Add -ie variant to -i option (only_exact) 20000228 Kill another missing-fabs bug; remove prune on "almost exact matches" which is now adequately covered by the derivative tests. 20000228 Add 'E' operator, AM_l and AM_E attributes, and PS_REVPOW. check_match now uses newton() to evaluate score more accurately. Add loss-of-precision test to '-' operator. 20000228 check_match calling newton was royally slowing things down. Now it uses the old, much quicker test and then uses newton() as a confirmation test. This cuts time for "ries 2.5063" from 4.09 down to 1.39 on the Cyrix 180 (which is about what it has been since 0223). Add statistics of pruned subexpressions ("dead-ends"). 20000229 Fix bug in newton() that prevented success if target was negative. Add debug printf's 'mnr'. 20000229 Add time display; add debug printf's 'opqsABCDEFG'. 20000229 Add debug printf's 'Hy' and a few notes about complex analytic definitions 20000301 Change debug printf 's'; add another loss-of-precision test to '+' and '-'; add debug printf's 'IJKLtuvx'. Add 'I' operator and special-case tests for no defined symbols of each class (this is a massive optimization for "ries -ie 7 '-S1+*-/^v'") 20000302 Add debug printf 'w' and improve 'r'. Add arctan function (but not using it yet) and some notes about derivatives. 200003xx Benchmarks on a 333-MHz Celeron: -command-------------------------- -mem- time equations ries 2.5063141592653589 960K 0.3 9.5619e7 ries -l1 2.5063141592653589 3072K 1.5 1.0306e9 ries -l2 2.5063141592653589 10.7M 6.5 1.2832e10 ries -l3 2.5063141592653589 32.7M 24.4 1.2042e11 200203xx Benchmarks on an 800-MHz PowerPC G4 (iMac, model M6498LL/A): -command-------------------------- -mem- time equations ries 2.5063141592653589 960K 0.1 9.5062e7 ries -l1 2.5063141592653589 3072K 0.8 1.0248e9 ries -l2 2.5063141592653589 10.6M 3.6 1.2765e10 ries -l3 2.5063141592653589 32.6M 14.3 1.1976e11 ries -l4 2.5063141592653589 114M 70.3 1.4609e12 200310xx Benchmarks on an 800-MHz PowerPC G4 (iBook G4, model M9164LL/A): -command-------------------------- -mem- time equations ries 2.5063141592653589 960K 0.1 9.5062e7 ries -l1 2.5063141592653589 3072K 0.7 1.0248e9 ries -l2 2.5063141592653589 10.6M 3.2 1.2765e10 ries -l3 2.5063141592653589 32.6M 12.8 1.1976e11 ries -l4 2.5063141592653589 114M 62.3 1.4609e12 ries -l5 2.5063141592653589 398M 388 1.7825e13 20050715 Benchmarks on a 2-GHz PowerPC G5 (part of a dual system, model M9455LL/A): -command-------------------------- -mem- time equations ries 2.5063141592653589 960K 0.0 9.5062e7 ries -l1 2.5063141592653589 3072K 0.2 1.0248e9 ries -l2 2.5063141592653589 10.6M 1.5 1.2765e10 ries -l3 2.5063141592653589 32.6M 6.6 1.1976e11 ries -l4 2.5063141592653589 114M 36.2 1.4609e12 ries -l5 2.5063141592653589 398M 260 1.7825e13 ries -l6 2.5063141592653589 1.41G 2183 2.2297e14 20070511 Alan Eliasen emails me to tell me that 'ries -l4 193707721' gives the spurious result "x.(1/S(p))/x = 1/S(p) (exact match) {107}". This is a problem I have known about for a while. I reproduce the problem and reduce the symbol set to make it appear faster; it can be reproduced with: ries -F '-SxpSr/*' -l0 1002353667 Add rule to eliminate "sin(pi)". It now reports the following more complex version of the same bug: x.(1/S(p.p/p))/x = 1/S(p.p/p) (exact match) {181} */ #include #include #include /* The following three are only needed by gettime() for measuring how much time was used in the search. To port this program to a non-UNIX OS, remove the following three #includes and re-write gettime() */ #include /* ONLY used by gettime() */ #include /* ONLY used by gettime() */ #include /* ONLY used by gettime() */ #include /* -------------- defines ------------------------------------------------- */ /* Maximum length of a symbolic expression. NOTE: right now it's dimensioned to reflect a normal symbol set. However, in a run with a very limited symbol set the expressions grow in number a lot more slowly as the complexity score and length increase, and therefore a much higher MAX_ELEN would be necessary. The main problem is it affects the size of the list nodes, and therefore the memory footprint of the program. I guess I'll change it later, when I decide how to make the list nodes variable in size. */ #define MAX_ELEN 16 /* The size of the memory blocks we use */ #define ALLOC_SIZE 65536L /* %%% should be at least 8x the VM page size */ /* This can be increased to improve speed, but it also decreases the efficiency of memory usage. */ #define PASS_GRAN 1 /* -------------- typedefs ------------------------------------------------ */ typedef short s16; typedef unsigned short u16; typedef long s32; typedef unsigned long u32; typedef long long int s64; /* SYMBOL_RANGE is the dimension of an array capable of storing one of each possible value of typedef symbol. */ #define SYMBOL_RANGE 256 typedef char symbol; /* NOTE: %%% Although it looks like I am trying to maintain independence from having the symbol type be 'char', I actually have not accomplished this in the code. However, it won't be a total mess to convert it to 16-bit symbols or something like that, if someone decides that's necessary. The worst part will designing a new command-line syntax for specifying the symbolset for a search. However, *BEWARE*! If you're increasing the symbol set significantly past its original level of about 40 symbols, the users will pay dearly in efficiency (runtime). Having a great variety of symbols will massively slow down the search. In particular, please resist the temptation to make one symbol for every integer (or even every prime number) from 1 to 1000, or some other arbitrary big number. There should not be more integer symbols than every other symbol combined! RIES is perfectly happy synthesizing the larger integers from the small ones on its own, just like it does for the fractions and irrationals. If the fact that a search for "34.0" yields "x / 2 = 4 ^ 2 + 1" as an answer, the solution isn't to make "34" a symbol. Instead, look at ways to make RIES automatically combine integer subexpressions like "4^2+1" and, when possble, simplify the equation by moving the 2 to the RHS. On the other hand, keep in mind that the ability to show "34" expressed in terms of "two 2's, a 4 and a 1" is one of the reasons RIES was created. */ /* phantom symbols */ #define PS_REVSUB 1 #define PS_REVDIV 2 #define PS_cross 3 #define PS_REVPOW 4 #define IS_PHANTOM(x) (x < 10) /* this struct is used for expressions that have been inserted into LHS or RHS lists. */ typedef struct expr { double val; /* The expression's floating-point value */ double der; /* The derivative (for LHS expressions only: if RHS, this will be 0.) */ struct expr *left; /* left child tree or 0 if none */ struct expr *up; /* parent node, or 0 if we're the head */ struct expr *right; /* right child tree or 0 if none */ s16 elen; /* number of symbols in the expression, e.g. 8 */ symbol sym[MAX_ELEN+1]; /* The expression in symbolic form, e.g. "p6*1-qe-" this field is last to allow for variable length allocation */ } expr; /* size: 8+8+4+4+4+2+12=42 */ typedef struct rhs_node { double val; /* The expression's floating-point value */ void * P_L; /* parent and left child pointers */ void * P_R; /* parent and right child pointers */ symbol sym[MAX_ELEN+1]; /* The expression in symbolic form, e.g. "p6*1-qe-", null-terminated. This field is last to allow for variable length allocation in the future. */ } rhs_node; /* size: 8+4+4+12=28 */ /* P P^L P^R L R The three links (parent, left-child and right-child) are XORed together into the two fields P^R and P^L. As the tree is traversed, the traversal routine always keeps track of what node it came from and what that node's numeric value was. Then the three links are reconstructed as follows: If the current node was reached by going down, we know P. L and R are reconstructed by XORing P with the two fields P^R and P^L. If this node was reached by going up, we know either L or R. To find out which one, we compare the val field of this node with that of the node we just came from. If it's bigger, we came from the left child and therefore, L is known. If it's smaller we came from the right-child and R is known. If R is known, P is reconstructed by XORing R with P^R. Then L is reconstructed by XORing P with P^L. If L is known, the same method is used to get P and R. There is another, perhaps more direct way to save one pointer per node: maintain a stack of parent pointers. As you descend, push parent pointers on the stack, and pop them off as you ascend. However, it does not appear that the methods can be combined to save two pointers per node: if you descend, the node you arrive at has unknown L and R, both of which would have to get reconstructed. */ typedef struct obt_traversal { s16 side; /* LHS or RHS */ void * realnode; /* points to actual LHS or RHS node (0 if we're at the root) */ double x; /* The value */ double dx; /* Derivative */ void * up; /* Reconstructed parent pointer */ void * left; /* Reconstructed left-child pointer */ void * right; /* Reconstructed right-child pointer */ /* If they want the expression, they can get it from *realnode */ } obt_traversal; /* %%% obt (which literally stands for "obfuscated binary tree") routines aren't written yet. */ void obt_new(obt_traversal * it); void obt_clone(obt_traversal * from, obt_traversal * to); void obt_up(obt_traversal * it); void obt_left(obt_traversal * it); void obt_right(obt_traversal * it); /* M E T A S T A C K ! A metastack works like a stack of stacks. Imagine a normal stack with its push and pop operators. A metastack treats the whole stack as an object. Every time you do a push or pop, a copy of the *entire stack* gets pushed onto the metastack. Metastacks support a third operation, called "undo", which pops a stack off of the metastack; this popped stack replaces the current stack. Thus, undo enables you to go "back in time", returning the stack to the state it was at some point in the past without having to remember what values were popped and pushed. This is useful in RIES's expression generator because it allows intermediate calculations to be reused from one expression to the next during the recursive scan, and upon backtracking the stack from earlier, shorter subexpressions is restored. */ typedef struct metastack { double uv[MAX_ELEN-1]; /* undo values */ double udv[MAX_ELEN-1]; /* undo values (differentials) */ s16 uvp; /* undo values pointer */ s16 ms[MAX_ELEN * 2]; /* metastack (undo opcodes) */ s16 msp; /* metastack pointer (undo opcodes index) */ double s[(MAX_ELEN+1) >> 1]; /* current stack */ double ds[(MAX_ELEN+1) >> 1]; /* stack of differentials for LHS */ s16 sp; /* stack pointer */ } metastack; /* the metastack undo opcodes. If you wanted to support more than simple push and pop you would add another opcode for each operation (example: a modify-in-place operator, something which normally requires a pop followed by a push) */ #define MSO_PUSH 1 #define MSO_POP 2 /* "pe" stands for "partial expression". This struct is used by the recursive expression generator. Typically about half of its symbols will be filled in, and the complexity of the symbols written so far is compered to rminw and rmaxw for pruning. Here's an example. Let's say the current weight limits are 30 to 31: we're trying to generate expressions whose total weight is either 30 or 31. And suppose further that the partial expression currently contains two symbols with a total weight of 18. That means that the remaining symbols must add up to either 12 or 13 for the expression to be accepted. When it's generating expressions, RIES already knows what form (see below) the expression fits into, and therefore it knows how many symbols are left to generate and what types they are. Therefore, certain statements can be made about the weights of the symbols yet-to-be generated. Let's say the current form calls for two more symbols, both of type 'b', and suppose further that all 'b' symbols have weights of 8, 9 or 10. Well, that means that at this point in our example equation, even if the two remaining symbols are 8's, the total weight will be too high. Thus, the first two symbols form an impossible combination (from the point of view of trying to meet the current range of 30 to 31), and the expression generator can backtrack immediately, without having to proceed to explore all the combinations of two more type 'b' symbols that might be added here. That saves a LOT of time -- this optimization alone cuts the number of expressions by a factor of over 200 (for searches of 10,000,000 expressions) and even more if a larger number of expressions is searched. */ typedef struct pe { s16 comp; /* complexity of this partial-expression */ s16 elen; /* number of symbols, e.g. 2 */ symbol sym[MAX_ELEN+1]; /* the symbols, e.g. "p6" */ s16 rminw[MAX_ELEN+1]; /* remaining minimum weight */ s16 rmaxw[MAX_ELEN+1]; /* remaining maximum weight */ } pe; /* A "form" is a pseudo-expression consisting only of A's, B's and C's. The A's, B's and C's represent the three types of symbols that make up real expressions. Before real expressions are generated, RIES first determines what sequences of type-A, type-B and type-C symbols will constitute legal expressions. (It's easier to describe what an illegal expression is: It's an expression that causes the stack to underflow by executing an operation without enough operands, or an expression that leaves extra stuff on the stack when it's done.) */ typedef struct form { s16 min_weight; /* min attainable complexity with this form */ s16 max_weight; /* max attainable complexity with this form */ s16 stack; /* stack depth at end of form */ s16 flen; /* number of symbols in form */ symbol sym[MAX_ELEN+2]; /* the form, e.g. "aabc" */ } form; /* -------------- variables ----------------------------------------------- */ char *freepool; /* pointer to current alloc block */ u32 freesize; /* and bytes left in current alloc block */ expr *lhs_root; /* binary tree for LHS list */ expr *rhs_root; /* binary tree for RHS list */ s16 on_rhs; /* This is nonzero if we're currently generating expressions for the RHS tree. */ s16 lmax; /* current complexity maximum limit for LHS */ s16 rmax; /* current complexity maximum limit for RHS */ s16 lmin; /* current complexity minimum limit for LHS */ s16 rmin; /* current complexity minimum limit for RHS */ s16 tlevel; /* The maximum level (depth) of the search. The search ends when LHS + RHS > tlevel */ double target; /* The value for which we are searching */ double exec_x; /* the value exec() uses for 'x' symbol */ s16 got_exact; symbol sym_class[SYMBOL_RANGE]; /* the class of each symbol */ s16 sym_weight[SYMBOL_RANGE]; /* for scoring */ s16 weight_base; /* weight per symbol for expression complexity score */ u16 sym_mask[SYMBOL_RANGE]; /* The attributes, masked with this... */ u16 sym_mval[SYMBOL_RANGE]; /* must equal this. */ s16 sym_yes[SYMBOL_RANGE]; /* nonzero if this symbol is given in -S */ s16 sym_once[SYMBOL_RANGE]; /* nonzero if this symbol is given in -O */ s16 sym_no[SYMBOL_RANGE]; /* nonzero if this symbol is given in -N */ s16 sym_allowed[SYMBOL_RANGE]; /* Number of symbols allowed in each expression */ s16 sym_count[SYMBOL_RANGE]; /* used in ge_2() to keep track of how many symbols are in expression; part of -O option. */ char * sym_def[SYMBOL_RANGE]; /* strings which, when printed, define a symbol's meaning */ s16 sym_def_given[SYMBOL_RANGE]; s16 sym_def_needed[SYMBOL_RANGE]; #define LINELEFT_INIT (79-2) char * sym_forth[SYMBOL_RANGE]; /* The symbol's representation in postfix output format */ s16 S_option; s16 int_subexpr; /* nonzero if they gave -i option */ s16 only_exact; /* nonzero to print only the exact match, if any */ s16 relative_x; /* nonzero if X values should be given relative to T */ /* attribute masks */ #define AM_KK 0x0001 /* K K - */ #define AM_1 0x0002 /* 1 - */ #define AM_2 0x0004 /* 2 - */ #define AM_n 0x0008 /* n - */ #define AM_r 0x0010 /* r - */ #define AM_55 0x0020 /* J K - where J and K are both 5 or less */ #define AM_jK 0x0040 /* J K - where J < K */ #define AM_RHS 0x0080 /* set only when filling RHS list */ #define AM_sq 0x0100 /* o - where o is 's' or 'q' */ #define AM_1K 0x0200 /* 1 K - */ #define AM_l 0x0400 /* l - */ #define AM_E 0x0800 /* E - */ #define AM_p 0x1000 /* p - */ symbol asym[20]; /* the valid class 'a' symbols */ s16 n_asym; s16 a_minw; /* minimum weight of class 'a' symbols */ s16 a_maxw; /* maximum weight of class 'a' symbols */ symbol bsym[20]; /* the valid class 'b' symbols */ s16 n_bsym; s16 b_minw; /* minimum weight of class 'b' symbols */ s16 b_maxw; /* maximum weight of class 'b' symbols */ symbol csym[20]; /* the valid class 'c' symbols */ s16 n_csym; s16 c_minw; /* minimum weight of class 'c' symbols */ s16 c_maxw; /* maximum weight of class 'c' symbols */ s16 s_minw; /* minimum weight of any symbol */ s16 max_flen; /* max length of forms generated thus far */ /* Irrational constants have lots-o-digits just in case this program ever gets ported to a C compiler that offers quad-precision floating point. */ double k_0 = 0.0; double k_1 = 1.0; double k_phi = 1.61803398874989484820458683436563811772030; double k_2 = 2.0; double k_e = 2.71828182845904523536028747135266249775724; double k_3 = 3.0; double k_pi = 3.14159265358979323846264338327950288419716; double k_4 = 4.0; double k_5 = 5.0; double k_6 = 6.0; double k_7 = 7.0; double k_8 = 8.0; double k_9 = 9.0; /* These constants are used to set legal limits in various functions */ double k_prune_deriv = 1.0e-10; double k_sin_clip = 0.99999; double k_2pi = 6.2831853071795864; double k_eXlim = 690.0; double k_matchstart = 0.01; double p_ovr; double n_ovr; double best_match; u32 g_ne; u32 insert_count; u32 prune_count, lhs_prune, rhs_prune; u32 mem_total; u16 postfix_output; u32 lhs_insert, rhs_insert; u32 lhs_gen, rhs_gen; u32 gen_total; /* debugging options: */ /* numbers from "ries -l2 2.5063141592653589 -DJ | wc -l" */ s16 debug_s; /* report_match: "show work" 300 */ /* (values of all subexpressions) */ s16 debug_o; /* eval: sym, x and dx at each step 788 */ s16 debug_n; /* newton: x and dx at each step 255 */ s16 debug_p; /* check_match entry 453890 */ s16 debug_q; /* check_match past 1st-stage test 109 */ s16 debug_m; /* ms_push, ms_pop, ms_peek, ms_undo 10216232 */ s16 debug_r; /* exec 1706583 */ /* ge_2: (uppercase for LHS, lowercase for RHS) */ /* exec: */ s16 debug_A; /* prune partial exec error 29949 64184 */ s16 debug_B; /* prune partial zero 196 2797 */ s16 debug_C; /* prune partial noninteger with -i 38 38 */ s16 debug_D; /* prune partial overflow 1681 4171 */ s16 debug_E; /* prune partial dx near 0 1333 */ /* full expr: */ s16 debug_F; /* prune full dx near 0 1157 */ s16 debug_G; /* insert 305319 670210 */ /* partial expr: */ s16 debug_H; /* rules 361668 731891 */ s16 debug_I; /* symbols to try 3286347 6618747 */ s16 debug_J; /* prune complexity 2111413 4235348 */ s16 debug_K; /* prune rules 208550 451853 */ s16 debug_L; /* prune symcount 38 38 */ /* ge_1: */ s16 debug_t; /* entry 9380 */ s16 debug_u; /* rminw and rmaxw calculation 31649 */ s16 debug_v; /* expressions generated 4709 */ s16 debug_w; /* gf_1 28978 */ s16 debug_x; /* add_rule 89 */ s16 debug_y; /* main loop 353 */ #define DBG_LHS 2 #define DBG_RHS 1 /* -------------- prototypes ---------------------------------------------- */ void ms_push(metastack *ms, double x, double dx); double ms_pop(metastack *ms, double *diff); double ms_peek(metastack *ms, double *diff); void ms_undo(metastack *ms); s16 exec(metastack *ms, symbol op, s16 *undo_count, s16 do_dx); s16 infix_1(symbol * expr, char * term, symbol * t_op); void infix_preproc(symbol * expr, symbol * out); s16 postfix(symbol * expr, char * term); s16 postfix_formatter(symbol * expr, s16 print); s16 complexity(symbol * expr); s16 eval(symbol * expr, double * val, double * dx, s16 show_work); s16 newton(symbol * lhs, symbol * rhs, double *root); void define_symbols(void); s32 gettime(void); void print_end(void); void exact_exit(void); void report_match(expr *lhs, expr *rhs, pe *exm, double score); void check_match(expr * lhs, expr * rhs); expr * bt_prev(expr *it); expr * bt_next(expr *it); void check_sides(expr * it); s16 bt_insert(double x, double dx, pe *ex, s16 * res1); u32 ge_2(form *base, pe *bpe, s16 e_minw, s16 e_maxw, metastack *ms); u32 ge_1(form *base, s16 e_minw, s16 e_maxw); u32 gf_1(form *base, s16 minw, s16 maxw); u32 gen_forms(s16 minw, s16 maxw); void add_symbol(symbol sym, symbol type, s16 weight, char *forth, char * definition); void add_rule(symbol * symset, symbol sym, s16 mask, s16 mval); void init1(void); void init2(void); char * my_alloc(u32 size, double val); int main(int nargs, char *argv[]); /* -------------- functions ----------------------------------------------- */ /* All allocation is done in fixed-size blocks, and nothing is ever * deallocated, so it's very easy to manage. This is your typical * cached blocked allocate funtion */ char * my_alloc(u32 size, double val) { s16 do_alloc; char * rv; /* assume the worst */ rv = 0; /* first, see if there's a block with enough space */ do_alloc = 0; if (freepool) { if (freesize >= size) { /* we're okay */ } else { do_alloc = 1; } } else { do_alloc = 1; } /* if we need to allocate, do so */ if (do_alloc) { freesize = 0; freepool = (char *) malloc((size_t) ALLOC_SIZE); /* did we get any? */ if (freepool) { freesize = ALLOC_SIZE; mem_total += (ALLOC_SIZE / 1024L); } } /* if we have a block now, we can allocate from it */ if (freepool) { rv = freepool; freepool += size; /* %%% how do I find out the alignment requirement for pointers to structs? */ freesize -= size; } return(rv); } /* metastack routines The following illustrate the contents of the stack and undo list for some sample expressions. The expressions read from top to bottom, and there is one line for each push or pop. (There are no undo's illustrated here; an undo would be just moving up to a previous line). At the end of each example is the total number of steps: This is the number of items in the undo list. op stack --undo-list-- op stack --undo-list-- 1 1 2 2 2 1 2 s - 2 4 1 2 4 4 2 8 1 2 4 8 s - 2 4 + 1 2 4 8 16 2 4 1 2 8 4 q - 2 4 16 1 2 12 8 4 4 2 4 16 + 1 2 8 4 12 s - 2 4 16 4 1 8 4 12 2 16 2 4 16 4 1 14 8 4 12 2 q - 2 4 16 4 16 + 1 8 4 12 2 14 4 2 4 16 4 16 - 8 4 12 2 14 1 s - 2 4 16 4 16 4 15 8 4 12 2 14 1 16 2 4 16 4 16 4 total steps: 13 total steps: 13 op stack --undo-list-- 1 1 1 1 1 n 1 1 1 -1 1 r 1 1 -1 1 -1 1 -1 - 1 1 -1 -1 - 1 -1 -1 1 2 1 -1 -1 1 r - 1 -1 -1 1 2 0.5 1 -1 -1 1 2 total steps: 11 */ void ms_init(metastack *ms) { ms->uvp = 0; ms->msp = 0; ms->sp = 0; } /* ms_push does a standard PUSH operation. */ void ms_push(metastack *ms, double x, double dx) { s16 sp, msp; /* push x */ sp = ms->sp; (ms->ds)[sp] = dx; (ms->s)[sp] = x; /* remember */ msp = ms->msp; ms->ms[msp] = MSO_PUSH; if (debug_m) { printf("push %d %lg (%lg) '%d'\n", sp, x, dx, msp); } /* for a push, we don't need to add any undo values */ sp++; ms->sp = sp; msp++; ms->msp = msp; } /* ms_pop does a standard POP operation. */ double ms_pop(metastack *ms, double *diff) { s16 sp, msp, uvp; double rv, drv; /* pop a value */ sp = ms->sp; sp--; rv = (ms->s)[sp]; drv = (ms->ds)[sp]; if (diff) { *diff = drv; } ms->sp = sp; /* remember the action and the data */ msp = ms->msp; ms->ms[msp] = MSO_POP; /* save the popped values in the undo list */ uvp = ms->uvp; if (debug_m) { printf("pop %d %lg (%lg) '%d' .%d.\n", sp, rv, drv, msp, uvp); } ms->udv[uvp] = drv; ms->uv[uvp] = rv; msp++; ms->msp = msp; uvp++; ms->uvp = uvp; return rv; } /* ms_peek just lets you see what's on the top of the stack. */ double ms_peek(metastack *ms, double *diff) { s16 sp; double rv, drv; sp = (ms->sp)-1; rv = ms->s[sp]; drv = ms->ds[sp]; if (diff) { *diff = drv; } if (debug_m) { printf("peek %d %lg (%lg)\n", sp, rv, drv); } return (rv); } /* ms_undo performs a metastack UNDO operation -- it returns the stack to the state it was in before the most recent PUSH or POP. */ void ms_undo(metastack *ms) { s16 msp, opcode; /* pop the opcode */ msp = ms->msp; opcode = ms->ms[--msp]; ms->msp = msp; if (opcode == MSO_PUSH) { /* to undo a PUSH is easy -- just pop the SP. */ ms->sp--; if (debug_m) { printf("undo-push\n"); } } else { s16 uvp, sp; double uv, udv; /* undo a POP. This is like a PUSH except we have to retrieve the value from the undo data. */ uvp = ms->uvp; uvp--; uv = ms->uv[uvp]; udv = ms->udv[uvp]; ms->uvp = uvp; sp = ms->sp; ms->ds[sp] = udv; ms->s[sp] = uv; sp++; ms->sp = sp; if (debug_m) { printf("undo-pop\n"); } } } /* Cody, William J., Jr., and Waite, William. Software Manual for the Elementary Functions. Prentice-Hall (Englewood Cliffs, New Jersey, 1980). */ /* the man page for atan2 doesn't specify its behavior when both arguments are zero, so just to be safe I am defining it myself. */ double arctan(double a, double b) { double rv; if ((a == k_0) && (b == k_0)) { rv = k_0; } else { rv = atan2(a,b); } return rv; } /* sinh sinh(x) = (e^x - e^-x) / 2 */ /* complex natural logarithm cln(z) = ln(cabs(z)) + phase(z) i */ /* complex exponential cexp(z) = exp(a) (cos b + i sin b) */ /* complex sine (and sinh) csin(z) = (cexp(i z) - cexp(-i z)) / 2i csin(i z) == i csinh(z) csin(z) = (cexp(i z) - cexp(-i z)) / 2i = (cexp(ai - b) - cexp(b - ai)) / 2i = (e^-b (cos a + i sin a) - e^b (cos a - i sin a)) / 2i = (e^-b cos a + i e^-b sin a - e^b cos a + i e^b sin a) / 2i = i 1/2 e^-b cos a - 1/2 e^-b sin a - i 1/2 e^b cos a - 1/2 e^b sin a a + bi = i 1/2 e^-b cos a - 1/2 e^-b sin a - i 1/2 e^b cos a - 1/2 e^b sin a a = - 1/2 e^-b sin a - 1/2 e^b sin a bi = i 1/2 e^-b cos a - i 1/2 e^b cos a a = - 1/2 e^-b sin a - 1/2 e^b sin a = - sin a cosh b b = 1/2 e^-b cos a - 1/2 e^b cos a = - cos a sinh b b = - cos a (e^b - e^-b)/2 b 2 / (e^-b - e^b) = - cos a -2 b / (e^-b - e^b) = cos a 2 b / (e^b - e^-b) = cos a a = 2 pi K + cos' (2 b / (e^b - e^-b)) a = - sin a (e^b + e^-b)/2 -1 = sin(a)/a (e^b + e^-b)/2 -2/(e^b + e^-b) = sin(a)/a - bell(b) = sin(a)/a b = +/- bell' (-sin(a)/a) [defined only when sin(a)/a < 0] +/- bell'(-sin(a)/a) = - cos a sinh(+/- bell'(sin(a)/a)) 2 pi K + cos' (2 b / (e^b - e^-b)) = - cosh b sqrt(1 - (2 b / (e^b - e^-b))^2) */ /* More elaborate functions follow. These are the ones that do not come directly out of the closure of the simple operators + - * / ^ v (as is the case for log, exp, sin and cos and their related functions) but from new concepts such as the integral of a function, etc. This includes the standard distribution, the Gamma function, Bessel functions, etc. */ /* Bernoulli numbers given by the iterative recursive algorithm: B(1) = 1/2 B(2) = 1/3 (3 B(1) - 1) B(3) = 1/4 (6 B(2) - 4 B(1) + 1) B(4) = 1/5 (10 B(3) - 10 B(2) + 5 B(1) - 1) (etc.) where all the coefficients are from Pascal's Triangle. B(n) is 0 for all odd n above 1. B(n) for n even can also be computed numerically by: B(2n) = zeta(2n) 2 (2n)! / [ -1^(n+1) (2 pi)^(2n) ] where zeta(2n) is the simple (positive real) Riemann Zeta function This should produce the following: 2 1.666650307245059 10-001 4 -3.333333333333231 10-002 6 2.380952380952373 10-002 8 -3.333333333333330 10-002 10 7.575757575757569 10-002 12 -2.531135531135529 10-001 14 1.166666666666665 16 -7.092156862745091 18 54.971177944862090 20 -529.124242424241500 22 6192.123188405786000 24 -86580.253113552940000 26 1425517.166666664000000 28 -2.729823106781603 10+07 30 6.015808739006410 10+08 32 -1.511631576709212 10+10 */ /* Gamma function and generalized factorial Important: the number of steps of substituting G(x) = x G(x-1), and the number of terms in the series, both depend on the precision you want. If you add more terms without increasing the number of substitution steps, the series diverges! */ /* binomial coefficients The binomial coefficient "n over k" is: bincoef(n,k) = n! / ((n-k)! k!) = exp(lngamma(n) - (lngamma(n-k) + lngamma(k))) */ /* Riemann Zeta function The full Riemann Zeta function for any complex z is computed by the double infinite sum Z(z) = 2^(z-1) SIGMA[n=0..inf] 1/2^(n+1) f(n,z) where f(n,z) = SIGMA[k=0..n] -1^k bincoef(n,k) (k+1)^(-z) for the critical line real(z) = 1/2, we have the simpler Z(s) = 1/(1-2^(1-s)) SIGMA[n=1..inf] -1^(n-1) n^(-s) */ /* exec actually executes an opcode, using a metastack. It returns a nonzero value if there was an error, e.g. divide-by-zero. It also sets undo_count to a number indicating the number of times you have to call ms_undo to put the stack back to the state it was in before calling exec on this opcode. there is plenty of error checking for zero and negative arguments, but no checking for overflow. The reason is that only a few expressions, (the simplest is 445^^) are capable of overflowing -- so it's not much of an optimization. There's no danger of the program generating an exception from overflow, so we don't bother. */ s16 exec(metastack *ms, symbol op, s16 *undo_count, s16 do_dx) { double a, b, rv; double da; double db; double drv; double lb; /* set default for differential (overridden if we compute it) */ drv = k_0; switch(op) { /* class 'a' symbols. For all constants the derivative is zero; for X the derivative is 1.0 */ case '1' : rv = k_1; ms_push(ms, rv, k_0); *undo_count = 1; break; case 'f' : rv = k_phi; ms_push(ms, rv, k_0); *undo_count = 1; break; case '2' : rv = k_2; ms_push(ms, rv, k_0); *undo_count = 1; break; case 'e' : rv = k_e; ms_push(ms, rv, k_0); *undo_count = 1; break; case '3' : rv = k_3; ms_push(ms, rv, k_0); *undo_count = 1; break; case 'p' : rv = k_pi; ms_push(ms, rv, k_0); *undo_count = 1; break; case '4' : rv = k_4; ms_push(ms, rv, k_0); *undo_count = 1; break; case '5' : rv = k_5; ms_push(ms, rv, k_0); *undo_count = 1; break; case '6' : rv = k_6; ms_push(ms, rv, k_0); *undo_count = 1; break; case '7' : rv = k_7; ms_push(ms, rv, k_0); *undo_count = 1; break; case '8' : rv = k_8; ms_push(ms, rv, k_0); *undo_count = 1; break; case '9' : rv = k_9; ms_push(ms, rv, k_0); *undo_count = 1; break; case 'x' : rv = exec_x; drv = k_1; ms_push(ms, rv, drv); *undo_count = 1; break; /* class 'b' symbols */ case 'I' : /* Identity: used only when all other class-b symbols have been excluded */ /* n(a+bi) = -a + -bi */ a = ms_pop(ms, &da); rv = a; drv = da; ms_push(ms, rv, drv); *undo_count = 2; break; case 'n' : /* negate */ /* n(a+bi) = -a + -bi */ a = ms_pop(ms, &da); if (do_dx) { drv = - da; } rv = -a; ms_push(ms, rv, drv); *undo_count = 2; break; case 'r' : /* reciprocal */ /* r(a+bi) = (a-bi) / (a^2 + b^2) */ a = ms_pop(ms, &da); *undo_count = 1; if (a == k_0) { return -1; } if (do_dx) { drv = ( - da) / (a * a); } rv = k_1 / a; ms_push(ms, rv, drv); *undo_count = 2; break; case 's' : /* squared */ a = ms_pop(ms, &da); if (do_dx) { drv = k_2 * a * da; } rv = a * a; ms_push(ms, rv, drv); *undo_count = 2; break; case 'q' : /* square root */ /* q(a+bi) is the conjugate with the same imaginary sign as bi. for example, q(i) = (1+i)/sqrt(2); q(-i) = (1-i)/sqrt(2) */ a = ms_pop(ms, &da); *undo_count = 1; if (a < k_0) { return -1; } rv = sqrt(a); if (do_dx) { drv = da / (k_2 * rv); } ms_push(ms, sqrt(a), drv); *undo_count = 2; break; case 'l' : /* ln */ a = ms_pop(ms, &da); *undo_count = 1; if (a <= k_0) { return -1; } if (do_dx) { drv = da / a; } rv = log(a); ms_push(ms, rv, drv); *undo_count = 2; break; case 'E' : /* e ^ X */ /* E(a+bi) = e^x (cos b + sin b i) */ a = ms_pop(ms, &da); *undo_count = 1; if (a > k_eXlim) { return -1; } rv = exp(a); if (do_dx) { drv = rv * da; } ms_push(ms, rv, drv); *undo_count = 2; break; case 'S' : /* sine */ /* d/dx sin(u) = cos(u) */ a = ms_pop(ms, &da); *undo_count = 1; if ((a >= k_2pi) || (-a >= k_2pi)) { /* This is to eliminate nonsense "solutions" like "sin(X^9) = 1/4" */ return -1; } rv = sin(a); if (fabs(rv) > k_sin_clip) { /* This is to eliminate stuff like "sin(pi/2 + 0.00001) = 1" */ return -2; } if (do_dx) { drv = cos(a) * da; } ms_push(ms, rv, drv); *undo_count = 2; break; case 'C' : /* cosine */ /* d/dx cos(u) = - sin(u) */ a = ms_pop(ms, &da); *undo_count = 1; if ((a >= k_2pi) || (-a >= k_2pi)) { /* This is to eliminate nonsense "solutions" like "sin(X^9) = 1/4" */ return -1; } rv = cos(a); if (fabs(rv) > k_sin_clip) { /* This is to eliminate stuff like "cos(0.00001) = 1" */ return -2; } if (do_dx) { drv = k_0 - (sin(a) * da); } ms_push(ms, rv, drv); *undo_count = 2; break; /* class 'c' symbols */ case '-' : b = ms_pop(ms, &db); a = ms_pop(ms, &da); *undo_count = 2; /* this operator shares the loss-of-precision tests with '+' */ b = - b; db = - db; goto add_common; case '+' : b = ms_pop(ms, &db); a = ms_pop(ms, &da); *undo_count = 2; add_common: ; rv = a + b; if ((fabs(a) < (fabs(b) * k_prune_deriv)) || (fabs(b) < (fabs(a) * k_prune_deriv)) ) { /* loss-of-precision error, e.g. "1 + 1e40" */ return -2; } if ((fabs(rv) < (fabs(a) * k_prune_deriv)) || (fabs(rv) < (fabs(b) * k_prune_deriv)) ) { /* another loss-of-precision error, e.g. "1e40 - (1e40+1)" */ return -2; } if (do_dx) { drv = da + db; } ms_push(ms, rv, drv); *undo_count = 3; break; case '*' : /* d/dx u v = v du + u dv */ b = ms_pop(ms, &db); a = ms_pop(ms, &da); if (do_dx) { drv = (b * da) + (a * db); } rv = a * b; ms_push(ms, rv, drv); *undo_count = 3; break; case '/' : /* d/dx u/v = (v du - u dv) / v^2 */ b = ms_pop(ms, &db); *undo_count = 1; if (b == k_0) { return -1; } a = ms_pop(ms, &da); if (do_dx) { drv = ((b * da) - (a * db)) / (b * b); } rv = a / b; ms_push(ms, rv, drv); *undo_count = 3; break; case '^' : /* a to the power of b */ /* d/dx u^v = v u^(v-1) du + ln(u) u^v dv */ b = ms_pop(ms, &db); a = ms_pop(ms, &da); *undo_count = 2; if (a <= k_0) { return -1; } rv = pow(a, b); if (do_dx) { drv = rv * ( (b * da / a) + (log(a) * db) ); } ms_push(ms, rv, drv); *undo_count = 3; break; case 'v' : /* the bth root of a, that is, a^(1/b) */ b = ms_pop(ms, &db); *undo_count = 1; if (b == k_0) { return -1; } a = ms_pop(ms, &da); *undo_count = 2; if (a < k_0) { return -1; } rv = pow(a, k_1 / b); if (do_dx) { drv = rv * ( (da / (a * b)) - (log(a) * db / (b*b)) ); } ms_push(ms, rv, drv); *undo_count = 3; break; case 'L' : /* log base b of a, that is, ln(a)/ln(b) */ b = ms_pop(ms, &db); *undo_count = 1; if ((b <= k_0) || (b == k_1)) { return -1; } a = ms_pop(ms, &da); *undo_count = 2; if (a <= k_0) { return -1; } lb = log(b); rv = log(a) / lb; if (do_dx) { drv = (da / (a * lb)) - (rv * db / (b * lb)); } ms_push(ms, rv, drv); *undo_count = 3; break; case 'A' : /* atan2 function (quadrant-correct arctangent) */ b = ms_pop(ms, &db); a = ms_pop(ms, &da); rv = arctan(a,b); if (do_dx) { /* %%% I need to verify this. It also needs to be transformed into a form that won't lose accuracy when computed (consider b near 0) probably need to use two forms: one for when fabs(a) > fabs(b), and one for the other case. */ drv = (k_1 / (k_1 + (a*a / b*b))) * (b * da - a * db) / (b*b); } ms_push(ms, rv, drv); *undo_count = 3; break; default: break; } if (debug_r) { if (do_dx && sym_class[op] == 'c') { printf("exec %8.5lg", a); printf(" (%8.5lg)", da); printf(" '%c'", op); printf(" %8.5lg", b); printf(" (%8.5lg)", db); printf(" -> %10.6lg", rv); printf(" d/dx=%10.6lg", drv); } else { if (sym_class[op] != 'a') { printf("exec %14.10lg", a); if (do_dx) { printf(" (%14.10lg)", da); } } printf(" '%c'", op); if (sym_class[op] == 'c') { printf(" %14.10lg", b); if (do_dx) { printf(" (%14.10lg)", db); } } printf(" -> %14.10lg", rv); if (do_dx) { printf(" d/dx=%14.10lg", drv); } } printf("\n"); } return 0; } /* infix_1 pulls one full subexpression off the end of an expression, converts it to infix, and writes the result into the supplied char *. It is still in one-character-per-symbol format. It also modifies the expression by moving its terminating 0 backwards. It writes the term's final operator into t_op, which is to aid the caller in determining whether precedence will require the use of parentheses. infix_1 does not put parentheses around the term, however it does parenthesize subterms within the term. */ s16 infix_1(symbol * expr, char * term, symbol * t_op) { s16 iptr, optr; /* input, output pointers */ symbol op; char st_a[MAX_ELEN * 4]; /* subterm A */ symbol op_a; /* and its operator */ char st_b[MAX_ELEN * 4]; /* subterm B */ symbol op_b; s16 swap; char * term_a; /* for exchanging the two terms if we decide to do so */ char * term_b; /* for exchanging the two terms if we decide to do so */ symbol op_t; /* for swapping op_a and op_b */ char * s; /* for copying subterms to output */ s16 prec_a, prec_b; /* precedence flags for class 'c' operators */ optr = 0; /* go to the end */ iptr = 0; while(expr[iptr]) { iptr++; } /* check for underrun */ if (iptr == 0) { return 1; } /* get the operator */ --iptr; op = expr[iptr]; expr[iptr] = 0; *t_op = op; /* check its type */ switch(sym_class[op]) { case 'a': /* oooh, this is easy */ term[optr++] = op; /* we terminate the input string here. */ expr[iptr] = 0; break; case 'b': /* we've got the op, get the term */ if(infix_1(expr, st_a, &op_a)) { return 1; } swap = 0; if (op == 's') { swap = 1; } /* %%% what happens next should depend on op and op_a */ if (swap == 0) { term[optr++] = op; } term[optr++] = '('; for(s = st_a; *s; s++) { term[optr++] = *s; } term[optr++] = ')'; if (swap) { term[optr++] = op; } break; case 'c': /* get the terms */ if(infix_1(expr, st_b, &op_b)) { return 1; } if(infix_1(expr, st_a, &op_a)) { return 1; } /* default is to keep them in order */ term_a = st_a; term_b = st_b; swap = 0; /* optional swaps for the commutative operators if adding, try to put a constant at the end */ if ((op == '+') && (sym_class[op_a] == 'a') && (op_a != 'x')) { swap = 1; } /* rules for multiplying */ if (op == '*') { /* a bare class-a symbol always precedes an expression %%% this swap should not be done if the baresymbol is 'x' and the expression has no 'x' */ if ((sym_class[op_a] != 'a') && (sym_class[op_b] == 'a')) { swap = 1; } /* with two bare symbols multiplied by each other, swap if the latter is an integer. Note: the integer test of op_b implicitly tests for the second term being class-a. */ if ((sym_class[op_a] == 'a') && (op_b >= '0') && (op_b <= '9')) { swap = 1; } } /* Always swap for operators that are inherently swapped. */ if (op == 'v') { swap = 1; } else if (op == PS_REVSUB) { swap = 1; op = '-'; *t_op = '-'; } else if (op == PS_REVDIV) { swap = 1; op = '/'; *t_op = '/'; } else if (op == PS_REVPOW) { swap = 1; op = '^'; *t_op = '^'; } /* okay, do the swap */ if (swap) { term_a = st_b; term_b = st_a; op_t = op_a; op_a = op_b; op_b = op_t; } /* determine precedence */ prec_a = prec_b = 0; /* default don't use parens */ if (sym_class[op_a] == 'c') { prec_a = 1; /* default use parens */ if (strchr("+*/^v", op_a) && strchr("+-", op)) { prec_a = 0; } else if (strchr("*^v", op_a) && strchr("*/", op)) { prec_a = 0; } } if (sym_class[op_b] == 'c') { prec_b = 1; /* default use parens */ if ((op == '+') && strchr("+-*/^v", op_b)) { prec_b = 0; } else if ((op == '-') && strchr("*/^v", op_b)) { prec_b = 0; } else if ((op == '*') && strchr("*/^v", op_b)) { prec_b = 0; } else if ((op == '/') && strchr("^v", op_b)) { prec_b = 0; } } /* We're all set to generate output */ if (prec_a) { term[optr++] = '('; } for(s = term_a; *s; s++) { term[optr++] = *s; } if (prec_a) { term[optr++] = ')'; } if ((op == '*') && ((sym_class[op_a] != 'a') || (sym_class[op_b] != 'a') || (op_b > '9'))) { term[optr++] = '.'; /* PS_cross; */ } else { term[optr++] = op; } if (prec_b) { term[optr++] = '('; } for(s = term_b; *s; s++) { term[optr++] = *s; } if (prec_b) { term[optr++] = ')'; } break; } /* terminate input and output strings */ term[optr] = 0; return 0; } /* infix_preproc performs manipulations on expressions that make the infix conversion easier. For example, the "squared" operator gets turned into "^2". That gives the infix conversion less special cases to check for when it figures out where to put the parentheses. We can also do some simplification here, for example [xn2+] which is "-x+2" in infix, becomes [x2_] "2-x". */ void infix_preproc(symbol * expr, symbol * out) { symbol * s; symbol * o; s = expr; o = out; while(*s) { /* [-n] = "-(a-b)" seems to never occur. If it did, we could map [-n] -> [_] substring conversion of [-n] -> [_] where '_' is a reversed '-' operator. Likewise situation for [/r]; use '\'. */ if ((*s == 'n') && (sym_class[*(s+1)] == 'a') && (*(s+2) == '+')) { /* converting [xn2+] into [x2_]. Test case: 1.3204, "1/(-x+2) = 4/e" without this conversion and "1/(2-x) = 4/e" with. */ *o++ = *(s+1); *o++ = PS_REVSUB; s += 2; } else if (*s == 's') { /* [s] -> [2^] */ *o++ = '2'; *o++ = '^'; } else if (*s == 'r') { /* [r] -> [1\] */ *o++ = '1'; *o++ = PS_REVDIV; } else if (*s == 'E') { /* [E] -> [e`] where ` is reverse power */ *o++ = 'e'; *o++ = PS_REVPOW; } else { *o++ = *s; } s++; } *o++ = 0; } s16 postfix(symbol * expr, char * out) { symbol * s; char * o; for(s = expr, o = out; *o++ = *s++;) { } return 0; } s16 postfix_formatter(symbol * expr, s16 print) { char cv_scratch[MAX_ELEN + 1]; s16 i, len; char c; if (postfix(expr, cv_scratch)) { /* error */ return 0; } else { len = 0; for(i=0; cv_scratch[i]; i++) { c = cv_scratch[i]; if(sym_forth[c]) { len += strlen(sym_forth[c]); if(print) { printf("%s", sym_forth[c]); } } else { len++; if(print) { printf("%c", c); } } } } } /* complexity simply adds up the complexity scores (weights) in an expression. */ s16 complexity(symbol * expr) { symbol sym; s16 comp; comp = 0; while(sym = *expr++) { comp += sym_weight[sym]; } return comp; } /* eval evaluates an expression; useful for retrieving the values of both sides after a match, or for iterating Newton's method. It returns an error if it doesn't evaluate for some reason. */ s16 eval(symbol * expr, double * val, double * dx, s16 show_work) { metastack ms; symbol * s; symbol dbg_scratch[MAX_ELEN + 1]; symbol ss; s16 err, i; s16 undo_count; /* default return values */ *val = k_0; *dx = k_0; ms_init(&ms); for(s = expr, i=0; s[i]; i++) { err = exec(&ms, s[i], &undo_count, 1); if (err) { return err; } if (debug_o || show_work) { char escratch[MAX_ELEN + 1]; char fscratch[MAX_ELEN * 4]; symbol ss; double val, dval; s16 j; dbg_scratch[i] = s[i]; dbg_scratch[i+1] = 0; infix_preproc(dbg_scratch, escratch); err = infix_1(escratch, fscratch, &ss); val = ms_peek(&ms, &dval); if (debug_o) { printf("eval "); } for(j=strlen(fscratch); j<20; j++) { printf(" "); } printf("%s = %14.10lg", fscratch, val); if (dval != k_0) { printf(" (d/dx = %14.10lg)", dval); } printf("\n"); } } *val = ms_peek(&ms, dx); return 0; } /* newton performs Newton's method to determine the precise value of X for a given LHS and RHS to match. If the method does not converge or if there is an error the target value is returned instead. */ s16 newton(symbol * lhs, symbol * rhs, double *root) { double rhs_val; double x, dx, curr, prev; s16 i, err; *root = target; /* first get the RHS value. */ if (err = eval(rhs, &rhs_val, &dx, 0)) { return err; } if (debug_n) { printf("newton RHS [%s] = %14.10lg\n", rhs, rhs_val); printf("newton iterating LHS [%s]...\n", lhs); } curr = target; prev = curr - k_1; for(i=0; i<6; i++) { exec_x = curr; if (err = eval(lhs, &x, &dx, 0)) { exec_x = target; if (debug_n) { printf("newton got error from eval.\n"); } return err; } if (dx == k_0) { exec_x = target; if (debug_n) { printf("newton got zero derivative.\n"); } return 1; } if (debug_n) { printf("newton step %d v=%14.10lg dv=%14.10lg\n", i, x, dx); } if (curr == prev) { /* we be done */ i = 7; } else { prev = curr; curr = curr + ((rhs_val - x) / dx); } } /* restore exec's target value. very important!! (-:*/ exec_x = target; /* %%% make the following an exact compare-to-zero to test the "~=" in output feature */ if (fabs(curr - prev) < fabs(target * k_prune_deriv)) { /* if (fabs(curr - prev) == k_0) { */ *root = curr; if (debug_n) { printf("newton converged on %14.10lg\n", curr); } return 0; } if (debug_n) { printf("newton did not converge (last delta = %14.10lg).\n", curr - prev); } return 1; } void defsym_used(symbol * expr) { symbol * sym; sym = expr; while(*sym) { if (sym_def[*sym]) { sym_def_needed[*sym] = 1; } sym++; } } void define_symbols() { s16 sym; s16 lineleft, sl, clen; s16 going; char * candidate; symbol csym; going = 1; lineleft = LINELEFT_INIT; while (going) { going = 0; for(sym=0; sym clen) && (sl < lineleft) ) { csym = sym; candidate = sym_def[csym]; clen = strlen(candidate); } } } } /* did we actually find one that fits? */ if (candidate) { printf(" %s", sym_def[csym]); sym_def_given[csym] = 1; lineleft = lineleft - clen - 2; } else { printf("\n"); lineleft = LINELEFT_INIT; } } } if (lineleft < LINELEFT_INIT) { printf("\n"); } } /* gettime is the only really OS-specific routine in this whole program. It returns how long the program has been running, measured in tenths of a second. For example, it returns 14 if the program has run for 1.4 seconds. */ s32 gettime () { struct rusage usage; int err; s32 rv; err = getrusage(RUSAGE_SELF, &usage); rv = usage.ru_utime.tv_usec; rv = rv / 100000L; rv = rv + (10L * (usage.ru_utime.tv_sec)); return rv; } /* print_end generates the summary statistics that get printed at the end. */ void print_end() { s64 combos; u32 total_insert; u32 tsec, t10; printf("\n"); define_symbols(); total_insert = lhs_insert + rhs_insert; printf("\n"); printf(" --LHS-- --RHS-- -Total-\n"); t10 = gettime(); tsec = t10 / 10L; t10 = t10 % 10L; printf(" dead-ends: %10ld %10ld %10ld CPU time: %ld.%ld\n", lhs_prune, rhs_prune, lhs_prune + rhs_prune, tsec, t10); printf(" expressions: %10ld %10ld %10ld \n", lhs_gen, rhs_gen, gen_total); printf(" distinct: %10ld %10ld %10ld Memory: %ldK\n", lhs_insert, rhs_insert, total_insert, mem_total); /* tell them how much work we did. */ printf("\n"); combos = ((s64) lhs_insert) * ((s64) rhs_insert); printf(" Total equations tested: %20lld \n", combos); if (0) { char text[2]; printf("Hit return: "); scanf("%c", text); } } /* exact_exit checks to see if it's time to exit after an exact match. it should only be called after an exact match is found. If the only_exact flag is set we exit because we're done. If not, then we test to see if best_match is negative. Once best_match becomes negative there is no chance of getting any more nonexact matches, and therefore this exact match is the last useful output we'll get. */ void exact_exit() { if (only_exact) { print_end(); exit(0); } if (best_match < k_0) { print_end(); exit(0); } } /* report_match does the formatting to print out a match. You can either supply an LHS and RHS that are already in the tree, or just one tree member and a pe (which you would do if reporting an exact match and don't want to insert the item) */ void report_match(expr *lhs, expr *rhs, pe * exm, double delta) { char * le; char * re; s16 i; char fval[30]; s16 padsize; char escratch[MAX_ELEN]; char fscratch[MAX_ELEN * 4]; symbol ss; s16 err; /* ignore second and subsequent exact matches. */ if ((delta == k_0) && got_exact) { return; } if (lhs) { le = (char *) (lhs->sym); } else { le = (char *) (exm->sym); } if (rhs) { re = (char *) (rhs->sym); } else { re = (char *) (exm->sym); } if (debug_s) { double x, dx; printf("\nbased on:\n"); eval(le, &x, &dx, 1); eval(re, &x, &dx, 1); printf("I got:\n"); } if (postfix_output) { err = postfix(le, fscratch); } else { infix_preproc(le, escratch); err = infix_1(escratch, fscratch, &ss); } if (err) { fscratch[0] = 0; } for(i= strlen(fscratch); i<21; i++) { printf(" "); } printf(" %s", fscratch); if (postfix_output) { err = postfix(re, fscratch); } else { infix_preproc(re, escratch); err = infix_1(escratch, fscratch, &ss); } if (err) { fscratch[0] = 0; } printf(" = %s ", fscratch); for(i= strlen(fscratch); i<21; i++) { printf(" "); } if (delta == k_0) { printf("(exact match)"); fval[0] = 0; padsize = 11; got_exact = 1; } else if (relative_x == 0) { printf("for X "); printf("= "); sprintf(fval, "%.14lg", target + delta); padsize = 16; } else if (delta < k_0) { delta = - delta; printf("for X "); printf("= T - "); sprintf(fval, "%.6lg", delta); padsize = 12; } else { printf("for X "); printf("= T + "); sprintf(fval, "%.6lg", delta); padsize = 12; } printf("%s", fval); for(i= strlen(fval); isym, rhs->sym); } /* old algorithm */ diff = rhs->val - lhs->val; dx = lhs->der; if (dx == k_0) { fprintf(stderr, "check_match got LHS dx = 0!\n"); return; } delta = diff / dx; score = fabs(delta); if (score < best_match) { /* We have a good candidate for a new match. Now we use Newton's method to get a more accurate score */ if (debug_q) { printf("check_match [%s] ~= [%s], calling newton\n", lhs->sym, rhs->sym); } if (newton(lhs->sym, rhs->sym, &root)) { /* It didn't converge: don't accept, because it's probably a pathological case like sin(1/a) near a=0 */ return; } delta = root - target; /* printf("target %lg root %lg delta %lg\n", target, root, delta); */ score = fabs(delta); if (score == k_0) { /* Newton's method revealed that we have an exact match. We'll report it (in case it's our first exact match) but not adjust our report threshold. */ if (debug_q) { printf("check_match exact match\n"); } report_match(lhs, rhs, 0, delta); } else if (score < best_match) { if (debug_q) { printf("check_match new score %lg\n", score); } report_match(lhs, rhs, 0, delta); /* The minus 1.0e-15 is to avoid having lots of matches that beat each other only because of roundoff in the score calculation. This happens if you invoke e.g. "ries 0.434294481903252" */ best_match = (score * 0.999) - 1.0e-15; if (debug_q) { printf("check_match lowering match threshold to %lg\n", best_match); } if ((best_match < k_0) && got_exact) { print_end(); exit(0); } } } } /* bt_next traverses the list "backwards" to the next-smaller expression. */ expr * bt_prev(expr *it) { expr *old; /* if it has a left child it's relatively easy */ if (it->left) { /* go down left, then down right to dead end */ it = it->left; while (it->right) { it = it->right; } } else { /* here we have to traverse up until we get to a node from which we were the right. We also need to worry about going all the way off the top, which would mean we're done. */ old = 0; while(it && (old == it->left)) { old = it; it = it->up; } } return it; } /* bt_next traverses the list to the next-greater expression. */ expr * bt_next(expr *it) { expr *old; /* if it has a right child it's relatively easy */ if (it->right) { /* go down right, then down left to dead end */ it = it->right; while (it->left) { it = it->left; } } else { /* here we have to traverse up until we get to a node from which we were the left. We also need to worry about going all the way off the top, which would mean we're done. */ old = 0; while(it && (old == it->right)) { old = it; it = it->up; } } return it; } /* check_sides looks for a match for a given expression. It finds the nearest expression of the opposite type (LHS or RHS) to either side of the supplied expression, and for each, calls check_match. NOTE: The "complete" task would be to scan for *all* RHS's in the tree that are within best_match * dx, but if there is more than one, that would imply that two RHS's have been inserted during the previous pass, both of which are new matches. In such a case we're choosing to just report the closer match. */ void check_sides(expr *it) { expr * other; if (it->der == k_0) { /* we've got a RHS */ other = it; while(other && (other->der == k_0)) { other = bt_prev(other); } /* if we got one, check it */ if (other) { check_match(other, it); } /* do the same thing again, to the right this time. */ other = it; while(other && (other->der == k_0)) { other = bt_next(other); } if (other) { check_match(other, it); } } else { /* we've got an LHS */ other = it; while(other && (other->der != k_0)) { other = bt_prev(other); } /* if we got one, check it */ if (other) { check_match(it, other); } /* do the same thing again, to the right this time. */ other = it; while(other && (other->der != k_0)) { other = bt_next(other); } if (other) { check_match(it, other); } } } /* bt_insert adds an expression to the tree. The dx parameter is used to determine if it's an RHS or an LHS expression. If it's an LHS expression, match checking is done right away. If it's an RHS expression, matching must be deferred to a separate linear pass. */ s16 bt_insert(double x, double dx, pe *ex, s16 * res1) { expr * it; s16 going, insert, fillin, report, i; fillin = 0; going = 0; insert = 0; *res1 = 0; it = lhs_root; /* If there's a tree to descend, descend it. */ if (it) { going = 1; insert = 1; } else { /* insert and copy first node */ lhs_root = (expr *) my_alloc(sizeof(expr), k_0); if (lhs_root == 0) { return 1; } *res1 = 1; it = lhs_root; it->up = 0; going = 0; /* no descending to do */ insert = 0; /* and we just inserted */ fillin = 1; /* but we need to fill it in */ } while(going) { if (it->val == x) { /* Exact match. If both are RHS's, or if both are LHS's, we discard the new one, because it is a higher-complexity expression with the same value and is therefore of no use. If one is an LHS and the other is an RHS we have found an exact solution! */ going = 0; insert = 0; if (dx == k_0) { /* new item is RHS. */ if (it->der == k_0) { /* Tree item is RHS, too. We just discard in this case. */ } else { /* Tree item is LHS... */ if (got_exact) { /* We only report one per run. */ } else { /* our first exact match! */ report_match(it, 0, ex, k_0); } } } else { /* new item is LHS. */ if (it->der == k_0) { /* tree item is RHS... */ if (got_exact) { /* We only report one per run. */ } else { /* our first exact match! */ report_match(0, it, ex, k_0); } } else { /* Tree item is LHS too, discard. */ } } } else { /* no match yet: descend. */ if (x < it->val) { /* go to left child */ if (it->left) { it = it->left; } else { /* no left: that means we insert here. */ insert = -1; going = 0; } } else { /* go to right child */ if (it->right) { it = it->right; } else { /* no right: that means we insert here. */ insert = 1; going = 0; } } } } if (insert) { expr *n; n = (expr *) my_alloc(sizeof(expr), k_0); if (n == 0) { return 1; } *res1 = 1; insert_count++; if (insert > 0) { it->right = n; n->up = it; } else { it->left = n; n->up = it; } it = n; fillin = 1; } if (fillin) { /* copy the expression into the new node */ it->val = x; it->der = dx; it->left = 0; it->right = 0; it->elen = ex->elen; for(i=0; i<=it->elen; i++) { it->sym[i] = ex->sym[i]; } } /* Last thing to do is to check for a new match. */ if (insert) { check_sides(it); } return 0; } /* ge_2 is the core code for generating expressions from a form. base -----bpe----- comp elen syms ab 0 0 - ab 10 1 1 ab 17 2 1l ab 17 2 1n ab 13 1 2 ab 20 2 2r ab 20 2 2q ab 20 2 2l */ u32 ge_2(form *base, pe *bpe, s16 e_minw, s16 e_maxw, metastack *ms) { symbol class, sym; symbol *syms; s16 ns; s16 minw, maxw, comp; s16 rminw, rmaxw; s16 ip; u32 n; s16 recurse; s16 atts; s16 muc; /* metastack undo count */ double x, dx; s16 insert; s16 dbg_side; dbg_side = on_rhs ? DBG_RHS : DBG_LHS; n = 0; muc = 0; ip = bpe->elen; /* execute the opcode that just got added (if any) */ if (ip > 0) { /* if there are errors like [0/] or [1-q], we'll return right away, thereby pruning */ if (exec(ms, bpe->sym[ip-1], &muc, ! on_rhs)) { if (debug_A & dbg_side) { bpe->sym[ip] = 0; printf("prune partial exec error [%s)\n", bpe->sym); } while(muc) { ms_undo(ms); muc--; } return 0; } /* We can always prune zero subexpressions. Any solution that contains a zero subexpression can be reduced to a simpler solution that does not contain a zero subexpression. */ x = ms_peek(ms, &dx); if (x == k_0) { if (debug_B & dbg_side) { bpe->sym[ip] = 0; printf("prune partial zero [%s) = %14.10lg\n", bpe->sym, x); } while(muc) { ms_undo(ms); muc--; } prune_count++; return 0; } /* Prune non-integer subexpressions if the -i option was given. */ if (int_subexpr) { if (x != floor(x)) { if (debug_C & dbg_side) { bpe->sym[ip] = 0; printf("prune partial noninteger [%s) = %15.10lg\n", bpe->sym, x); } while(muc) { ms_undo(ms); muc--; } prune_count++; return 0; } } /* %%% here we should do a bt_find and check that the trailing N opcodes of our PE are equal to that found (if any). If not, our PE is a redundant and possibly more complex way to generate a value, and can be pruned. This should be a significant optimization. It covers the same ground as the AM_ rules, but catches many cases they miss (like multiple equivalent sums: 37+, 28+, 19+, 136++, 16n3-+, and on and on... */ /* %%% I don't know if C's '>' operator works right with infinities. * %%% this could be done more efficiently inside exec() */ /* Any subexpression that overflows either in value or in the differential causes pruning. */ if ((x >= p_ovr) || (x <= n_ovr)) { if (debug_D & dbg_side) { bpe->sym[ip] = 0; printf("prune partial overflow [%s) = %15.10lg\n", bpe->sym, x); } while(muc) { ms_undo(ms); muc--; } prune_count++; return 0; } /* ignore nonzero LHS expressions with very small differential (singularity problem with roundoff error) */ if (! on_rhs) { if ((fabs(dx) > k_0) && (fabs(dx) < k_prune_deriv)) { (*bpe).sym[ip] = 0; if (debug_E & dbg_side) { printf("prune partial dx~=0 [%s) = %15.10lg, d/dx = %15.10lg\n", (char *) (&((*bpe).sym[0])), x, dx); } while(muc) { ms_undo(ms); muc--; } prune_count++; return 0; } } } /* insert or recurse? */ if (ip >= base->flen) { /* it's now as long as it can get. */ insert = 1; /* ignore LHS expressions with differential of zero (singularity problem) */ if (! on_rhs) { if (fabs(dx) < k_prune_deriv) { (*bpe).sym[ip] = 0; if (debug_F & dbg_side) { printf("prune full dx~=0 [%s] = %14.10lg, d/dx = %14.10lg\n", (char *) (&((*bpe).sym[0])), x, dx); } insert = 0; prune_count++; } } if (insert) { s16 res1; x = ms_peek(ms, &dx); (*bpe).sym[ip] = 0; if (debug_G & dbg_side) { printf("ge_2 insert [%s] = %14.10lg", (char *) (&((*bpe).sym[0])), x); if (! on_rhs) { printf(", d/dx = %14.10lg", dx); } printf(", comp {%d}\n", bpe->comp); } if (bt_insert(x, dx, bpe, &res1)) { fprintf(stderr, "Out of memory\n"); exit(1); } if (res1 == 0) { if (debug_G & dbg_side) { printf(" rejected (duplicate value)\n"); } } } /* undo stack manipulation and return */ while(muc) { ms_undo(ms); muc--; } return insert; } /* we reach here if it's incomplete, and we've just executed the last symbol, and now it's time to add new symbols. */ /* get class and comp */ class = base->sym[ip]; comp = bpe->comp; /* set up our variables for generation */ syms = 0; switch(class) { case 'a': syms = asym; ns = n_asym; minw = a_minw; maxw = a_maxw; break; case 'b': syms = bsym; ns = n_bsym; minw = b_minw; maxw = b_maxw; break; case 'c': syms = csym; ns = n_csym; minw = c_minw; maxw = c_maxw; break; } /* evaluate remaining available minw an maxw */ rminw = bpe->rminw[ip+1]; rmaxw = bpe->rmaxw[ip+1]; /* calculate the attributes for the symbols we have so far */ if (ip > 0) { if (debug_H & dbg_side) { bpe->sym[ip] = 0; printf("attributes for [%s): ", bpe->sym); } } atts = on_rhs; /* no debug print for on_rhs because it's pretty obvious */ if (ip > 1) { /* two-symbol patterns */ /* currently, all two-symbol patterns are for (aa) forms */ if ((base->sym[ip-1] == 'a') && (base->sym[ip-2] == 'a')) { if (debug_H & dbg_side) { printf(" (%c%c)", bpe->sym[ip-2], bpe->sym[ip-1]); } /* KK rule is true if the same constant occurs twice in a row. */ if (bpe->sym[ip-1] == bpe->sym[ip-2]) { atts |= AM_KK; if (debug_H & dbg_side) { printf(" KK"); } } /* 55 rule is true if both constants are integers less than or equal * to 5. */ if ((bpe->sym[ip-1] < '6') && (bpe->sym[ip-2] < '6')) { atts |= AM_55; if (debug_H & dbg_side) { printf(" 55"); } } /* 1K rule is true if first constant is 1 */ if (bpe->sym[ip-2] == '1') { atts |= AM_1K; if (debug_H & dbg_side) { printf(" 1K"); } } /* jK rule: true if smaller constant is followed by larger constant. * For these purposes, the noninteger constants are considered * larger than all the integers. (This can be changed if necessary * by adding an indirection array defining symbol sequence, or by * using the symbol weights as a sequencing measure -- but for now, * it works fine this way.) */ if (bpe->sym[ip-1] > bpe->sym[ip-2]) { atts |= AM_jK; if (debug_H & dbg_side) { printf(" jK"); } } } } if (ip > 0) { /* one-symbol patterns */ if (debug_H & dbg_side) { printf(" (%c)", bpe->sym[ip-1]); } if (bpe->sym[ip-1] == '1') { atts |= AM_1; if (debug_H & dbg_side) { printf(" 1"); } } else if (bpe->sym[ip-1] == '2') { atts |= AM_2; if (debug_H & dbg_side) { printf(" 2"); } } else if (bpe->sym[ip-1] == 'n') { atts |= AM_n; if (debug_H & dbg_side) { printf(" n"); } } else if (bpe->sym[ip-1] == 'r') { atts |= AM_r; if (debug_H & dbg_side) { printf(" r"); } } else if (bpe->sym[ip-1] == 'l') { atts |= AM_l; if (debug_H & dbg_side) { printf(" l"); } } else if (bpe->sym[ip-1] == 'E') { atts |= AM_E; if (debug_H & dbg_side) { printf(" E"); } } else if (bpe->sym[ip-1] == 'p') { atts |= AM_p; if (debug_H & dbg_side) { printf(" p"); } } else if ((bpe->sym[ip-1] == 's') || (bpe->sym[ip-1] == 'q')) { atts |= AM_sq; if (debug_H & dbg_side) { printf(" sq"); } } if (debug_H & dbg_side) { printf("\n"); } } bpe->elen = ip + 1; if (debug_I & dbg_side) { printf("%d symbols to try.\n", ns); } while (ns > 0) { /* get next symbol and write it */ sym = *syms++; if (debug_I & dbg_side) { bpe->sym[ip] = 0; printf("trying [%s) + '%c':\n", bpe->sym, sym); } bpe->sym[ip] = sym; sym_count[sym] += 1; /* update complexity */ bpe->comp = comp + sym_weight[sym]; /* begin pruning */ recurse = 1; /* is attainable minimum above the global max? */ if (bpe->comp + rminw > e_maxw) { if (debug_J & dbg_side) { printf("prune complexity {%d} + rminw[%d] {%d} > e_maxw {%d}\n", bpe->comp, ip+1, rminw, e_maxw); } recurse = 0; } else if (bpe->comp + rmaxw < e_minw) { if (debug_J & dbg_side) { printf("prune complexity {%d} + rmaxw[%d] {%d} < e_minw {%d}\n", bpe->comp, ip+1, rmaxw, e_minw); } recurse = 0; } /* does this symbol generate a stupid combination? */ else if ((atts & sym_mask[sym]) != sym_mval[sym]) { if (debug_K & dbg_side) { printf("prune on symbol rules: atts %x & mask %x != sym_mval %x\n", atts, sym_mask[sym], sym_mval[sym]); } recurse = 0; } /* LHS expressions always start with 'x' */ else if ((on_rhs == 0) && (ip == 0) && (sym != 'x')) { if (debug_K & dbg_side) { printf("prune LHS must start with 'x'\n"); } recurse = 0; } /* Check symbol count for this symbol */ else if (sym_count[sym] > sym_allowed[sym]) { if (debug_L & dbg_side) { printf("prune symcount[%c]\n", sym); } recurse = 0; } if (recurse) { /* okay, it's all set to recurse */ n += ge_2(base, bpe, e_minw, e_maxw, ms); } else { prune_count++; } /* undo the writing of this symbol. */ sym_count[sym] -= 1; ns--; } /* undo the damage */ bpe->elen = ip; bpe->comp = comp; /* we're done with the IR's of this opcode */ while(muc) { ms_undo(ms); muc--; } return n; } u32 ge_1(form *base, s16 e_minw, s16 e_maxw) { pe bpe; metastack ms; s16 blen; u32 n; s16 i, rminw, rmaxw; symbol sym; blen = base->flen; /* for now, just report it */ (*base).sym[blen] = 0; if (debug_t) { printf("ge_1 on form {%s}\n", (char *) (&((*base).sym[0]))); printf("%3d >= %3d <= {%s} <= %3d >= %3d\n", e_maxw, base->min_weight, (char *) (&((*base).sym[0])), base->max_weight, e_minw); } n = 0; /* set up the pe struct */ bpe.comp = 0; bpe.elen = 0; /* calculate and fill in the rmw fields */ rminw = rmaxw = 0; i=blen; bpe.rminw[i] = rminw; bpe.rmaxw[i] = rmaxw; while(i>0) { i--; sym = (*base).sym[i]; switch(sym) { case 'a': rminw += a_minw; rmaxw += a_maxw; break; case 'b': rminw += b_minw; rmaxw += b_maxw; break; case 'c': rminw += c_minw; rmaxw += c_maxw; break; } if (debug_u) { printf("rminw[%d] = %d, rmaxw[%d] = %d.\n", i, rminw, i,rmaxw); } bpe.rminw[i] = rminw; bpe.rmaxw[i] = rmaxw; } /* set up the metastack */ ms_init(&ms); /* generate! */ n = ge_2(base, &bpe, e_minw, e_maxw, &ms); if(debug_v) { printf("form %s generated %ld expressions.\n", base->sym, n); } return n; } /* generate forms by simple recursive algorithm. ------base------ ------next------ flen sym-- stack flen sym-- stack Comments 0 - 0 Initial call 0 - 0 1 a 1 Setting up call to myself 1 a 1 Entering recursive invocation 1 a 1 2 aa 2 Setting up call to myself . . . (. . .) 1 a 1 Entering recursive invocation 1 a 1 2 ab 1 Setting up another call to myself */ u32 gf_1(form *base, s16 e_minw, s16 e_maxw) { s16 blen; form next; s16 i, recurse_forms, gen_expr; symbol s, slim; u32 n; n = 0; /* copy base form */ blen = base->flen; for(i=0; istack + ('b' - s); /* 1 for a, 0 for b, -1 for c */ if (s == 'a') { next.min_weight = base->min_weight + a_minw; next.max_weight = base->max_weight + a_maxw; } else if (s == 'b') { next.min_weight = base->min_weight + b_minw; next.max_weight = base->max_weight + b_maxw; } else { next.min_weight = base->min_weight + c_minw; next.max_weight = base->max_weight + c_maxw; } /* check form symtax */ recurse_forms = 1; if (next.stack < 1) { /* stack can't be zero or underflow */ if (debug_w) { printf("gf_1 prune [%s) stack would underflow\n", next.sym); } recurse_forms = 0; } else if (next.min_weight + s_minw > e_maxw) { /* adding this symbol would make a form that can't possibly generate any expressions within the given complexity range. */ if (debug_w) { printf("gf_1 prune [%s) complexity\n", next.sym); } recurse_forms = 0; } else if (blen >= MAX_ELEN) { /* limits expression length to our physical allocation size. If this actually triggers at runtime, it implies that the weights are too spread out or that MAX_ELEN is just too darn small. */ if (debug_w) { printf("gf_1 prune [%s) length\n", next.sym); } recurse_forms = 0; } else if ((next.flen + next.stack) >= MAX_ELEN + 1) { /* in this case it would have no way of getting the stack down to one item before exceeding MAX_ELEN */ if (debug_w) { printf("gf_1 prune [%s) stack too high\n", next.sym); } recurse_forms = 0; } /* check viability for expressions */ gen_expr = 0; if (next.stack == 1) { gen_expr = 1; if (next.min_weight > e_maxw) { if (debug_w) { printf("gf_1 [%s] min weight too big for expressions\n", next.sym); } gen_expr = 0; } else if (next.max_weight < e_minw) { if (debug_w) { printf("gf_1 [%s] max weight too small for expressions\n", next.sym); } gen_expr = 0; } } if (gen_expr) { if (debug_w) { printf("gf_1 generating expressions on form [%s]\n", next.sym); } g_ne += ge_1(&next, e_minw, e_maxw); } /* recurse, if appropriate */ if (recurse_forms) { n += gf_1(&next, e_minw, e_maxw); } /* count the leaf nodes... leaf. huh-huh. heh huh heh-heh. */ if (gen_expr) { n++; } } return(n); } /* generate forms on-the-fly, given a minimum and maxmum complexity score. * It will generate all expressions with valid forms that lie within the * complexity limits. */ u32 gen_forms(s16 e_minw, s16 e_maxw) { form base; u32 n; n = 0; base.flen = 0; base.stack = 0; base.min_weight = 0; base.max_weight = 0; n = gf_1(&base, e_minw, e_maxw); return(n); } /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Routines for initializing the data structures and variables used for the search. */ void add_symbol(symbol sym, symbol type, s16 weight, char * forth, char * definition) { if(definition) { sym_def[sym] = definition; if(strlen(definition) > LINELEFT_INIT) { fprintf(stderr, "Symbol definition for '%c' too long:\n%s\n", sym, definition); } } if(forth) { sym_forth[sym] = forth; } if (IS_PHANTOM(sym)) { sym_class[sym] = type; return; } weight += 10; sym_class[sym] = type; sym_weight[sym] = weight; sym_mask[sym] = 0; sym_mval[sym] = 0; if (sym_allowed[sym] == 0) { /* this is to leave the weights unchanged */ return; } if (type == 'a') { asym[n_asym++] = sym; if (n_asym == 1) { a_minw = weight; a_maxw = weight; } else { if (weight < a_minw) { a_minw = weight; } else if (weight > a_maxw) { a_maxw = weight; } } } else if (type == 'b') { bsym[n_bsym++] = sym; if (n_bsym == 1) { b_minw = weight; b_maxw = weight; } else { if (weight < b_minw) { b_minw = weight; } else if (weight > b_maxw) { b_maxw = weight; } } } else if (type == 'c') { csym[n_csym++] = sym; if (n_csym == 1) { c_minw = weight; c_maxw = weight; } else { if (weight < c_minw) { c_minw = weight; } else if (weight > c_maxw) { c_maxw = weight; } } } } void add_rule(symbol * symset, symbol sym, s16 mask, s16 mval) { symbol sreq; s16 allowed; allowed = 1; while(sreq = *symset++) { if (sym_allowed[sreq] < MAX_ELEN) { allowed = 0; } } if (allowed) { if (debug_x) { printf("Using rule %8s %c %4x %4x\n", symset, sym, mask, mval); } sym_mask[sym] |= mask; if (mval) { sym_mval[sym] |= mask; } } } /* init1() sets defaults (anything that can be overridden or changed by command-line arguments) */ void init1() { s16 i; /* There are three command-line options for specifying the symbolset: -S, -O, and -N. -S means "use only these symbols"; -O means "use at most one of these per expression" and -N means "don't use these symbols at all". These options are stored into the three arrays sym_yes, sym_once, and sym_no respectively while scanning the command-line. However, during actual expression generation only the array sym_allowed is used. sym_allowed speficies how many of each symbol is allowed in each expression. In sym_allowed[], a value of 0 is used for -N symbols, 1 for -O symbols, and MAX_ELEN for -S symbols. If there is no -S option, the default for any symbols not mentioned in -O or -N is that they are enabled in the normal way. Also, when there are conflicts -S takes precedence over -O and -O takes precedence over -N. This is implemented in 5 steps as follows: - start with all entries of sym_allowed = MAX_ELEN - if there was a -S argument, clear all entries to 0 - set all entries specified by -N to 0 - set all entries specified by -O to 1 - set all entries specified by -S to MAX_ELEN */ for(i=0; i [Ks], and represents two sequences of operators * that have the same value or have the same effect in an expression. The * form to the left of the '=>' is the form being eliminated by the rule, * and the form to the right is shown to demonstrate why the form on the * left is eliminated. When the transformation has a shorter form on the * right than on the left, as with [1r] => [1], the reason for the rule * is obvious. In other cases it is important to eliminate only the form * that has a higher complexity score. If the scores are equal, such as * [rn] => [nr], we pick one that interacts favorably with other rules. * The form on the right is said to be "forced", meaning that any expressions * involving this type of calculation are "forced" to do it in the right-hand * form. * You must not "force" a form that is also eliminated by another rule! * For example, the following two rules are OK by themselves, but together * cause a problem: * [sr] => [rs] * [rs] => [sr] * However, you can also "force" a form that is also "forced" into another * form by another rule. The following two rules exemplify this: * [rn] => [nr] * [rq] => [qr] * These two rules, combined, will cause the combinations of 'r', 'n', and * 'q' to be reduced from six {[rnq], [rqn], [nrq], [nqr], [qrn], [qnr]} to * two {[nqr], [qnr]} * Each rule has a symbolset which must be present in order for that * rule to be used. * symset sym mask mval */ add_rule("", 'x', AM_RHS, 0); add_rule("nr", 'n', AM_r, 0); /* [rn] => [nr] */ add_rule("", 'n', AM_n, 0); /* [nn] => [] */ add_rule("1", 'r', AM_1, 0); /* [1r] => [1] */ add_rule("", 'r', AM_r, 0); /* [rr] => [] */ add_rule("1", 's', AM_1, 0); /* [1s] => [1] */ add_rule("4", 's', AM_2, 0); /* [2s] => [4] */ add_rule("s", 's', AM_n, 0); /* [ns] => [s] */ add_rule("rs", 's', AM_r, 0); /* [rs] => [sr] */ add_rule("4^", 's', AM_sq, 0); /* [qs] => []; [ss] => [4^] */ add_rule("1", 'q', AM_1, 0); /* [1q] => [1] */ add_rule("rq", 'q', AM_r, 0); /* [rq] => [qr] */ add_rule("4v", 'q', AM_sq, 0); /* [sq] => []; [qq] => [4v] */ add_rule("", 'l', AM_1, 0); /* [1l] => 0 */ add_rule("ln", 'l', AM_r, 0); /* [rl] => [ln] */ add_rule("", 'l', AM_E, 0); /* [El] => [] */ add_rule("", 'E', AM_l, 0); /* [lE] => [] */ add_rule("Er", 'E', AM_n, 0); /* [nE] => [Er] */ add_rule("Sn", 'S', AM_n, 0); /* [nS] => [Sn] */ add_rule("C", 'C', AM_n, 0); /* [nC] => [C] */ /* The operators are not included in their own require-symset string unless they are also used in the target of the forceed transformation. This is important with the -O option. For example, if they specify -O+, the '+' rules will have the effect of "saving" the '+' for "a more important", i.e. irreducible, use. Of course, it doesn't actually prevent solutions from being found, it just makes them get found sooner. */ add_rule("2*", '+', AM_KK, 0); /* [KK+] => [K2*] */ add_rule("*256789", '+', AM_55, 0); /* [JK+] => [L] or [52*] */ add_rule("+", '+', AM_jK, 0); /* [jK+] => [Kj+] */ add_rule("-", '+', AM_n, 0); /* [n+] => [-] */ add_rule("", '-', AM_KK, 0); /* [KK-] => 0 */ add_rule("1234n", '-', AM_55, 0); /* [JK-] => [L] or [Ln] */ add_rule("+", '-', AM_n, 0); /* [n-] => [+] */ add_rule("s", '*', AM_KK, 0); /* [KK*] => [Ks] */ add_rule("*", '*', AM_jK, 0); /* [jK*] => [Kj*] */ add_rule("*", '*', AM_1, 0); /* [1*] => [] */ add_rule("*n", '*', AM_n, 0); /* [n*] => [*n] */ add_rule("/", '*', AM_r, 0); /* [r*] => [/] */ add_rule("1", '/', AM_KK, 0); /* [KK/] => [1] */ add_rule("r", '/', AM_1K, 0); /* [1K/] => [Kr] */ add_rule("", '/', AM_1, 0); /* [1/] => [] */ add_rule("/n", '/', AM_n, 0); /* [n/] => [/n] */ add_rule("*", '/', AM_r, 0); /* [r/] => [*] */ add_rule("", '^', AM_1, 0); /* [1^] => [] */ add_rule("s", '^', AM_2, 0); /* [2^] => [s] */ add_rule("^r", '^', AM_n, 0); /* [n^] => [^r] */ add_rule("1", '^', AM_1K, 0); /* [1K^] => [1] */ add_rule("v", '^', AM_r, 0); /* [r^] => [v] */ add_rule("", 'v', AM_1, 0); /* [1v] => [] */ add_rule("q", 'v', AM_2, 0); /* [2v] => [q] */ add_rule("vr", 'v', AM_n, 0); /* [nv] => [vr] */ add_rule("1", 'v', AM_1K, 0); /* [1Kv] => [1] */ add_rule("^", 'v', AM_r, 0); /* [rv] => [^] */ add_rule("1", 'L', AM_KK, 0); /* [KKL] => [1] */ add_rule("", 'L', AM_1, 0); /* [1L] => undefined */ add_rule("Ln", 'L', AM_r, 0); /* [rL] => [Ln] */ add_rule("", 'L', AM_1K, 0); /* [1KL] => 0 */ /* Added on 20070511 */ add_rule("", 'S', AM_p, 0); /* [pS] => 0 */ s_minw = a_minw; if (b_minw < s_minw) { s_minw = b_minw; } if (c_minw < s_minw) { s_minw = c_minw; } /* This init is for ge_2. */ for(i=0; i 1) { av = argv; args = nargs; av++; /* skip our program name/path */ args--; while(args) { arg = *av; av++; args--; if ((arg[0] == '-') && (arg[1] == 'l')) { /* they gave a level */ arg += 2; /* skip the "-l" */ sscanf(arg, "%hd", &levadj); tlevel += levadj; } else if ((arg[0] == '-') && (arg[1] == 'N')) { /* Not these symbols */ symbol sym; arg += 2; /* skip the "-N" */ while(*arg) { sym = (symbol) (*arg); arg++; sym_no[sym] = 1; } } else if ((arg[0] == '-') && (arg[1] == 'S')) { /* Only these symbols */ symbol sym; s16 i; S_option = 1; arg += 2; /* skip the "-S" */ while(*arg) { sym = (symbol) (*arg); arg++; sym_yes[sym] = 1; } } else if ((arg[0] == '-') && (arg[1] == 'O')) { /* Once-only symbols */ symbol sym; s16 i; arg += 2; /* skip the "-O" */ while(*arg) { sym = (symbol) (*arg); arg++; sym_once[sym] = 1; } } else if ((arg[0] == '-') && (arg[1] == 'D')) { /* Debugging options */ char d; s16 i; arg += 2; /* skip the "-D" */ while(d = *arg) { switch(d) { case 'a': debug_A |= DBG_RHS; break; case 'b': debug_B |= DBG_RHS; break; case 'c': debug_C |= DBG_RHS; break; case 'd': debug_D |= DBG_RHS; break; case 'e': debug_E |= DBG_RHS; break; case 'f': debug_F |= DBG_RHS; break; case 'g': debug_G |= DBG_RHS; break; case 'h': debug_H |= DBG_RHS; break; case 'i': debug_I |= DBG_RHS; break; case 'j': debug_J |= DBG_RHS; break; case 'k': debug_K |= DBG_RHS; break; case 'l': debug_L |= DBG_RHS; break; case 'm': debug_m = 1; break; case 'n': debug_n = 1; break; case 'o': debug_o = 1; break; case 'p'