                            WORKSHEET FILE FORMAT 
                                  FROM LOTUS 

                      APPENDIX B - THE FORMULA COMPILER

               Copyright(c) 1984, Lotus Development Corporation 
                               161 First Street 
                        Cambridge, Massachusetts 02142 
                                (617) 492-7171 
                      Electronic Edition, December, 1984 
                             All Rights Reserved 
























































                      APPENDIX B:  The Formula Compiler 

 This appendix describes the internal workings of the formula compiler.  The
 compiler transforms an ASCII string of characters representing a formula to
 its Reverse Polish code.  The basic algorithm utilizes and SR parser (SR =
 shift and reduce).  The aim of the parser is to apply a set of reduction
 rules which embody the syntax of the compiler to an input string.  Formula
 code is compiled to a temporary buffer. 

 Lexicon Analysis 

 A lexical analyzer breaks up the input string into lexical units called
 tokens.  A token is a substring of the original input string operand,
 operator, or special symbol (such as comma, parentheses, etc.) In addition,
 the lexical analyser supplies two special tokens, "beginning of formula"
 (boform) and "end of formula" (eoform), to facilitate the compilation
 process.  The lexical analyzer identifies and processes literals (both
 number and string), cell and range references, operators, and function
 calls.  It assigns a unique code to each distinct operator, function, or
 type of operand. 

 A function with no arguments is treated like a number. 

 Syntax Analysis 

 The syntactical analysis of a formula is accomplished by processing a list
 of tokens in left-to-right order.  A stack called the syntax is also used
 during the syntactical scan.  The basic algorithm is as follows: 

 Repeat the following steps: 

 1) Get the next token 

 2) If the token is a literal or cell reference: 
       a) Push the number code on the syntax stack 
       b) Push the number code on the syntax stack 

 3) If the token is a range reference: 
       a) Compile code to push the range reference 
       b) Push the range code on the syntax stack 

 4) Otherwise push the token code for the token on the syntax stack. 

 For each syntax rule, if the pattern on the top of the  syntax matches the
 rule pattern take the action associated with the rule and start scanning
 from the beginning for any additional rules which may apply. 

 When a token code is pushed on the syntax stack, an additional word of
 zeros is also pushed on the stack.  This is used when compiling function
 calls to hold the function's argument count. 





 Rule Matching 

 A relatively small number of rules are used to process formulas of arbitrary
 complexity.  If a rule matches the top of the syntax stack, then the
 compiler takes a specific action and rule scanning starts again with the
 first rule.  Each rule matches certain patterns on the syntax stack.  A
 typical rule might be: if the top of the stack is the token for right
 parenthesis, and the next-to-top is a number, and the second form the top
 is a left parenthesis, then pop the top three items from the syntax stack
 and push the number on the syntax stack. 

 This rule can be more succinctly represented as: 

                        Stack 

          Before                      After                 Action 
          ) 
          number 
          (                           number                none 



 The Rules 


 The following are the syntax rules used to process formulas.  Note that the
 order of the rules is important.  The rules for compilation of operators
 used additional tables which assign a precedence number and opcode to each
 legal unary and binary operator.  Thus, for example, there is a single
 token code for minus sign (-), but there are two opcodes one for unary
 minus and one for binary minus.  In addition, these two operators, while
 lexically identical, also have different precedence.  In general, operators
 of higher precedence will be performed before operators of lower precedence
 are performed left-to-right.  All special operators (boform, eoform,
 parentheses, comma, etc.) are implicitly assigned a precedence of zero. 

 Rule 1  Termination test 

                  Stack 

         Before           After       Action 
         eoform                       Output a return code to compile buffer 
         number                       Return, indicating successful compile 
         boform

 Rule 2  Function argument processing 

                 Stack 
         Before          After       Action 
         '                           Error if range argument illegal for 
         number or range             function. 
         (               (           Increment argument count on stack 
         function        function 

 Rule 3  Process final function argument 

                 Stack 
         Before         After        Action 
         )                           Error if range argument illegal for 
         number or range             function. 
         (                           Increment argument count on stack 
         function       number       Compile function opcode 
                                     If list function, compile argument 
                                     count; otherwise error is wrong 
                                     argument count. 




 Rule 4  Parenthesis removal 

                Stack 
        Before         After        Action 
        )                           Compile parenthesis opcode 
        number 
        (              number 
        operator       operator 



 Rule 5  Binary operators 

               Stack 
        Before         After        Action 
        op2                         If binary op<binary op, rule does 
        number                      not match.  Otherwise, compile opcode 
        op1            op2          for operator op1. 


 Rule 6  Unary operators 

               Stack 
        Before      After           Action 
        op2                         I unary op<binary op, rule does 
        number      op2             not match.  Otherwise, compile opcode. 
        op1         number          for operator op 1. 


 Rule 7  Error detection 

              Stack 
       Before       After          Action 
       eoform                      Return indicating unsuccessful compile 





 Table 9   Operator Precedence Table 

 Operator              Unary Precedence       Binary Precedence 
 +                             6                      4 
 -                             6                      4 
 *                            na                      5 
 /                            na                      7 
 ^                            na                      3 
 =                            na                      3 
 < >                          na                      3 
 < =                          na                      3 
 > =                          na                      3 
 <                            na                      3 
 >                            na                      3 
 #and#                        na                      1 
 #or#                         na                      1 
 #not#                        2                      na 











 Example: 

 Using the above rules, we can now see how a particular formula is
 compiled.  Let us consider the following formula: 

                  3+5*6 

 This is broken up by the lexical analyzer into seven tokens. 

                  boform 
                  3 
                  + 
                  5 
                  * 
                  6 
                  eoform 

 The syntax scans proceed as follows until a matching rule is found: 

 Stack 

 boform           number         +            number 
                  boform         number       + 
                                 boform       number 
                                              boform 

 Compile buffer 

                  push 3         push 3       push 3 
                                              push 5 

 At this point, rule 5 is invoked, but since the precedence of boform is
 zero, no action is taken. 

 Stack 

 *                number 
 number           * 
 +                number 
 number           + 
 boform           number 
                  boform 

 Compile buffer 

 push 3           push 3 
 push 5           push 5 
                  push 6 







 At this  point, since the binary precedence of + is lower than the binary
 precedence of *, rule 5 does apply, and the opcode for * is compiled.  The
 stack is reduced by replacing number * number by number and scan is made,
 but no further rule applies. 


 Stack 

 number          eoform 
 +               number 
 number          + 
 boform          number 
                 boform 

 Compile buffer 

 push 3          push 3 
 push 5          push 5 
 push 6          push 6 



 Rule 5 applies again, and the opcode for + is compiled, reducing the stack
 to boform, number, eoform.  Rescanning finds a match on rule 1 which
 compiles a return opcode and terminates.  The final compiled code is thus: 

 push 3 
 push 5 
 push 6 
 * 
 + 
 return 

 A Note on the Decompiler 

 The algorithm for the formula decompiler was taken verbatim from: 

 Writing Interactive Compilers and Interpreters, P.J. Brown, John Wiley and
 Sons, 1979.  See chapter 6.2.  The algorithm itself is described on pages
 216 and 217. 

 This algorithm is also described in the following article. 

 More on the Re-creation of Source Code from Reverse Polish, P.J. Brown,
 Software Practice and Experience, Vol 7, 545-551 (1977). 










 WORKSHEET COLUMN DESIGNATORS 

 Most records within the 1-2-3 Condensed Worksheet format are specified
 with column/row designators (for example, column 0, row 0 equals A1).  When
 determining the column designator, the table below will help make
 conversion easier. 


 Column   Hex   Dec        Column   Hex   Dec        Column   Hex   Dec 
   A       0     1           BA     34     52          DA     68    104 
   B       1     1           BB     35     53          DB     69    105 
   C       2     2           BC     36     54          DC     6A    106 
   D       3     3           BD     37     55          DD     6B    107 
   E       4     4           BE     38     56          DE     6C    108 
   F       5     5           BF     39     57          DF     6D    109 
   G       6     6           BG     3A     58          DG     6E    110 
   H       7     7           BH     3B     59          DH     6F    111 
   I       8     8           BI     3C     60          DI     70    112 
   J       9     9           BJ     3D     61          DJ     71    113 
   K       A    10           BK     3E     62          DK     72    114 
   L       B    11           BL     3F     63          DL     73    115 
   M       C    12           BM     40     64          DM     74    116 
   N       D    13           BN     41     65          DN     75    117 
   O       E    14           BO     42     66          DO     76    118 
   P       F    15           BP     43     67          DP     77    119 
   Q      10    16           BQ     44     68          DQ     78    120 
   R      11    17           BR     45     69          DR     79    121 
   S      12    18           BS     46     70          DS     7A    122 
   T      13    19           BT     47     71          DT     7B    123 
   U      14    20           BU     48     72          DU     7C    124 
   V      15    21           BV     49     73          DV     7D    125 
   W      16    22           BW     4A     74          DW     7E    126 
   X      17    23           BX     4B     75          DX     7F    127 
   Y      18    24           BY     4C     76          DY     80    128 
   Z      19    25           BZ     4D     77          DZ     81    129 
  AA      1A    26           CA     4E     78          EA     82    130 
  AB      1B    27           CB     4F     79          EB     83    131 
  AC      1C    28           CC     50     80          EC     84    132 
  AD      1D    29           CD     51     81          ED     85    133 
  AE      1E    30           CE     52     82          EE     86    134 
  AF      1F    31           CF     53     83          EF     87    135 
  AG      20    32           CG     54     84          EG     88    136 
  AH      21    33           CH     55     85          EH     89    137 
  AI      22    34           CI     56     86          EI     8A    138 
  AJ      23    35           CJ     57     87          EJ     8B    139 
  AK      24    36           CK     58     88          EK     8C    140 
  AL      25    37           CL     59     89          EL     8D    141 
  AM      26    38           CM     5A     90          EM     8E    142 
  AN      27    39           CN     5B     91          EN     8F    143 
  AO      28    40           CO     5C     92          EO     90    144 
  AP      29    41           CP     5D     93          EP     91    145 
  AQ      2A    42           CQ     5E     94          EQ     92    146 
  AR      2B    43           CR     5F     95          ER     93    147 
  AS      2C    44           CS     60     96          ES     94    148 
  AT      2D    45           CT     61     97          ET     95    149 
  AU      2E    46           CU     62     98          EU     96    150 
  AV      2F    47           CV     63     99          EV     97    151 
  AW      30    48           CW     64    100          EW     98    152 
  AX      31    49           CX     65    101          EX     99    153 
  AY      32    50           CY     66    102          EY     9A    154 
  AZ      33    51           CZ     67    103          EZ     9B    155 








 (CONTINUED)




               Column   Hex    Dec         Column    Hex    Dec 

                 FA     9C     156           HA      DO     208 
                 FB     9D     157           HB      D1     209 
                 FC     9E     158           HC      D2     210 
                 FD     9F     159           HD      D3     211 
                 FE     AO     160           HE      D4     212 
                 FF     A1     161           HF      D5     213 
                 FG     A2     162           HG      D6     214 
                 FH     A3     163           HH      D7     215 
                 FI     A4     164           HI      D8     216 
                 FJ     A5     165           HJ      D9     217 
                 FK     A6     166           HK      DA     218 
                 FL     A7     167           HL      DB     219 
                 FM     A8     168           HM      DC     220 
                 FN     A9     169           HN      DD     221 
                 FO     AA     170           HO      DE     222 
                 FP     AB     171           HP      DF     223 
                 FQ     AC     172           HQ      EO     224 
                 FR     AD     173           HR      E1     225 
                 FS     AE     174           HS      E2     226 
                 FT     AF     175           HT      E3     227 
                 FU     BO     176           HU      E4     228 
                 FV     B1     177           HV      E5     229 
                 FW     B2     178           HW      E6     230 
                 FX     B3     179           HX      E7     231 
                 FY     B4     180           HY      E8     232 
                 FZ     B5     181           HZ      E9     233 
                 GA     B6     182           IA      EA     234 
                 GB     B7     183           IB      EB     235 
                 GC     B8     184           IC      EC     236 
                 GD     B9     185           ID      ED     237 
                 GE     BA     186           IE      EE     238 
                 GF     BB     187           IF      EF     239 
                 GG     BC     188           IG      FO     240 
                 GH     BD     189           IH      F1     241 
                 GI     BE     190           II      F2     242 
                 GJ     BF     191           IJ      F3     243 
                 GK     CO     192           IK      F4     244 
                 GL     C1     193           IL      F5     245 
                 GM     C2     195           IM      F6     246 
                 GN     C3     195           IN      F7     247 
                 GO     C4     196           IO      F8     248 
                 GP     C5     197           IP      F9     249 
                 GQ     C6     198           IQ      FA     250 
                 GR     C7     199           IR      FB     251 
                 GS     C8     200           IS      FC     252 
                 GT     C9     201           IT      FD     253 
                 GU     CA     202           IU      FE     254 
                 GV     CB     203           IV      FF     255 
                 GW     CC     204 
                 GX     CD     205 
                 GY     CE     206 
                 GZ     CF     207 




 ANALYSIS OF 1-2-3  WORKSHEET FILE 

 The worksheet shown below was created in 1-2-3 and saved to disk. 



                                              Key: 

                                              A2..A5 Named Range (code 11) 
          EXAMPLE                                 A2: Label (code 15) 
             100                                  A3: Integer (code 13) 
            12.5                                  A4: Number (code 14) 
            87.5                                  A5: Formula (+A3-A4) 
                                                      (code 16) 


 The example shown below is a partial hex dump of this worksheet file.  By 
 reading each record header, you can determine the type of record you are
 encountering.  The record header will also tell you the length of that
 follows the header.  By analyzing the record header, you can read the
 records you want and skip unrelated records. 


    362B:0100                           06 00 08 00 00 00 00 00 00 00 
    362B:0110        04 00 2F 00 01 00  01 02 00 01 00 FF 03 00 01 00 
    362B:0120        00 04 00 01 00 00  05 00 01 00 FF 07 00 1F 00 00 
    362B:0130        00 01 00 71 00 09  00 08 00 14 00 00 00 00 00 00 
    362B:0140        00 00 00 00 00 00  00 04 00 04 00 48 00 00 0B 00 
    362B:0150        18 00 54 45 53 54  00 00 00 00 00 00 00 00 00 00 
    362B:0160        00 00 00 00 01 00  00 00 04 00 18 00 19 00 00 FF 
    362B:0170        FF 00 00 FF FF 00  00 FF FF 00 00 FF FF 00 00 FF 
    362B:0180 


    362B:05C0 
    362B:05D0        00 00 00 00 00 00  00 00 00 00 00 00 00 00 00 00 
    362B:05E0        00 00 00 00 00 00  00 00 00 00 00 00 00 00 00 00 
    362B:05F0        00 00 00 00 71 71  01 00 0F 00 0E 00 FF 00 00 01 
    362B:0600        00 27 45 58 41 4D  50 4C 45 00 0D 00 07 00 FF 00 
    362B:0610        00 02 00 64 00 
    362B:0620                           10 00 1B 00 FF 00 00 04 00 00 
    362B:0630        00 00 00 00 E0 55  40 0C 00 01 00 80 FE BF 01 00 
    362B:0640        80 FF BF 0A 03 
