WSFF
This commit is contained in:
parent
76461e6676
commit
630ce0f116
@ -3,6 +3,7 @@
|
||||
Various spreadsheet file format notes.
|
||||
|
||||
- [Data Interchange Format (DIF)](/dif/README.md)
|
||||
- [Lotus WK# formats](/lotus/README.md)
|
||||
- [Symbolic Link (SLK/SYLK)](/sylk/README.md)
|
||||
- [XLSB Short Records](/xlsb_short_records/README.md)
|
||||
- [Number Formats](/ssf/README.md)
|
||||
|
@ -19,7 +19,7 @@ a 4-byte header consisting of a `0` byte followed by the compressed length
|
||||
(stored as a 3-byte little-endian integer)
|
||||
|
||||
Each block follows the Snappy compressed format as described in
|
||||
<https://github.com/google/snappy/blob/main/format_description.txt> . iWork
|
||||
[the format description from the snappy repo](./snappy_format.txt). iWork
|
||||
apps do not expect a particular compression level, and it is possible to create
|
||||
the equivalent of a "STORED" block.
|
||||
|
||||
|
110
iwa/snappy_format.txt
Normal file
110
iwa/snappy_format.txt
Normal file
@ -0,0 +1,110 @@
|
||||
Snappy compressed format description
|
||||
Last revised: 2011-10-05
|
||||
|
||||
|
||||
This is not a formal specification, but should suffice to explain most
|
||||
relevant parts of how the Snappy format works. It is originally based on
|
||||
text by Zeev Tarantov.
|
||||
|
||||
Snappy is a LZ77-type compressor with a fixed, byte-oriented encoding.
|
||||
There is no entropy encoder backend nor framing layer -- the latter is
|
||||
assumed to be handled by other parts of the system.
|
||||
|
||||
This document only describes the format, not how the Snappy compressor nor
|
||||
decompressor actually works. The correctness of the decompressor should not
|
||||
depend on implementation details of the compressor, and vice versa.
|
||||
|
||||
|
||||
1. Preamble
|
||||
|
||||
The stream starts with the uncompressed length (up to a maximum of 2^32 - 1),
|
||||
stored as a little-endian varint. Varints consist of a series of bytes,
|
||||
where the lower 7 bits are data and the upper bit is set iff there are
|
||||
more bytes to be read. In other words, an uncompressed length of 64 would
|
||||
be stored as 0x40, and an uncompressed length of 2097150 (0x1FFFFE)
|
||||
would be stored as 0xFE 0xFF 0x7F.
|
||||
|
||||
|
||||
2. The compressed stream itself
|
||||
|
||||
There are two types of elements in a Snappy stream: Literals and
|
||||
copies (backreferences). There is no restriction on the order of elements,
|
||||
except that the stream naturally cannot start with a copy. (Having
|
||||
two literals in a row is never optimal from a compression point of
|
||||
view, but nevertheless fully permitted.) Each element starts with a tag byte,
|
||||
and the lower two bits of this tag byte signal what type of element will
|
||||
follow:
|
||||
|
||||
00: Literal
|
||||
01: Copy with 1-byte offset
|
||||
10: Copy with 2-byte offset
|
||||
11: Copy with 4-byte offset
|
||||
|
||||
The interpretation of the upper six bits are element-dependent.
|
||||
|
||||
|
||||
2.1. Literals (00)
|
||||
|
||||
Literals are uncompressed data stored directly in the byte stream.
|
||||
The literal length is stored differently depending on the length
|
||||
of the literal:
|
||||
|
||||
- For literals up to and including 60 bytes in length, the upper
|
||||
six bits of the tag byte contain (len-1). The literal follows
|
||||
immediately thereafter in the bytestream.
|
||||
- For longer literals, the (len-1) value is stored after the tag byte,
|
||||
little-endian. The upper six bits of the tag byte describe how
|
||||
many bytes are used for the length; 60, 61, 62 or 63 for
|
||||
1-4 bytes, respectively. The literal itself follows after the
|
||||
length.
|
||||
|
||||
|
||||
2.2. Copies
|
||||
|
||||
Copies are references back into previous decompressed data, telling
|
||||
the decompressor to reuse data it has previously decoded.
|
||||
They encode two values: The _offset_, saying how many bytes back
|
||||
from the current position to read, and the _length_, how many bytes
|
||||
to copy. Offsets of zero can be encoded, but are not legal;
|
||||
similarly, it is possible to encode backreferences that would
|
||||
go past the end of the block (offset > current decompressed position),
|
||||
which is also nonsensical and thus not allowed.
|
||||
|
||||
As in most LZ77-based compressors, the length can be larger than the offset,
|
||||
yielding a form of run-length encoding (RLE). For instance,
|
||||
"xababab" could be encoded as
|
||||
|
||||
<literal: "xab"> <copy: offset=2 length=4>
|
||||
|
||||
Note that since the current Snappy compressor works in 32 kB
|
||||
blocks and does not do matching across blocks, it will never produce
|
||||
a bitstream with offsets larger than about 32768. However, the
|
||||
decompressor should not rely on this, as it may change in the future.
|
||||
|
||||
There are several different kinds of copy elements, depending on
|
||||
the amount of bytes to be copied (length), and how far back the
|
||||
data to be copied is (offset).
|
||||
|
||||
|
||||
2.2.1. Copy with 1-byte offset (01)
|
||||
|
||||
These elements can encode lengths between [4..11] bytes and offsets
|
||||
between [0..2047] bytes. (len-4) occupies three bits and is stored
|
||||
in bits [2..4] of the tag byte. The offset occupies 11 bits, of which the
|
||||
upper three are stored in the upper three bits ([5..7]) of the tag byte,
|
||||
and the lower eight are stored in a byte following the tag byte.
|
||||
|
||||
|
||||
2.2.2. Copy with 2-byte offset (10)
|
||||
|
||||
These elements can encode lengths between [1..64] and offsets from
|
||||
[0..65535]. (len-1) occupies six bits and is stored in the upper
|
||||
six bits ([2..7]) of the tag byte. The offset is stored as a
|
||||
little-endian 16-bit integer in the two bytes following the tag byte.
|
||||
|
||||
|
||||
2.2.3. Copy with 4-byte offset (11)
|
||||
|
||||
These are like the copies with 2-byte offsets (see previous subsection),
|
||||
except that the offset is stored as a 32-bit integer instead of a
|
||||
16-bit integer (and thus will occupy four bytes).
|
24
lotus/README.md
Normal file
24
lotus/README.md
Normal file
@ -0,0 +1,24 @@
|
||||
# Lotus Worksheet File Format
|
||||
|
||||
## Specifications
|
||||
|
||||
An official set of file format specifications were released by Lotus Corporation
|
||||
and released into the public domain.
|
||||
|
||||
The official dedication is in the first document:
|
||||
|
||||
> The information contained in this document has been released into the
|
||||
> public domain and is not considered to be confidential or proprietary
|
||||
> although still the copyright and property of Lotus Development Corporation.
|
||||
> All efforts have been made to ensure that this information is clear and
|
||||
> useful since Lotus will not be providing customer assistance with this
|
||||
> booklet.
|
||||
|
||||
- [WSFF1.TXT](./WSFF1.TXT) covers the generic record structure and includes a
|
||||
summary of the core record types.
|
||||
- [WSFF2.TXT](./WSFF2.TXT) covers each record type, including wire layouts.
|
||||
- [WSFF3.TXT](./WSFF3.TXT) covers the cell format bit field
|
||||
- [WSFF4.TXT](./WSFF4.TXT) covers the wire layout of formula expressions
|
||||
- [WSFF5.TXT](./WSFF5.TXT) covers the formula opcodes (addendum to WSFF4)
|
||||
|
||||
[data:image/s3,"s3://crabby-images/0534e/0534e1576b9fc63ac2895f5a0abeb7496d62b19f" alt="Analytics"](https://github.com/SheetJS/notes)
|
338
lotus/WSFF1.TXT
Normal file
338
lotus/WSFF1.TXT
Normal file
@ -0,0 +1,338 @@
|
||||
WORKSHEET FILE FORMAT
|
||||
FROM LOTUS
|
||||
|
||||
INTRODUCTION AND QUICK REFERENCE
|
||||
|
||||
Copyright(c) 1984, Lotus Development Corporation
|
||||
161 First Street
|
||||
Cambridge, Massachusetts 02142
|
||||
(617) 492-7171
|
||||
Electronic Edition, December, 1984
|
||||
All Rights Reserved
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
PREFACE
|
||||
|
||||
Lotus Development Corporation's 1-2-3(TM) and Symphony(TM) perform user
|
||||
selected operations upon a data matrix that is termed a "worksheet".
|
||||
|
||||
Worksheet files are such matrices stored on disk.
|
||||
|
||||
A worksheet file is an unbroken sequence of binary coded records defining a
|
||||
single worksheet.
|
||||
|
||||
Both 1-2-3 and Symphony accept externally created data files if the files
|
||||
are in the worksheet file format. Other programs can decode and process
|
||||
worksheet files created by 1-2-3 or Symphony.
|
||||
|
||||
The following document provides information required to create or access a
|
||||
worksheet file by describing the records used to create a worksheet file.
|
||||
It is assumed that the reader is familiar with Lotus products and has ready
|
||||
access to 1-2-3 or Symphony documentation.
|
||||
|
||||
Note that the worksheet files for 1-2-3 and Symphony are similar, but not
|
||||
necessarily interchangeable. 1-2-3 and Symphony share some record types,
|
||||
but also have record types unique to that product. Symphony can read 1-2-3
|
||||
records, but 1-2-3 cannot read Symphony records.
|
||||
|
||||
The information contained in this document has been released into the
|
||||
public domain and is not considered to be confidential or proprietary
|
||||
although still the copyright and property of Lotus Development Corporation.
|
||||
All efforts have been made to ensure that this information is clear and
|
||||
useful since Lotus will not be providing customer assistance with this
|
||||
booklet. Lotus will, however, incorporate any necessary corrections if
|
||||
they are reported in writing to:
|
||||
|
||||
Lotus Development Corporation
|
||||
Worksheet File Format
|
||||
161 First Street
|
||||
Cambridge, MA 02142
|
||||
|
||||
|
||||
WORKSHEET FILE FORMAT
|
||||
|
||||
Worksheet files are organized as an unbroken sequence of variable length
|
||||
binary records. Each record consists of a 4-byte header followed by the
|
||||
record body. The header defines the record's type and length, as the
|
||||
example below shows.
|
||||
|
||||
The header's composition is as follows:
|
||||
|
||||
|
||||
|
||||
Byte Number Byte Description
|
||||
0,1 Record type code
|
||||
2,3 Record body length (bytes)
|
||||
|
||||
|
||||
Example: Record Header
|
||||
|
||||
Record Header
|
||||
|
||||
Record Record
|
||||
Type Length
|
||||
|
||||
Byte Number 0 1 2 3
|
||||
Hex Code 1C 00 20 00
|
||||
Decimal Equivalent 28 32
|
||||
|
||||
|
||||
The record body can be of many different types; most have predetermined
|
||||
length, but some vary in length.
|
||||
|
||||
The record type code is 28.
|
||||
|
||||
In a hex dump of the file, the record type appears as 1C 00h, noting that
|
||||
the 8086/88 stores the most significant byte of word in the higher memory
|
||||
address.
|
||||
|
||||
The record length is 32 bytes.
|
||||
In a hex dump of the file, the record length appears as 20 00h.
|
||||
|
||||
|
||||
Record types with Column/Row Coordinates
|
||||
|
||||
Some record types contain column/row coordinates to identify a cell, or one
|
||||
of the two points that define a range. Numbering starts at zero in the
|
||||
upper left corner of the worksheet.
|
||||
For example:
|
||||
|
||||
Cell A1 = column 0, row 0
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
SUMMARY OF RECORD TYPES
|
||||
|
||||
This section describes the different record types found in 1-2-3 and
|
||||
Symphony.
|
||||
|
||||
There are to Quick Reference tables ordered by Opcode and by Product,
|
||||
followed by a detailed reference section ordered by Opcode. In the
|
||||
reference section, there are examples for the more commonly used records.
|
||||
|
||||
It is assumed that the reader is familiar with 1-2-3 or Symphony and has
|
||||
access to Lotus' documentation.
|
||||
|
||||
Quick Reference by Opcode
|
||||
|
||||
Type Code (hex) Length (bytes) Description
|
||||
|
||||
BOF 0 2 Beginning of file
|
||||
EOF 1 0 End of file
|
||||
CALCMODE 2 1 Calculation mode
|
||||
CALCORDER 3 1 Calculation order
|
||||
SPLIT 4 1 Split window type
|
||||
SYNC 5 1 Split window sync
|
||||
RANGE 6 8 Active worksheet range
|
||||
WINDOW1 7 31 Window 1 record
|
||||
COLW1 8 3 Column width,
|
||||
window 1
|
||||
WINTWO 9 31 Window 2 record
|
||||
COLW2 A 3 Column width,
|
||||
window 2
|
||||
NAME B 24 Named range
|
||||
BLANK C 5 Blank cell
|
||||
INTEGER D 7 Integer number cell
|
||||
NUMBER E 13 Floating point number
|
||||
LABEL F variable Label cell
|
||||
FORMULA 10 variable Formula cell
|
||||
TABLE 18 25 Data table range
|
||||
ORANGE 19 25 Query range
|
||||
PRANGE 1A 8 Print range
|
||||
SRANGE 1B 8 Sort range
|
||||
FRANGE 1C 8 Fill range
|
||||
KRANGE1 1D 9 Primary sort key range
|
||||
HRANGE 20 16 Distribution range
|
||||
KRANGE2 23 9 Secondary sort key
|
||||
range
|
||||
PROTEC 24 1 Global protection
|
||||
FOOTER 25 242 Print footer
|
||||
HEADER 26 242 Print header
|
||||
SETUP 27 40 Print setup
|
||||
MARGINS 28 10 Print margins code
|
||||
|
||||
|
||||
|
||||
Quick Reference by Opcode (continued)
|
||||
|
||||
Type code (hex) Length (bytes) Description
|
||||
|
||||
LABELFMT 29 1 Label alignment
|
||||
TITLES 2A 16 Print borders
|
||||
GRAPH 2D 437 Current graph settings
|
||||
NGRAPH 2E 453 Named graph settings
|
||||
CALCCOUNT 2F 1 Iteration count
|
||||
UNFORMATTED 30 1 Formatted/unformatted
|
||||
print
|
||||
CURSORW12 31 1 Cursor location
|
||||
WINDOW 32 144 Symphony window
|
||||
settings
|
||||
STRING 33 variable Value of string
|
||||
formula
|
||||
PASSWORD 37 4 File lockout (CHKSUM)
|
||||
LOCKED 38 1 Lock flag
|
||||
QUERY 3C 127 Symphony query
|
||||
settings
|
||||
QUERYNAME 3D 16 Query name
|
||||
PRINT 3E 679 Symphony print record
|
||||
PRINTNAME 3F 16 Print record name
|
||||
GRAPH2 40 499 Symphony graph
|
||||
record
|
||||
GRAPHNAME 41 16 Graph record name
|
||||
ZOOM 42 9 Orig coordinates
|
||||
expanded window
|
||||
SYMSPLIT 43 2 Nos. of split windows
|
||||
NSROWS 44 2 Nos. of screen rows
|
||||
NSCOLS 45 2 Nos. of screen columns
|
||||
RULER 46 25 Named ruler range
|
||||
NNAME 47 25 Named sheet range
|
||||
ACOMM 48 65 Autoload.comm code
|
||||
AMACRO 49 8 Autoexecute macro
|
||||
address
|
||||
PARSE 4A 16 Query parse
|
||||
information
|
||||
|
||||
|
||||
|
||||
|
||||
Quick Reference by Product: 1-2-3 only
|
||||
|
||||
Type Code (hex) Length (bytes) Description
|
||||
|
||||
SPLIT 4 1 Split window type
|
||||
SYNC 5 1 Split window sync
|
||||
WINDOW 1 7 31 Window 1 record
|
||||
WINTWO 9 31 Window 2 record
|
||||
COLW2 A 3 Column width,
|
||||
window 2
|
||||
NAME B 24 Named range
|
||||
QRANGE 19 25 Query range
|
||||
PRANGE 1A 8 Print range
|
||||
SRANGE 1B 8 Sort range
|
||||
KRANGE1 1D 9 Primary sort key range
|
||||
KRANGE2 23 9 Secondary sort key
|
||||
range
|
||||
FOOTER 25 242 Print footer
|
||||
HEADER 26 242 Print header
|
||||
SETUP 27 40 Print setup
|
||||
MARGINS 28 10 Print margins code
|
||||
TITLES 2A 16 Print borders
|
||||
GRAPH 2D 437 Current graph settings
|
||||
NGRAPH 2E 453 Named graph settings
|
||||
|
||||
|
||||
|
||||
|
||||
Quick Reference by Product: 1-2-3 and Symphony
|
||||
|
||||
Type Code (hex) Length (bytes) Description
|
||||
|
||||
BOF 0 2 Beginning of file
|
||||
EOF 1 0 End of file
|
||||
CALCMODE 2 1 Calculation mode
|
||||
CALCORDER 3 1 Calculation order
|
||||
RANGE 6 8 Active worksheet range
|
||||
COLW1 8 3 Column width
|
||||
BLANK C 5 Blank cell
|
||||
INTEGER D 7 Integer number cell
|
||||
NUMBER E 13 Floating point number
|
||||
LABEL F variable Label cell
|
||||
FORMULA 10 variable Formula cell
|
||||
TABLE 18 25 Data table range
|
||||
FRANGE 1C 8 Fill range
|
||||
HRANGE 20 16 Distribution range
|
||||
PROTEC 24 1 Global protection
|
||||
LABELFMT 29 1 Label alignment
|
||||
CALCCOUNT 2F 1 Iteration count
|
||||
UNFORMATTED 30 1 Formatted/unformatted
|
||||
print
|
||||
CURSORW12 31 1 Cursor location
|
||||
|
||||
|
||||
|
||||
|
||||
Quick Reference by Product: Symphony only
|
||||
|
||||
Type Code (hex) Length (bytes) Description
|
||||
|
||||
WINDOW 32 144 Symphony window
|
||||
settings
|
||||
STRING 33 variable Value of string
|
||||
formula
|
||||
PASSWORD 37 4 File lockout (CHKSUM)
|
||||
LOCKED 38 1 Lock flag
|
||||
QUERY 3C 127 Symphony query
|
||||
settings
|
||||
QUERYNAME 3D 16 Query name
|
||||
PRINT 3E 679 Symphony print record
|
||||
PRINTNAME 3F 16 Print record name
|
||||
GRAPH2 40 499 Symphony graph
|
||||
record
|
||||
GRAPHNAME 41 16 Graph rocord name
|
||||
ZOOM 42 9 Orig coordinates
|
||||
expanded window
|
||||
SYMSPLIT 43 2 Nos. of split windows
|
||||
NSROWS 44 2 Nos. of screen rows
|
||||
NSCOLS 45 2 Nos. of screen columns
|
||||
RULER 46 25 Named ruler range
|
||||
NNAME 47 25 Named sheet range
|
||||
ACOMM 48 65 Autoload. comm code
|
||||
AMACRO 49 8 Autoexecute macro
|
||||
address
|
||||
PARSE 4A 16 Query parse
|
||||
information
|
2553
lotus/WSFF2.TXT
Normal file
2553
lotus/WSFF2.TXT
Normal file
File diff suppressed because it is too large
Load Diff
BIN
lotus/WSFF3.TXT
Normal file
BIN
lotus/WSFF3.TXT
Normal file
Binary file not shown.
544
lotus/WSFF4.TXT
Normal file
544
lotus/WSFF4.TXT
Normal file
@ -0,0 +1,544 @@
|
||||
WORKSHEET FILE FORMAT
|
||||
FROM LOTUS
|
||||
|
||||
APPENDIX B - THE FORMULA COMPILER
|
||||
|
||||
Copyright(c) 1984, Lotus Development Corporation
|
||||
161 First Street
|
||||
Cambridge, Massachusetts 02142
|
||||
(617) 492-7171
|
||||
Electronic Edition, December, 1984
|
||||
All Rights Reserved
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
APPENDIX B: The Formula Compiler
|
||||
|
||||
This appendix describes the internal workings of the formula compiler. The
|
||||
compiler transforms an ASCII string of characters representing a formula to
|
||||
its Reverse Polish code. The basic algorithm utilizes and SR parser (SR =
|
||||
shift and reduce). The aim of the parser is to apply a set of reduction
|
||||
rules which embody the syntax of the compiler to an input string. Formula
|
||||
code is compiled to a temporary buffer.
|
||||
|
||||
Lexicon Analysis
|
||||
|
||||
A lexical analyzer breaks up the input string into lexical units called
|
||||
tokens. A token is a substring of the original input string operand,
|
||||
operator, or special symbol (such as comma, parentheses, etc.) In addition,
|
||||
the lexical analyser supplies two special tokens, "beginning of formula"
|
||||
(boform) and "end of formula" (eoform), to facilitate the compilation
|
||||
process. The lexical analyzer identifies and processes literals (both
|
||||
number and string), cell and range references, operators, and function
|
||||
calls. It assigns a unique code to each distinct operator, function, or
|
||||
type of operand.
|
||||
|
||||
A function with no arguments is treated like a number.
|
||||
|
||||
Syntax Analysis
|
||||
|
||||
The syntactical analysis of a formula is accomplished by processing a list
|
||||
of tokens in left-to-right order. A stack called the syntax is also used
|
||||
during the syntactical scan. The basic algorithm is as follows:
|
||||
|
||||
Repeat the following steps:
|
||||
|
||||
1) Get the next token
|
||||
|
||||
2) If the token is a literal or cell reference:
|
||||
a) Push the number code on the syntax stack
|
||||
b) Push the number code on the syntax stack
|
||||
|
||||
3) If the token is a range reference:
|
||||
a) Compile code to push the range reference
|
||||
b) Push the range code on the syntax stack
|
||||
|
||||
4) Otherwise push the token code for the token on the syntax stack.
|
||||
|
||||
For each syntax rule, if the pattern on the top of the syntax matches the
|
||||
rule pattern take the action associated with the rule and start scanning
|
||||
from the beginning for any additional rules which may apply.
|
||||
|
||||
When a token code is pushed on the syntax stack, an additional word of
|
||||
zeros is also pushed on the stack. This is used when compiling function
|
||||
calls to hold the function's argument count.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Rule Matching
|
||||
|
||||
A relatively small number of rules are used to process formulas of arbitrary
|
||||
complexity. If a rule matches the top of the syntax stack, then the
|
||||
compiler takes a specific action and rule scanning starts again with the
|
||||
first rule. Each rule matches certain patterns on the syntax stack. A
|
||||
typical rule might be: if the top of the stack is the token for right
|
||||
parenthesis, and the next-to-top is a number, and the second form the top
|
||||
is a left parenthesis, then pop the top three items from the syntax stack
|
||||
and push the number on the syntax stack.
|
||||
|
||||
This rule can be more succinctly represented as:
|
||||
|
||||
Stack
|
||||
|
||||
Before After Action
|
||||
)
|
||||
number
|
||||
( number none
|
||||
|
||||
|
||||
|
||||
The Rules
|
||||
|
||||
|
||||
The following are the syntax rules used to process formulas. Note that the
|
||||
order of the rules is important. The rules for compilation of operators
|
||||
used additional tables which assign a precedence number and opcode to each
|
||||
legal unary and binary operator. Thus, for example, there is a single
|
||||
token code for minus sign (-), but there are two opcodes one for unary
|
||||
minus and one for binary minus. In addition, these two operators, while
|
||||
lexically identical, also have different precedence. In general, operators
|
||||
of higher precedence will be performed before operators of lower precedence
|
||||
are performed left-to-right. All special operators (boform, eoform,
|
||||
parentheses, comma, etc.) are implicitly assigned a precedence of zero.
|
||||
|
||||
Rule 1 Termination test
|
||||
|
||||
Stack
|
||||
|
||||
Before After Action
|
||||
eoform Output a return code to compile buffer
|
||||
number Return, indicating successful compile
|
||||
boform
|
||||
|
||||
Rule 2 Function argument processing
|
||||
|
||||
Stack
|
||||
Before After Action
|
||||
' Error if range argument illegal for
|
||||
number or range function.
|
||||
( ( Increment argument count on stack
|
||||
function function
|
||||
|
||||
Rule 3 Process final function argument
|
||||
|
||||
Stack
|
||||
Before After Action
|
||||
) Error if range argument illegal for
|
||||
number or range function.
|
||||
( Increment argument count on stack
|
||||
function number Compile function opcode
|
||||
If list function, compile argument
|
||||
count; otherwise error is wrong
|
||||
argument count.
|
||||
|
||||
|
||||
|
||||
|
||||
Rule 4 Parenthesis removal
|
||||
|
||||
Stack
|
||||
Before After Action
|
||||
) Compile parenthesis opcode
|
||||
number
|
||||
( number
|
||||
operator operator
|
||||
|
||||
|
||||
|
||||
Rule 5 Binary operators
|
||||
|
||||
Stack
|
||||
Before After Action
|
||||
op2 If binary op<binary op, rule does
|
||||
number not match. Otherwise, compile opcode
|
||||
op1 op2 for operator op1.
|
||||
|
||||
|
||||
Rule 6 Unary operators
|
||||
|
||||
Stack
|
||||
Before After Action
|
||||
op2 I unary op<binary op, rule does
|
||||
number op2 not match. Otherwise, compile opcode.
|
||||
op1 number for operator op 1.
|
||||
|
||||
|
||||
Rule 7 Error detection
|
||||
|
||||
Stack
|
||||
Before After Action
|
||||
eoform Return indicating unsuccessful compile
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Table 9 Operator Precedence Table
|
||||
|
||||
Operator Unary Precedence Binary Precedence
|
||||
+ 6 4
|
||||
- 6 4
|
||||
* na 5
|
||||
/ na 7
|
||||
^ na 3
|
||||
= na 3
|
||||
< > na 3
|
||||
< = na 3
|
||||
> = na 3
|
||||
< na 3
|
||||
> na 3
|
||||
#and# na 1
|
||||
#or# na 1
|
||||
#not# 2 na
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Example:
|
||||
|
||||
Using the above rules, we can now see how a particular formula is
|
||||
compiled. Let us consider the following formula:
|
||||
|
||||
3+5*6
|
||||
|
||||
This is broken up by the lexical analyzer into seven tokens.
|
||||
|
||||
boform
|
||||
3
|
||||
+
|
||||
5
|
||||
*
|
||||
6
|
||||
eoform
|
||||
|
||||
The syntax scans proceed as follows until a matching rule is found:
|
||||
|
||||
Stack
|
||||
|
||||
boform number + number
|
||||
boform number +
|
||||
boform number
|
||||
boform
|
||||
|
||||
Compile buffer
|
||||
|
||||
push 3 push 3 push 3
|
||||
push 5
|
||||
|
||||
At this point, rule 5 is invoked, but since the precedence of boform is
|
||||
zero, no action is taken.
|
||||
|
||||
Stack
|
||||
|
||||
* number
|
||||
number *
|
||||
+ number
|
||||
number +
|
||||
boform number
|
||||
boform
|
||||
|
||||
Compile buffer
|
||||
|
||||
push 3 push 3
|
||||
push 5 push 5
|
||||
push 6
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
At this point, since the binary precedence of + is lower than the binary
|
||||
precedence of *, rule 5 does apply, and the opcode for * is compiled. The
|
||||
stack is reduced by replacing number * number by number and scan is made,
|
||||
but no further rule applies.
|
||||
|
||||
|
||||
Stack
|
||||
|
||||
number eoform
|
||||
+ number
|
||||
number +
|
||||
boform number
|
||||
boform
|
||||
|
||||
Compile buffer
|
||||
|
||||
push 3 push 3
|
||||
push 5 push 5
|
||||
push 6 push 6
|
||||
|
||||
|
||||
|
||||
Rule 5 applies again, and the opcode for + is compiled, reducing the stack
|
||||
to boform, number, eoform. Rescanning finds a match on rule 1 which
|
||||
compiles a return opcode and terminates. The final compiled code is thus:
|
||||
|
||||
push 3
|
||||
push 5
|
||||
push 6
|
||||
*
|
||||
+
|
||||
return
|
||||
|
||||
A Note on the Decompiler
|
||||
|
||||
The algorithm for the formula decompiler was taken verbatim from:
|
||||
|
||||
Writing Interactive Compilers and Interpreters, P.J. Brown, John Wiley and
|
||||
Sons, 1979. See chapter 6.2. The algorithm itself is described on pages
|
||||
216 and 217.
|
||||
|
||||
This algorithm is also described in the following article.
|
||||
|
||||
More on the Re-creation of Source Code from Reverse Polish, P.J. Brown,
|
||||
Software Practice and Experience, Vol 7, 545-551 (1977).
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
WORKSHEET COLUMN DESIGNATORS
|
||||
|
||||
Most records within the 1-2-3 Condensed Worksheet format are specified
|
||||
with column/row designators (for example, column 0, row 0 equals A1). When
|
||||
determining the column designator, the table below will help make
|
||||
conversion easier.
|
||||
|
||||
|
||||
Column Hex Dec Column Hex Dec Column Hex Dec
|
||||
A 0 1 BA 34 52 DA 68 104
|
||||
B 1 1 BB 35 53 DB 69 105
|
||||
C 2 2 BC 36 54 DC 6A 106
|
||||
D 3 3 BD 37 55 DD 6B 107
|
||||
E 4 4 BE 38 56 DE 6C 108
|
||||
F 5 5 BF 39 57 DF 6D 109
|
||||
G 6 6 BG 3A 58 DG 6E 110
|
||||
H 7 7 BH 3B 59 DH 6F 111
|
||||
I 8 8 BI 3C 60 DI 70 112
|
||||
J 9 9 BJ 3D 61 DJ 71 113
|
||||
K A 10 BK 3E 62 DK 72 114
|
||||
L B 11 BL 3F 63 DL 73 115
|
||||
M C 12 BM 40 64 DM 74 116
|
||||
N D 13 BN 41 65 DN 75 117
|
||||
O E 14 BO 42 66 DO 76 118
|
||||
P F 15 BP 43 67 DP 77 119
|
||||
Q 10 16 BQ 44 68 DQ 78 120
|
||||
R 11 17 BR 45 69 DR 79 121
|
||||
S 12 18 BS 46 70 DS 7A 122
|
||||
T 13 19 BT 47 71 DT 7B 123
|
||||
U 14 20 BU 48 72 DU 7C 124
|
||||
V 15 21 BV 49 73 DV 7D 125
|
||||
W 16 22 BW 4A 74 DW 7E 126
|
||||
X 17 23 BX 4B 75 DX 7F 127
|
||||
Y 18 24 BY 4C 76 DY 80 128
|
||||
Z 19 25 BZ 4D 77 DZ 81 129
|
||||
AA 1A 26 CA 4E 78 EA 82 130
|
||||
AB 1B 27 CB 4F 79 EB 83 131
|
||||
AC 1C 28 CC 50 80 EC 84 132
|
||||
AD 1D 29 CD 51 81 ED 85 133
|
||||
AE 1E 30 CE 52 82 EE 86 134
|
||||
AF 1F 31 CF 53 83 EF 87 135
|
||||
AG 20 32 CG 54 84 EG 88 136
|
||||
AH 21 33 CH 55 85 EH 89 137
|
||||
AI 22 34 CI 56 86 EI 8A 138
|
||||
AJ 23 35 CJ 57 87 EJ 8B 139
|
||||
AK 24 36 CK 58 88 EK 8C 140
|
||||
AL 25 37 CL 59 89 EL 8D 141
|
||||
AM 26 38 CM 5A 90 EM 8E 142
|
||||
AN 27 39 CN 5B 91 EN 8F 143
|
||||
AO 28 40 CO 5C 92 EO 90 144
|
||||
AP 29 41 CP 5D 93 EP 91 145
|
||||
AQ 2A 42 CQ 5E 94 EQ 92 146
|
||||
AR 2B 43 CR 5F 95 ER 93 147
|
||||
AS 2C 44 CS 60 96 ES 94 148
|
||||
AT 2D 45 CT 61 97 ET 95 149
|
||||
AU 2E 46 CU 62 98 EU 96 150
|
||||
AV 2F 47 CV 63 99 EV 97 151
|
||||
AW 30 48 CW 64 100 EW 98 152
|
||||
AX 31 49 CX 65 101 EX 99 153
|
||||
AY 32 50 CY 66 102 EY 9A 154
|
||||
AZ 33 51 CZ 67 103 EZ 9B 155
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
(CONTINUED)
|
||||
|
||||
|
||||
|
||||
|
||||
Column Hex Dec Column Hex Dec
|
||||
|
||||
FA 9C 156 HA DO 208
|
||||
FB 9D 157 HB D1 209
|
||||
FC 9E 158 HC D2 210
|
||||
FD 9F 159 HD D3 211
|
||||
FE AO 160 HE D4 212
|
||||
FF A1 161 HF D5 213
|
||||
FG A2 162 HG D6 214
|
||||
FH A3 163 HH D7 215
|
||||
FI A4 164 HI D8 216
|
||||
FJ A5 165 HJ D9 217
|
||||
FK A6 166 HK DA 218
|
||||
FL A7 167 HL DB 219
|
||||
FM A8 168 HM DC 220
|
||||
FN A9 169 HN DD 221
|
||||
FO AA 170 HO DE 222
|
||||
FP AB 171 HP DF 223
|
||||
FQ AC 172 HQ EO 224
|
||||
FR AD 173 HR E1 225
|
||||
FS AE 174 HS E2 226
|
||||
FT AF 175 HT E3 227
|
||||
FU BO 176 HU E4 228
|
||||
FV B1 177 HV E5 229
|
||||
FW B2 178 HW E6 230
|
||||
FX B3 179 HX E7 231
|
||||
FY B4 180 HY E8 232
|
||||
FZ B5 181 HZ E9 233
|
||||
GA B6 182 IA EA 234
|
||||
GB B7 183 IB EB 235
|
||||
GC B8 184 IC EC 236
|
||||
GD B9 185 ID ED 237
|
||||
GE BA 186 IE EE 238
|
||||
GF BB 187 IF EF 239
|
||||
GG BC 188 IG FO 240
|
||||
GH BD 189 IH F1 241
|
||||
GI BE 190 II F2 242
|
||||
GJ BF 191 IJ F3 243
|
||||
GK CO 192 IK F4 244
|
||||
GL C1 193 IL F5 245
|
||||
GM C2 195 IM F6 246
|
||||
GN C3 195 IN F7 247
|
||||
GO C4 196 IO F8 248
|
||||
GP C5 197 IP F9 249
|
||||
GQ C6 198 IQ FA 250
|
||||
GR C7 199 IR FB 251
|
||||
GS C8 200 IS FC 252
|
||||
GT C9 201 IT FD 253
|
||||
GU CA 202 IU FE 254
|
||||
GV CB 203 IV FF 255
|
||||
GW CC 204
|
||||
GX CD 205
|
||||
GY CE 206
|
||||
GZ CF 207
|
||||
|
||||
|
||||
|
||||
|
||||
ANALYSIS OF 1-2-3 WORKSHEET FILE
|
||||
|
||||
The worksheet shown below was created in 1-2-3 and saved to disk.
|
||||
|
||||
|
||||
|
||||
Key:
|
||||
|
||||
A2..A5 Named Range (code 11)
|
||||
EXAMPLE A2: Label (code 15)
|
||||
100 A3: Integer (code 13)
|
||||
12.5 A4: Number (code 14)
|
||||
87.5 A5: Formula (+A3-A4)
|
||||
(code 16)
|
||||
|
||||
|
||||
The example shown below is a partial hex dump of this worksheet file. By
|
||||
reading each record header, you can determine the type of record you are
|
||||
encountering. The record header will also tell you the length of that
|
||||
follows the header. By analyzing the record header, you can read the
|
||||
records you want and skip unrelated records.
|
||||
|
||||
|
||||
362B:0100 06 00 08 00 00 00 00 00 00 00
|
||||
362B:0110 04 00 2F 00 01 00 01 02 00 01 00 FF 03 00 01 00
|
||||
362B:0120 00 04 00 01 00 00 05 00 01 00 FF 07 00 1F 00 00
|
||||
362B:0130 00 01 00 71 00 09 00 08 00 14 00 00 00 00 00 00
|
||||
362B:0140 00 00 00 00 00 00 00 04 00 04 00 48 00 00 0B 00
|
||||
362B:0150 18 00 54 45 53 54 00 00 00 00 00 00 00 00 00 00
|
||||
362B:0160 00 00 00 00 01 00 00 00 04 00 18 00 19 00 00 FF
|
||||
362B:0170 FF 00 00 FF FF 00 00 FF FF 00 00 FF FF 00 00 FF
|
||||
362B:0180
|
||||
|
||||
|
||||
362B:05C0
|
||||
362B:05D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|
||||
362B:05E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|
||||
362B:05F0 00 00 00 00 71 71 01 00 0F 00 0E 00 FF 00 00 01
|
||||
362B:0600 00 27 45 58 41 4D 50 4C 45 00 0D 00 07 00 FF 00
|
||||
362B:0610 00 02 00 64 00
|
||||
362B:0620 10 00 1B 00 FF 00 00 04 00 00
|
||||
362B:0630 00 00 00 00 E0 55 40 0C 00 01 00 80 FE BF 01 00
|
||||
362B:0640 80 FF BF 0A 03
|
||||
|