sylk and xlsb_short_records
This commit is contained in:
parent
a2d9e018bf
commit
89a4acbcdf
12
README.md
12
README.md
@ -1,2 +1,10 @@
|
||||
# notes
|
||||
Various file format notes
|
||||
# SheetJS File Format Notes
|
||||
|
||||
Various spreadsheet file format notes.
|
||||
|
||||
- [Symbolic Link (SLK/SYLK)](/sylk/README.md)
|
||||
- [XLSB Short Records](/xlsb_short_records/README.md)
|
||||
|
||||
Project sponsored by [SheetJS](https://sheetjs.com)
|
||||
|
||||
[](https://github.com/SheetJS/notes)
|
||||
|
1
_config.yml
Normal file
1
_config.yml
Normal file
@ -0,0 +1 @@
|
||||
title: SheetJS File Format Notes
|
83
sylk/README.md
Normal file
83
sylk/README.md
Normal file
@ -0,0 +1,83 @@
|
||||
# Symbolic Link format
|
||||
|
||||
Files start with `ID` (`0x49 0x44`). Files are interpreted as plaintext in the
|
||||
system ANSI codepage.
|
||||
|
||||
|
||||
## Basics
|
||||
|
||||
The file consists of a series of plaintext records. Records are separated by
|
||||
newline characters (both `\r\n` and `\n` newlines are accepted by newer versions
|
||||
of Excel, but generated files should prefer CRLF).
|
||||
|
||||
### Fields
|
||||
|
||||
A record consists of a record type and a series of fields. Each part of the
|
||||
record is separated by a single `;` character.
|
||||
|
||||
The literal semicolon is encoded as two consecutive semicolons `;;`. Example:
|
||||
|
||||
```
|
||||
C;Y1;X1;K"abc;;def"
|
||||
```
|
||||
|
||||
### Encoding
|
||||
|
||||
In addition to the escaped semicolon, Excel understand two types of Encodings:
|
||||
|
||||
#### Raw Byte Trigrams
|
||||
|
||||
Trigrams matching the pattern `\x1B[\x20-\x2F][\x30-\x3F]` are decoded into a
|
||||
single byte whose high bits are taken from the second character and whose low
|
||||
bits are taken from the third character.
|
||||
|
||||
For example. `"\x1B :" == "\x1B\x20\x3A` encodes the byte `"\x0A"` (newline)
|
||||
|
||||
`"\x1B#;` encodes a literal semicolon.
|
||||
|
||||
#### Special Escapes
|
||||
|
||||
Excel also understands a set of special escapes that start with `\x1BN`. For
|
||||
clarity, the `\x1BN` part is not included in the table:
|
||||
|
||||
| sequence | text |
|
||||
|:---------|:-----|
|
||||
| `AA` | `À` |
|
||||
|
||||
|
||||
## Record Types
|
||||
|
||||
| Record Type | Description |
|
||||
|:------------|:---------------------|
|
||||
| `ID` | Header |
|
||||
| `E` | EOF |
|
||||
| `B` | Worksheet Dimensions |
|
||||
| `O` | Options |
|
||||
| `P` | Number Format |
|
||||
| `F` | Formatting |
|
||||
| `C` | Cell |
|
||||
|
||||
|
||||
## EOF Record (E)
|
||||
|
||||
There are no fields.
|
||||
|
||||
|
||||
## Cell Record (C)
|
||||
|
||||
|
||||
### Comments
|
||||
|
||||
The `A` field of the `C` record can specify plaintext comments. They are encoded
|
||||
using the same text encoding in `K` fields.
|
||||
|
||||
### Shared Formulae
|
||||
|
||||
The `S` field of the `C` record signals that a cell is using a shared formula.
|
||||
The `R` and `C` fields are the 1-indexed row and column indices of the cell with
|
||||
the formula. The formula should be extracted from the original location and
|
||||
shifted to the current cell (relative references adjusted by the offset).
|
||||
|
||||
|
||||
|
||||
[](https://github.com/SheetJS/notes)
|
10
sylk/comment.slk
Normal file
10
sylk/comment.slk
Normal file
@ -0,0 +1,10 @@
|
||||
ID;PWXL;N;E
|
||||
P;PGeneral
|
||||
F;P0;DG0G10;M320
|
||||
B;Y3;X1;D0 0 9 0
|
||||
C;Y1;X1;AArthas: :I would gladly bear any curse to save my homeland.
|
||||
C;Y2;X2;AMuradin: :Leave it be, Arthas. Forget this business and lead your men home.
|
||||
C;Y1;X1;K1
|
||||
C;Y1;X2;K2
|
||||
C;Y2;X1;K3
|
||||
E
|
8
sylk/shared_formula.slk
Normal file
8
sylk/shared_formula.slk
Normal file
@ -0,0 +1,8 @@
|
||||
ID;PWXL;N;E
|
||||
P;PGeneral
|
||||
F;P0;DG0G10;M320
|
||||
B;Y3;X1;D0 0 9 0
|
||||
C;Y1;X1;K1
|
||||
C;Y2;K2;ER[-1]C+1
|
||||
C;Y3;K3;S;R2;C1
|
||||
E
|
90
xlsb_short_records/README.md
Normal file
90
xlsb_short_records/README.md
Normal file
@ -0,0 +1,90 @@
|
||||
# XLSB Short Records
|
||||
|
||||
There are 7 undocumented XLSB records (record types 12-18) that Excel supports.
|
||||
They appear to specify cells using a "Short" cell structure
|
||||
|
||||
## Cell Structures
|
||||
|
||||
XLSB Cell structures are 8 bytes with the following layout:
|
||||
|
||||
```
|
||||
column index (4 bytes)
|
||||
style index (3 bytes)
|
||||
flags (1 byte)
|
||||
```
|
||||
|
||||
A "Short" structure is 4 bytes and omits the column:
|
||||
|
||||
```
|
||||
style index (3 bytes)
|
||||
flags (1 byte)
|
||||
```
|
||||
|
||||
The actual column index is understood to be the column after the previous cell.
|
||||
For example, if D3 was the last cell, a record using the Short structure is
|
||||
defining cell E3.
|
||||
|
||||
## Cell Records
|
||||
|
||||
The various cell records (BrtCellBlank, BrtCellBool, etc) consist of a Cell
|
||||
structure followed by the cell data. The various formula records (BrtFmlaBool,
|
||||
BrtFmlaError, etc) append the formula structure to the base cell record.
|
||||
|
||||
The "Short" cell records follow similar patterns but omit the 4-byte column
|
||||
field from the cell structure.
|
||||
|
||||
For example, record type 18 "BrtShortIsst" is the short form of BrtCellIsst.
|
||||
|
||||
BrtCellIsst has the following layout:
|
||||
|
||||
```
|
||||
column index (4 bytes)
|
||||
style index (3 bytes)
|
||||
flags (1 byte)
|
||||
shared string table index (4 bytes)
|
||||
```
|
||||
|
||||
BrtShortIsst omits the column index:
|
||||
|
||||
```
|
||||
style index (3 bytes)
|
||||
flags (1 byte)
|
||||
shared string table index (4 bytes)
|
||||
```
|
||||
|
||||
## Records
|
||||
|
||||
| Record | Name | Long Cell Record |
|
||||
|-------:|:--------------|:-----------------|
|
||||
| `12` | BrtShortBlank | BrtCellBlank |
|
||||
| `13` | BrtShortRk | BrtCellRk |
|
||||
| `14` | BrtShortError | BrtFmlaError |
|
||||
| `15` | BrtShortBool | BrtCellBool |
|
||||
| `16` | BrtShortReal | BrtCellReal |
|
||||
| `17` | BrtShortSt | BrtCellSt |
|
||||
| `18` | BrtShortIsst | BrtCellIsst |
|
||||
|
||||
Record 13 is informally referred to as "BrtShortRk". It is the short form of
|
||||
BrtCellRk. BrtCellRk is a 12 byte structure:
|
||||
|
||||
```
|
||||
column index (4 bytes)
|
||||
style index (3 bytes)
|
||||
flags (1 byte)
|
||||
value stored as RkNumber (4 bytes)
|
||||
```
|
||||
|
||||
The short form BrtShortRk is therefore an 8 byte structure:
|
||||
|
||||
```
|
||||
style index (3 bytes)
|
||||
flags (1 byte)
|
||||
value stored as RkNumber (4 bytes)
|
||||
```
|
||||
|
||||
## Test Files
|
||||
|
||||
- [`brt_str.xlsb`](./brt_str.xlsb) includes types 12,13,14,15,16,17
|
||||
- [`brt_sst.xlsb`](./brt_sst.xlsb) includes types 12,13,14,15,16,18
|
||||
|
||||
[](https://github.com/SheetJS/notes)
|
BIN
xlsb_short_records/brt_sst.xlsb
Normal file
BIN
xlsb_short_records/brt_sst.xlsb
Normal file
Binary file not shown.
BIN
xlsb_short_records/brt_str.xlsb
Normal file
BIN
xlsb_short_records/brt_str.xlsb
Normal file
Binary file not shown.
Loading…
Reference in New Issue
Block a user