This commit is contained in:
SheetJS 2022-06-01 18:59:29 -04:00
parent d61cb0d193
commit 2b3ea9dd7e
6 changed files with 281 additions and 187 deletions

@ -34,8 +34,18 @@ Write options are described in the [Writing Options](./api/write-options) sectio
## Utilities
Utilities are available in the `XLSX.utils` object and are described in the
[Utility Functions](./api/utilities) section:
Utilities are available in the `XLSX.utils` object.
The following are described in [A1 Utilities](./csf/general#a1-utilities)
**Cell and cell address manipulation:**
- `encode_row / decode_row` converts between 0-indexed rows and 1-indexed rows.
- `encode_col / decode_col` converts between 0-indexed columns and column names.
- `encode_cell / decode_cell` converts cell addresses.
- `encode_range / decode_range` converts cell ranges.
The following are described in the [Utility Functions](./api/utilities):
**Constructing:**
@ -60,11 +70,6 @@ Utilities are available in the `XLSX.utils` object and are described in the
- `sheet_to_formulae` generates a list of the formulae (with value fallbacks).
**Cell and cell address manipulation:**
**Miscellaneous**
- `format_cell` generates the text value for a cell (using number formats).
- `encode_row / decode_row` converts between 0-indexed rows and 1-indexed rows.
- `encode_col / decode_col` converts between 0-indexed columns and column names.
- `encode_cell / decode_cell` converts cell addresses.
- `encode_range / decode_range` converts cell ranges.

@ -2,153 +2,157 @@
sidebar_position: 1
---
# Core Concepts
# Addresses and Ranges
The "Common Spreadsheet Format" (CSF) is the object model used by SheetJS.
## Cell Addresses and Ranges
## Cell Addresses
Cell address objects are stored as `{c:C, r:R}` where `C` and `R` are 0-indexed
column and row numbers, respectively. For example, the cell address `B5` is
represented by the object `{c:1, r:4}`.
## Cell Ranges
Cell range objects are stored as `{s:S, e:E}` where `S` is the first cell and
`E` is the last cell in the range. The ranges are inclusive. For example, the
range `A3:B7` is represented by the object `{s:{c:0, r:2}, e:{c:1, r:6}}`.
Utility functions perform a row-major order walk traversal of a sheet range:
### Column and Row Ranges
A column range (spanning every row) is represented with the starting row `0` and
the ending row `1048575`:
```js
for(var R = range.s.r; R <= range.e.r; ++R) {
for(var C = range.s.c; C <= range.e.c; ++C) {
var cell_address = {c:C, r:R};
/* if an A1-style address is needed, encode the address */
var cell_ref = XLSX.utils.encode_cell(cell_address);
}
}
{ s: { c: 0, r: 0 }, e: { c: 0, r: 1048575 } } // A:A
{ s: { c: 1, r: 0 }, e: { c: 2, r: 1048575 } } // B:C
```
## Cell Object
Cell objects are plain JS objects with keys and values following the convention:
| Key | Description |
| --- | ---------------------------------------------------------------------- |
| | **Core Cell Properties** ([More Info](#data-types)) |
| `v` | raw value (number, string, Date object, boolean) |
| `t` | type: `b` Boolean, `e` Error, `n` Number, `d` Date, `s` Text, `z` Stub |
| | **Number Formats** ([More Info](./features#number-formats)) |
| `z` | number format string associated with the cell (if requested) |
| `w` | formatted text (if applicable) |
| | **Formulae** ([More Info](./features/formulae)) |
| `f` | cell formula encoded as an A1-style string (if applicable) |
| `F` | range of enclosing array if formula is array formula (if applicable) |
| `D` | if true, array formula is dynamic (if applicable) |
| | **Other Cell Properties** ([More Info](./features)) |
| `l` | cell hyperlink and tooltip ([More Info](./features/hyperlinks)) |
| `c` | cell comments ([More Info](./features#cell-comments)) |
| `r` | rich text encoding (if applicable) |
| `h` | HTML rendering of the rich text (if applicable) |
| `s` | the style/theme of the cell (if applicable) |
Built-in export utilities (such as the CSV exporter) will use the `w` text if it
is available. To change a value, be sure to delete `cell.w` (or set it to
`undefined`) before attempting to export. The utilities will regenerate the `w`
text from the number format (`cell.z`) and the raw value if possible.
The actual array formula is stored in the `f` field of the first cell in the
array range. Other cells in the range will omit the `f` field.
### Data Types
The raw value is stored in the `v` value property, interpreted based on the `t`
type property. This separation allows for representation of numbers as well as
numeric text. There are 6 valid cell types:
| Type | Description |
| :--: | :-------------------------------------------------------------------- |
| `b` | Boolean: value interpreted as JS `boolean` |
| `e` | Error: value is a numeric code and `w` property stores common name ** |
| `n` | Number: value is a JS `number` ** |
| `d` | Date: value is a JS `Date` object or string to be parsed as Date ** |
| `s` | Text: value interpreted as JS `string` and written as text ** |
| `z` | Stub: blank stub cell that is ignored by data processing utilities ** |
<details>
<summary><b>Error values and interpretation</b> (click to show)</summary>
| Value | Error Meaning |
| -----: | :-------------- |
| `0x00` | `#NULL!` |
| `0x07` | `#DIV/0!` |
| `0x0F` | `#VALUE!` |
| `0x17` | `#REF!` |
| `0x1D` | `#NAME?` |
| `0x24` | `#NUM!` |
| `0x2A` | `#N/A` |
| `0x2B` | `#GETTING_DATA` |
</details>
Type `n` is the Number type. This includes all forms of data that Excel stores
as numbers, such as dates/times and Boolean fields. Excel exclusively uses data
that can be fit in an IEEE754 floating point number, just like JS Number, so the
`v` field holds the raw number. The `w` field holds formatted text. Dates are
stored as numbers by default and converted with `XLSX.SSF.parse_date_code`.
Type `d` is the Date type, generated only when the option `cellDates` is passed.
Since JSON does not have a natural Date type, parsers are generally expected to
store ISO 8601 Date strings like you would get from `date.toISOString()`. On
the other hand, writers and exporters should be able to handle date strings and
JS Date objects. Note that Excel disregards timezone modifiers and treats all
dates in the local timezone. The library does not correct for this error.
Type `s` is the String type. Values are explicitly stored as text. Excel will
interpret these cells as "number stored as text". Generated Excel files
automatically suppress that class of error, but other formats may elicit errors.
Type `z` represents blank stub cells. They are generated in cases where cells
have no assigned value but hold comments or other metadata. They are ignored by
the core library data processing utility functions. By default these cells are
not generated; the parser `sheetStubs` option must be set to `true`.
#### Dates
<details>
<summary><b>Excel Date Code details</b> (click to show)</summary>
By default, Excel stores dates as numbers with a format code that specifies date
processing. For example, the date `19-Feb-17` is stored as the number `42785`
with a number format of `d-mmm-yy`. The `SSF` module understands number formats
and performs the appropriate conversion.
XLSX also supports a special date type `d` where the data is an ISO 8601 date
string. The formatter converts the date back to a number.
The default behavior for all parsers is to generate number cells. Setting
`cellDates` to true will force the generators to store dates.
</details>
<details>
<summary><b>Time Zones and Dates</b> (click to show)</summary>
Excel has no native concept of universal time. All times are specified in the
local time zone. Excel limitations prevent specifying true absolute dates.
Following Excel, this library treats all dates as relative to local time zone.
</details>
<details>
<summary><b>Epochs: 1900 and 1904</b> (click to show)</summary>
Excel supports two epochs (January 1 1900 and January 1 1904).
The workbook's epoch can be determined by examining the workbook's
`wb.Workbook.WBProps.date1904` property:
A row range (spanning every column) is represented with the starting col `0` and
the ending col `16383`:
```js
!!(((wb.Workbook||{}).WBProps||{}).date1904)
{ s: { c: 0, r: 0 }, e: { c: 16383, r: 0 } } // 1:1
{ s: { c: 0, r: 1 }, e: { c: 16383, r: 2 } } // 2:3
```
</details>
# Common Spreadsheet Address Styles
## A1-Style
A1-style is the default address style in Lotus 1-2-3 and Excel.
Columns are specified with letters, counting from `A` to `Z`, then `AA` to `ZZ`,
then `AAA`. Some sample values, along with SheetJS column indices, are listed:
| Ordinal | A1 Name | SheetJS |
|:--------|:--------|--------:|
| First | `A` | `0` |
| Second | `B` | `1` |
| 26th | `Z` | `25` |
| 27th | `AA` | `26` |
| 702st | `ZZ` | `701` |
| 703rd | `AAA` | `702` |
| 16384th | `XFD` | `16383` |
Rows are specified with numbers, starting from `1` for the first row. SheetJS
APIs that take row indices start from `0` (ECMAScript convention).
A cell address is the concatenation of column text and row number. For example,
the cell in the third column and fourth row is "C4".
A cell range is represented as the top-left cell of the range, followed by `:`,
followed by the bottom-right cell of the range. For example, the range `"C2:D4"`
includes 6 cells marked with ▒ in the table below:
<table><tbody>
<tr><th> </th><th>A</th><th>B</th><th>C</th><th>D</th><th>E</th></tr>
<tr><th>1</th><td> </td><td> </td><td> </td><td> </td><td> </td></tr>
<tr><th>2</th><td> </td><td> </td><td></td><td></td><td> </td></tr>
<tr><th>3</th><td> </td><td> </td><td></td><td></td><td> </td></tr>
<tr><th>4</th><td> </td><td> </td><td></td><td></td><td> </td></tr>
<tr><th>5</th><td> </td><td> </td><td> </td><td> </td><td> </td></tr>
</tbody></table>
A column range is represented by the left-most column, followed by `:`, followed
by the right-most column. For example, the range `C:D` represents the third and
fourth columns.
A row range is represented by the top-most row, followed by `:`, followed by the
bottom-most column. For example, `2:4` represents the second/third/fourth rows.
### A1 Utilities
#### Column Names
_Get the SheetJS index from an A1-Style column_
```js
var col_index = XLSX.utils.decode_col("D");
```
The argument is expected to be a string representing a column.
_Get the A1-Style column string from a SheetJS index_
```js
var col_name = XLSX.utils.encode_col(3);
```
The argument is expected to be a SheetJS column (non-negative integer).
#### Row Names
_Get the SheetJS index from an A1-Style row_
```js
var row_index = XLSX.utils.decode_row("4");
```
The argument is expected to be a string representing a row.
_Get the A1-Style row string from a SheetJS index_
```js
var row_name = XLSX.utils.encode_row(3);
```
The argument is expected to be a SheetJS column (non-negative integer).
#### Cell Addresses
_Generate a SheetJS cell address from an A1-Style address string_
```js
var address = XLSX.utils.decode_cell("A2");
```
The argument is expected to be a string representing a single cell address.
_Generate an A1-style address string from a SheetJS cell address_
```js
var a1_addr = XLSX.utils.encode_cell({r:1, c:0});
```
The argument is expected to be a SheetJS cell address
#### Cell Ranges
_Generate a SheetJS cell range from an A1-style range string_
```js
var range = XLSX.utils.decode_range("A1:D3");
```
The argument is expected to be a string representing a range or a single cell
address. The single cell address is interpreted as a single cell range, so
`XLSX.utils.decode_range("D3")` is the same as `XLSX.utils.decode_range("D3:D3")`
_Generate an A1-style address string from a SheetJS cell address_
```js
var a1_range = XLSX.utils.encode_range({ s: { c: 0, r: 0 }, e: { c: 3, r: 2 } });
```
The argument is expected to be a SheetJS cell range.

129
docz/docs/07-csf/02-cell.md Normal file

@ -0,0 +1,129 @@
---
sidebar_position: 2
---
# Cell Object
Cell objects are plain JS objects with keys and values following the convention:
| Key | Description |
| --- | ---------------------------------------------------------------------- |
| | **Core Cell Properties** ([More Info](#data-types)) |
| `v` | raw value (number, string, Date object, boolean) |
| `t` | type: `b` Boolean, `e` Error, `n` Number, `d` Date, `s` Text, `z` Stub |
| | **Number Formats** ([More Info](./features#number-formats)) |
| `z` | number format string associated with the cell (if requested) |
| `w` | formatted text (if applicable) |
| | **Formulae** ([More Info](./features/formulae)) |
| `f` | cell formula encoded as an A1-style string (if applicable) |
| `F` | range of enclosing array if formula is array formula (if applicable) |
| `D` | if true, array formula is dynamic (if applicable) |
| | **Other Cell Properties** ([More Info](./features)) |
| `l` | cell hyperlink and tooltip ([More Info](./features/hyperlinks)) |
| `c` | cell comments ([More Info](./features#cell-comments)) |
| `r` | rich text encoding (if applicable) |
| `h` | HTML rendering of the rich text (if applicable) |
| `s` | the style/theme of the cell (if applicable) |
Built-in export utilities (such as the CSV exporter) will use the `w` text if it
is available. To change a value, be sure to delete `cell.w` (or set it to
`undefined`) before attempting to export. The utilities will regenerate the `w`
text from the number format (`cell.z`) and the raw value if possible.
The actual array formula is stored in the `f` field of the first cell in the
array range. Other cells in the range will omit the `f` field.
### Data Types
The raw value is stored in the `v` value property, interpreted based on the `t`
type property. This separation allows for representation of numbers as well as
numeric text. There are 6 valid cell types:
| Type | Description |
| :--: | :-------------------------------------------------------------------- |
| `b` | Boolean: value interpreted as JS `boolean` |
| `e` | Error: value is a numeric code and `w` property stores common name ** |
| `n` | Number: value is a JS `number` ** |
| `d` | Date: value is a JS `Date` object or string to be parsed as Date ** |
| `s` | Text: value interpreted as JS `string` and written as text ** |
| `z` | Stub: blank stub cell that is ignored by data processing utilities ** |
<details>
<summary><b>Error values and interpretation</b> (click to show)</summary>
| Value | Error Meaning |
| -----: | :-------------- |
| `0x00` | `#NULL!` |
| `0x07` | `#DIV/0!` |
| `0x0F` | `#VALUE!` |
| `0x17` | `#REF!` |
| `0x1D` | `#NAME?` |
| `0x24` | `#NUM!` |
| `0x2A` | `#N/A` |
| `0x2B` | `#GETTING_DATA` |
</details>
Type `n` is the Number type. This includes all forms of data that Excel stores
as numbers, such as dates/times and Boolean fields. Excel exclusively uses data
that can be fit in an IEEE754 floating point number, just like JS Number, so the
`v` field holds the raw number. The `w` field holds formatted text. Dates are
stored as numbers by default and converted with `XLSX.SSF.parse_date_code`.
Type `d` is the Date type, generated only when the option `cellDates` is passed.
Since JSON does not have a natural Date type, parsers are generally expected to
store ISO 8601 Date strings like you would get from `date.toISOString()`. On
the other hand, writers and exporters should be able to handle date strings and
JS Date objects. Note that Excel disregards timezone modifiers and treats all
dates in the local timezone. The library does not correct for this error.
Type `s` is the String type. Values are explicitly stored as text. Excel will
interpret these cells as "number stored as text". Generated Excel files
automatically suppress that class of error, but other formats may elicit errors.
Type `z` represents blank stub cells. They are generated in cases where cells
have no assigned value but hold comments or other metadata. They are ignored by
the core library data processing utility functions. By default these cells are
not generated; the parser `sheetStubs` option must be set to `true`.
#### Dates
<details>
<summary><b>Excel Date Code details</b> (click to show)</summary>
By default, Excel stores dates as numbers with a format code that specifies date
processing. For example, the date `19-Feb-17` is stored as the number `42785`
with a number format of `d-mmm-yy`. The `SSF` module understands number formats
and performs the appropriate conversion.
XLSX also supports a special date type `d` where the data is an ISO 8601 date
string. The formatter converts the date back to a number.
The default behavior for all parsers is to generate number cells. Setting
`cellDates` to true will force the generators to store dates.
</details>
<details>
<summary><b>Time Zones and Dates</b> (click to show)</summary>
Excel has no native concept of universal time. All times are specified in the
local time zone. Excel limitations prevent specifying true absolute dates.
Following Excel, this library treats all dates as relative to local time zone.
</details>
<details>
<summary><b>Epochs: 1900 and 1904</b> (click to show)</summary>
Excel supports two epochs (January 1 1900 and January 1 1904).
The workbook's epoch can be determined by examining the workbook's
`wb.Workbook.WBProps.date1904` property:
```js
!!(((wb.Workbook||{}).WBProps||{}).date1904)
```
</details>

@ -1,5 +1,5 @@
---
sidebar_position: 2
sidebar_position: 3
---
# Sheet Objects

@ -1,5 +1,5 @@
---
sidebar_position: 3
sidebar_position: 4
---
# Workbook Object
@ -19,8 +19,7 @@ standard, XLS parsing stores core properties in both places.
The various file formats use different internal names for file properties. The
workbook `Props` object normalizes the names:
<details>
<summary><b>File Properties</b> (click to show)</summary>
<details open><summary><b>File Properties</b> (click to hide)</summary>
| JS Name | Excel Description |
|:--------------|:-------------------------------|

@ -104,50 +104,7 @@ The A1-style formula string is stored in the `f` field of the cell object.
Spreadsheet software typically represent formulae with a leading `=` sign, but
SheetJS formulae omit the `=`.
<details><summary><b>What is A1-style?</b> (click to show)</summary>
A1-style is the default in Excel.
Columns are specified with letters, counting from `A` to `Z`, then `AA` to `ZZ`,
then `AAA`. Some sample values, along with SheetJS column indices, are listed:
| Ordinal | A1 Name | SheetJS |
|:--------|:--------|--------:|
| First | `A` | `0` |
| Second | `B` | `1` |
| 26th | `Z` | `25` |
| 27th | `AA` | `26` |
| 702st | `ZZ` | `701` |
| 703rd | `AAA` | `702` |
| 16384th | `XFD` | `16383` |
Rows are specified with numbers, starting from `1` for the first row. SheetJS
APIs that take row indices start from `0` (ECMAScript convention).
A cell address is the concatenation of column text and row number. For example,
the cell in the third column and fourth row is "C4".
A cell range is represented as the top-left cell of the range, followed by `:`,
followed by the bottom-right cell of the range. For example, the range `"C2:D4"`
includes 6 cells marked with ▒ in the table below:
<table><tbody>
<tr><th></th><th>A</th><th>B</th><th>C</th><th>D</th><th>E</th></tr>
<tr><th>1</th><td></td><td></td><td></td><td></td><td></td></tr>
<tr><th>2</th><td></td><td></td><td></td><td></td><td></td></tr>
<tr><th>3</th><td></td><td></td><td></td><td></td><td></td></tr>
<tr><th>4</th><td></td><td></td><td></td><td></td><td></td></tr>
<tr><th>5</th><td></td><td></td><td></td><td></td><td></td></tr>
</tbody></table>
A column range is represented by the left-most column, followed by `:`, followed
by the right-most column. For example, the range `C:D` represents the third and
fourth columns.
A row range is represented by the top-most row, followed by `:`, followed by the
bottom-most column. For example, `2:4` represents the second/third/fourth rows.
</details>
["A1-Style"](../general#a1-style) describes A1 style in more detail.
For example, consider [this test file](pathname:///files/concat.xlsx):