From 2b3ea9dd7efb35a586bbcef79a2993e4ba2c5ce3 Mon Sep 17 00:00:00 2001 From: SheetJS Date: Wed, 1 Jun 2022 18:59:29 -0400 Subject: [PATCH] A1 --- docz/docs/05-interface.md | 21 +- docz/docs/07-csf/01-general.md | 266 +++++++++--------- docz/docs/07-csf/02-cell.md | 129 +++++++++ docz/docs/07-csf/{02-sheet.md => 03-sheet.md} | 2 +- docz/docs/07-csf/{03-book.md => 04-book.md} | 5 +- docz/docs/07-csf/07-features/01-formulae.md | 45 +-- 6 files changed, 281 insertions(+), 187 deletions(-) create mode 100644 docz/docs/07-csf/02-cell.md rename docz/docs/07-csf/{02-sheet.md => 03-sheet.md} (99%) rename docz/docs/07-csf/{03-book.md => 04-book.md} (97%) diff --git a/docz/docs/05-interface.md b/docz/docs/05-interface.md index cf9ff0c..0106a4e 100644 --- a/docz/docs/05-interface.md +++ b/docz/docs/05-interface.md @@ -34,8 +34,18 @@ Write options are described in the [Writing Options](./api/write-options) sectio ## Utilities -Utilities are available in the `XLSX.utils` object and are described in the -[Utility Functions](./api/utilities) section: +Utilities are available in the `XLSX.utils` object. + +The following are described in [A1 Utilities](./csf/general#a1-utilities) + +**Cell and cell address manipulation:** + +- `encode_row / decode_row` converts between 0-indexed rows and 1-indexed rows. +- `encode_col / decode_col` converts between 0-indexed columns and column names. +- `encode_cell / decode_cell` converts cell addresses. +- `encode_range / decode_range` converts cell ranges. + +The following are described in the [Utility Functions](./api/utilities): **Constructing:** @@ -60,11 +70,6 @@ Utilities are available in the `XLSX.utils` object and are described in the - `sheet_to_formulae` generates a list of the formulae (with value fallbacks). -**Cell and cell address manipulation:** +**Miscellaneous** - `format_cell` generates the text value for a cell (using number formats). -- `encode_row / decode_row` converts between 0-indexed rows and 1-indexed rows. -- `encode_col / decode_col` converts between 0-indexed columns and column names. -- `encode_cell / decode_cell` converts cell addresses. -- `encode_range / decode_range` converts cell ranges. - diff --git a/docz/docs/07-csf/01-general.md b/docz/docs/07-csf/01-general.md index 36d690c..d88f984 100644 --- a/docz/docs/07-csf/01-general.md +++ b/docz/docs/07-csf/01-general.md @@ -2,153 +2,157 @@ sidebar_position: 1 --- -# Core Concepts +# Addresses and Ranges The "Common Spreadsheet Format" (CSF) is the object model used by SheetJS. -## Cell Addresses and Ranges +## Cell Addresses Cell address objects are stored as `{c:C, r:R}` where `C` and `R` are 0-indexed column and row numbers, respectively. For example, the cell address `B5` is represented by the object `{c:1, r:4}`. +## Cell Ranges + Cell range objects are stored as `{s:S, e:E}` where `S` is the first cell and `E` is the last cell in the range. The ranges are inclusive. For example, the range `A3:B7` is represented by the object `{s:{c:0, r:2}, e:{c:1, r:6}}`. -Utility functions perform a row-major order walk traversal of a sheet range: + +### Column and Row Ranges + +A column range (spanning every row) is represented with the starting row `0` and +the ending row `1048575`: ```js -for(var R = range.s.r; R <= range.e.r; ++R) { - for(var C = range.s.c; C <= range.e.c; ++C) { - var cell_address = {c:C, r:R}; - /* if an A1-style address is needed, encode the address */ - var cell_ref = XLSX.utils.encode_cell(cell_address); - } -} +{ s: { c: 0, r: 0 }, e: { c: 0, r: 1048575 } } // A:A +{ s: { c: 1, r: 0 }, e: { c: 2, r: 1048575 } } // B:C ``` -## Cell Object - -Cell objects are plain JS objects with keys and values following the convention: - -| Key | Description | -| --- | ---------------------------------------------------------------------- | -| | **Core Cell Properties** ([More Info](#data-types)) | -| `v` | raw value (number, string, Date object, boolean) | -| `t` | type: `b` Boolean, `e` Error, `n` Number, `d` Date, `s` Text, `z` Stub | -| | **Number Formats** ([More Info](./features#number-formats)) | -| `z` | number format string associated with the cell (if requested) | -| `w` | formatted text (if applicable) | -| | **Formulae** ([More Info](./features/formulae)) | -| `f` | cell formula encoded as an A1-style string (if applicable) | -| `F` | range of enclosing array if formula is array formula (if applicable) | -| `D` | if true, array formula is dynamic (if applicable) | -| | **Other Cell Properties** ([More Info](./features)) | -| `l` | cell hyperlink and tooltip ([More Info](./features/hyperlinks)) | -| `c` | cell comments ([More Info](./features#cell-comments)) | -| `r` | rich text encoding (if applicable) | -| `h` | HTML rendering of the rich text (if applicable) | -| `s` | the style/theme of the cell (if applicable) | - -Built-in export utilities (such as the CSV exporter) will use the `w` text if it -is available. To change a value, be sure to delete `cell.w` (or set it to -`undefined`) before attempting to export. The utilities will regenerate the `w` -text from the number format (`cell.z`) and the raw value if possible. - -The actual array formula is stored in the `f` field of the first cell in the -array range. Other cells in the range will omit the `f` field. - -### Data Types - -The raw value is stored in the `v` value property, interpreted based on the `t` -type property. This separation allows for representation of numbers as well as -numeric text. There are 6 valid cell types: - -| Type | Description | -| :--: | :-------------------------------------------------------------------- | -| `b` | Boolean: value interpreted as JS `boolean` | -| `e` | Error: value is a numeric code and `w` property stores common name ** | -| `n` | Number: value is a JS `number` ** | -| `d` | Date: value is a JS `Date` object or string to be parsed as Date ** | -| `s` | Text: value interpreted as JS `string` and written as text ** | -| `z` | Stub: blank stub cell that is ignored by data processing utilities ** | - -
- Error values and interpretation (click to show) - -| Value | Error Meaning | -| -----: | :-------------- | -| `0x00` | `#NULL!` | -| `0x07` | `#DIV/0!` | -| `0x0F` | `#VALUE!` | -| `0x17` | `#REF!` | -| `0x1D` | `#NAME?` | -| `0x24` | `#NUM!` | -| `0x2A` | `#N/A` | -| `0x2B` | `#GETTING_DATA` | - -
- -Type `n` is the Number type. This includes all forms of data that Excel stores -as numbers, such as dates/times and Boolean fields. Excel exclusively uses data -that can be fit in an IEEE754 floating point number, just like JS Number, so the -`v` field holds the raw number. The `w` field holds formatted text. Dates are -stored as numbers by default and converted with `XLSX.SSF.parse_date_code`. - -Type `d` is the Date type, generated only when the option `cellDates` is passed. -Since JSON does not have a natural Date type, parsers are generally expected to -store ISO 8601 Date strings like you would get from `date.toISOString()`. On -the other hand, writers and exporters should be able to handle date strings and -JS Date objects. Note that Excel disregards timezone modifiers and treats all -dates in the local timezone. The library does not correct for this error. - -Type `s` is the String type. Values are explicitly stored as text. Excel will -interpret these cells as "number stored as text". Generated Excel files -automatically suppress that class of error, but other formats may elicit errors. - -Type `z` represents blank stub cells. They are generated in cases where cells -have no assigned value but hold comments or other metadata. They are ignored by -the core library data processing utility functions. By default these cells are -not generated; the parser `sheetStubs` option must be set to `true`. - - -#### Dates - -
- Excel Date Code details (click to show) - -By default, Excel stores dates as numbers with a format code that specifies date -processing. For example, the date `19-Feb-17` is stored as the number `42785` -with a number format of `d-mmm-yy`. The `SSF` module understands number formats -and performs the appropriate conversion. - -XLSX also supports a special date type `d` where the data is an ISO 8601 date -string. The formatter converts the date back to a number. - -The default behavior for all parsers is to generate number cells. Setting -`cellDates` to true will force the generators to store dates. - -
- -
- Time Zones and Dates (click to show) - -Excel has no native concept of universal time. All times are specified in the -local time zone. Excel limitations prevent specifying true absolute dates. - -Following Excel, this library treats all dates as relative to local time zone. - -
- -
- Epochs: 1900 and 1904 (click to show) - -Excel supports two epochs (January 1 1900 and January 1 1904). -The workbook's epoch can be determined by examining the workbook's -`wb.Workbook.WBProps.date1904` property: +A row range (spanning every column) is represented with the starting col `0` and +the ending col `16383`: ```js -!!(((wb.Workbook||{}).WBProps||{}).date1904) +{ s: { c: 0, r: 0 }, e: { c: 16383, r: 0 } } // 1:1 +{ s: { c: 0, r: 1 }, e: { c: 16383, r: 2 } } // 2:3 ``` -
+# Common Spreadsheet Address Styles + +## A1-Style + +A1-style is the default address style in Lotus 1-2-3 and Excel. + +Columns are specified with letters, counting from `A` to `Z`, then `AA` to `ZZ`, +then `AAA`. Some sample values, along with SheetJS column indices, are listed: + +| Ordinal | A1 Name | SheetJS | +|:--------|:--------|--------:| +| First | `A` | `0` | +| Second | `B` | `1` | +| 26th | `Z` | `25` | +| 27th | `AA` | `26` | +| 702st | `ZZ` | `701` | +| 703rd | `AAA` | `702` | +| 16384th | `XFD` | `16383` | + +Rows are specified with numbers, starting from `1` for the first row. SheetJS +APIs that take row indices start from `0` (ECMAScript convention). + +A cell address is the concatenation of column text and row number. For example, +the cell in the third column and fourth row is "C4". + +A cell range is represented as the top-left cell of the range, followed by `:`, +followed by the bottom-right cell of the range. For example, the range `"C2:D4"` +includes 6 cells marked with ▒ in the table below: + + + + + + + + +
ABCDE
1
2
3
4
5
+ +A column range is represented by the left-most column, followed by `:`, followed +by the right-most column. For example, the range `C:D` represents the third and +fourth columns. + +A row range is represented by the top-most row, followed by `:`, followed by the +bottom-most column. For example, `2:4` represents the second/third/fourth rows. + +### A1 Utilities + +#### Column Names + +_Get the SheetJS index from an A1-Style column_ + +```js +var col_index = XLSX.utils.decode_col("D"); +``` + +The argument is expected to be a string representing a column. + +_Get the A1-Style column string from a SheetJS index_ + +```js +var col_name = XLSX.utils.encode_col(3); +``` + +The argument is expected to be a SheetJS column (non-negative integer). + +#### Row Names + +_Get the SheetJS index from an A1-Style row_ + +```js +var row_index = XLSX.utils.decode_row("4"); +``` + +The argument is expected to be a string representing a row. + +_Get the A1-Style row string from a SheetJS index_ + +```js +var row_name = XLSX.utils.encode_row(3); +``` + +The argument is expected to be a SheetJS column (non-negative integer). + +#### Cell Addresses + +_Generate a SheetJS cell address from an A1-Style address string_ + +```js +var address = XLSX.utils.decode_cell("A2"); +``` + +The argument is expected to be a string representing a single cell address. + +_Generate an A1-style address string from a SheetJS cell address_ + +```js +var a1_addr = XLSX.utils.encode_cell({r:1, c:0}); +``` + +The argument is expected to be a SheetJS cell address + +#### Cell Ranges + +_Generate a SheetJS cell range from an A1-style range string_ + +```js +var range = XLSX.utils.decode_range("A1:D3"); +``` + +The argument is expected to be a string representing a range or a single cell +address. The single cell address is interpreted as a single cell range, so +`XLSX.utils.decode_range("D3")` is the same as `XLSX.utils.decode_range("D3:D3")` + +_Generate an A1-style address string from a SheetJS cell address_ + +```js +var a1_range = XLSX.utils.encode_range({ s: { c: 0, r: 0 }, e: { c: 3, r: 2 } }); +``` + +The argument is expected to be a SheetJS cell range. diff --git a/docz/docs/07-csf/02-cell.md b/docz/docs/07-csf/02-cell.md new file mode 100644 index 0000000..cccb383 --- /dev/null +++ b/docz/docs/07-csf/02-cell.md @@ -0,0 +1,129 @@ +--- +sidebar_position: 2 +--- + +# Cell Object + +Cell objects are plain JS objects with keys and values following the convention: + +| Key | Description | +| --- | ---------------------------------------------------------------------- | +| | **Core Cell Properties** ([More Info](#data-types)) | +| `v` | raw value (number, string, Date object, boolean) | +| `t` | type: `b` Boolean, `e` Error, `n` Number, `d` Date, `s` Text, `z` Stub | +| | **Number Formats** ([More Info](./features#number-formats)) | +| `z` | number format string associated with the cell (if requested) | +| `w` | formatted text (if applicable) | +| | **Formulae** ([More Info](./features/formulae)) | +| `f` | cell formula encoded as an A1-style string (if applicable) | +| `F` | range of enclosing array if formula is array formula (if applicable) | +| `D` | if true, array formula is dynamic (if applicable) | +| | **Other Cell Properties** ([More Info](./features)) | +| `l` | cell hyperlink and tooltip ([More Info](./features/hyperlinks)) | +| `c` | cell comments ([More Info](./features#cell-comments)) | +| `r` | rich text encoding (if applicable) | +| `h` | HTML rendering of the rich text (if applicable) | +| `s` | the style/theme of the cell (if applicable) | + +Built-in export utilities (such as the CSV exporter) will use the `w` text if it +is available. To change a value, be sure to delete `cell.w` (or set it to +`undefined`) before attempting to export. The utilities will regenerate the `w` +text from the number format (`cell.z`) and the raw value if possible. + +The actual array formula is stored in the `f` field of the first cell in the +array range. Other cells in the range will omit the `f` field. + +### Data Types + +The raw value is stored in the `v` value property, interpreted based on the `t` +type property. This separation allows for representation of numbers as well as +numeric text. There are 6 valid cell types: + +| Type | Description | +| :--: | :-------------------------------------------------------------------- | +| `b` | Boolean: value interpreted as JS `boolean` | +| `e` | Error: value is a numeric code and `w` property stores common name ** | +| `n` | Number: value is a JS `number` ** | +| `d` | Date: value is a JS `Date` object or string to be parsed as Date ** | +| `s` | Text: value interpreted as JS `string` and written as text ** | +| `z` | Stub: blank stub cell that is ignored by data processing utilities ** | + +
+ Error values and interpretation (click to show) + +| Value | Error Meaning | +| -----: | :-------------- | +| `0x00` | `#NULL!` | +| `0x07` | `#DIV/0!` | +| `0x0F` | `#VALUE!` | +| `0x17` | `#REF!` | +| `0x1D` | `#NAME?` | +| `0x24` | `#NUM!` | +| `0x2A` | `#N/A` | +| `0x2B` | `#GETTING_DATA` | + +
+ +Type `n` is the Number type. This includes all forms of data that Excel stores +as numbers, such as dates/times and Boolean fields. Excel exclusively uses data +that can be fit in an IEEE754 floating point number, just like JS Number, so the +`v` field holds the raw number. The `w` field holds formatted text. Dates are +stored as numbers by default and converted with `XLSX.SSF.parse_date_code`. + +Type `d` is the Date type, generated only when the option `cellDates` is passed. +Since JSON does not have a natural Date type, parsers are generally expected to +store ISO 8601 Date strings like you would get from `date.toISOString()`. On +the other hand, writers and exporters should be able to handle date strings and +JS Date objects. Note that Excel disregards timezone modifiers and treats all +dates in the local timezone. The library does not correct for this error. + +Type `s` is the String type. Values are explicitly stored as text. Excel will +interpret these cells as "number stored as text". Generated Excel files +automatically suppress that class of error, but other formats may elicit errors. + +Type `z` represents blank stub cells. They are generated in cases where cells +have no assigned value but hold comments or other metadata. They are ignored by +the core library data processing utility functions. By default these cells are +not generated; the parser `sheetStubs` option must be set to `true`. + + +#### Dates + +
+ Excel Date Code details (click to show) + +By default, Excel stores dates as numbers with a format code that specifies date +processing. For example, the date `19-Feb-17` is stored as the number `42785` +with a number format of `d-mmm-yy`. The `SSF` module understands number formats +and performs the appropriate conversion. + +XLSX also supports a special date type `d` where the data is an ISO 8601 date +string. The formatter converts the date back to a number. + +The default behavior for all parsers is to generate number cells. Setting +`cellDates` to true will force the generators to store dates. + +
+ +
+ Time Zones and Dates (click to show) + +Excel has no native concept of universal time. All times are specified in the +local time zone. Excel limitations prevent specifying true absolute dates. + +Following Excel, this library treats all dates as relative to local time zone. + +
+ +
+ Epochs: 1900 and 1904 (click to show) + +Excel supports two epochs (January 1 1900 and January 1 1904). +The workbook's epoch can be determined by examining the workbook's +`wb.Workbook.WBProps.date1904` property: + +```js +!!(((wb.Workbook||{}).WBProps||{}).date1904) +``` + +
diff --git a/docz/docs/07-csf/02-sheet.md b/docz/docs/07-csf/03-sheet.md similarity index 99% rename from docz/docs/07-csf/02-sheet.md rename to docz/docs/07-csf/03-sheet.md index 9d5e6a3..5cb6752 100644 --- a/docz/docs/07-csf/02-sheet.md +++ b/docz/docs/07-csf/03-sheet.md @@ -1,5 +1,5 @@ --- -sidebar_position: 2 +sidebar_position: 3 --- # Sheet Objects diff --git a/docz/docs/07-csf/03-book.md b/docz/docs/07-csf/04-book.md similarity index 97% rename from docz/docs/07-csf/03-book.md rename to docz/docs/07-csf/04-book.md index ee9065c..8373791 100644 --- a/docz/docs/07-csf/03-book.md +++ b/docz/docs/07-csf/04-book.md @@ -1,5 +1,5 @@ --- -sidebar_position: 3 +sidebar_position: 4 --- # Workbook Object @@ -19,8 +19,7 @@ standard, XLS parsing stores core properties in both places. The various file formats use different internal names for file properties. The workbook `Props` object normalizes the names: -
- File Properties (click to show) +
File Properties (click to hide) | JS Name | Excel Description | |:--------------|:-------------------------------| diff --git a/docz/docs/07-csf/07-features/01-formulae.md b/docz/docs/07-csf/07-features/01-formulae.md index 89d826e..460424e 100644 --- a/docz/docs/07-csf/07-features/01-formulae.md +++ b/docz/docs/07-csf/07-features/01-formulae.md @@ -104,50 +104,7 @@ The A1-style formula string is stored in the `f` field of the cell object. Spreadsheet software typically represent formulae with a leading `=` sign, but SheetJS formulae omit the `=`. -
What is A1-style? (click to show) - -A1-style is the default in Excel. - -Columns are specified with letters, counting from `A` to `Z`, then `AA` to `ZZ`, -then `AAA`. Some sample values, along with SheetJS column indices, are listed: - -| Ordinal | A1 Name | SheetJS | -|:--------|:--------|--------:| -| First | `A` | `0` | -| Second | `B` | `1` | -| 26th | `Z` | `25` | -| 27th | `AA` | `26` | -| 702st | `ZZ` | `701` | -| 703rd | `AAA` | `702` | -| 16384th | `XFD` | `16383` | - -Rows are specified with numbers, starting from `1` for the first row. SheetJS -APIs that take row indices start from `0` (ECMAScript convention). - -A cell address is the concatenation of column text and row number. For example, -the cell in the third column and fourth row is "C4". - -A cell range is represented as the top-left cell of the range, followed by `:`, -followed by the bottom-right cell of the range. For example, the range `"C2:D4"` -includes 6 cells marked with ▒ in the table below: - - - - - - - - -
ABCDE
1
2
3
4
5
- -A column range is represented by the left-most column, followed by `:`, followed -by the right-most column. For example, the range `C:D` represents the third and -fourth columns. - -A row range is represented by the top-most row, followed by `:`, followed by the -bottom-most column. For example, `2:4` represents the second/third/fourth rows. - -
+["A1-Style"](../general#a1-style) describes A1 style in more detail. For example, consider [this test file](pathname:///files/concat.xlsx):