This commit is contained in:
SheetJS 2023-05-18 05:21:08 -04:00
parent c62069a7e4
commit 23cb01e318
9 changed files with 225 additions and 83 deletions

@ -11,7 +11,7 @@ Cell objects are plain JS objects with keys and values following the convention:
| | **Core Cell Properties** ([More Info](#data-types)) |
| `v` | raw value (number, string, Date object, boolean) |
| `t` | type: `b` Boolean, `e` Error, `n` Number, `d` Date, `s` Text, `z` Stub |
| | **Number Formats** ([More Info](/docs/csf/features#number-formats)) |
| | **Number Formats** ([More Info](/docs/csf/features/nf)) |
| `z` | number format string associated with the cell (if requested) |
| `w` | formatted text (if applicable) |
| | **Formulae** ([More Info](/docs/csf/features/formulae)) |

@ -10,7 +10,7 @@ sidebar_position: 5
Defined names have evolved over the decades, with new features added over time:
- "English" refers to defined names with English letters and numbers (ASCII)
- "Unicode" refers to defined names that non-English characters.
- "Unicode" refers to defined names with non-English characters.
- "Comment" refers to comments that can be attached to defined names.
| Formats | English | Unicode | Comment |

@ -0,0 +1,218 @@
---
sidebar_position: 6
---
# Number Formats
<details>
<summary><b>File Format Support</b> (click to show)</summary>
Legacy formats like CSV mix "content" and "presentation". There is no true
concept of a "number format" distinct from the number itself. For specific
formats, the library will guess the number format.
| Formats | Basic | Storage Representation |
|:------------------|:-----:|:-----------------------|
| XLSX / XLSM | ✔ | Number Format Code |
| XLSB | ✔ | Number Format Code |
| XLS | ✔ | Number Format Code |
| XLML | ✔ | Number Format Code |
| SYLK | R | Number Format Code |
| ODS / FODS / UOS | ✔ | XML |
| NUMBERS | | Binary encoding |
| WK\* | | Binary encoding |
| WQ\* / WB\* / QPW | | Binary encoding |
| DBF | | Implied by field types |
| CSV | * | N/A |
| PRN | * | N/A |
| DIF | * | N/A |
| RTF | * | N/A |
Asterisks (*) mark formats that mix content and presentation. Synthetic number
formats may be generated for special values.
The letter R (R) marks features parsed but not written in the format.
</details>
The following example generates a file with some common number formats:
```jsx live
function SheetJSSimpleNF(props) {
const xport = React.useCallback(async () => {
/* Create worksheet from simple data */
const ws = XLSX.utils.aoa_to_sheet([
["Currency", 3.5],
["Thousands", 7262],
["Percent", 0.0219]
]);
/* assign number formats */
ws["B1"].z = '"$"#,##0_);\\("$"#,##0\\)';
ws["B2"].z = '#,##0';
ws["B3"].z = "0.00%";
/* Export to file (start a download) */
const wb = XLSX.utils.book_new();
XLSX.utils.book_append_sheet(wb, ws, "Formats");
XLSX.writeFile(wb, "SheetJSSimpleNF.xlsx");
});
return ( <button onClick={xport}><b>Export XLSX!</b></button> );
}
```
## Values and Formatting
Typically spreadsheets will include formatted text such as currencies (`$3.50`)
or large numbers with thousands separators (`7,262`) or percentages (`2.19%`).
To simplify editing, the applications will store the underlying values and the
number formats separately. For example, `$3.50` will be represented as the value
`3.5` with a number format that mandates a `$` sigil and 2 decimal places.
Some file formats like CSV only support the formatted text. Native formats for
spreadsheet applications including Lotus 1-2-3 and Excel will store the value
and number format separately.
### Dates and Times
Many spreadsheet formats store dates and times using a number that represents
the number of seconds or days after some epoch. Dates are covered in more detail
[in the dedicated section](/docs/csf/features/dates).
## SheetJS Representation
Number formats and values are attached to cells. The following keys are used:
| Key | Description |
| --- | ---------------------------------------------------------------------- |
| `v` | raw value (number, string, Date object, boolean) |
| `z` | number format string associated with the cell (if requested) |
| `w` | formatted text (if applicable) |
The `cell.w` formatted text for each cell is produced from `cell.v` and `cell.z`
format. If the format is not specified, the Excel `General` format is used.
By default, parsers do not attach number formats to cells. The `cellNF` option
instructs `XLSX.read` or `XLSX.readFile` to save the formats.
### Number Format Strings
The `z` format string follows the Excel persistence rules as described in
ECMA-376 18.8.31 (Number Formats). For more info, see the Excel documentation
article `Create or delete a custom number format`
The rules are slightly different from how Excel displays custom number formats.
In particular, literal characters must be wrapped in double quotes or preceded
by a backslash.
The following example prints number formats from a user-specified file:
```jsx live
function SheetJSExtractNF(props) {
const [rows, setRows] = React.useState([])
return ( <>
<input type="file" onChange={async(e) => {
/* parse workbook with cellNF: true */
const file = e.target.files[0];
const data = await file.arrayBuffer();
const wb = XLSX.read(data, {cellNF: true});
const formats = {};
wb.SheetNames.forEach(n => {
var ws = wb.Sheets[n]; if(!ws || !ws["!ref"]) return;
var ref = XLSX.utils.decode_range(ws["!ref"]);
for(var R = 0; R <= ref.e.r; ++R) for(var C = 0; C <= ref.e.c; ++C) {
var addr = XLSX.utils.encode_cell({r:R,c:C});
if(!ws[addr] || !ws[addr].z) continue;
if(formats[ws[addr].z]) continue;
formats[ws[addr].z] = `'${n}'!${addr}`;
setRows(Object.entries(formats));
}
});
}}/>
<table><tr><th>Format</th><th>JSON</th><th>Example Cell</th></tr>
{rows.map((r,R) => ( <tr key={R}>
<td><code>{r[0]}</code></td>
<td><code>{JSON.stringify(r[0])}</code></td>
<td>{r[1]}</td>
</tr> ))}
</table>
</> );
}
```
## Miscellany
The default formats are listed in ECMA-376 18.8.30:
<details>
<summary><b>Default Number Formats</b> (click to show)</summary>
| ID | Format |
|---:|:---------------------------|
| 0 | `General` |
| 1 | `0` |
| 2 | `0.00` |
| 3 | `#,##0` |
| 4 | `#,##0.00` |
| 9 | `0%` |
| 10 | `0.00%` |
| 11 | `0.00E+00` |
| 12 | `# ?/?` |
| 13 | `# ??/??` |
| 14 | `m/d/yy` (see below) |
| 15 | `d-mmm-yy` |
| 16 | `d-mmm` |
| 17 | `mmm-yy` |
| 18 | `h:mm AM/PM` |
| 19 | `h:mm:ss AM/PM` |
| 20 | `h:mm` |
| 21 | `h:mm:ss` |
| 22 | `m/d/yy h:mm` |
| 37 | `#,##0 ;(#,##0)` |
| 38 | `#,##0 ;[Red](#,##0)` |
| 39 | `#,##0.00;(#,##0.00)` |
| 40 | `#,##0.00;[Red](#,##0.00)` |
| 45 | `mm:ss` |
| 46 | `[h]:mm:ss` |
| 47 | `mmss.0` |
| 48 | `##0.0E+0` |
| 49 | `@` |
</details>
Format 14 (`m/d/yy`) is localized by Excel: even though the file specifies that
number format, it will be drawn differently based on system settings. It makes
sense when the producer and consumer of files are in the same locale, but that
is not always the case over the Internet. To get around this ambiguity, parse
functions accept the `dateNF` option to override the interpretation of that
specific format string.
### Excel Format Categories
Excel officially recognizes a small number of formats as "Currency" and another
set of formats as "Accounting". The exact formats in `en-US` are listed below:
#### Currency
| JS String | Decimal Places | Negative Color |
|:-----------------------------------------|:---------------|:---------------|
| `'"$"#,##0_);\\("$"#,##0\\)'` | 0 | Black |
| `'"$"#,##0_);[Red]\\("$"#,##0\\)'` | 0 | Red |
| `'"$"#,##0.00_);\\("$"#,##0.00\\)'` | 2 | Black |
| `'"$"#,##0.00_);[Red]\\("$"#,##0.00\\)'` | 2 | Red |
#### Accounting
| JS String | Decimal | Sigil |
|:-----------------------------------------------------------------|:--------|:------|
| `'_(* #,##0_);_(* \\(#,##0\\);_(* "-"_);_(@_)'` | 0 | |
| `'_("$"* #,##0_);_("$"* \\(#,##0\\);_("$"* "-"_);_(@_)'` | 0 | `$` |
| `'_(* #,##0.00_);_(* \\(#,##0.00\\);_(* "-"??_);_(@_)'` | 2 | |
| `'_("$"* #,##0.00_);_("$"* \\(#,##0.00\\);_("$"* "-"??_);_(@_)'` | 2 | `$` |
For other locales, the formats can be discovered by creating a file with the
desired format and testing with [the Number Format Strings demo](#number-format-strings)

@ -138,82 +138,6 @@ follow the priority order:
</details>
## Number Formats
The `cell.w` formatted text for each cell is produced from `cell.v` and `cell.z`
format. If the format is not specified, the Excel `General` format is used.
The format can either be specified as a string or as an index into the format
table. Readers are expected to populate `workbook.SSF` with the number format
table. Writers are expected to serialize the table.
The following example creates a custom format from scratch:
```js
var wb = {
SheetNames: ["Sheet1"],
Sheets: {
Sheet1: {
"!ref":"A1:C1",
A1: { t:"n", v:10000 }, // <-- General format
B1: { t:"n", v:10000, z: "0%" }, // <-- Builtin format
C1: { t:"n", v:10000, z: "\"T\"\ #0.00" } // <-- Custom format
}
}
}
```
The rules are slightly different from how Excel displays custom number formats.
In particular, literal characters must be wrapped in double quotes or preceded
by a backslash. For more info, see the Excel documentation article
`Create or delete a custom number format` or ECMA-376 18.8.31 (Number Formats)
<details>
<summary><b>Default Number Formats</b> (click to show)</summary>
The default formats are listed in ECMA-376 18.8.30:
| ID | Format |
|---:|:---------------------------|
| 0 | `General` |
| 1 | `0` |
| 2 | `0.00` |
| 3 | `#,##0` |
| 4 | `#,##0.00` |
| 9 | `0%` |
| 10 | `0.00%` |
| 11 | `0.00E+00` |
| 12 | `# ?/?` |
| 13 | `# ??/??` |
| 14 | `m/d/yy` (see below) |
| 15 | `d-mmm-yy` |
| 16 | `d-mmm` |
| 17 | `mmm-yy` |
| 18 | `h:mm AM/PM` |
| 19 | `h:mm:ss AM/PM` |
| 20 | `h:mm` |
| 21 | `h:mm:ss` |
| 22 | `m/d/yy h:mm` |
| 37 | `#,##0 ;(#,##0)` |
| 38 | `#,##0 ;[Red](#,##0)` |
| 39 | `#,##0.00;(#,##0.00)` |
| 40 | `#,##0.00;[Red](#,##0.00)` |
| 45 | `mm:ss` |
| 46 | `[h]:mm:ss` |
| 47 | `mmss.0` |
| 48 | `##0.0E+0` |
| 49 | `@` |
</details>
Format 14 (`m/d/yy`) is localized by Excel: even though the file specifies that
number format, it will be drawn differently based on system settings. It makes
sense when the producer and consumer of files are in the same locale, but that
is not always the case over the Internet. To get around this ambiguity, parse
functions accept the `dateNF` option to override the interpretation of that
specific format string.
## Sheet Visibility
<details>

@ -1,5 +1,5 @@
---
sidebar_position: 5
sidebar_position: 3
hide_table_of_contents: true
title: Reading Files
---

@ -1,5 +1,5 @@
---
sidebar_position: 7
sidebar_position: 5
hide_table_of_contents: true
title: Writing Files
---

@ -1,6 +1,6 @@
---
sidebar_position: 9
title: Formulae Output
title: Array of Formulae
pagination_next: miscellany/formats
---

@ -1,5 +1,5 @@
{
"label": "Utility Functions",
"collapsed": false,
"position": 9
"position": 7
}

@ -440,7 +440,7 @@ can be added to the individual TD elements:
|:----------|:-------------------------------------------------------------|
| `data-t` | Override [Cell Type](/docs/csf/cell#data-types) |
| `data-v` | Override Cell Value |
| `data-z` | Override [Number Format](/docs/csf/features/#number-formats) |
| `data-z` | Override [Number Format](/docs/csf/features/nf) |
For example: