11 KiB
sidebar_position |
---|
6 |
Number Formats
File Format Support (click to show)
Modern applications separate "content" from "presentation". A value like $3.50
is typically stored as the underlying value (3.50
) with a format ($0.00
).
Parsers are expected to render values using the respective number formats.
Text-based file formats like CSV and HTML mix content and presentation. $3.50
is stored as the formatted value. The formatted values can be generated from
many different values and number formats. SheetJS parsers expose options to
control value parsing and number format speculation.
Formats | Basic | Storage Representation |
---|---|---|
XLSX / XLSM | ✔ | Number Format Code |
XLSB | ✔ | Number Format Code |
XLS | ✔ | Number Format Code |
XLML | ✔ | Number Format Code |
SYLK | R | Number Format Code |
ODS / FODS / UOS | ✔ | XML Tokens |
NUMBERS | Binary encoding | |
WK1 | + | Fixed set of formats |
WK3 / WK4 | Binary encoding | |
WKS Lotus | + | Fixed set of formats |
WKS Works | + | Fixed set of formats |
WQ1 | + | Fixed set of formats |
WQ2 | Binary encoding | |
WB1 / WB2 / WB3 | Binary encoding | |
QPW | + | Binary encoding |
DBF | Implied by field types | |
HTML | ! | Special override |
CSV | * | N/A |
PRN | * | N/A |
DIF | * | N/A |
RTF | * | N/A |
(+) mark formats with limited support. The QPW (Quattro Pro Workbooks) parser supports the built-in date and built-in time formats but does not support custom number formats. Date and Time support in modern Excel formats requires limited number format support to distinguish date or time codes from standard numeric data.
Asterisks (*) mark formats that mix content and presentation. Writers will use formatted values if cell objects include formatted text or number formats. Parsers may guess number formats for special values.
The letter R (R) marks features parsed but not written in the format.
(!) HTML mixes content and presentation. The HTML DOM parser supports special attributes to override number formats
This example generates a worksheet with common number formats. sheet_to_html
uses the number formats in generating the HTML table. The "Export" button
generates workbooks with number formatting.
function SheetJSSimpleNF(props) {
const [ws, setWS] = React.useState();
const fmt = React.useRef(null);
/* when the page is loaded, create worksheet and show table */
React.useEffect(() => {
/* Create worksheet from simple data */
const ws = XLSX.utils.aoa_to_sheet([
["General", 54337 ],
["Currency", 3.5 ],
["Thousands", 7262 ],
["Percent", 0.0219 ],
]);
/* assign number formats */
ws["B2"].z = '"$"#,##0.00_);\\("$"#,##0.00\\)';
ws["B3"].z = '#,##0';
ws["B4"].z = "0.00%";
setWS(ws);
}, []);
const xport = (fmt) => {
/* Export to file (start a download) */
const wb = XLSX.utils.book_new();
XLSX.utils.book_append_sheet(wb, ws, "Formats");
XLSX.writeFile(wb, `SheetJSSimpleNF.${fmt}`);
};
const fmts = ["xlsx", "xls", "csv", "xlsb", "html", "ods"];
return ( <>
<select ref={fmt}>{fmts.map(fmt => (<option value={fmt}>{fmt}</option>))}</select>
<button onClick={()=>xport(fmt.current.value)}><b>Export!</b></button>
<div dangerouslySetInnerHTML={{__html: ws && XLSX.utils.sheet_to_html(ws) || "" }}/>
</> );
}
Values and Formatting
Typically spreadsheets will include formatted text such as currencies ($3.50
)
or large numbers with thousands separators (7,262
) or percentages (2.19%
).
To simplify editing, the applications will store the underlying values and the
number formats separately. For example, $3.50
will be represented as the value
3.5
with a number format that mandates a $
sigil and 2 decimal places.
CSV and other formats only support the formatted text. Applications reading CSV files are expected to interpret the values as numbers or dates.
Dates and Times
Many spreadsheet formats store dates and times using a number that represents the number of seconds or days after some epoch. Dates are covered in more detail in the dedicated section.
Percentages
Percentage formats automatically scale values by 100. Multiple percent symbols
repeat the effect. For example, a cell with value 2.19%
is typically stored as
a numeric cell with value 0.0219
and number format 0.00%
The following table uses the en-US
locale (.
as the decimal point symbol):
Number | Format | en-US Text |
---|---|---|
0.0219 |
0.00% |
2.19% |
2.19 |
0.00% |
219% |
0.0219 |
0.00%% |
219%% |
2.19 |
0.00%% |
21900%% |
SheetJS Representation
Number formats and values are attached to cells. The following keys are used:
Key | Description |
---|---|
v |
raw value (number, string, Date object, boolean) |
z |
number format string associated with the cell (if requested) |
w |
formatted text (if applicable) |
The cell.w
formatted text for each cell is produced from cell.v
and cell.z
format. If the format is not specified, the Excel General
format is used.
By default, parsers do not attach number formats to cells. The cellNF
option
instructs XLSX.read
or XLSX.readFile
to save the formats.
Number Format Strings
The z
format string follows the Excel persistence rules as described in
ECMA-376 18.8.31 (Number Formats). For more info, see the Excel documentation
article Create or delete a custom number format
The rules are slightly different from how Excel displays custom number formats. In particular, literal characters must be wrapped in double quotes or preceded by a backslash.
The following example prints number formats from a user-specified file:
function SheetJSExtractNF(props) {
const [rows, setRows] = React.useState([])
return ( <>
<input type="file" onChange={async(e) => {
/* parse workbook with cellNF: true */
const wb = XLSX.read(await e.target.files[0].arrayBuffer(), {cellNF: true});
/* look at each cell in each worksheet */
const formats = {};
wb.SheetNames.forEach(n => {
var ws = wb.Sheets[n]; if(!ws || !ws["!ref"]) return;
var ref = XLSX.utils.decode_range(ws["!ref"]);
for(var R = 0; R <= ref.e.r; ++R) for(var C = 0; C <= ref.e.c; ++C) {
var addr = XLSX.utils.encode_cell({r:R,c:C});
if(!ws[addr] || !ws[addr].z || formats[ws[addr].z]) continue;
/* when a new format is found, save the address */
formats[ws[addr].z] = `'${n}'!${addr}`;
setRows(Object.entries(formats));
}
});
}}/>
<table><tr><th>Format</th><th>JSON</th><th>Example Cell</th></tr>
{rows.map((r,R) => ( <tr key={R}>
<td><code>{r[0]}</code></td>
<td><code>{JSON.stringify(r[0])}</code></td>
<td>{r[1]}</td>
</tr> ))}
</table>
</> );
}
Miscellany
The default formats are listed in ECMA-376 18.8.30:
Default Number Formats (click to show)
ID | Format |
---|---|
0 | General |
1 | 0 |
2 | 0.00 |
3 | #,##0 |
4 | #,##0.00 |
9 | 0% |
10 | 0.00% |
11 | 0.00E+00 |
12 | # ?/? |
13 | # ??/?? |
14 | m/d/yy (see below) |
15 | d-mmm-yy |
16 | d-mmm |
17 | mmm-yy |
18 | h:mm AM/PM |
19 | h:mm:ss AM/PM |
20 | h:mm |
21 | h:mm:ss |
22 | m/d/yy h:mm |
37 | #,##0 ;(#,##0) |
38 | #,##0 ;[Red](#,##0) |
39 | #,##0.00;(#,##0.00) |
40 | #,##0.00;[Red](#,##0.00) |
45 | mm:ss |
46 | [h]:mm:ss |
47 | mmss.0 |
48 | ##0.0E+0 |
49 | @ |
Format 14 (m/d/yy
) is localized by Excel: even though the file specifies that
number format, it will be drawn differently based on system settings. It makes
sense when the producer and consumer of files are in the same locale, but that
is not always the case over the Internet. To get around this ambiguity, parse
functions accept the dateNF
option to override the interpretation of that
specific format string.
Excel Format Categories
Excel officially recognizes a small number of formats as "Currency" and another
set of formats as "Accounting". The exact formats in en-US
are listed below:
Currency
JS String | Decimal Places | Negative Color |
---|---|---|
'"$"#,##0_);\\("$"#,##0\\)' |
0 | Black |
'"$"#,##0_);[Red]\\("$"#,##0\\)' |
0 | Red |
'"$"#,##0.00_);\\("$"#,##0.00\\)' |
2 | Black |
'"$"#,##0.00_);[Red]\\("$"#,##0.00\\)' |
2 | Red |
Accounting
JS String | Decimal | Sigil |
---|---|---|
'_(* #,##0_);_(* \\(#,##0\\);_(* "-"_);_(@_)' |
0 | |
'_("$"* #,##0_);_("$"* \\(#,##0\\);_("$"* "-"_);_(@_)' |
0 | $ |
'_(* #,##0.00_);_(* \\(#,##0.00\\);_(* "-"??_);_(@_)' |
2 | |
'_("$"* #,##0.00_);_("$"* \\(#,##0.00\\);_("$"* "-"??_);_(@_)' |
2 | $ |
For other locales, the formats can be discovered by creating a file with the desired format and testing with the Number Format Strings demo