Notes on Worksheet Ranges

This commit is contained in:
SheetJS 2024-07-08 04:18:18 -04:00
parent 7f0bda6af6
commit 9088dfd430
7 changed files with 134 additions and 41 deletions

@ -279,7 +279,7 @@ The program will run on ARM64 Windows.
| `win10-x64` |`.\out\sheetjs-electron-win32-x64\sheetjs-electron.exe` |
| `win11-arm` |`.\out\sheetjs-electron-win32-x64\sheetjs-electron.exe` |
| `linux-x64` |`./out/sheetjs-electron-linux-x64/sheetjs-electron` |
| `linux-x64` |`./out/sheetjs-electron-linux-arm64/sheetjs-electron` |
| `linux-arm` |`./out/sheetjs-electron-linux-arm64/sheetjs-electron` |
#### Electron API

@ -117,7 +117,7 @@ This demo was tested in the following environments:
| macOS 14.5 | `darwin-arm` | `0.88.0` | 2024-05-28 | |
| Windows 10 | `win10-x64` | `0.83.0` | 2024-03-04 | |
| Windows 11 | `win11-arm` | `0.88.0` | 2024-05-28 | |
| Linux (HoloOS) | `linux-x64` | `0.85.0` | 2024-03-12 | |
| Linux (HoloOS) | `linux-x64` | `0.89.0` | 2024-07-07 | |
| Linux (Debian) | `linux-arm` | `0.60.0` | 2024-05-23 | Unofficial build[^1] |
:::
@ -138,7 +138,7 @@ cd sheetjs-nwjs
"version": "0.0.0",
"main": "index.html",
"dependencies": {
"nw": "0.88.0",
"nw": "0.89.0",
"xlsx": "https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz"
}
}`}
@ -182,9 +182,15 @@ file can be opened in Excel or another spreadsheet editor.
5) To build a standalone app, run the builder:
```bash
npx -p nw-builder nwbuild --mode=build --version=0.88.0 --glob=false --outDir=../out ./
npx -p nw-builder nwbuild --mode=build --version=0.89.0 --glob=false --outDir=../out ./
```
This will generate the standalone app in the `..\out\` folder.
6) Launch the generated application:
| Architecture | Command |
|:-------------|:--------------------------------------------------------------|
| `linux-x64` | `../out/sheetjs-nwjs` |
[^1]: The [`nw60-arm64_2022-01-08` release](https://github.com/LeonardLaszlo/nw.js-armv7-binaries/releases/tag/nw60-arm64_2022-01-08) included an ARM64 version of `nw`.

@ -11,17 +11,28 @@ import CodeBlock from '@theme/CodeBlock';
:::danger pass
WebSQL is no longer enabled by default in Chrome. Chrome 123 will officially
remove support. For SQL in the browser, there are a few alternatives:
WebSQL is no longer supported in Chrome or Safari.
For SQL in the browser, there are a few alternatives:
- [SQL.js](/docs/demos/data/sqlite#browser) is a compiled version of SQLite
- [AlaSQL](/docs/demos/data/alasql) is a pure-JS SQL engine backed by IndexedDB
:::
WebSQL (formally "Web SQL Database") is a popular SQL-based in-browser database
available in Chromium and related browsers including Google Chrome. In practice,
it is powered by SQLite. Many SQLite-compatible queries work as-is in WebSQL.
WebSQL (formally "Web SQL Database") was a popular SQL-based in-browser database
available in Chromium and Safari. In practice, it was powered by SQLite. Many
SQLite-compatible queries were supported by WebSQL engines.
:::note Historical Context
Google and Apple developed and supported WebSQL. Legacy browser vendors fought
against standardization and ultimately broke the web by forcing the deprecation
of the storied API.
Leveraging new technologies, many websites ship with an in-browser SQL database.
:::
The public demo https://sheetjs.com/sql generates a database from workbook.

@ -17,10 +17,38 @@ Excel supports 4 different types of "sheets":
Generic sheets are plain JavaScript objects. Each key that does not start with
`!` is an `A1`-style address whose corresponding value is a cell object.
### Worksheet Range
The `!ref` property stores the [A1-style range](/docs/csf/general#a1-style-1).
Functions that work with sheets should use this property to determine the range.
Cells that are assigned outside of the range are not processed.
For example, in the following sparse worksheet, the cell `A3` will be ignored
since it is outside of the worksheet range (`A1:B2`):
```js
var ws = {
// worksheet range is A1:B2
"!ref": "A1:B2",
// A1 is in the range and will be included
"A1": { t: "s", v: "SheetJS" },
// cell A3 is outside of the range and will be ignored
"A3": { t: "n", v: 5433795 },
};
```
[Utility functions](/docs/api/utilities/) and functions that handle sheets
should test for the presence of the `!ref` field. If the `!ref` is omitted or is
not a valid range, functions should treat the sheet as empty.
### Cell Storage
By default, the parsers and utility functions generate "sparse-mode" worksheet
objects. `sheet[address]` returns the cell object for the specified address.
By default, the parsers and utility functions generate "sparse-mode" worksheets.
For a given [A1-style address](/docs/csf/general#a1-style), `sheet[ref]` is the
corresponding cell object.
#### Dense Mode
@ -127,27 +155,13 @@ _`json_to_sheet`_
+var sheet = XLSX.utils.json_to_sheet([{x:1,y:2}], {...opts, dense: true});
```
</details>
### Sheet Properties
Each key starts with `!`. The properties are accessible as `sheet[key]`.
- `sheet['!ref']`: A-1 based range representing the sheet range. Functions that
work with sheets should use this parameter to determine the range. Cells that
are assigned outside of the range are not processed. In particular, when
writing a sheet by hand, cells outside of the range are not included
Functions that handle sheets should test for the presence of `!ref` field.
If the `!ref` is omitted or is not a valid range, functions are free to treat
the sheet as empty or attempt to guess the range. The standard utilities that
ship with this library treat sheets as empty (for example, the CSV output is
empty string).
When reading a worksheet with the `sheetRows` property set, the ref parameter
will use the restricted range. The original range is set at `ws['!fullref']`
- `sheet['!ref']`: [A1-style sheet range string](#worksheet-range)
- `sheet['!margins']`: Object representing the page margins. The default values
follow Excel's "normal" preset. Excel also has a "wide" and a "narrow" preset

@ -14,6 +14,19 @@ features are only accessible by inspecting and modifying the objects directly.
This section covers the JS representation of workbooks, worksheets, cells,
ranges, addresses and other features.
:::info Historical Context
[Web Workers](/docs/demos/bigdata/worker), a popular API for parallelism in the
web browser, uses message passing. The "structured clone algorithm"[^1] is used
to pass data between the main renderer thread and Worker instances.
The structured clone algorithm does not preserve functions or prototypes.
In the SheetJS data model, each structure is a simple object. There are no
classes or prototype methods.
:::
### Contents
<ul>{useCurrentSidebarCategory().items.map(globalThis.lambda = (item, index) => {
@ -24,4 +37,6 @@ ranges, addresses and other features.
<a href={item.href}>{item.label}</a>{item.customProps?.summary && (" - " + item.customProps.summary)}
<ul>{item.items && item.items.map(lambda)}</ul>
</li>);
})}</ul>
})}</ul>
[^1]: See [the HTML Living Standard](https://html.spec.whatwg.org/multipage/structured-data.html#structured-cloning) for more details on the "structured clone algorithm".

@ -46,7 +46,7 @@ The read functions accept an options argument:
|:------------|:--------|:-----------------------------------------------------|
|`type` | | [Input data representation](#input-type) |
|`raw` | `false` | If true, plain text parsing will not parse values ** |
|`dense` | `false` | If true, use a dense worksheet representation ** |
|`dense` | `false` | If true, use a [dense sheet representation](#dense) |
|`codepage` | | If specified, use code page when appropriate ** |
|`cellFormula`| `true` | Save [formulae to the `.f` field](#formulae) |
|`cellHTML` | `true` | Parse rich text and save HTML to the `.h` field |
@ -93,13 +93,22 @@ The read functions accept an options argument:
- `WTF` is mainly for development. By default, the parser will suppress read
errors on single worksheets, allowing you to read from the worksheets that do
parse properly. Setting `WTF:true` forces those errors to be thrown.
- By default, "sparse" mode worksheets are generated. Individual cells are
accessed by indexing the worksheet object with an A1-Style address. "dense"
worksheets store cells in an array of arrays at `sheet["!data"]`.
- `UTC` applies to CSV, Text and HTML formats. When explicitly set to `false`,
the parsers will assume the files are specified in local time. By default, as
is the case for other file formats, dates and times are interpreted in UTC.
#### Dense
The ["Cell Storage"](/docs/csf/sheet#cell-storage) section of the SheetJS Data
Model documentation explains the worksheet representation in more detail.
:::note pass
[Utility functions that process SheetJS workbook objects](/docs/api/utilities/)
typically process both sparse and dense worksheets.
:::
#### Range
Some file formats, including XLSX and XLS, can self-report worksheet ranges. The
@ -107,7 +116,9 @@ self-reported ranges are used by default.
If the `sheetRows` option is set, up to `sheetRows` rows will be parsed from the
worksheets. `sheetRows-1` rows will be generated when looking at the JSON object
output (since the header row is counted as a row when parsing the data).
output (since the header row is counted as a row when parsing the data). The
`!ref` property of the worksheet will hold the adjusted range. For formats that
self-report sheet ranges, the `!fullref` property will hold the original range.
The `nodim` option instructs the parser to ignore self-reported ranges and use
the actual cells in the worksheet to determine the range. This addresses known

@ -9,7 +9,18 @@ import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import CodeBlock from '@theme/CodeBlock';
**`XLSX.write(wb, options)`**
The main SheetJS method for writing workbooks is `write`. Scripts receive common
[JavaScript data representations](#output-type) and are expected to write or
share files using platform-specific APIs.
The `writeFile` helper method accepts a filename and tries to write to a local
file using [standard APIs](/docs/demos/local/file).
**Export a SheetJS workbook object in a specified file format**
```js
var file_data = XLSX.write(wb, opts);
```
`write` attempts to write the workbook `wb` and return the file.
@ -17,7 +28,11 @@ The `options` argument is required. It must specify
- [`bookType`](#supported-output-formats) (file format of the exported file)
- [`type`](#output-type) (return value type)
**`XLSX.writeFile(wb, filename, options)`**
**Export a SheetJS workbook object and attempt to write a local file**
```js
XLSX.writeFile(wb, filename, options);
```
`writeFile` attempts to write `wb` to a local file with specified `filename`.
@ -27,9 +42,16 @@ It also supports NodeJS, ExtendScript applications, and Chromium extensions.
If `options` is omitted or if `bookType` is missing from the `options` object,
the output file format will be deduced from the filename extension.
**`XLSX.writeXLSX(wb, options)`**
**Special functions for exporting data in the XLSX format**
**`XLSX.writeFileXLSX(wb, filename, options)`**
```js
// limited form of `write`
var file_data = XLSX.writeXLSX(wb, options);
// limited form of `writeFile`
XLSX.writeFileXLSX(wb, filename, options);
```
`writeXLSX` and `writeFileXLSX` are limited versions of `write` and `writeFile`.
They support writing to the XLSX file format.
@ -42,11 +64,19 @@ more appropriate when exporting to XLS or XLSB or other formats.
<details>
<summary><b>NodeJS-specific methods</b> (click to show)</summary>
**`XLSX.writeFileAsync(filename, wb, cb)`**
**Export a workbook and attempt to write a local file using `fs.writeFile`**
**`XLSX.writeFileAsync(filename, wb, options, cb)`**
```js
// callback equivalent of `XLSX.writeFile`
XLSX.writeFileAsync(filename, wb, cb);
attempt to write `wb` to `filename` and invoke the callback `cb` on completion.
// callback equivalent with options argument
XLSX.writeFileAsync(filename, wb, options, cb);
```
`writeFileAsync` attempts to write `wb` to `filename` and invoke the callback
`cb` on completion.
When an `options` object is specified, it is expected to be the third argument.
@ -275,6 +305,12 @@ The `type` option specifies the JS form of the output:
| `"array"` | ArrayBuffer, fallback array of 8-bit unsigned int |
| `"file"` | string: path of file that will be created (nodejs only) |
- For compatibility with Excel, `csv` output will always include the UTF-8 byte
order mark.
:::note pass
For compatibility with Excel, `csv` output will always include the UTF-8 byte
order mark ("BOM").
The raw [`sheet_to_csv` method](/docs/api/utilities/csv#csv-output) will return
JavaScript strings without the UTF-8 BOM.
:::