2012-12-04 19:27:20 +00:00
|
|
|
# xlsx
|
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
Parser and writer for various spreadsheet formats. Pure-JS cleanroom
|
2017-02-22 06:57:59 +00:00
|
|
|
implementation from official specifications, related documents, and test files.
|
|
|
|
Emphasis on parsing and writing robustness, cross-format feature compatibility
|
|
|
|
with a unified JS representation, and ES3/ES5 browser compatibility back to IE6.
|
|
|
|
|
2017-03-13 06:46:37 +00:00
|
|
|
[**In-Browser Demo**](http://oss.sheetjs.com/js-xlsx)
|
2017-02-22 06:57:59 +00:00
|
|
|
|
2017-03-13 06:46:37 +00:00
|
|
|
[**Source Code**](http://git.io/xlsx)
|
2014-10-26 05:26:18 +00:00
|
|
|
|
2017-03-13 06:46:37 +00:00
|
|
|
[**Commercial Support**](http://sheetjs.com/reinforcements)
|
2017-03-10 17:33:08 +00:00
|
|
|
|
2017-03-13 06:46:37 +00:00
|
|
|
[**File format support for known spreadsheet data formats:**](#file-formats)
|
|
|
|
|
|
|
|
![circo graph of format support](formats.png)
|
2014-10-26 05:26:18 +00:00
|
|
|
|
2012-12-04 19:27:20 +00:00
|
|
|
|
2017-02-03 20:50:45 +00:00
|
|
|
|
2017-03-13 06:46:37 +00:00
|
|
|
## Table of Contents
|
|
|
|
|
|
|
|
<!-- toc -->
|
|
|
|
|
|
|
|
- [Installation](#installation)
|
|
|
|
* [JS Ecosystem Demos](#js-ecosystem-demos)
|
|
|
|
* [Optional Modules](#optional-modules)
|
|
|
|
* [ECMAScript 5 Compatibility](#ecmascript-5-compatibility)
|
|
|
|
- [Parsing Workbooks](#parsing-workbooks)
|
|
|
|
- [Working with the Workbook](#working-with-the-workbook)
|
|
|
|
- [Writing Workbooks](#writing-workbooks)
|
|
|
|
- [Interface](#interface)
|
|
|
|
* [Parsing functions](#parsing-functions)
|
|
|
|
* [Writing functions](#writing-functions)
|
|
|
|
* [Utilities](#utilities)
|
|
|
|
- [Workbook / Worksheet / Cell Object Description](#workbook--worksheet--cell-object-description)
|
|
|
|
* [General Structures](#general-structures)
|
|
|
|
* [Cell Object](#cell-object)
|
2017-03-20 09:02:25 +00:00
|
|
|
+ [Data Types](#data-types)
|
2017-03-21 20:44:35 +00:00
|
|
|
+ [Dates](#dates)
|
2017-03-13 06:46:37 +00:00
|
|
|
* [Worksheet Object](#worksheet-object)
|
2017-03-27 21:35:15 +00:00
|
|
|
* [Chartsheet Object](#chartsheet-object)
|
2017-03-13 06:46:37 +00:00
|
|
|
* [Workbook Object](#workbook-object)
|
2017-03-20 09:02:25 +00:00
|
|
|
* [Document Features](#document-features)
|
|
|
|
+ [Formulae](#formulae)
|
|
|
|
+ [Column Properties](#column-properties)
|
2017-03-13 06:46:37 +00:00
|
|
|
- [Parsing Options](#parsing-options)
|
|
|
|
* [Input Type](#input-type)
|
|
|
|
* [Guessing File Type](#guessing-file-type)
|
|
|
|
- [Writing Options](#writing-options)
|
|
|
|
* [Supported Output Formats](#supported-output-formats)
|
|
|
|
* [Output Type](#output-type)
|
|
|
|
- [Utility Functions](#utility-functions)
|
2017-03-25 01:36:40 +00:00
|
|
|
* [Array of Arrays Input](#array-of-arrays-input)
|
2017-03-13 06:46:37 +00:00
|
|
|
* [Formulae Output](#formulae-output)
|
|
|
|
* [CSV and general DSV Output](#csv-and-general-dsv-output)
|
|
|
|
* [JSON](#json)
|
|
|
|
- [File Formats](#file-formats)
|
|
|
|
* [Excel 2007+ XML (XLSX/XLSM)](#excel-2007-xml-xlsxxlsm)
|
|
|
|
* [Excel 2.0-95 (BIFF2/BIFF3/BIFF4/BIFF5)](#excel-20-95-biff2biff3biff4biff5)
|
|
|
|
* [Excel 97-2004 Binary (BIFF8)](#excel-97-2004-binary-biff8)
|
|
|
|
* [Excel 2003-2004 (SpreadsheetML)](#excel-2003-2004-spreadsheetml)
|
|
|
|
* [Excel 2007+ Binary (XLSB, BIFF12)](#excel-2007-binary-xlsb-biff12)
|
|
|
|
* [OpenDocument Spreadsheet (ODS/FODS) and Uniform Office Spreadsheet (UOS1/2)](#opendocument-spreadsheet-odsfods-and-uniform-office-spreadsheet-uos12)
|
|
|
|
* [Comma-Separated Values](#comma-separated-values)
|
|
|
|
* [HTML](#html)
|
|
|
|
- [Testing](#testing)
|
|
|
|
* [Tested Environments](#tested-environments)
|
|
|
|
* [Test Files](#test-files)
|
|
|
|
- [Contributing](#contributing)
|
|
|
|
- [License](#license)
|
|
|
|
- [References](#references)
|
|
|
|
- [Badges](#badges)
|
|
|
|
|
|
|
|
<!-- tocstop -->
|
2017-02-10 19:23:01 +00:00
|
|
|
|
2012-12-04 19:27:20 +00:00
|
|
|
## Installation
|
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
With [npm](https://www.npmjs.org/package/xlsx):
|
2012-12-04 19:27:20 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```bash
|
|
|
|
$ npm install xlsx
|
|
|
|
```
|
2012-12-04 19:27:20 +00:00
|
|
|
|
|
|
|
In the browser:
|
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```html
|
|
|
|
<script lang="javascript" src="dist/xlsx.core.min.js"></script>
|
|
|
|
```
|
2014-04-23 01:37:08 +00:00
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
With [bower](http://bower.io/search/?q=js-xlsx):
|
2014-04-23 01:37:08 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```bash
|
|
|
|
$ bower install js-xlsx
|
|
|
|
```
|
2014-04-23 01:37:08 +00:00
|
|
|
|
|
|
|
CDNjs automatically pulls the latest version and makes all versions available at
|
|
|
|
<http://cdnjs.com/libraries/xlsx>
|
|
|
|
|
2017-03-10 17:33:08 +00:00
|
|
|
### JS Ecosystem Demos
|
|
|
|
|
|
|
|
The `demos` directory includes sample projects for:
|
|
|
|
|
2017-03-25 01:36:40 +00:00
|
|
|
- [`angular`](demos/angular/)
|
2017-03-23 17:11:31 +00:00
|
|
|
- [`browserify`](demos/browserify/)
|
2017-03-25 22:18:50 +00:00
|
|
|
- [`Adobe ExtendScript`](demos/extendscript/)
|
2017-03-23 17:11:31 +00:00
|
|
|
- [`requirejs`](demos/requirejs/)
|
|
|
|
- [`systemjs`](demos/systemjs/)
|
|
|
|
- [`webpack`](demos/webpack/)
|
2017-03-10 17:33:08 +00:00
|
|
|
|
2017-02-22 06:57:59 +00:00
|
|
|
### Optional Modules
|
2014-04-23 01:37:08 +00:00
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
The node version automatically requires modules for additional features. Some
|
2014-05-01 00:24:27 +00:00
|
|
|
of these modules are rather large in size and are only needed in special
|
|
|
|
circumstances, so they do not ship with the core. For browser use, they must
|
|
|
|
be included directly:
|
2014-04-23 01:37:08 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```html
|
2017-02-03 20:50:45 +00:00
|
|
|
<!-- international support from js-codepage -->
|
2016-12-31 08:20:45 +00:00
|
|
|
<script src="dist/cpexcel.js"></script>
|
|
|
|
```
|
2014-04-23 01:37:08 +00:00
|
|
|
|
|
|
|
An appropriate version for each dependency is included in the dist/ directory.
|
|
|
|
|
|
|
|
The complete single-file version is generated at `dist/xlsx.full.min.js`
|
|
|
|
|
2017-03-13 06:46:37 +00:00
|
|
|
Webpack and browserify builds include optional modules by default. Webpack can
|
|
|
|
be configured to remove support with `resolve.alias`:
|
|
|
|
|
|
|
|
```js
|
|
|
|
/* uncomment the lines below to remove support */
|
|
|
|
resolve: {
|
|
|
|
alias: { "./dist/cpexcel.js": "" } // <-- omit international support
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
|
2017-02-22 06:57:59 +00:00
|
|
|
### ECMAScript 5 Compatibility
|
2014-06-03 18:39:46 +00:00
|
|
|
|
|
|
|
Since xlsx.js uses ES5 functions like `Array#forEach`, older browsers require
|
|
|
|
[Polyfills](http://git.io/QVh77g). This repo and the gh-pages branch include
|
|
|
|
[a shim](https://github.com/SheetJS/js-xlsx/blob/master/shim.js)
|
|
|
|
|
|
|
|
To use the shim, add the shim before the script tag that loads xlsx.js:
|
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```html
|
|
|
|
<script type="text/javascript" src="/path/to/shim.js"></script>
|
|
|
|
```
|
2014-06-03 18:39:46 +00:00
|
|
|
|
2014-05-28 18:31:33 +00:00
|
|
|
## Parsing Workbooks
|
2012-12-04 19:27:20 +00:00
|
|
|
|
2014-10-26 05:26:18 +00:00
|
|
|
For parsing, the first step is to read the file. This involves acquiring the
|
|
|
|
data and feeding it into the library. Here are a few common scenarios:
|
2012-12-04 19:27:20 +00:00
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
- node readFile:
|
2012-12-04 19:27:20 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2014-05-28 18:31:33 +00:00
|
|
|
if(typeof require !== 'undefined') XLSX = require('xlsx');
|
|
|
|
var workbook = XLSX.readFile('test.xlsx');
|
|
|
|
/* DO SOMETHING WITH workbook HERE */
|
|
|
|
```
|
2014-05-22 12:16:51 +00:00
|
|
|
|
2014-10-26 05:26:18 +00:00
|
|
|
- ajax (for a more complete example that works in older browsers, check the demo
|
|
|
|
at <http://oss.sheetjs.com/js-xlsx/ajax.html>):
|
2014-05-28 18:31:33 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2014-05-28 18:31:33 +00:00
|
|
|
/* set up XMLHttpRequest */
|
|
|
|
var url = "test_files/formula_stress_test_ajax.xlsx";
|
|
|
|
var oReq = new XMLHttpRequest();
|
|
|
|
oReq.open("GET", url, true);
|
|
|
|
oReq.responseType = "arraybuffer";
|
|
|
|
|
|
|
|
oReq.onload = function(e) {
|
|
|
|
var arraybuffer = oReq.response;
|
|
|
|
|
|
|
|
/* convert data to binary string */
|
|
|
|
var data = new Uint8Array(arraybuffer);
|
|
|
|
var arr = new Array();
|
|
|
|
for(var i = 0; i != data.length; ++i) arr[i] = String.fromCharCode(data[i]);
|
|
|
|
var bstr = arr.join("");
|
|
|
|
|
|
|
|
/* Call XLSX */
|
|
|
|
var workbook = XLSX.read(bstr, {type:"binary"});
|
|
|
|
|
|
|
|
/* DO SOMETHING WITH workbook HERE */
|
|
|
|
}
|
|
|
|
|
|
|
|
oReq.send();
|
|
|
|
```
|
|
|
|
|
2017-02-03 20:50:45 +00:00
|
|
|
- HTML5 drag-and-drop using readAsBinaryString or readAsArrayBuffer:
|
|
|
|
note: readAsBinaryString and readAsArrayBuffer may not be available in every
|
|
|
|
browser. Use dynamic feature tests to determine which method to use.
|
2014-05-28 18:31:33 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2017-02-03 20:50:45 +00:00
|
|
|
/* processing array buffers, only required for readAsArrayBuffer */
|
|
|
|
function fixdata(data) {
|
2017-03-13 06:46:37 +00:00
|
|
|
var o = "", l = 0, w = 10240;
|
|
|
|
for(; l<data.byteLength/w; ++l) o+=String.fromCharCode.apply(null,new Uint8Array(data.slice(l*w,l*w+w)));
|
|
|
|
o+=String.fromCharCode.apply(null, new Uint8Array(data.slice(l*w)));
|
|
|
|
return o;
|
2017-02-03 20:50:45 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
var rABS = true; // true: readAsBinaryString ; false: readAsArrayBuffer
|
2014-05-28 18:31:33 +00:00
|
|
|
/* set up drag-and-drop event */
|
|
|
|
function handleDrop(e) {
|
|
|
|
e.stopPropagation();
|
|
|
|
e.preventDefault();
|
|
|
|
var files = e.dataTransfer.files;
|
|
|
|
var i,f;
|
2017-02-03 20:50:45 +00:00
|
|
|
for (i = 0; i != files.length; ++i) {
|
|
|
|
f = files[i];
|
2014-05-28 18:31:33 +00:00
|
|
|
var reader = new FileReader();
|
|
|
|
var name = f.name;
|
|
|
|
reader.onload = function(e) {
|
|
|
|
var data = e.target.result;
|
|
|
|
|
2017-02-03 20:50:45 +00:00
|
|
|
var workbook;
|
|
|
|
if(rABS) {
|
|
|
|
/* if binary string, read with type 'binary' */
|
|
|
|
workbook = XLSX.read(data, {type: 'binary'});
|
|
|
|
} else {
|
|
|
|
/* if array buffer, convert to base64 */
|
|
|
|
var arr = fixdata(data);
|
|
|
|
workbook = XLSX.read(btoa(arr), {type: 'base64'});
|
|
|
|
}
|
2014-05-28 18:31:33 +00:00
|
|
|
|
|
|
|
/* DO SOMETHING WITH workbook HERE */
|
|
|
|
};
|
2017-02-03 20:50:45 +00:00
|
|
|
if(rABS) reader.readAsBinaryString(f);
|
|
|
|
else reader.readAsArrayBuffer(f);
|
2014-05-28 18:31:33 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
drop_dom_element.addEventListener('drop', handleDrop, false);
|
|
|
|
```
|
|
|
|
|
2017-02-03 20:50:45 +00:00
|
|
|
- HTML5 input file element using readAsBinaryString or readAsArrayBuffer:
|
2014-07-28 13:22:32 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2017-02-03 20:50:45 +00:00
|
|
|
/* fixdata and rABS are defined in the drag and drop example */
|
2014-07-28 13:22:32 +00:00
|
|
|
function handleFile(e) {
|
|
|
|
var files = e.target.files;
|
|
|
|
var i,f;
|
2017-02-03 20:50:45 +00:00
|
|
|
for (i = 0; i != files.length; ++i) {
|
|
|
|
f = files[i];
|
2014-07-28 13:22:32 +00:00
|
|
|
var reader = new FileReader();
|
|
|
|
var name = f.name;
|
|
|
|
reader.onload = function(e) {
|
|
|
|
var data = e.target.result;
|
|
|
|
|
2017-02-03 20:50:45 +00:00
|
|
|
var workbook;
|
|
|
|
if(rABS) {
|
|
|
|
/* if binary string, read with type 'binary' */
|
|
|
|
workbook = XLSX.read(data, {type: 'binary'});
|
|
|
|
} else {
|
|
|
|
/* if array buffer, convert to base64 */
|
|
|
|
var arr = fixdata(data);
|
|
|
|
workbook = XLSX.read(btoa(arr), {type: 'base64'});
|
|
|
|
}
|
2014-07-28 13:22:32 +00:00
|
|
|
|
|
|
|
/* DO SOMETHING WITH workbook HERE */
|
|
|
|
};
|
|
|
|
reader.readAsBinaryString(f);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
input_dom_element.addEventListener('change', handleFile, false);
|
|
|
|
```
|
|
|
|
|
2014-10-26 05:26:18 +00:00
|
|
|
## Working with the Workbook
|
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
The full object format is described later in this README.
|
|
|
|
|
|
|
|
This example extracts the value stored in cell A1 from the first worksheet:
|
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2015-04-02 20:32:22 +00:00
|
|
|
var first_sheet_name = workbook.SheetNames[0];
|
|
|
|
var address_of_cell = 'A1';
|
|
|
|
|
|
|
|
/* Get worksheet */
|
|
|
|
var worksheet = workbook.Sheets[first_sheet_name];
|
|
|
|
|
|
|
|
/* Find desired cell */
|
|
|
|
var desired_cell = worksheet[address_of_cell];
|
|
|
|
|
|
|
|
/* Get the value */
|
2017-03-14 08:19:51 +00:00
|
|
|
var desired_value = (desired_cell ? desired_cell.v : undefined);
|
2015-04-02 20:32:22 +00:00
|
|
|
```
|
|
|
|
|
2017-03-14 08:19:51 +00:00
|
|
|
**Complete examples:**
|
2014-05-28 18:31:33 +00:00
|
|
|
|
|
|
|
- <http://oss.sheetjs.com/js-xlsx/> HTML5 File API / Base64 Text / Web Workers
|
2014-02-12 06:09:42 +00:00
|
|
|
|
2017-02-22 06:57:59 +00:00
|
|
|
Note that older versions of IE do not support HTML5 File API, so the base64 mode
|
|
|
|
is used for testing. On OSX you can get the base64 encoding with:
|
2014-02-12 06:09:42 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```bash
|
2017-02-22 06:57:59 +00:00
|
|
|
$ <target_file base64 | pbcopy
|
|
|
|
```
|
|
|
|
|
|
|
|
On Windows XP and up you can get the base64 encoding using `certutil`:
|
|
|
|
|
|
|
|
```cmd
|
|
|
|
> certutil -encode target_file target_file.b64
|
2016-12-31 08:20:45 +00:00
|
|
|
```
|
2014-05-28 18:31:33 +00:00
|
|
|
|
2017-02-22 06:57:59 +00:00
|
|
|
(note: You have to open the file and remove the header and footer lines)
|
|
|
|
|
2014-05-28 18:31:33 +00:00
|
|
|
- <http://oss.sheetjs.com/js-xlsx/ajax.html> XMLHttpRequest
|
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
- <https://github.com/SheetJS/js-xlsx/blob/master/bin/xlsx.njs> node
|
2014-05-28 18:31:33 +00:00
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
The node version installs a command line tool `xlsx` which can read spreadsheet
|
2014-05-28 18:31:33 +00:00
|
|
|
files and output the contents in various formats. The source is available at
|
|
|
|
`xlsx.njs` in the bin directory.
|
2014-02-12 06:09:42 +00:00
|
|
|
|
2014-01-27 09:38:50 +00:00
|
|
|
Some helper functions in `XLSX.utils` generate different views of the sheets:
|
|
|
|
|
2014-02-04 00:00:44 +00:00
|
|
|
- `XLSX.utils.sheet_to_csv` generates CSV
|
2014-05-28 18:31:33 +00:00
|
|
|
- `XLSX.utils.sheet_to_json` generates an array of objects
|
2014-07-28 13:22:32 +00:00
|
|
|
- `XLSX.utils.sheet_to_formulae` generates a list of formulae
|
2014-01-27 09:38:50 +00:00
|
|
|
|
2014-05-28 18:31:33 +00:00
|
|
|
## Writing Workbooks
|
|
|
|
|
2014-10-26 05:26:18 +00:00
|
|
|
For writing, the first step is to generate output data. The helper functions
|
|
|
|
`write` and `writeFile` will produce the data in various formats suitable for
|
|
|
|
dissemination. The second step is to actual share the data with the end point.
|
|
|
|
Assuming `workbook` is a workbook object:
|
2014-05-28 18:31:33 +00:00
|
|
|
|
|
|
|
- nodejs write to file:
|
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2014-05-28 18:31:33 +00:00
|
|
|
/* output format determined by filename */
|
|
|
|
XLSX.writeFile(workbook, 'out.xlsx');
|
2014-10-26 05:26:18 +00:00
|
|
|
/* at this point, out.xlsx is a file that you can distribute */
|
2014-05-28 18:31:33 +00:00
|
|
|
```
|
|
|
|
|
2017-02-10 19:23:01 +00:00
|
|
|
- browser generate binary blob and "download" to client
|
|
|
|
(using [FileSaver.js](https://github.com/eligrey/FileSaver.js/) for download):
|
2014-05-28 18:31:33 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2017-02-10 19:23:01 +00:00
|
|
|
/* bookType can be 'xlsx' or 'xlsm' or 'xlsb' or 'ods' */
|
2014-07-28 13:22:32 +00:00
|
|
|
var wopts = { bookType:'xlsx', bookSST:false, type:'binary' };
|
2014-05-28 18:31:33 +00:00
|
|
|
|
|
|
|
var wbout = XLSX.write(workbook,wopts);
|
|
|
|
|
|
|
|
function s2ab(s) {
|
|
|
|
var buf = new ArrayBuffer(s.length);
|
|
|
|
var view = new Uint8Array(buf);
|
|
|
|
for (var i=0; i!=s.length; ++i) view[i] = s.charCodeAt(i) & 0xFF;
|
|
|
|
return buf;
|
|
|
|
}
|
|
|
|
|
2014-10-26 05:26:18 +00:00
|
|
|
/* the saveAs call downloads a file on the local machine */
|
2017-02-03 20:50:45 +00:00
|
|
|
saveAs(new Blob([s2ab(wbout)],{type:"application/octet-stream"}), "test.xlsx");
|
2014-05-28 18:31:33 +00:00
|
|
|
```
|
|
|
|
|
2017-03-14 08:19:51 +00:00
|
|
|
**Complete examples:**
|
2014-02-07 10:53:40 +00:00
|
|
|
|
2014-05-28 18:31:33 +00:00
|
|
|
- <http://sheetjs.com/demos/writexlsx.html> generates a simple file
|
|
|
|
- <http://git.io/WEK88Q> writing an array of arrays in nodejs
|
2014-07-28 13:22:32 +00:00
|
|
|
- <http://sheetjs.com/demos/table.html> exporting an HTML table
|
2014-02-07 10:53:40 +00:00
|
|
|
|
2014-05-16 00:33:34 +00:00
|
|
|
## Interface
|
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
`XLSX` is the exposed variable in the browser and the exported node variable
|
2014-05-16 00:33:34 +00:00
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
`XLSX.version` is the version of the library (added by the build script).
|
|
|
|
|
|
|
|
`XLSX.SSF` is an embedded version of the [format library](http://git.io/ssf).
|
|
|
|
|
|
|
|
### Parsing functions
|
2014-05-16 00:33:34 +00:00
|
|
|
|
|
|
|
`XLSX.read(data, read_opts)` attempts to parse `data`.
|
|
|
|
|
|
|
|
`XLSX.readFile(filename, read_opts)` attempts to read `filename` and parse.
|
|
|
|
|
2017-03-10 23:39:17 +00:00
|
|
|
Parse options are described in the [Parsing Options](#parsing-options) section.
|
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
### Writing functions
|
|
|
|
|
2014-05-16 00:33:34 +00:00
|
|
|
`XLSX.write(wb, write_opts)` attempts to write the workbook `wb`
|
|
|
|
|
|
|
|
`XLSX.writeFile(wb, filename, write_opts)` attempts to write `wb` to `filename`
|
|
|
|
|
2017-03-25 01:36:40 +00:00
|
|
|
`XLSX.writeFileAsync(filename, wb, o, cb)` attempts to write `wb` to `filename`.
|
|
|
|
If `o` is omitted, the writer will use the third argument as the callback.
|
|
|
|
|
2017-03-10 23:39:17 +00:00
|
|
|
Write options are described in the [Writing Options](#writing-options) section.
|
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
### Utilities
|
|
|
|
|
|
|
|
Utilities are available in the `XLSX.utils` object:
|
|
|
|
|
2017-03-25 01:36:40 +00:00
|
|
|
**Importing:**
|
|
|
|
|
|
|
|
- `aoa_to_sheet` converts an array of arrays of JS data to a worksheet.
|
|
|
|
|
2017-03-14 08:19:51 +00:00
|
|
|
**Exporting:**
|
2014-07-28 13:22:32 +00:00
|
|
|
|
2017-03-05 00:56:31 +00:00
|
|
|
- `sheet_to_json` converts a worksheet object to an array of JSON objects.
|
2017-02-03 20:50:45 +00:00
|
|
|
`sheet_to_row_object_array` is an alias that will be removed in the future.
|
|
|
|
- `sheet_to_csv` generates delimiter-separated-values output.
|
|
|
|
- `sheet_to_formulae` generates a list of the formulae (with value fallbacks).
|
|
|
|
|
2017-03-10 23:39:17 +00:00
|
|
|
Exporters are described in the [Utility Functions](#utility-functions) section.
|
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
|
2017-03-14 08:19:51 +00:00
|
|
|
**Cell and cell address manipulation:**
|
2014-07-28 13:22:32 +00:00
|
|
|
|
|
|
|
- `format_cell` generates the text value for a cell (using number formats)
|
|
|
|
- `{en,de}code_{row,col}` convert between 0-indexed rows/cols and A1 forms.
|
|
|
|
- `{en,de}code_cell` converts cell addresses
|
|
|
|
- `{en,de}code_range` converts cell ranges
|
|
|
|
|
|
|
|
## Workbook / Worksheet / Cell Object Description
|
2014-01-27 09:38:50 +00:00
|
|
|
|
2014-05-16 00:33:34 +00:00
|
|
|
js-xlsx conforms to the Common Spreadsheet Format (CSF):
|
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
### General Structures
|
|
|
|
|
|
|
|
Cell address objects are stored as `{c:C, r:R}` where `C` and `R` are 0-indexed
|
|
|
|
column and row numbers, respectively. For example, the cell address `B5` is
|
|
|
|
represented by the object `{c:1, r:4}`.
|
|
|
|
|
|
|
|
Cell range objects are stored as `{s:S, e:E}` where `S` is the first cell and
|
|
|
|
`E` is the last cell in the range. The ranges are inclusive. For example, the
|
|
|
|
range `A3:B7` is represented by the object `{s:{c:0, r:2}, e:{c:1, r:6}}`. Utils
|
|
|
|
use the following pattern to walk each of the cells in a range:
|
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2014-07-28 13:22:32 +00:00
|
|
|
for(var R = range.s.r; R <= range.e.r; ++R) {
|
|
|
|
for(var C = range.s.c; C <= range.e.c; ++C) {
|
|
|
|
var cell_address = {c:C, r:R};
|
|
|
|
}
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
### Cell Object
|
|
|
|
|
2017-02-19 20:36:32 +00:00
|
|
|
| Key | Description |
|
|
|
|
| --- | ---------------------------------------------------------------------- |
|
|
|
|
| `v` | raw value (see Data Types section for more info) |
|
|
|
|
| `w` | formatted text (if applicable) |
|
|
|
|
| `t` | cell type: `b` Boolean, `n` Number, `e` error, `s` String, `d` Date |
|
|
|
|
| `f` | cell formula encoded as an A1-style string (if applicable) |
|
|
|
|
| `F` | range of enclosing array if formula is array formula (if applicable) |
|
|
|
|
| `r` | rich text encoding (if applicable) |
|
|
|
|
| `h` | HTML rendering of the rich text (if applicable) |
|
|
|
|
| `c` | comments associated with the cell |
|
|
|
|
| `z` | number format string associated with the cell (if requested) |
|
|
|
|
| `l` | cell hyperlink object (.Target holds link, .tooltip is tooltip) |
|
|
|
|
| `s` | the style/theme of the cell (if applicable) |
|
2014-07-28 13:22:32 +00:00
|
|
|
|
|
|
|
Built-in export utilities (such as the CSV exporter) will use the `w` text if it
|
|
|
|
is available. To change a value, be sure to delete `cell.w` (or set it to
|
|
|
|
`undefined`) before attempting to export. The utilities will regenerate the `w`
|
|
|
|
text from the number format (`cell.z`) and the raw value if possible.
|
|
|
|
|
2017-02-19 20:36:32 +00:00
|
|
|
The actual array formula is stored in the `f` field of the first cell in the
|
|
|
|
array range. Other cells in the range will omit the `f` field.
|
|
|
|
|
2017-03-20 09:02:25 +00:00
|
|
|
#### Data Types
|
2014-10-26 05:26:18 +00:00
|
|
|
|
|
|
|
The raw value is stored in the `v` field, interpreted based on the `t` field.
|
|
|
|
|
2017-03-10 17:33:08 +00:00
|
|
|
Type `b` is the Boolean type. `v` is interpreted according to JS truth tables.
|
2014-10-26 05:26:18 +00:00
|
|
|
|
|
|
|
Type `e` is the Error type. `v` holds the number and `w` holds the common name:
|
|
|
|
|
2017-03-10 17:33:08 +00:00
|
|
|
| Value | Error Meaning |
|
|
|
|
| -----: | :-------------- |
|
|
|
|
| `0x00` | `#NULL!` |
|
|
|
|
| `0x07` | `#DIV/0!` |
|
|
|
|
| `0x0F` | `#VALUE!` |
|
|
|
|
| `0x17` | `#REF!` |
|
|
|
|
| `0x1D` | `#NAME?` |
|
|
|
|
| `0x24` | `#NUM!` |
|
|
|
|
| `0x2A` | `#N/A` |
|
|
|
|
| `0x2B` | `#GETTING_DATA` |
|
2014-10-26 05:26:18 +00:00
|
|
|
|
|
|
|
Type `n` is the Number type. This includes all forms of data that Excel stores
|
|
|
|
as numbers, such as dates/times and Boolean fields. Excel exclusively uses data
|
|
|
|
that can be fit in an IEEE754 floating point number, just like JS Number, so the
|
2017-03-19 23:04:31 +00:00
|
|
|
`v` field holds the raw number. The `w` field holds formatted text. Dates are
|
|
|
|
stored as numbers by default and converted with `XLSX.SSF.parse_date_code`.
|
2014-10-26 05:26:18 +00:00
|
|
|
|
|
|
|
Type `d` is the Date type, generated only when the option `cellDates` is passed.
|
|
|
|
Since JSON does not have a natural Date type, parsers are generally expected to
|
|
|
|
store ISO 8601 Date strings like you would get from `date.toISOString()`. On
|
|
|
|
the other hand, writers and exporters should be able to handle date strings and
|
2017-02-03 20:50:45 +00:00
|
|
|
JS Date objects. Note that Excel disregards timezone modifiers and treats all
|
2014-10-26 05:26:18 +00:00
|
|
|
dates in the local timezone. js-xlsx does not correct for this error.
|
|
|
|
|
|
|
|
Type `s` is the String type. `v` should be explicitly stored as a string to
|
|
|
|
avoid possible confusion.
|
|
|
|
|
2017-03-15 08:19:02 +00:00
|
|
|
Type `z` represents blank stub cells. These do not have any data or type, and
|
|
|
|
are not processed by any of the core library functions. By default these cells
|
2017-03-16 01:17:24 +00:00
|
|
|
will not be generated; the parser `sheetStubs` option must be set to `true`.
|
2017-03-15 08:19:02 +00:00
|
|
|
|
2017-03-21 20:44:35 +00:00
|
|
|
#### Dates
|
|
|
|
|
|
|
|
By default, Excel stores dates as numbers with a format code that specifies date
|
|
|
|
processing. For example, the date `19-Feb-17` is stored as the number `42785`
|
|
|
|
with a number format of `d-mmm-yy`. The `SSF` module understands number formats
|
|
|
|
and performs the appropriate conversion.
|
|
|
|
|
|
|
|
XLSX also supports a special date type `d` where the data is an ISO 8601 date
|
|
|
|
string. The formatter converts the date back to a number.
|
|
|
|
|
|
|
|
The default behavior for all parsers is to generate number cells. Setting
|
|
|
|
`cellDates` to true will force the generators to store dates.
|
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
### Worksheet Object
|
2014-01-27 09:38:50 +00:00
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
Each key that does not start with `!` maps to a cell (using `A-1` notation)
|
2014-01-27 09:38:50 +00:00
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
`worksheet[address]` returns the cell object for the specified address.
|
2014-02-04 00:00:44 +00:00
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
Special worksheet keys (accessible as `worksheet[key]`, each starting with `!`):
|
|
|
|
|
|
|
|
- `ws['!ref']`: A-1 based range representing the worksheet range. Functions that
|
|
|
|
work with sheets should use this parameter to determine the range. Cells that
|
|
|
|
are assigned outside of the range are not processed. In particular, when
|
|
|
|
writing a worksheet by hand, be sure to update the range. For a longer
|
|
|
|
discussion, see <http://git.io/KIaNKQ>
|
|
|
|
|
2014-08-26 17:40:04 +00:00
|
|
|
Functions that handle worksheets should test for the presence of `!ref` field.
|
|
|
|
If the `!ref` is omitted or is not a valid range, functions are free to treat
|
|
|
|
the sheet as empty or attempt to guess the range. The standard utilities that
|
2017-02-03 20:50:45 +00:00
|
|
|
ship with this library treat sheets as empty (for example, the CSV output is
|
2014-08-26 17:40:04 +00:00
|
|
|
empty string).
|
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
When reading a worksheet with the `sheetRows` property set, the ref parameter
|
|
|
|
will use the restricted range. The original range is set at `ws['!fullref']`
|
|
|
|
|
|
|
|
- `ws['!cols']`: array of column properties objects. Column widths are actually
|
|
|
|
stored in files in a normalized manner, measured in terms of the "Maximum
|
|
|
|
Digit Width" (the largest width of the rendered digits 0-9, in pixels). When
|
|
|
|
parsed, the column objects store the pixel width in the `wpx` field, character
|
|
|
|
width in the `wch` field, and the maximum digit width in the `MDW` field.
|
|
|
|
|
|
|
|
- `ws['!merges']`: array of range objects corresponding to the merged cells in
|
|
|
|
the worksheet. Plaintext utilities are unaware of merge cells. CSV export
|
|
|
|
will write all cells in the merge range if they exist, so be sure that only
|
|
|
|
the first cell (upper-left) in the range is set.
|
|
|
|
|
2017-03-27 21:35:15 +00:00
|
|
|
### Chartsheet Object
|
|
|
|
|
|
|
|
Chartsheets are represented as standard worksheets. They are distinguished with
|
|
|
|
the `!type` property set to `"chart"`.
|
|
|
|
|
|
|
|
The underlying data and `!ref` refer to the cached data in the chartsheet.
|
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
### Workbook Object
|
|
|
|
|
|
|
|
`workbook.SheetNames` is an ordered list of the sheets in the workbook
|
|
|
|
|
|
|
|
`wb.Sheets[sheetname]` returns an object representing the worksheet.
|
|
|
|
|
|
|
|
`wb.Props` is an object storing the standard properties. `wb.Custprops` stores
|
2015-04-02 20:32:22 +00:00
|
|
|
custom properties. Since the XLS standard properties deviate from the XLSX
|
|
|
|
standard, XLS parsing stores core properties in both places. .
|
2014-02-04 00:00:44 +00:00
|
|
|
|
2017-03-19 23:04:31 +00:00
|
|
|
`wb.WBProps` includes more workbook-level properties:
|
|
|
|
|
|
|
|
- Excel supports two epochs (January 1 1900 and January 1 1904), see
|
|
|
|
[1900 vs. 1904 Date System](http://support2.microsoft.com/kb/180162).
|
|
|
|
The workbook's epoch can be determined by examining the workbook's
|
|
|
|
`wb.WBProps.date1904` property.
|
2014-01-27 09:38:50 +00:00
|
|
|
|
2017-03-20 09:02:25 +00:00
|
|
|
### Document Features
|
|
|
|
|
|
|
|
Even for basic features like date storage, the official Excel formats store the
|
|
|
|
same content in different ways. The parsers are expected to convert from the
|
|
|
|
underlying file format representation to the Common Spreadsheet Format. Writers
|
|
|
|
are expected to convert from CSF back to the underlying file format.
|
|
|
|
|
|
|
|
#### Formulae
|
|
|
|
|
|
|
|
The A1-style formula string is stored in the `f` field. Even though different
|
|
|
|
file formats store the formulae in different ways, the formats are translated.
|
|
|
|
Even though some formats store formulae with a leading equal sign, CSF formulae
|
|
|
|
do not start with `=`.
|
|
|
|
|
|
|
|
The worksheet representation of A1=1, A2=2, A3=A1+A2:
|
|
|
|
|
|
|
|
```js
|
|
|
|
{
|
|
|
|
"!ref": "A1:A3",
|
|
|
|
A1: { t:'n', v:1 },
|
|
|
|
A2: { t:'n', v:2 },
|
|
|
|
A3: { t:'n', v:3, f:'A1+A2' }
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
Shared formulae are decompressed and each cell has the formula corresponding to
|
|
|
|
its cell. Writers generally do not attempt to generate shared formulae.
|
|
|
|
|
|
|
|
Cells with formula entries but no value will be serialized in a way that Excel
|
|
|
|
and other spreadsheet tools will recognize. This library will not automatically
|
|
|
|
compute formula results! For example, to compute `BESSELJ` in a worksheet:
|
|
|
|
|
|
|
|
```js
|
|
|
|
{
|
|
|
|
"!ref": "A1:A3",
|
|
|
|
A1: { t:'n', v:3.14159 },
|
|
|
|
A2: { t:'n', v:2 },
|
|
|
|
A3: { t:'n', f:'BESSELJ(A1,A2)' }
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
**Array Formulae**
|
|
|
|
|
|
|
|
Array formulae are stored in the top-left cell of the array block. All cells
|
|
|
|
of an array formula have a `F` field corresponding to the range. A single-cell
|
|
|
|
formula can be distinguished from a plain formula by the presence of `F` field.
|
|
|
|
|
|
|
|
For example, setting the cell `C1` to the array formula `{=SUM(A1:A3*B1:B3)}`:
|
|
|
|
|
|
|
|
```js
|
|
|
|
worksheet['C1'] = { t:'n', f: "SUM(A1:A3*B1:B3)", F:"C1:C1" };
|
|
|
|
```
|
|
|
|
|
|
|
|
For a multi-cell array formula, every cell has the same array range but only the
|
|
|
|
first cell has content. Consider `D1:D3=A1:A3*B1:B3`:
|
|
|
|
|
|
|
|
```js
|
|
|
|
worksheet['D1'] = { t:'n', F:"D1:D3", f:"A1:A3*B1:B3" };
|
|
|
|
worksheet['D2'] = { t:'n', F:"D1:D3" };
|
|
|
|
worksheet['D3'] = { t:'n', F:"D1:D3" };
|
|
|
|
```
|
|
|
|
|
|
|
|
Utilities and writers are expected to check for the presence of a `F` field and
|
|
|
|
ignore any possible formula element `f` in cells other than the starting cell.
|
|
|
|
They are not expected to perform validation of the formulae!
|
|
|
|
|
|
|
|
**Formula Output**
|
|
|
|
|
|
|
|
The `sheet_to_formulae` method generates one line per formula or array formula.
|
|
|
|
Array formulae are rendered in the form `range=formula` while plain cells are
|
|
|
|
rendered in the form `cell=formula or value`. Note that string literals are
|
|
|
|
prefixed with an apostrophe `'`, consistent with Excel's formula bar display.
|
|
|
|
|
|
|
|
**Formulae File Format Details**
|
|
|
|
|
|
|
|
| Storage Representation | Formats | Read | Write |
|
|
|
|
|:-----------------------|:-------------------------|:-----:|:-----:|
|
|
|
|
| A1-style strings | XLSX | :o: | :o: |
|
|
|
|
| RC-style strings | XLML and plaintext | :o: | :o: |
|
|
|
|
| BIFF Parsed formulae | XLSB and all XLS formats | :o: | |
|
|
|
|
| OpenFormula formulae | ODS/FODS/UOS | :o: | :o: |
|
|
|
|
|
|
|
|
Since Excel prohibits named cells from colliding with names of A1 or RC style
|
|
|
|
cell references, a (not-so-simple) regex conversion is possible. BIFF Parsed
|
|
|
|
formulae have to be explicitly unwound. OpenFormula formulae can be converted
|
|
|
|
with regexes for the most part.
|
|
|
|
#### Column Properties
|
|
|
|
|
|
|
|
Excel internally stores column widths in a nebulous "Max Digit Width" form. The
|
|
|
|
Max Digit Width is the width of the largest digit when rendered. The internal
|
|
|
|
width must be an integer multiple of the the width divided by 256. ECMA-376
|
|
|
|
describes a formula for converting between pixels and the internal width.
|
|
|
|
|
|
|
|
Given the constraints, it is possible to determine the MDW without actually
|
|
|
|
inspecting the font! The parsers guess the pixel width by converting from width
|
|
|
|
to pixels and back, repeating for all possible MDW and selecting the MDW that
|
|
|
|
minimizes the error. XLML actually stores the pixel width, so the guess works
|
|
|
|
in the opposite direction.
|
|
|
|
|
|
|
|
The `!cols` array in each worksheet, if present, is a collection of `ColInfo`
|
|
|
|
objects which have the following properties:
|
|
|
|
|
|
|
|
```typescript
|
|
|
|
type ColInfo = {
|
|
|
|
MDW?:number; // Excel's "Max Digit Width" unit, always integral
|
2017-03-25 01:36:40 +00:00
|
|
|
width:number; // width in Excel's "Max Digit Width", width*256 is integral
|
2017-03-20 09:02:25 +00:00
|
|
|
wpx?:number; // width in screen pixels
|
|
|
|
wch?:number; // intermediate character calculation
|
|
|
|
};
|
|
|
|
```
|
|
|
|
|
|
|
|
Even though all of the information is made available, writers are expected to
|
|
|
|
follow the priority order:
|
|
|
|
|
|
|
|
1) use `width` field if available
|
|
|
|
2) use `wpx` pixel width if available
|
2017-03-25 01:36:40 +00:00
|
|
|
3) use `wch` character count if available
|
|
|
|
|
2014-05-16 00:33:34 +00:00
|
|
|
## Parsing Options
|
2013-04-20 16:22:32 +00:00
|
|
|
|
2014-02-07 10:53:40 +00:00
|
|
|
The exported `read` and `readFile` functions accept an options argument:
|
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
| Option Name | Default | Description |
|
|
|
|
| :---------- | ------: | :--------------------------------------------------- |
|
2017-02-22 06:57:59 +00:00
|
|
|
| type | | Input data encoding (see Input Type below) |
|
2017-03-25 01:36:40 +00:00
|
|
|
| cellFormula | true | Save formulae to the .f field |
|
2016-12-31 08:20:45 +00:00
|
|
|
| cellHTML | true | Parse rich text and save HTML to the .h field |
|
|
|
|
| cellNF | false | Save number format string to the .z field |
|
|
|
|
| cellStyles | false | Save style/theme info to the .s field |
|
2017-03-22 07:50:11 +00:00
|
|
|
| cellDates | false | Store dates as type `d` (default is `n`) |
|
2017-03-15 08:19:02 +00:00
|
|
|
| sheetStubs | false | Create cell objects of type `z` for stub cells |
|
2016-12-31 08:20:45 +00:00
|
|
|
| sheetRows | 0 | If >0, read the first `sheetRows` rows ** |
|
|
|
|
| bookDeps | false | If true, parse calculation chains |
|
|
|
|
| bookFiles | false | If true, add raw files to book object ** |
|
|
|
|
| bookProps | false | If true, only parse enough to get book metadata ** |
|
|
|
|
| bookSheets | false | If true, only parse enough to get the sheet names |
|
|
|
|
| bookVBA | false | If true, expose vbaProject.bin to `vbaraw` field ** |
|
|
|
|
| password | "" | If defined and file is encrypted, use password ** |
|
2017-03-13 06:46:37 +00:00
|
|
|
| WTF | false | If true, throw errors on unexpected file features ** |
|
2014-02-12 06:09:42 +00:00
|
|
|
|
2014-10-26 05:26:18 +00:00
|
|
|
- Even if `cellNF` is false, formatted text will be generated and saved to `.w`
|
2014-02-13 06:22:42 +00:00
|
|
|
- In some cases, sheets may be parsed even if `bookSheets` is false.
|
2014-02-14 06:25:46 +00:00
|
|
|
- `bookSheets` and `bookProps` combine to give both sets of information
|
2014-02-15 05:08:18 +00:00
|
|
|
- `Deps` will be an empty object if `bookDeps` is falsy
|
2015-04-02 20:32:22 +00:00
|
|
|
- `bookFiles` behavior depends on file type:
|
|
|
|
* `keys` array (paths in the ZIP) for ZIP-based formats
|
|
|
|
* `files` hash (mapping paths to objects representing the files) for ZIP
|
|
|
|
* `cfb` object for formats using CFB containers
|
2014-02-19 03:03:28 +00:00
|
|
|
- `sheetRows-1` rows will be generated when looking at the JSON object output
|
|
< |