2012-12-04 19:27:20 +00:00
|
|
|
# xlsx
|
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
Parser and writer for various spreadsheet formats. Pure-JS cleanroom
|
2017-02-22 06:57:59 +00:00
|
|
|
implementation from official specifications, related documents, and test files.
|
|
|
|
Emphasis on parsing and writing robustness, cross-format feature compatibility
|
|
|
|
with a unified JS representation, and ES3/ES5 browser compatibility back to IE6.
|
|
|
|
|
|
|
|
File format support for known spreadsheet data formats:
|
|
|
|
|
|
|
|
| Format | Read | Write |
|
|
|
|
|:-------------------------------------------------------------|:-----:|:-----:|
|
|
|
|
| **Excel Worksheet/Workbook Formats** |:-----:|:-----:|
|
|
|
|
| Excel 2007+ XML Formats (XLSX/XLSM) | :o: | :o: |
|
|
|
|
| Excel 2007+ Binary Format (XLSB BIFF12) | :o: | :o: |
|
|
|
|
| Excel 2003-2004 XML Format (XML "SpreadsheetML") | :o: | |
|
|
|
|
| Excel 97-2004 (XLS BIFF8) | :o: | |
|
|
|
|
| Excel 5.0/95 (XLS BIFF5) | :o: | |
|
|
|
|
| Excel 4.0 (XLS/XLW BIFF4) | :o: | |
|
|
|
|
| Excel 3.0 (XLS BIFF3) | :o: | |
|
|
|
|
| Excel 2.0/2.1 (XLS BIFF2) | :o: | :o: |
|
|
|
|
| **Excel Supported Text Formats** |:-----:|:-----:|
|
|
|
|
| Delimiter-Separated Values (CSV/TSV/DSV) | | :o: |
|
|
|
|
| **Other Workbook/Worksheet Formats** |:-----:|:-----:|
|
|
|
|
| OpenDocument Spreadsheet (ODS) | :o: | :o: |
|
|
|
|
| Flat XML ODF Spreadsheet (FODS) | :o: | :o: |
|
|
|
|
| Uniform Office Format Spreadsheet (æ ‡æ–‡é€š UOS1/UOS2) | :o: | |
|
2014-10-26 05:26:18 +00:00
|
|
|
|
|
|
|
Demo: <http://oss.sheetjs.com/js-xlsx>
|
|
|
|
|
|
|
|
Source: <http://git.io/xlsx>
|
2012-12-04 19:27:20 +00:00
|
|
|
|
2017-02-03 20:50:45 +00:00
|
|
|
Paid support available through the [reinforcements program](http://sheetjs.com/reinforcements)
|
|
|
|
|
2017-02-10 19:23:01 +00:00
|
|
|
|
2012-12-04 19:27:20 +00:00
|
|
|
## Installation
|
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
With [npm](https://www.npmjs.org/package/xlsx):
|
2012-12-04 19:27:20 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```bash
|
|
|
|
$ npm install xlsx
|
|
|
|
```
|
2012-12-04 19:27:20 +00:00
|
|
|
|
|
|
|
In the browser:
|
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```html
|
|
|
|
<script lang="javascript" src="dist/xlsx.core.min.js"></script>
|
|
|
|
```
|
2014-04-23 01:37:08 +00:00
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
With [bower](http://bower.io/search/?q=js-xlsx):
|
2014-04-23 01:37:08 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```bash
|
|
|
|
$ bower install js-xlsx
|
|
|
|
```
|
2014-04-23 01:37:08 +00:00
|
|
|
|
|
|
|
CDNjs automatically pulls the latest version and makes all versions available at
|
|
|
|
<http://cdnjs.com/libraries/xlsx>
|
|
|
|
|
2017-02-22 06:57:59 +00:00
|
|
|
### Optional Modules
|
2014-04-23 01:37:08 +00:00
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
The node version automatically requires modules for additional features. Some
|
2014-05-01 00:24:27 +00:00
|
|
|
of these modules are rather large in size and are only needed in special
|
|
|
|
circumstances, so they do not ship with the core. For browser use, they must
|
|
|
|
be included directly:
|
2014-04-23 01:37:08 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```html
|
2017-02-03 20:50:45 +00:00
|
|
|
<!-- international support from js-codepage -->
|
2016-12-31 08:20:45 +00:00
|
|
|
<script src="dist/cpexcel.js"></script>
|
|
|
|
<!-- ODS support -->
|
|
|
|
<script src="dist/ods.js"></script>
|
|
|
|
```
|
2014-04-23 01:37:08 +00:00
|
|
|
|
|
|
|
An appropriate version for each dependency is included in the dist/ directory.
|
|
|
|
|
|
|
|
The complete single-file version is generated at `dist/xlsx.full.min.js`
|
|
|
|
|
2017-02-22 06:57:59 +00:00
|
|
|
### ECMAScript 5 Compatibility
|
2014-06-03 18:39:46 +00:00
|
|
|
|
|
|
|
Since xlsx.js uses ES5 functions like `Array#forEach`, older browsers require
|
|
|
|
[Polyfills](http://git.io/QVh77g). This repo and the gh-pages branch include
|
|
|
|
[a shim](https://github.com/SheetJS/js-xlsx/blob/master/shim.js)
|
|
|
|
|
|
|
|
To use the shim, add the shim before the script tag that loads xlsx.js:
|
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```html
|
|
|
|
<script type="text/javascript" src="/path/to/shim.js"></script>
|
|
|
|
```
|
2014-06-03 18:39:46 +00:00
|
|
|
|
2014-05-28 18:31:33 +00:00
|
|
|
## Parsing Workbooks
|
2012-12-04 19:27:20 +00:00
|
|
|
|
2014-10-26 05:26:18 +00:00
|
|
|
For parsing, the first step is to read the file. This involves acquiring the
|
|
|
|
data and feeding it into the library. Here are a few common scenarios:
|
2012-12-04 19:27:20 +00:00
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
- node readFile:
|
2012-12-04 19:27:20 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2014-05-28 18:31:33 +00:00
|
|
|
if(typeof require !== 'undefined') XLSX = require('xlsx');
|
|
|
|
var workbook = XLSX.readFile('test.xlsx');
|
|
|
|
/* DO SOMETHING WITH workbook HERE */
|
|
|
|
```
|
2014-05-22 12:16:51 +00:00
|
|
|
|
2014-10-26 05:26:18 +00:00
|
|
|
- ajax (for a more complete example that works in older browsers, check the demo
|
|
|
|
at <http://oss.sheetjs.com/js-xlsx/ajax.html>):
|
2014-05-28 18:31:33 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2014-05-28 18:31:33 +00:00
|
|
|
/* set up XMLHttpRequest */
|
|
|
|
var url = "test_files/formula_stress_test_ajax.xlsx";
|
|
|
|
var oReq = new XMLHttpRequest();
|
|
|
|
oReq.open("GET", url, true);
|
|
|
|
oReq.responseType = "arraybuffer";
|
|
|
|
|
|
|
|
oReq.onload = function(e) {
|
|
|
|
var arraybuffer = oReq.response;
|
|
|
|
|
|
|
|
/* convert data to binary string */
|
|
|
|
var data = new Uint8Array(arraybuffer);
|
|
|
|
var arr = new Array();
|
|
|
|
for(var i = 0; i != data.length; ++i) arr[i] = String.fromCharCode(data[i]);
|
|
|
|
var bstr = arr.join("");
|
|
|
|
|
|
|
|
/* Call XLSX */
|
|
|
|
var workbook = XLSX.read(bstr, {type:"binary"});
|
|
|
|
|
|
|
|
/* DO SOMETHING WITH workbook HERE */
|
|
|
|
}
|
|
|
|
|
|
|
|
oReq.send();
|
|
|
|
```
|
|
|
|
|
2017-02-03 20:50:45 +00:00
|
|
|
- HTML5 drag-and-drop using readAsBinaryString or readAsArrayBuffer:
|
|
|
|
note: readAsBinaryString and readAsArrayBuffer may not be available in every
|
|
|
|
browser. Use dynamic feature tests to determine which method to use.
|
2014-05-28 18:31:33 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2017-02-03 20:50:45 +00:00
|
|
|
/* processing array buffers, only required for readAsArrayBuffer */
|
|
|
|
function fixdata(data) {
|
|
|
|
var o = "", l = 0, w = 10240;
|
|
|
|
for(; l<data.byteLength/w; ++l) o+=String.fromCharCode.apply(null,new Uint8Array(data.slice(l*w,l*w+w)));
|
|
|
|
o+=String.fromCharCode.apply(null, new Uint8Array(data.slice(l*w)));
|
|
|
|
return o;
|
|
|
|
}
|
|
|
|
|
|
|
|
var rABS = true; // true: readAsBinaryString ; false: readAsArrayBuffer
|
2014-05-28 18:31:33 +00:00
|
|
|
/* set up drag-and-drop event */
|
|
|
|
function handleDrop(e) {
|
|
|
|
e.stopPropagation();
|
|
|
|
e.preventDefault();
|
|
|
|
var files = e.dataTransfer.files;
|
|
|
|
var i,f;
|
2017-02-03 20:50:45 +00:00
|
|
|
for (i = 0; i != files.length; ++i) {
|
|
|
|
f = files[i];
|
2014-05-28 18:31:33 +00:00
|
|
|
var reader = new FileReader();
|
|
|
|
var name = f.name;
|
|
|
|
reader.onload = function(e) {
|
|
|
|
var data = e.target.result;
|
|
|
|
|
2017-02-03 20:50:45 +00:00
|
|
|
var workbook;
|
|
|
|
if(rABS) {
|
|
|
|
/* if binary string, read with type 'binary' */
|
|
|
|
workbook = XLSX.read(data, {type: 'binary'});
|
|
|
|
} else {
|
|
|
|
/* if array buffer, convert to base64 */
|
|
|
|
var arr = fixdata(data);
|
|
|
|
workbook = XLSX.read(btoa(arr), {type: 'base64'});
|
|
|
|
}
|
2014-05-28 18:31:33 +00:00
|
|
|
|
|
|
|
/* DO SOMETHING WITH workbook HERE */
|
|
|
|
};
|
2017-02-03 20:50:45 +00:00
|
|
|
if(rABS) reader.readAsBinaryString(f);
|
|
|
|
else reader.readAsArrayBuffer(f);
|
2014-05-28 18:31:33 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
drop_dom_element.addEventListener('drop', handleDrop, false);
|
|
|
|
```
|
|
|
|
|
2017-02-03 20:50:45 +00:00
|
|
|
- HTML5 input file element using readAsBinaryString or readAsArrayBuffer:
|
2014-07-28 13:22:32 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2017-02-03 20:50:45 +00:00
|
|
|
/* fixdata and rABS are defined in the drag and drop example */
|
2014-07-28 13:22:32 +00:00
|
|
|
function handleFile(e) {
|
|
|
|
var files = e.target.files;
|
|
|
|
var i,f;
|
2017-02-03 20:50:45 +00:00
|
|
|
for (i = 0; i != files.length; ++i) {
|
|
|
|
f = files[i];
|
2014-07-28 13:22:32 +00:00
|
|
|
var reader = new FileReader();
|
|
|
|
var name = f.name;
|
|
|
|
reader.onload = function(e) {
|
|
|
|
var data = e.target.result;
|
|
|
|
|
2017-02-03 20:50:45 +00:00
|
|
|
var workbook;
|
|
|
|
if(rABS) {
|
|
|
|
/* if binary string, read with type 'binary' */
|
|
|
|
workbook = XLSX.read(data, {type: 'binary'});
|
|
|
|
} else {
|
|
|
|
/* if array buffer, convert to base64 */
|
|
|
|
var arr = fixdata(data);
|
|
|
|
workbook = XLSX.read(btoa(arr), {type: 'base64'});
|
|
|
|
}
|
2014-07-28 13:22:32 +00:00
|
|
|
|
|
|
|
/* DO SOMETHING WITH workbook HERE */
|
|
|
|
};
|
|
|
|
reader.readAsBinaryString(f);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
input_dom_element.addEventListener('change', handleFile, false);
|
|
|
|
```
|
|
|
|
|
2014-10-26 05:26:18 +00:00
|
|
|
## Working with the Workbook
|
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
The full object format is described later in this README.
|
|
|
|
|
|
|
|
This example extracts the value stored in cell A1 from the first worksheet:
|
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2015-04-02 20:32:22 +00:00
|
|
|
var first_sheet_name = workbook.SheetNames[0];
|
|
|
|
var address_of_cell = 'A1';
|
|
|
|
|
|
|
|
/* Get worksheet */
|
|
|
|
var worksheet = workbook.Sheets[first_sheet_name];
|
|
|
|
|
|
|
|
/* Find desired cell */
|
|
|
|
var desired_cell = worksheet[address_of_cell];
|
|
|
|
|
|
|
|
/* Get the value */
|
|
|
|
var desired_value = desired_cell.v;
|
|
|
|
```
|
|
|
|
|
|
|
|
This example iterates through every nonempty of every sheet and dumps values:
|
2014-05-28 18:31:33 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2014-05-28 18:31:33 +00:00
|
|
|
var sheet_name_list = workbook.SheetNames;
|
2015-04-02 20:32:22 +00:00
|
|
|
sheet_name_list.forEach(function(y) { /* iterate through sheets */
|
2014-05-28 18:31:33 +00:00
|
|
|
var worksheet = workbook.Sheets[y];
|
|
|
|
for (z in worksheet) {
|
2015-04-02 20:32:22 +00:00
|
|
|
/* all keys that do not begin with "!" correspond to cell addresses */
|
2014-05-28 18:31:33 +00:00
|
|
|
if(z[0] === '!') continue;
|
|
|
|
console.log(y + "!" + z + "=" + JSON.stringify(worksheet[z].v));
|
|
|
|
}
|
|
|
|
});
|
|
|
|
```
|
2014-02-12 06:09:42 +00:00
|
|
|
|
2014-05-28 18:31:33 +00:00
|
|
|
Complete examples:
|
|
|
|
|
|
|
|
- <http://oss.sheetjs.com/js-xlsx/> HTML5 File API / Base64 Text / Web Workers
|
2014-02-12 06:09:42 +00:00
|
|
|
|
2017-02-22 06:57:59 +00:00
|
|
|
Note that older versions of IE do not support HTML5 File API, so the base64 mode
|
|
|
|
is used for testing. On OSX you can get the base64 encoding with:
|
2014-02-12 06:09:42 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```bash
|
2017-02-22 06:57:59 +00:00
|
|
|
$ <target_file base64 | pbcopy
|
|
|
|
```
|
|
|
|
|
|
|
|
On Windows XP and up you can get the base64 encoding using `certutil`:
|
|
|
|
|
|
|
|
```cmd
|
|
|
|
> certutil -encode target_file target_file.b64
|
2016-12-31 08:20:45 +00:00
|
|
|
```
|
2014-05-28 18:31:33 +00:00
|
|
|
|
2017-02-22 06:57:59 +00:00
|
|
|
(note: You have to open the file and remove the header and footer lines)
|
|
|
|
|
2014-05-28 18:31:33 +00:00
|
|
|
- <http://oss.sheetjs.com/js-xlsx/ajax.html> XMLHttpRequest
|
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
- <https://github.com/SheetJS/js-xlsx/blob/master/bin/xlsx.njs> node
|
2014-05-28 18:31:33 +00:00
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
The node version installs a command line tool `xlsx` which can read spreadsheet
|
2014-05-28 18:31:33 +00:00
|
|
|
files and output the contents in various formats. The source is available at
|
|
|
|
`xlsx.njs` in the bin directory.
|
2014-02-12 06:09:42 +00:00
|
|
|
|
2014-01-27 09:38:50 +00:00
|
|
|
Some helper functions in `XLSX.utils` generate different views of the sheets:
|
|
|
|
|
2014-02-04 00:00:44 +00:00
|
|
|
- `XLSX.utils.sheet_to_csv` generates CSV
|
2014-05-28 18:31:33 +00:00
|
|
|
- `XLSX.utils.sheet_to_json` generates an array of objects
|
2014-07-28 13:22:32 +00:00
|
|
|
- `XLSX.utils.sheet_to_formulae` generates a list of formulae
|
2014-01-27 09:38:50 +00:00
|
|
|
|
2014-05-28 18:31:33 +00:00
|
|
|
## Writing Workbooks
|
|
|
|
|
2014-10-26 05:26:18 +00:00
|
|
|
For writing, the first step is to generate output data. The helper functions
|
|
|
|
`write` and `writeFile` will produce the data in various formats suitable for
|
|
|
|
dissemination. The second step is to actual share the data with the end point.
|
|
|
|
Assuming `workbook` is a workbook object:
|
2014-05-28 18:31:33 +00:00
|
|
|
|
|
|
|
- nodejs write to file:
|
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2014-05-28 18:31:33 +00:00
|
|
|
/* output format determined by filename */
|
|
|
|
XLSX.writeFile(workbook, 'out.xlsx');
|
2014-10-26 05:26:18 +00:00
|
|
|
/* at this point, out.xlsx is a file that you can distribute */
|
2014-05-28 18:31:33 +00:00
|
|
|
```
|
|
|
|
|
2017-02-10 19:23:01 +00:00
|
|
|
- browser generate binary blob and "download" to client
|
|
|
|
(using [FileSaver.js](https://github.com/eligrey/FileSaver.js/) for download):
|
2014-05-28 18:31:33 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2017-02-10 19:23:01 +00:00
|
|
|
/* bookType can be 'xlsx' or 'xlsm' or 'xlsb' or 'ods' */
|
2014-07-28 13:22:32 +00:00
|
|
|
var wopts = { bookType:'xlsx', bookSST:false, type:'binary' };
|
2014-05-28 18:31:33 +00:00
|
|
|
|
|
|
|
var wbout = XLSX.write(workbook,wopts);
|
|
|
|
|
|
|
|
function s2ab(s) {
|
|
|
|
var buf = new ArrayBuffer(s.length);
|
|
|
|
var view = new Uint8Array(buf);
|
|
|
|
for (var i=0; i!=s.length; ++i) view[i] = s.charCodeAt(i) & 0xFF;
|
|
|
|
return buf;
|
|
|
|
}
|
|
|
|
|
2014-10-26 05:26:18 +00:00
|
|
|
/* the saveAs call downloads a file on the local machine */
|
2017-02-03 20:50:45 +00:00
|
|
|
saveAs(new Blob([s2ab(wbout)],{type:"application/octet-stream"}), "test.xlsx");
|
2014-05-28 18:31:33 +00:00
|
|
|
```
|
|
|
|
|
|
|
|
Complete examples:
|
2014-02-07 10:53:40 +00:00
|
|
|
|
2014-05-28 18:31:33 +00:00
|
|
|
- <http://sheetjs.com/demos/writexlsx.html> generates a simple file
|
|
|
|
- <http://git.io/WEK88Q> writing an array of arrays in nodejs
|
2014-07-28 13:22:32 +00:00
|
|
|
- <http://sheetjs.com/demos/table.html> exporting an HTML table
|
2014-02-07 10:53:40 +00:00
|
|
|
|
2014-05-16 00:33:34 +00:00
|
|
|
## Interface
|
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
`XLSX` is the exposed variable in the browser and the exported node variable
|
2014-05-16 00:33:34 +00:00
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
`XLSX.version` is the version of the library (added by the build script).
|
|
|
|
|
|
|
|
`XLSX.SSF` is an embedded version of the [format library](http://git.io/ssf).
|
|
|
|
|
|
|
|
### Parsing functions
|
2014-05-16 00:33:34 +00:00
|
|
|
|
|
|
|
`XLSX.read(data, read_opts)` attempts to parse `data`.
|
|
|
|
|
|
|
|
`XLSX.readFile(filename, read_opts)` attempts to read `filename` and parse.
|
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
### Writing functions
|
|
|
|
|
2014-05-16 00:33:34 +00:00
|
|
|
`XLSX.write(wb, write_opts)` attempts to write the workbook `wb`
|
|
|
|
|
|
|
|
`XLSX.writeFile(wb, filename, write_opts)` attempts to write `wb` to `filename`
|
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
### Utilities
|
|
|
|
|
|
|
|
Utilities are available in the `XLSX.utils` object:
|
|
|
|
|
|
|
|
Exporting:
|
|
|
|
|
|
|
|
- `sheet_to_json` converts a workbook object to an array of JSON objects.
|
2017-02-03 20:50:45 +00:00
|
|
|
`sheet_to_row_object_array` is an alias that will be removed in the future.
|
|
|
|
- `sheet_to_csv` generates delimiter-separated-values output.
|
|
|
|
- `sheet_to_formulae` generates a list of the formulae (with value fallbacks).
|
|
|
|
|
|
|
|
The `sheet_to_*` functions accept a worksheet and an optional options object.
|
2014-07-28 13:22:32 +00:00
|
|
|
|
|
|
|
Cell and cell address manipulation:
|
|
|
|
|
|
|
|
- `format_cell` generates the text value for a cell (using number formats)
|
|
|
|
- `{en,de}code_{row,col}` convert between 0-indexed rows/cols and A1 forms.
|
|
|
|
- `{en,de}code_cell` converts cell addresses
|
|
|
|
- `{en,de}code_range` converts cell ranges
|
|
|
|
|
|
|
|
## Workbook / Worksheet / Cell Object Description
|
2014-01-27 09:38:50 +00:00
|
|
|
|
2014-05-16 00:33:34 +00:00
|
|
|
js-xlsx conforms to the Common Spreadsheet Format (CSF):
|
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
### General Structures
|
|
|
|
|
|
|
|
Cell address objects are stored as `{c:C, r:R}` where `C` and `R` are 0-indexed
|
|
|
|
column and row numbers, respectively. For example, the cell address `B5` is
|
|
|
|
represented by the object `{c:1, r:4}`.
|
|
|
|
|
|
|
|
Cell range objects are stored as `{s:S, e:E}` where `S` is the first cell and
|
|
|
|
`E` is the last cell in the range. The ranges are inclusive. For example, the
|
|
|
|
range `A3:B7` is represented by the object `{s:{c:0, r:2}, e:{c:1, r:6}}`. Utils
|
|
|
|
use the following pattern to walk each of the cells in a range:
|
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```js
|
2014-07-28 13:22:32 +00:00
|
|
|
for(var R = range.s.r; R <= range.e.r; ++R) {
|
|
|
|
for(var C = range.s.c; C <= range.e.c; ++C) {
|
|
|
|
var cell_address = {c:C, r:R};
|
|
|
|
}
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
### Cell Object
|
|
|
|
|
2017-02-19 20:36:32 +00:00
|
|
|
| Key | Description |
|
|
|
|
| --- | ---------------------------------------------------------------------- |
|
|
|
|
| `v` | raw value (see Data Types section for more info) |
|
|
|
|
| `w` | formatted text (if applicable) |
|
|
|
|
| `t` | cell type: `b` Boolean, `n` Number, `e` error, `s` String, `d` Date |
|
|
|
|
| `f` | cell formula encoded as an A1-style string (if applicable) |
|
|
|
|
| `F` | range of enclosing array if formula is array formula (if applicable) |
|
|
|
|
| `r` | rich text encoding (if applicable) |
|
|
|
|
| `h` | HTML rendering of the rich text (if applicable) |
|
|
|
|
| `c` | comments associated with the cell |
|
|
|
|
| `z` | number format string associated with the cell (if requested) |
|
|
|
|
| `l` | cell hyperlink object (.Target holds link, .tooltip is tooltip) |
|
|
|
|
| `s` | the style/theme of the cell (if applicable) |
|
2014-07-28 13:22:32 +00:00
|
|
|
|
|
|
|
Built-in export utilities (such as the CSV exporter) will use the `w` text if it
|
|
|
|
is available. To change a value, be sure to delete `cell.w` (or set it to
|
|
|
|
`undefined`) before attempting to export. The utilities will regenerate the `w`
|
|
|
|
text from the number format (`cell.z`) and the raw value if possible.
|
|
|
|
|
2017-02-19 20:36:32 +00:00
|
|
|
The actual array formula is stored in the `f` field of the first cell in the
|
|
|
|
array range. Other cells in the range will omit the `f` field.
|
|
|
|
|
2014-10-26 05:26:18 +00:00
|
|
|
### Data Types
|
|
|
|
|
|
|
|
The raw value is stored in the `v` field, interpreted based on the `t` field.
|
|
|
|
|
|
|
|
Type `b` is the Boolean type. `v` is interpreted according to JS truth tables
|
|
|
|
|
|
|
|
Type `e` is the Error type. `v` holds the number and `w` holds the common name:
|
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
| Value | Error Meaning |
|
|
|
|
| ----: | :------------- |
|
|
|
|
| 0x00 | #NULL! |
|
|
|
|
| 0x07 | #DIV/0! |
|
|
|
|
| 0x0F | #VALUE! |
|
|
|
|
| 0x17 | #REF! |
|
|
|
|
| 0x1D | #NAME? |
|
|
|
|
| 0x24 | #NUM! |
|
|
|
|
| 0x2A | #N/A |
|
|
|
|
| 0x2B | #GETTING\_DATA |
|
2014-10-26 05:26:18 +00:00
|
|
|
|
|
|
|
Type `n` is the Number type. This includes all forms of data that Excel stores
|
|
|
|
as numbers, such as dates/times and Boolean fields. Excel exclusively uses data
|
|
|
|
that can be fit in an IEEE754 floating point number, just like JS Number, so the
|
|
|
|
`v` field holds the raw number. The `w` field holds formatted text.
|
|
|
|
|
|
|
|
Type `d` is the Date type, generated only when the option `cellDates` is passed.
|
|
|
|
Since JSON does not have a natural Date type, parsers are generally expected to
|
|
|
|
store ISO 8601 Date strings like you would get from `date.toISOString()`. On
|
|
|
|
the other hand, writers and exporters should be able to handle date strings and
|
2017-02-03 20:50:45 +00:00
|
|
|
JS Date objects. Note that Excel disregards timezone modifiers and treats all
|
2014-10-26 05:26:18 +00:00
|
|
|
dates in the local timezone. js-xlsx does not correct for this error.
|
|
|
|
|
|
|
|
Type `s` is the String type. `v` should be explicitly stored as a string to
|
|
|
|
avoid possible confusion.
|
|
|
|
|
2017-02-19 20:36:32 +00:00
|
|
|
### Formulae
|
|
|
|
|
|
|
|
The A1-style formula string is stored in the `f` field. Even though different
|
|
|
|
file formats store the formulae in different ways, the formats are converted.
|
|
|
|
|
|
|
|
Shared formulae are decompressed and each cell has the correct formula.
|
|
|
|
|
|
|
|
Array formulae are stored in the top-left cell of the array block. All cells
|
|
|
|
of an array formula have a `F` field corresponding to the range. A single-cell
|
|
|
|
formula can be distinguished from a plain formula by the presence of `F` field.
|
|
|
|
|
|
|
|
The `sheet_to_formulae` method generates one line per formula or array formula.
|
|
|
|
Array formulae are rendered in the form `range=formula` while plain cells are
|
|
|
|
rendered in the form `cell=formula or value`.
|
2014-10-26 05:26:18 +00:00
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
### Worksheet Object
|
2014-01-27 09:38:50 +00:00
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
Each key that does not start with `!` maps to a cell (using `A-1` notation)
|
2014-01-27 09:38:50 +00:00
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
`worksheet[address]` returns the cell object for the specified address.
|
2014-02-04 00:00:44 +00:00
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
Special worksheet keys (accessible as `worksheet[key]`, each starting with `!`):
|
|
|
|
|
|
|
|
- `ws['!ref']`: A-1 based range representing the worksheet range. Functions that
|
|
|
|
work with sheets should use this parameter to determine the range. Cells that
|
|
|
|
are assigned outside of the range are not processed. In particular, when
|
|
|
|
writing a worksheet by hand, be sure to update the range. For a longer
|
|
|
|
discussion, see <http://git.io/KIaNKQ>
|
|
|
|
|
2014-08-26 17:40:04 +00:00
|
|
|
Functions that handle worksheets should test for the presence of `!ref` field.
|
|
|
|
If the `!ref` is omitted or is not a valid range, functions are free to treat
|
|
|
|
the sheet as empty or attempt to guess the range. The standard utilities that
|
2017-02-03 20:50:45 +00:00
|
|
|
ship with this library treat sheets as empty (for example, the CSV output is
|
2014-08-26 17:40:04 +00:00
|
|
|
empty string).
|
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
When reading a worksheet with the `sheetRows` property set, the ref parameter
|
|
|
|
will use the restricted range. The original range is set at `ws['!fullref']`
|
|
|
|
|
|
|
|
- `ws['!cols']`: array of column properties objects. Column widths are actually
|
|
|
|
stored in files in a normalized manner, measured in terms of the "Maximum
|
|
|
|
Digit Width" (the largest width of the rendered digits 0-9, in pixels). When
|
|
|
|
parsed, the column objects store the pixel width in the `wpx` field, character
|
|
|
|
width in the `wch` field, and the maximum digit width in the `MDW` field.
|
|
|
|
|
|
|
|
- `ws['!merges']`: array of range objects corresponding to the merged cells in
|
|
|
|
the worksheet. Plaintext utilities are unaware of merge cells. CSV export
|
|
|
|
will write all cells in the merge range if they exist, so be sure that only
|
|
|
|
the first cell (upper-left) in the range is set.
|
|
|
|
|
|
|
|
### Workbook Object
|
|
|
|
|
|
|
|
`workbook.SheetNames` is an ordered list of the sheets in the workbook
|
|
|
|
|
|
|
|
`wb.Sheets[sheetname]` returns an object representing the worksheet.
|
|
|
|
|
|
|
|
`wb.Props` is an object storing the standard properties. `wb.Custprops` stores
|
2015-04-02 20:32:22 +00:00
|
|
|
custom properties. Since the XLS standard properties deviate from the XLSX
|
|
|
|
standard, XLS parsing stores core properties in both places. .
|
2014-02-04 00:00:44 +00:00
|
|
|
|
2014-01-27 09:38:50 +00:00
|
|
|
|
2014-05-16 00:33:34 +00:00
|
|
|
## Parsing Options
|
2013-04-20 16:22:32 +00:00
|
|
|
|
2014-02-07 10:53:40 +00:00
|
|
|
The exported `read` and `readFile` functions accept an options argument:
|
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
| Option Name | Default | Description |
|
|
|
|
| :---------- | ------: | :--------------------------------------------------- |
|
2017-02-22 06:57:59 +00:00
|
|
|
| type | | Input data encoding (see Input Type below) |
|
2016-12-31 08:20:45 +00:00
|
|
|
| cellFormula | true | Save formulae to the .f field ** |
|
|
|
|
| cellHTML | true | Parse rich text and save HTML to the .h field |
|
|
|
|
| cellNF | false | Save number format string to the .z field |
|
|
|
|
| cellStyles | false | Save style/theme info to the .s field |
|
|
|
|
| cellDates | false | Store dates as type `d` (default is `n`) ** |
|
|
|
|
| sheetStubs | false | Create cell objects for stub cells |
|
|
|
|
| sheetRows | 0 | If >0, read the first `sheetRows` rows ** |
|
|
|
|
| bookDeps | false | If true, parse calculation chains |
|
|
|
|
| bookFiles | false | If true, add raw files to book object ** |
|
|
|
|
| bookProps | false | If true, only parse enough to get book metadata ** |
|
|
|
|
| bookSheets | false | If true, only parse enough to get the sheet names |
|
|
|
|
| bookVBA | false | If true, expose vbaProject.bin to `vbaraw` field ** |
|
|
|
|
| password | "" | If defined and file is encrypted, use password ** |
|
2014-02-12 06:09:42 +00:00
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
- `cellFormula` option only applies to formats that require extra processing to
|
|
|
|
parse formulae (XLS/XLSB).
|
2014-10-26 05:26:18 +00:00
|
|
|
- Even if `cellNF` is false, formatted text will be generated and saved to `.w`
|
2014-02-13 06:22:42 +00:00
|
|
|
- In some cases, sheets may be parsed even if `bookSheets` is false.
|
2014-02-14 06:25:46 +00:00
|
|
|
- `bookSheets` and `bookProps` combine to give both sets of information
|
2014-02-15 05:08:18 +00:00
|
|
|
- `Deps` will be an empty object if `bookDeps` is falsy
|
2015-04-02 20:32:22 +00:00
|
|
|
- `bookFiles` behavior depends on file type:
|
|
|
|
* `keys` array (paths in the ZIP) for ZIP-based formats
|
|
|
|
* `files` hash (mapping paths to objects representing the files) for ZIP
|
|
|
|
* `cfb` object for formats using CFB containers
|
2014-02-19 03:03:28 +00:00
|
|
|
- `sheetRows-1` rows will be generated when looking at the JSON object output
|
|
|
|
(since the header row is counted as a row when parsing the data)
|
2014-04-03 22:51:54 +00:00
|
|
|
- `bookVBA` merely exposes the raw vba object. It does not parse the data.
|
2014-10-26 05:26:18 +00:00
|
|
|
- `cellDates` currently does not convert numerical dates to JS dates.
|
2015-04-02 20:32:22 +00:00
|
|
|
- Currently only XOR encryption is supported. Unsupported error will be thrown
|
|
|
|
for files employing other encryption methods.
|
2014-02-07 10:53:40 +00:00
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
The defaults are enumerated in bits/84\_defaults.js
|
2013-04-20 16:22:32 +00:00
|
|
|
|
2017-02-22 06:57:59 +00:00
|
|
|
### Input Type
|
|
|
|
|
|
|
|
Strings can be interpreted in multiple ways. The `type` parameter for `read`
|
|
|
|
tells the library how to parse the data argument:
|
|
|
|
|
|
|
|
| `type` | expected input |
|
|
|
|
|------------|-----------------------------------------------------------------|
|
|
|
|
| `"base64"` | string: base64 encoding of the file |
|
|
|
|
| `"binary"` | string: binary string (`n`-th byte is `data.charCodeAt(n)`) |
|
|
|
|
| `"buffer"` | nodejs Buffer |
|
|
|
|
| `"array"` | array: array of 8-bit unsigned int (`n`-th byte is `data[n]`) |
|
|
|
|
| `"file"` | string: filename that will be read and processed (nodejs only) |
|
|
|
|
|
|
|
|
### Guessing File Type
|
|
|
|
|
|
|
|
Excel and other spreadsheet tools read the first few bytes and apply other
|
|
|
|
heuristics to determine a file type. This enables file type punning: renaming
|
|
|
|
files with the `.xls` extension will tell your computer to use Excel to open the
|
|
|
|
file but Excel will know how to handle it. This library applies similar logic:
|
|
|
|
|
|
|
|
| Byte 0 | Raw File Type | Spreadsheet Types |
|
|
|
|
|:-------|:--------------------------------------------------------------------|
|
|
|
|
| `0xD0` | CFB Container | BIFF 5/8 or password-protected XLSX/XLSB |
|
|
|
|
| `0x09` | BIFF Stream | BIFF 2/3/4/5 |
|
|
|
|
| `0x3C` | XML | SpreadsheetML or Flat ODS or UOS1 |
|
|
|
|
| `0x50` | ZIP Archive | XLSB or XLSX/M or ODS or UOS2 |
|
|
|
|
|
2014-05-16 00:33:34 +00:00
|
|
|
## Writing Options
|
|
|
|
|
|
|
|
The exported `write` and `writeFile` functions accept an options argument:
|
|
|
|
|
2017-02-10 19:23:01 +00:00
|
|
|
| Option Name | Default | Description |
|
|
|
|
| :---------- | -------: | :-------------------------------------------------- |
|
2017-02-22 06:57:59 +00:00
|
|
|
| type | | Output data encoding (see Output Type below) |
|
2017-02-10 19:23:01 +00:00
|
|
|
| cellDates | `false` | Store dates as type `d` (default is `n`) |
|
|
|
|
| bookSST | `false` | Generate Shared String Table ** |
|
|
|
|
| bookType | `"xlsx"` | Type of Workbook (see below for supported formats) |
|
|
|
|
| sheet | `""` | Name of Worksheet for single-sheet formats ** |
|
|
|
|
| compression | `false` | Use ZIP compression for ZIP-based formats ** |
|
2014-05-16 00:33:34 +00:00
|
|
|
|
|
|
|
- `bookSST` is slower and more memory intensive, but has better compatibility
|
2014-07-28 13:22:32 +00:00
|
|
|
with older versions of iOS Numbers
|
2014-05-16 00:33:34 +00:00
|
|
|
- The raw data is the only thing guaranteed to be saved. Formulae, formatting,
|
2014-05-28 18:31:33 +00:00
|
|
|
and other niceties may not be serialized (pending CSF standardization)
|
2014-10-26 05:26:18 +00:00
|
|
|
- `cellDates` only applies to XLSX output and is not guaranteed to work with
|
|
|
|
third-party readers. Excel itself does not usually write cells with type `d`
|
|
|
|
so non-Excel tools may ignore the data or blow up in the presence of dates.
|
2014-05-16 00:33:34 +00:00
|
|
|
|
2017-02-22 06:57:59 +00:00
|
|
|
### Supported Output Formats
|
|
|
|
|
|
|
|
For broad compatibility with third-party tools, this library supports many
|
|
|
|
output formats. The specific file type is controlled with `bookType` option:
|
2017-02-10 19:23:01 +00:00
|
|
|
|
2017-02-22 06:57:59 +00:00
|
|
|
| bookType | file ext | container | sheets | Description |
|
|
|
|
| :------- | -------: | :-------: | :----- |:--------------------------------- |
|
|
|
|
| `xlsx` | `.xlsx` | ZIP | multi | Excel 2007+ XML Format |
|
|
|
|
| `xlsm` | `.xlsm` | ZIP | multi | Excel 2007+ Macro XML Format |
|
|
|
|
| `xlsb` | `.xlsb` | ZIP | multi | Excel 2007+ Binary Format |
|
|
|
|
| `ods` | `.ods` | ZIP | multi | OpenDocument Spreadsheet |
|
|
|
|
| `biff2` | `.xls` | none | single | Excel 2.0 Worksheet format |
|
|
|
|
| `fods` | `.fods` | none | multi | Flat OpenDocument Spreadsheet |
|
|
|
|
| `csv` | `.csv` | none | single | Comma Separated Values |
|
2017-02-10 19:23:01 +00:00
|
|
|
|
|
|
|
- `compression` only applies to formats with ZIP containers.
|
|
|
|
- Formats that only support a single sheet require a `sheet` option specifying
|
|
|
|
the worksheet. If the string is empty, the first worksheet is used.
|
|
|
|
|
2017-02-22 06:57:59 +00:00
|
|
|
### Output Type
|
|
|
|
|
|
|
|
The `type` argument for `write` mirrors the `type` argument for `read`:
|
|
|
|
|
|
|
|
| `type` | output |
|
|
|
|
|------------|-----------------------------------------------------------------|
|
|
|
|
| `"base64"` | string: base64 encoding of the file |
|
|
|
|
| `"binary"` | string: binary string (`n`-th byte is `data.charCodeAt(n)`) |
|
|
|
|
| `"buffer"` | nodejs Buffer |
|
|
|
|
| `"file"` | string: name of file to be written (nodejs only) |
|
|
|
|
|
|
|
|
|
|
|
|
## File Formats
|
|
|
|
|
|
|
|
Despite the fact that the name of the library is `xlsx`, it supports numerous
|
|
|
|
non-XLSX file formats:
|
|
|
|
|
|
|
|
### Excel 2.0-95 (BIFF2/BIFF3/BIFF4/BIFF5)
|
|
|
|
|
|
|
|
BIFF 2/3 XLS are single-sheet streams of binary records. Excel 4 introduced
|
|
|
|
the concept of a workbook (`XLW` files) but also had single-sheet `XLS` format.
|
|
|
|
The structure is largely similar to the Lotus 1-2-3 file formats. BIFF5/8/12
|
|
|
|
extended the format in various ways but largely stuck to the same record format.
|
|
|
|
|
|
|
|
There is no official specification for any of these formats. Excel 95 can write
|
|
|
|
files in these formats, so record lengths and fields were backsolved by writing
|
|
|
|
in all of the supported formats and comparing files. Excel 2016 can generate
|
|
|
|
BIFF5 files, enabling a full suite of file tests starting from XLSX or BIFF2.
|
|
|
|
|
|
|
|
### Excel 97-2004 Binary (BIFF8)
|
|
|
|
|
|
|
|
BIFF8 exclusively uses the Compound File Binary container format, splitting some
|
|
|
|
content into streams within the file. At its core, it still uses an extended
|
|
|
|
version of the binary record format from older versions of BIFF.
|
|
|
|
|
|
|
|
The `MS-XLS` specification covers the basics of the file format, and other
|
|
|
|
specifications expand on serialization of features like properties.
|
|
|
|
|
|
|
|
### Excel 2003-2004 (SpreadsheetML)
|
|
|
|
|
|
|
|
Predating XLSX, SpreadsheetML files are simple XML files. There is no official
|
|
|
|
and comprehensive specification, although MS has released whitepapers on the
|
|
|
|
format. Since Excel 2016 can generate SpreadsheetML files, backsolving is
|
|
|
|
pretty straightforward.
|
|
|
|
|
|
|
|
### Excel 2007+ Binary (XLSB, BIFF12)
|
|
|
|
|
|
|
|
Introduced in parallel with XLSX, the XLSB filetype combines BIFF architecture
|
|
|
|
with the content separation and ZIP container of XLSX. For the most part nodes
|
|
|
|
in an XLSX sub-file can be mapped to XLSB records in a corresponding sub-file.
|
|
|
|
|
|
|
|
The `MS-XLSB` specification covers the basics of the file format, and other
|
|
|
|
specifications expand on serialization of features like properties.
|
|
|
|
|
|
|
|
### OpenDocument Spreadsheet (ODS/FODS) and Uniform Office Spreadsheet (UOS1/2)
|
|
|
|
|
|
|
|
ODS is an XML-in-ZIP format akin to XLSX while FODS is an XML format akin to
|
|
|
|
SpreadsheetML. Both are detailed in the OASIS standard, but tools like LO/OO
|
|
|
|
add undocumented extensions.
|
|
|
|
|
|
|
|
UOS is a very similar format, and it comes in 2 varieties corresponding to ODS
|
|
|
|
and FODS respectively. For the most part, the difference between the formats
|
|
|
|
lies in the names of tags and attributes.
|
|
|
|
|
|
|
|
### Comma-Separated Values
|
|
|
|
|
|
|
|
Excel CSV deviates from RFC4180 in a number of important ways. The generated
|
|
|
|
CSV files should generally work in Excel although they may not work in RFC4180
|
|
|
|
compatible readers.
|
|
|
|
|
|
|
|
|
2014-01-27 09:38:50 +00:00
|
|
|
## Tested Environments
|
|
|
|
|
2017-02-03 20:50:45 +00:00
|
|
|
- NodeJS 0.8, 0.9, 0.10, 0.11, 0.12, 4.x, 5.x, 6.x, 7.x
|
2017-02-10 19:23:01 +00:00
|
|
|
- IE 6/7/8/9/10/11 (IE6-9 browsers require shims for interacting with client)
|
|
|
|
- Chrome 24+
|
|
|
|
- Safari 6+
|
|
|
|
- FF 18+
|
2014-01-27 09:38:50 +00:00
|
|
|
|
|
|
|
Tests utilize the mocha testing framework. Travis-CI and Sauce Labs links:
|
|
|
|
|
2014-05-28 18:31:33 +00:00
|
|
|
- <https://travis-ci.org/SheetJS/js-xlsx> for XLSX module in nodejs
|
2017-02-03 20:50:45 +00:00
|
|
|
- <https://travis-ci.org/SheetJS/SheetJS.github.io> for XLS\* modules
|
|
|
|
- <https://saucelabs.com/u/sheetjs> for XLS\* modules using Sauce Labs
|
2014-01-27 09:38:50 +00:00
|
|
|
|
2013-10-30 19:26:07 +00:00
|
|
|
## Test Files
|
|
|
|
|
2013-11-13 23:28:11 +00:00
|
|
|
Test files are housed in [another repo](https://github.com/SheetJS/test_files).
|
2013-10-30 19:26:07 +00:00
|
|
|
|
2014-05-01 00:24:27 +00:00
|
|
|
Running `make init` will refresh the `test_files` submodule and get the files.
|
|
|
|
|
2014-02-04 00:00:44 +00:00
|
|
|
## Testing
|
|
|
|
|
2015-04-02 20:32:22 +00:00
|
|
|
`make test` will run the node-based tests. To run the in-browser tests, clone
|
2014-02-04 00:00:44 +00:00
|
|
|
[the oss.sheetjs.com repo](https://github.com/SheetJS/SheetJS.github.io) and
|
|
|
|
replace the xlsx.js file (then fire up the browser and go to `stress.html`):
|
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```bash
|
2014-02-04 00:00:44 +00:00
|
|
|
$ cp xlsx.js ../SheetJS.github.io
|
|
|
|
$ cd ../SheetJS.github.io
|
|
|
|
$ simplehttpserver # or "python -mSimpleHTTPServer" or "serve"
|
|
|
|
$ open -a Chromium.app http://localhost:8000/stress.html
|
|
|
|
```
|
|
|
|
|
2014-05-16 00:33:34 +00:00
|
|
|
For a much smaller test, run `make test_misc`.
|
|
|
|
|
2014-02-07 10:53:40 +00:00
|
|
|
## Contributing
|
|
|
|
|
2014-05-01 00:24:27 +00:00
|
|
|
Due to the precarious nature of the Open Specifications Promise, it is very
|
|
|
|
important to ensure code is cleanroom. Consult CONTRIBUTING.md
|
2014-02-07 10:53:40 +00:00
|
|
|
|
2014-07-28 13:22:32 +00:00
|
|
|
The xlsx.js file is constructed from the files in the `bits` subdirectory. The
|
|
|
|
build script (run `make`) will concatenate the individual bits to produce the
|
|
|
|
script. Before submitting a contribution, ensure that running make will produce
|
|
|
|
the xlsx.js file exactly. The simplest way to test is to move the script:
|
|
|
|
|
2016-12-31 08:20:45 +00:00
|
|
|
```bash
|
2014-07-28 13:22:32 +00:00
|
|
|
$ mv xlsx.js xlsx.new.js
|
|
|
|
$ make
|
|
|
|
$ diff xlsx.js xlsx.new.js
|
|
|
|