sheetjs/docbits/20_import.md

## Parsing Workbooks

For parsing, the first step is to read the file.  This involves acquiring the
data and feeding it into the library.  Here are a few common scenarios:

<details>
  <summary><b>nodejs read a file</b> (click to show)</summary>

`readFile` is only available in server environments. Browsers have no API for
reading arbitrary files given a path, so another strategy must be used.

```js
if(typeof require !== 'undefined') XLSX = require('xlsx');
var workbook = XLSX.readFile('test.xlsx');
/* DO SOMETHING WITH workbook HERE */
```

</details>

<details>
  <summary><b>Photoshop ExtendScript read a file</b> (click to show)</summary>

`readFile` wraps the `File` logic in Photoshop and other ExtendScript targets.
The specified path should be an absolute path:

```js
#include "xlsx.extendscript.js"
/* Read test.xlsx from the Documents folder */
var workbook = XLSX.readFile(Folder.myDocuments + '/' + 'test.xlsx');
/* DO SOMETHING WITH workbook HERE */
```

The [`extendscript` demo](demos/extendscript/) includes a more complex example.

</details>

<details>
  <summary><b>Browser read TABLE element from page</b> (click to show)</summary>

The `table_to_book` and `table_to_sheet` utility functions take a DOM TABLE
element and iterate through the child nodes.

```js
var workbook = XLSX.utils.table_to_book(document.getElementById('tableau'));
/* DO SOMETHING WITH workbook HERE */
```

Multiple tables on a web page can be converted to individual worksheets:

```js
/* create new workbook */
var workbook = XLSX.utils.book_new();

/* convert table 'table1' to worksheet named "Sheet1" */
var ws1 = XLSX.utils.table_to_sheet(document.getElementById('table1'));
XLSX.utils.book_append_sheet(workbook, ws1, "Sheet1");

/* convert table 'table2' to worksheet named "Sheet2" */
var ws2 = XLSX.utils.table_to_sheet(document.getElementById('table2'));
XLSX.utils.book_append_sheet(workbook, ws2, "Sheet2");

/* workbook now has 2 worksheets */
```

Alternatively, the HTML code can be extracted and parsed:

```js
var htmlstr = document.getElementById('tableau').outerHTML;
var workbook = XLSX.read(htmlstr, {type:'string'});
```

</details>

<details>
  <summary><b>Browser download file (ajax)</b> (click to show)</summary>

Note: for a more complete example that works in older browsers, check the demo
at <http://oss.sheetjs.com/js-xlsx/ajax.html>.  The [`xhr` demo](demos/xhr/)
includes more examples with `XMLHttpRequest` and `fetch`.

```js
var url = "http://oss.sheetjs.com/test_files/formula_stress_test.xlsx";

/* set up async GET request */
var req = new XMLHttpRequest();
req.open("GET", url, true);
req.responseType = "arraybuffer";

req.onload = function(e) {
  var data = new Uint8Array(req.response);
  var workbook = XLSX.read(data, {type:"array"});

  /* DO SOMETHING WITH workbook HERE */
}

req.send();
```

</details>

<details>
  <summary><b>Browser drag-and-drop</b> (click to show)</summary>

Drag-and-drop uses the HTML5 `FileReader` API, loading the data with
`readAsBinaryString` or `readAsArrayBuffer`.  Since not all browsers support the
full `FileReader` API, dynamic feature tests are highly recommended.

```js
var rABS = true; // true: readAsBinaryString ; false: readAsArrayBuffer
function handleDrop(e) {
  e.stopPropagation(); e.preventDefault();
  var files = e.dataTransfer.files, f = files[0];
  var reader = new FileReader();
  reader.onload = function(e) {
    var data = e.target.result;
    if(!rABS) data = new Uint8Array(data);
    var workbook = XLSX.read(data, {type: rABS ? 'binary' : 'array'});

    /* DO SOMETHING WITH workbook HERE */
  };
  if(rABS) reader.readAsBinaryString(f); else reader.readAsArrayBuffer(f);
}
drop_dom_element.addEventListener('drop', handleDrop, false);
```

</details>

<details>
  <summary><b>Browser file upload form element</b> (click to show)</summary>

Data from file input elements can be processed using the same `FileReader` API
as in the drag-and-drop example:

```js
var rABS = true; // true: readAsBinaryString ; false: readAsArrayBuffer
function handleFile(e) {
  var files = e.target.files, f = files[0];
  var reader = new FileReader();
  reader.onload = function(e) {
    var data = e.target.result;
    if(!rABS) data = new Uint8Array(data);
    var workbook = XLSX.read(data, {type: rABS ? 'binary' : 'array'});

    /* DO SOMETHING WITH workbook HERE */
  };
  if(rABS) reader.readAsBinaryString(f); else reader.readAsArrayBuffer(f);
}
input_dom_element.addEventListener('change', handleFile, false);
```

The [`oldie` demo](demos/oldie/) shows an IE-compatible fallback scenario.

</details>

More specialized cases, including mobile app file processing, are covered in the
[included demos](demos/)

### Parsing Examples

- <http://oss.sheetjs.com/js-xlsx/> HTML5 File API / Base64 Text / Web Workers

Note that older versions of IE do not support HTML5 File API, so the Base64 mode
is used for testing.

<details>
  <summary><b>Get Base64 encoding on OSX / Windows</b> (click to show)</summary>

On OSX you can get the Base64 encoding with:

```bash
$ <target_file base64 | pbcopy
```

On Windows XP and up you can get the Base64 encoding using `certutil`:

```cmd
> certutil -encode target_file target_file.b64
```

(note: You have to open the file and remove the header and footer lines)

</details>

- <http://oss.sheetjs.com/js-xlsx/ajax.html> XMLHttpRequest
Documentation improvements - multiformat column widths (fixes #591 h/t @sheeeeep) - skip nested BIFF files 2017-03-20 09:02:25 +00:00			`## Parsing Workbooks`

			`For parsing, the first step is to read the file. This involves acquiring the`
			`data and feeding it into the library. Here are a few common scenarios:`

browser tests and doc cleanup [ci skip] 2017-04-30 20:37:53 +00:00			`<details>`
demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			`<summary><b>nodejs read a file</b> (click to show)</summary>`
Documentation improvements - multiformat column widths (fixes #591 h/t @sheeeeep) - skip nested BIFF files 2017-03-20 09:02:25 +00:00
version bump 0.11.5: "string" type - proper JS string input / output type - bower main now uses full version (fixes #820 h/t @newmesiss) - DOM parse directly acts on innerHTML (see #779 h/t @danxfisher) - unicode core props and ext props (fixes #822 h/t @fureweb-com) - shim update for IE10/11 - test refresh and flow checks 2017-09-30 06:18:11 +00:00			`readFile` is only available in server environments. Browsers have no API for
			`reading arbitrary files given a path, so another strategy must be used.`

Documentation improvements - multiformat column widths (fixes #591 h/t @sheeeeep) - skip nested BIFF files 2017-03-20 09:02:25 +00:00			```js
			`if(typeof require !== 'undefined') XLSX = require('xlsx');`
			`var workbook = XLSX.readFile('test.xlsx');`
			`/* DO SOMETHING WITH workbook HERE */`
HTML DOM Element read - DOM <table> element parsing (fixes #576 h/t @axolo) - removed InterfaceHdr check (fixes #209 h/t @Batistleman) - gitbook docs 2017-03-29 19:14:15 +00:00			```

browser tests and doc cleanup [ci skip] 2017-04-30 20:37:53 +00:00			`</details>`

version bump 0.12.0: extendscript fixes - ExtendScript write quirks (fixes #986 h/t @grefel) - BIFF8 write number formats (fixes #987 h/t @scwood) - xlsx.extendscript.js library script - readFile / writeFile support ExtendScript - flow update 2018-02-08 18:21:39 +00:00			`<details>`
			`<summary><b>Photoshop ExtendScript read a file</b> (click to show)</summary>`

			`readFile` wraps the `File` logic in Photoshop and other ExtendScript targets.
			`The specified path should be an absolute path:`

			```js
			`#include "xlsx.extendscript.js"`
			`/* Read test.xlsx from the Documents folder */`
			`var workbook = XLSX.readFile(Folder.myDocuments + '/' + 'test.xlsx');`
			`/* DO SOMETHING WITH workbook HERE */`
			```

			The [`extendscript` demo](demos/extendscript/) includes a more complex example.

			`</details>`

browser tests and doc cleanup [ci skip] 2017-04-30 20:37:53 +00:00			`<details>`
demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			`<summary><b>Browser read TABLE element from page</b> (click to show)</summary>`
HTML DOM Element read - DOM <table> element parsing (fixes #576 h/t @axolo) - removed InterfaceHdr check (fixes #209 h/t @Batistleman) - gitbook docs 2017-03-29 19:14:15 +00:00
version bump 0.11.5: "string" type - proper JS string input / output type - bower main now uses full version (fixes #820 h/t @newmesiss) - DOM parse directly acts on innerHTML (see #779 h/t @danxfisher) - unicode core props and ext props (fixes #822 h/t @fureweb-com) - shim update for IE10/11 - test refresh and flow checks 2017-09-30 06:18:11 +00:00			The `table_to_book` and `table_to_sheet` utility functions take a DOM TABLE
			`element and iterate through the child nodes.`

HTML DOM Element read - DOM <table> element parsing (fixes #576 h/t @axolo) - removed InterfaceHdr check (fixes #209 h/t @Batistleman) - gitbook docs 2017-03-29 19:14:15 +00:00			```js
version bump 0.12.6: BrtUid - `BrtUid` record (fixes #1044 h/t @gustavosimil) - `sheet_to_json` allow default for errors (fixes #1035 h/t @arijitkanrar) - docs and demos update 2018-03-19 21:42:55 +00:00			`var workbook = XLSX.utils.table_to_book(document.getElementById('tableau'));`
HTML DOM Element read - DOM <table> element parsing (fixes #576 h/t @axolo) - removed InterfaceHdr check (fixes #209 h/t @Batistleman) - gitbook docs 2017-03-29 19:14:15 +00:00			`/* DO SOMETHING WITH workbook HERE */`
Documentation improvements - multiformat column widths (fixes #591 h/t @sheeeeep) - skip nested BIFF files 2017-03-20 09:02:25 +00:00			```

version bump 0.12.6: BrtUid - `BrtUid` record (fixes #1044 h/t @gustavosimil) - `sheet_to_json` allow default for errors (fixes #1035 h/t @arijitkanrar) - docs and demos update 2018-03-19 21:42:55 +00:00			`Multiple tables on a web page can be converted to individual worksheets:`

			```js
			`/* create new workbook */`
			`var workbook = XLSX.utils.book_new();`

			`/* convert table 'table1' to worksheet named "Sheet1" */`
version bump 0.13.0: AMD support - library reshaped to support requirejs / amd without shim - control comment visibility (fixes #998, h/t @cmuruganmsc) - fixed README code sample error (fixes #1118 h/t @iahmedani) 2018-06-01 16:32:08 +00:00			`var ws1 = XLSX.utils.table_to_sheet(document.getElementById('table1'));`
version bump 0.12.6: BrtUid - `BrtUid` record (fixes #1044 h/t @gustavosimil) - `sheet_to_json` allow default for errors (fixes #1035 h/t @arijitkanrar) - docs and demos update 2018-03-19 21:42:55 +00:00			`XLSX.utils.book_append_sheet(workbook, ws1, "Sheet1");`

			`/* convert table 'table2' to worksheet named "Sheet2" */`
version bump 0.13.0: AMD support - library reshaped to support requirejs / amd without shim - control comment visibility (fixes #998, h/t @cmuruganmsc) - fixed README code sample error (fixes #1118 h/t @iahmedani) 2018-06-01 16:32:08 +00:00			`var ws2 = XLSX.utils.table_to_sheet(document.getElementById('table2'));`
version bump 0.12.6: BrtUid - `BrtUid` record (fixes #1044 h/t @gustavosimil) - `sheet_to_json` allow default for errors (fixes #1035 h/t @arijitkanrar) - docs and demos update 2018-03-19 21:42:55 +00:00			`XLSX.utils.book_append_sheet(workbook, ws2, "Sheet2");`

			`/* workbook now has 2 worksheets */`
			```

version bump 0.11.5: "string" type - proper JS string input / output type - bower main now uses full version (fixes #820 h/t @newmesiss) - DOM parse directly acts on innerHTML (see #779 h/t @danxfisher) - unicode core props and ext props (fixes #822 h/t @fureweb-com) - shim update for IE10/11 - test refresh and flow checks 2017-09-30 06:18:11 +00:00			`Alternatively, the HTML code can be extracted and parsed:`

			```js
			`var htmlstr = document.getElementById('tableau').outerHTML;`
version bump 0.12.6: BrtUid - `BrtUid` record (fixes #1044 h/t @gustavosimil) - `sheet_to_json` allow default for errors (fixes #1035 h/t @arijitkanrar) - docs and demos update 2018-03-19 21:42:55 +00:00			`var workbook = XLSX.read(htmlstr, {type:'string'});`
version bump 0.11.5: "string" type - proper JS string input / output type - bower main now uses full version (fixes #820 h/t @newmesiss) - DOM parse directly acts on innerHTML (see #779 h/t @danxfisher) - unicode core props and ext props (fixes #822 h/t @fureweb-com) - shim update for IE10/11 - test refresh and flow checks 2017-09-30 06:18:11 +00:00			```

browser tests and doc cleanup [ci skip] 2017-04-30 20:37:53 +00:00			`</details>`

			`<details>`
demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			`<summary><b>Browser download file (ajax)</b> (click to show)</summary>`
browser tests and doc cleanup [ci skip] 2017-04-30 20:37:53 +00:00
			`Note: for a more complete example that works in older browsers, check the demo`
version bump 0.12.6: BrtUid - `BrtUid` record (fixes #1044 h/t @gustavosimil) - `sheet_to_json` allow default for errors (fixes #1035 h/t @arijitkanrar) - docs and demos update 2018-03-19 21:42:55 +00:00			at <http://oss.sheetjs.com/js-xlsx/ajax.html>. The [`xhr` demo](demos/xhr/)
demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			includes more examples with `XMLHttpRequest` and `fetch`.
Documentation improvements - multiformat column widths (fixes #591 h/t @sheeeeep) - skip nested BIFF files 2017-03-20 09:02:25 +00:00
			```js
demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			`var url = "http://oss.sheetjs.com/test_files/formula_stress_test.xlsx";`
Documentation improvements - multiformat column widths (fixes #591 h/t @sheeeeep) - skip nested BIFF files 2017-03-20 09:02:25 +00:00
demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			`/* set up async GET request */`
			`var req = new XMLHttpRequest();`
			`req.open("GET", url, true);`
			`req.responseType = "arraybuffer";`
Documentation improvements - multiformat column widths (fixes #591 h/t @sheeeeep) - skip nested BIFF files 2017-03-20 09:02:25 +00:00
demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			`req.onload = function(e) {`
			`var data = new Uint8Array(req.response);`
			`var workbook = XLSX.read(data, {type:"array"});`
Documentation improvements - multiformat column widths (fixes #591 h/t @sheeeeep) - skip nested BIFF files 2017-03-20 09:02:25 +00:00
			`/* DO SOMETHING WITH workbook HERE */`
			`}`

demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			`req.send();`
Documentation improvements - multiformat column widths (fixes #591 h/t @sheeeeep) - skip nested BIFF files 2017-03-20 09:02:25 +00:00			```

browser tests and doc cleanup [ci skip] 2017-04-30 20:37:53 +00:00			`</details>`

			`<details>`
demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			`<summary><b>Browser drag-and-drop</b> (click to show)</summary>`
browser tests and doc cleanup [ci skip] 2017-04-30 20:37:53 +00:00
demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			Drag-and-drop uses the HTML5 `FileReader` API, loading the data with
			`readAsBinaryString` or `readAsArrayBuffer`. Since not all browsers support the
			full `FileReader` API, dynamic feature tests are highly recommended.
Documentation improvements - multiformat column widths (fixes #591 h/t @sheeeeep) - skip nested BIFF files 2017-03-20 09:02:25 +00:00
			```js
			`var rABS = true; // true: readAsBinaryString ; false: readAsArrayBuffer`
			`function handleDrop(e) {`
demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			`e.stopPropagation(); e.preventDefault();`
			`var files = e.dataTransfer.files, f = files[0];`
			`var reader = new FileReader();`
			`reader.onload = function(e) {`
			`var data = e.target.result;`
			`if(!rABS) data = new Uint8Array(data);`
			`var workbook = XLSX.read(data, {type: rABS ? 'binary' : 'array'});`

			`/* DO SOMETHING WITH workbook HERE */`
			`};`
			`if(rABS) reader.readAsBinaryString(f); else reader.readAsArrayBuffer(f);`
Documentation improvements - multiformat column widths (fixes #591 h/t @sheeeeep) - skip nested BIFF files 2017-03-20 09:02:25 +00:00			`}`
			`drop_dom_element.addEventListener('drop', handleDrop, false);`
			```

browser tests and doc cleanup [ci skip] 2017-04-30 20:37:53 +00:00			`</details>`

			`<details>`
demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			`<summary><b>Browser file upload form element</b> (click to show)</summary>`

			Data from file input elements can be processed using the same `FileReader` API
			`as in the drag-and-drop example:`
Documentation improvements - multiformat column widths (fixes #591 h/t @sheeeeep) - skip nested BIFF files 2017-03-20 09:02:25 +00:00
			```js
demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			`var rABS = true; // true: readAsBinaryString ; false: readAsArrayBuffer`
Documentation improvements - multiformat column widths (fixes #591 h/t @sheeeeep) - skip nested BIFF files 2017-03-20 09:02:25 +00:00			`function handleFile(e) {`
demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			`var files = e.target.files, f = files[0];`
			`var reader = new FileReader();`
			`reader.onload = function(e) {`
			`var data = e.target.result;`
			`if(!rABS) data = new Uint8Array(data);`
			`var workbook = XLSX.read(data, {type: rABS ? 'binary' : 'array'});`

			`/* DO SOMETHING WITH workbook HERE */`
			`};`
			`if(rABS) reader.readAsBinaryString(f); else reader.readAsArrayBuffer(f);`
Documentation improvements - multiformat column widths (fixes #591 h/t @sheeeeep) - skip nested BIFF files 2017-03-20 09:02:25 +00:00			`}`
			`input_dom_element.addEventListener('change', handleFile, false);`
			```

version bump 0.12.1: BIFF5 features - BIFF5 write number formats and other features - XLSX/XLSB/BIFF8 Suppress "Number stored as Text" errors - codename awareness (fixes #992 h/t @samusstrike) - updated CFB to 1.0.3 - demo refresh 2018-02-14 20:06:35 +00:00			The [`oldie` demo](demos/oldie/) shows an IE-compatible fallback scenario.

browser tests and doc cleanup [ci skip] 2017-04-30 20:37:53 +00:00			`</details>`

version bump 0.11.19: browser `writeFile` - IE6-9 ActiveX + VBScript shim - `writeFile` supported in browser - `oldie` demo for IE write strategies 2018-02-03 20:46:32 +00:00			`More specialized cases, including mobile app file processing, are covered in the`
			`[included demos](demos/)`
browser tests and doc cleanup [ci skip] 2017-04-30 20:37:53 +00:00
version bump 0.10.7: autocorrections for bad files - recalculate SSF for malformed files (fixes #506 h/t @asksahil) - malformed shared string (fixes #445 h/t @Ramzec) - SSF added to TS def (fixes #711 h/t @duckywang1) - Norsk property names - resolved gitbook processing issues 2017-07-05 22:27:54 +00:00			`### Parsing Examples`
HTML Write support - 'html' bookType write format - basic HTML entity encoding (fixes #629 h/t @xkr47) - HTML string and table merge cell fixes - doc notes on nodejs streaming 2017-04-16 04:32:13 +00:00
			`- <http://oss.sheetjs.com/js-xlsx/> HTML5 File API / Base64 Text / Web Workers`

demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			`Note that older versions of IE do not support HTML5 File API, so the Base64 mode`
version bump 0.10.4: privacy filter 2017-06-03 07:19:09 +00:00			`is used for testing.`

			`<details>`
demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			`<summary><b>Get Base64 encoding on OSX / Windows</b> (click to show)</summary>`
version bump 0.10.4: privacy filter 2017-06-03 07:19:09 +00:00
demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			`On OSX you can get the Base64 encoding with:`
HTML Write support - 'html' bookType write format - basic HTML entity encoding (fixes #629 h/t @xkr47) - HTML string and table merge cell fixes - doc notes on nodejs streaming 2017-04-16 04:32:13 +00:00
			```bash
			`$ <target_file base64 \| pbcopy`
			```

demo refresh [ci skip] 2017-09-24 23:40:09 +00:00			On Windows XP and up you can get the Base64 encoding using `certutil`:
HTML Write support - 'html' bookType write format - basic HTML entity encoding (fixes #629 h/t @xkr47) - HTML string and table merge cell fixes - doc notes on nodejs streaming 2017-04-16 04:32:13 +00:00
			```cmd
			`> certutil -encode target_file target_file.b64`
			```

			`(note: You have to open the file and remove the header and footer lines)`

version bump 0.10.4: privacy filter 2017-06-03 07:19:09 +00:00			`</details>`

HTML Write support - 'html' bookType write format - basic HTML entity encoding (fixes #629 h/t @xkr47) - HTML string and table merge cell fixes - doc notes on nodejs streaming 2017-04-16 04:32:13 +00:00			`- <http://oss.sheetjs.com/js-xlsx/ajax.html> XMLHttpRequest`