2017-03-20 09:02:25 +00:00
|
|
|
## Parsing Workbooks
|
|
|
|
|
|
|
|
For parsing, the first step is to read the file. This involves acquiring the
|
|
|
|
data and feeding it into the library. Here are a few common scenarios:
|
|
|
|
|
2017-04-30 20:37:53 +00:00
|
|
|
<details>
|
2017-09-24 23:40:09 +00:00
|
|
|
<summary><b>nodejs read a file</b> (click to show)</summary>
|
2017-03-20 09:02:25 +00:00
|
|
|
|
2017-09-30 06:18:11 +00:00
|
|
|
`readFile` is only available in server environments. Browsers have no API for
|
|
|
|
reading arbitrary files given a path, so another strategy must be used.
|
|
|
|
|
2017-03-20 09:02:25 +00:00
|
|
|
```js
|
|
|
|
if(typeof require !== 'undefined') XLSX = require('xlsx');
|
|
|
|
var workbook = XLSX.readFile('test.xlsx');
|
|
|
|
/* DO SOMETHING WITH workbook HERE */
|
2017-03-29 19:14:15 +00:00
|
|
|
```
|
|
|
|
|
2017-04-30 20:37:53 +00:00
|
|
|
</details>
|
|
|
|
|
2018-02-08 18:21:39 +00:00
|
|
|
<details>
|
|
|
|
<summary><b>Photoshop ExtendScript read a file</b> (click to show)</summary>
|
|
|
|
|
|
|
|
`readFile` wraps the `File` logic in Photoshop and other ExtendScript targets.
|
|
|
|
The specified path should be an absolute path:
|
|
|
|
|
|
|
|
```js
|
|
|
|
#include "xlsx.extendscript.js"
|
|
|
|
/* Read test.xlsx from the Documents folder */
|
|
|
|
var workbook = XLSX.readFile(Folder.myDocuments + '/' + 'test.xlsx');
|
|
|
|
/* DO SOMETHING WITH workbook HERE */
|
|
|
|
```
|
|
|
|
|
|
|
|
The [`extendscript` demo](demos/extendscript/) includes a more complex example.
|
|
|
|
|
|
|
|
</details>
|
|
|
|
|
2017-04-30 20:37:53 +00:00
|
|
|
<details>
|
2017-09-24 23:40:09 +00:00
|
|
|
<summary><b>Browser read TABLE element from page</b> (click to show)</summary>
|
2017-03-29 19:14:15 +00:00
|
|
|
|
2017-09-30 06:18:11 +00:00
|
|
|
The `table_to_book` and `table_to_sheet` utility functions take a DOM TABLE
|
|
|
|
element and iterate through the child nodes.
|
|
|
|
|
2017-03-29 19:14:15 +00:00
|
|
|
```js
|
2018-03-19 21:42:55 +00:00
|
|
|
var workbook = XLSX.utils.table_to_book(document.getElementById('tableau'));
|
2017-03-29 19:14:15 +00:00
|
|
|
/* DO SOMETHING WITH workbook HERE */
|
2017-03-20 09:02:25 +00:00
|
|
|
```
|
|
|
|
|
2018-03-19 21:42:55 +00:00
|
|
|
Multiple tables on a web page can be converted to individual worksheets:
|
|
|
|
|
|
|
|
```js
|
|
|
|
/* create new workbook */
|
|
|
|
var workbook = XLSX.utils.book_new();
|
|
|
|
|
|
|
|
/* convert table 'table1' to worksheet named "Sheet1" */
|
2018-06-01 16:32:08 +00:00
|
|
|
var ws1 = XLSX.utils.table_to_sheet(document.getElementById('table1'));
|
2018-03-19 21:42:55 +00:00
|
|
|
XLSX.utils.book_append_sheet(workbook, ws1, "Sheet1");
|
|
|
|
|
|
|
|
/* convert table 'table2' to worksheet named "Sheet2" */
|
2018-06-01 16:32:08 +00:00
|
|
|
var ws2 = XLSX.utils.table_to_sheet(document.getElementById('table2'));
|
2018-03-19 21:42:55 +00:00
|
|
|
XLSX.utils.book_append_sheet(workbook, ws2, "Sheet2");
|
|
|
|
|
|
|
|
/* workbook now has 2 worksheets */
|
|
|
|
```
|
|
|
|
|
2017-09-30 06:18:11 +00:00
|
|
|
Alternatively, the HTML code can be extracted and parsed:
|
|
|
|
|
|
|
|
```js
|
|
|
|
var htmlstr = document.getElementById('tableau').outerHTML;
|
2018-03-19 21:42:55 +00:00
|
|
|
var workbook = XLSX.read(htmlstr, {type:'string'});
|
2017-09-30 06:18:11 +00:00
|
|
|
```
|
|
|
|
|
2017-04-30 20:37:53 +00:00
|
|
|
</details>
|
|
|
|
|
|
|
|
<details>
|
2017-09-24 23:40:09 +00:00
|
|
|
<summary><b>Browser download file (ajax)</b> (click to show)</summary>
|
2017-04-30 20:37:53 +00:00
|
|
|
|
|
|
|
Note: for a more complete example that works in older browsers, check the demo
|
2018-03-19 21:42:55 +00:00
|
|
|
at <http://oss.sheetjs.com/js-xlsx/ajax.html>. The [`xhr` demo](demos/xhr/)
|
2017-09-24 23:40:09 +00:00
|
|
|
includes more examples with `XMLHttpRequest` and `fetch`.
|
2017-03-20 09:02:25 +00:00
|
|
|
|
|
|
|
```js
|
2017-09-24 23:40:09 +00:00
|
|
|
var url = "http://oss.sheetjs.com/test_files/formula_stress_test.xlsx";
|
2017-03-20 09:02:25 +00:00
|
|
|
|
2017-09-24 23:40:09 +00:00
|
|
|
/* set up async GET request */
|
|
|
|
var req = new XMLHttpRequest();
|
|
|
|
req.open("GET", url, true);
|
|
|
|
req.responseType = "arraybuffer";
|
2017-03-20 09:02:25 +00:00
|
|
|
|
2017-09-24 23:40:09 +00:00
|
|
|
req.onload = function(e) {
|
|
|
|
var data = new Uint8Array(req.response);
|
|
|
|
var workbook = XLSX.read(data, {type:"array"});
|
2017-03-20 09:02:25 +00:00
|
|
|
|
|
|
|
/* DO SOMETHING WITH workbook HERE */
|
|
|
|
}
|
|
|
|
|
2017-09-24 23:40:09 +00:00
|
|
|
req.send();
|
2017-03-20 09:02:25 +00:00
|
|
|
```
|
|
|
|
|
2017-04-30 20:37:53 +00:00
|
|
|
</details>
|
|
|
|
|
|
|
|
<details>
|
2017-09-24 23:40:09 +00:00
|
|
|
<summary><b>Browser drag-and-drop</b> (click to show)</summary>
|
2017-04-30 20:37:53 +00:00
|
|
|
|
2017-09-24 23:40:09 +00:00
|
|
|
Drag-and-drop uses the HTML5 `FileReader` API, loading the data with
|
|
|
|
`readAsBinaryString` or `readAsArrayBuffer`. Since not all browsers support the
|
|
|
|
full `FileReader` API, dynamic feature tests are highly recommended.
|
2017-03-20 09:02:25 +00:00
|
|
|
|
|
|
|
```js
|
|
|
|
var rABS = true; // true: readAsBinaryString ; false: readAsArrayBuffer
|
|
|
|
function handleDrop(e) {
|
2017-09-24 23:40:09 +00:00
|
|
|
e.stopPropagation(); e.preventDefault();
|
|
|
|
var files = e.dataTransfer.files, f = files[0];
|
|
|
|
var reader = new FileReader();
|
|
|
|
reader.onload = function(e) {
|
|
|
|
var data = e.target.result;
|
|
|
|
if(!rABS) data = new Uint8Array(data);
|
|
|
|
var workbook = XLSX.read(data, {type: rABS ? 'binary' : 'array'});
|
|
|
|
|
|
|
|
/* DO SOMETHING WITH workbook HERE */
|
|
|
|
};
|
|
|
|
if(rABS) reader.readAsBinaryString(f); else reader.readAsArrayBuffer(f);
|
2017-03-20 09:02:25 +00:00
|
|
|
}
|
|
|
|
drop_dom_element.addEventListener('drop', handleDrop, false);
|
|
|
|
```
|
|
|
|
|
2017-04-30 20:37:53 +00:00
|
|
|
</details>
|
|
|
|
|
|
|
|
<details>
|
2017-09-24 23:40:09 +00:00
|
|
|
<summary><b>Browser file upload form element</b> (click to show)</summary>
|
|
|
|
|
|
|
|
Data from file input elements can be processed using the same `FileReader` API
|
|
|
|
as in the drag-and-drop example:
|
2017-03-20 09:02:25 +00:00
|
|
|
|
|
|
|
```js
|
2017-09-24 23:40:09 +00:00
|
|
|
var rABS = true; // true: readAsBinaryString ; false: readAsArrayBuffer
|
2017-03-20 09:02:25 +00:00
|
|
|
function handleFile(e) {
|
2017-09-24 23:40:09 +00:00
|
|
|
var files = e.target.files, f = files[0];
|
|
|
|
var reader = new FileReader();
|
|
|
|
reader.onload = function(e) {
|
|
|
|
var data = e.target.result;
|
|
|
|
if(!rABS) data = new Uint8Array(data);
|
|
|
|
var workbook = XLSX.read(data, {type: rABS ? 'binary' : 'array'});
|
|
|
|
|
|
|
|
/* DO SOMETHING WITH workbook HERE */
|
|
|
|
};
|
|
|
|
if(rABS) reader.readAsBinaryString(f); else reader.readAsArrayBuffer(f);
|
2017-03-20 09:02:25 +00:00
|
|
|
}
|
|
|
|
input_dom_element.addEventListener('change', handleFile, false);
|
|
|
|
```
|
|
|
|
|
2018-02-14 20:06:35 +00:00
|
|
|
The [`oldie` demo](demos/oldie/) shows an IE-compatible fallback scenario.
|
|
|
|
|
2017-04-30 20:37:53 +00:00
|
|
|
</details>
|
|
|
|
|
2018-02-03 20:46:32 +00:00
|
|
|
More specialized cases, including mobile app file processing, are covered in the
|
|
|
|
[included demos](demos/)
|
2017-04-30 20:37:53 +00:00
|
|
|
|
2017-07-05 22:27:54 +00:00
|
|
|
### Parsing Examples
|
2017-04-16 04:32:13 +00:00
|
|
|
|
|
|
|
- <http://oss.sheetjs.com/js-xlsx/> HTML5 File API / Base64 Text / Web Workers
|
|
|
|
|
2017-09-24 23:40:09 +00:00
|
|
|
Note that older versions of IE do not support HTML5 File API, so the Base64 mode
|
2017-06-03 07:19:09 +00:00
|
|
|
is used for testing.
|
|
|
|
|
|
|
|
<details>
|
2017-09-24 23:40:09 +00:00
|
|
|
<summary><b>Get Base64 encoding on OSX / Windows</b> (click to show)</summary>
|
2017-06-03 07:19:09 +00:00
|
|
|
|
2017-09-24 23:40:09 +00:00
|
|
|
On OSX you can get the Base64 encoding with:
|
2017-04-16 04:32:13 +00:00
|
|
|
|
|
|
|
```bash
|
|
|
|
$ <target_file base64 | pbcopy
|
|
|
|
```
|
|
|
|
|
2017-09-24 23:40:09 +00:00
|
|
|
On Windows XP and up you can get the Base64 encoding using `certutil`:
|
2017-04-16 04:32:13 +00:00
|
|
|
|
|
|
|
```cmd
|
|
|
|
> certutil -encode target_file target_file.b64
|
|
|
|
```
|
|
|
|
|
|
|
|
(note: You have to open the file and remove the header and footer lines)
|
|
|
|
|
2017-06-03 07:19:09 +00:00
|
|
|
</details>
|
|
|
|
|
2017-04-16 04:32:13 +00:00
|
|
|
- <http://oss.sheetjs.com/js-xlsx/ajax.html> XMLHttpRequest
|
|
|
|
|