sheetjs/docbits/13_usage.md

### Usage

Most scenarios involving spreadsheets and data can be broken into 5 parts:

1) **Acquire Data**:  Data may be stored anywhere: local or remote files,
   databases, HTML TABLE, or even generated programmatically in the web browser.

2) **Extract Data**:  For spreadsheet files, this involves parsing raw bytes to
   read the cell data. For general JS data, this involves reshaping the data.

3) **Process Data**:  From generating summary statistics to cleaning data
   records, this step is the heart of the problem.

4) **Package Data**:  This can involve making a new spreadsheet or serializing
   with `JSON.stringify` or writing XML or simply flattening data for UI tools.

5) **Release Data**:  Spreadsheet files can be uploaded to a server or written
   locally.  Data can be presented to users in an HTML TABLE or data grid.

A common problem involves generating a valid spreadsheet export from data stored
in an HTML table.  In this example, an HTML TABLE on the page will be scraped,
a row will be added to the bottom with the date of the report, and a new file
will be generated and downloaded locally. `XLSX.writeFile` takes care of
packaging the data and attempting a local download:

```js
// Acquire Data (reference to the HTML table)
var table_elt = document.getElementById("my-table-id");

// Extract Data (create a workbook object from the table)
var workbook = XLSX.utils.table_to_book(table_elt);

// Process Data (add a new row)
var ws = workbook.Sheets["Sheet1"];
XLSX.utils.sheet_add_aoa(ws, [["Created "+new Date().toISOString()]], {origin:-1});

// Package and Release Data (`writeFile` tries to write and save an XLSB file)
XLSX.writeFile(workbook, "Report.xlsb");
```

This library tries to simplify steps 2 and 4 with functions to extract useful
data from spreadsheet files (`read` / `readFile`) and generate new spreadsheet
files from data (`write` / `writeFile`).  Additional utility functions like
`table_to_book` work with other common data sources like HTML tables.

This documentation and various demo projects cover a number of common scenarios
and approaches for steps 1 and 5.

Utility functions help with step 3.

### The Zen of SheetJS

_Data processing should fit in any workflow_

The library does not impose a separate lifecycle.  It fits nicely in websites
and apps built using any framework.  The plain JS data objects play nice with
Web Workers and future APIs.

["Acquiring and Extracting Data"](#acquiring-and-extracting-data) describes
solutions for common data import scenarios.

["Writing Workbooks"](#writing-workbooks) describes solutions for common data
export scenarios involving actual spreadsheet files.

["Utility Functions"](#utility-functions) details utility functions for
translating JSON Arrays and other common JS structures into worksheet objects.

_JavaScript is a powerful language for data processing_

The ["Common Spreadsheet Format"](#common-spreadsheet-format) is a simple object
representation of the core concepts of a workbook.  The various functions in the
library provide low-level tools for working with the object.

For friendly JS processing, there are utility functions for converting parts of
a worksheet to/from an Array of Arrays.  The following example combines powerful
JS Array methods with a network request library to download data, select the
information we want and create a workbook file:

<details>
  <summary><b>Get Data from a JSON Endpoint and Generate a Workbook</b> (click to show)</summary>

The goal is to generate a XLSB workbook of US President names and birthdays.

**Acquire Data**

_Raw Data_

<https://theunitedstates.io/congress-legislators/executive.json> has the desired
data.  For example, John Adams:

```js
{
  "id": { /* (data omitted) */ },
  "name": {
    "first": "John",          // <-- first name
    "last": "Adams"           // <-- last name
  },
  "bio": {
    "birthday": "1735-10-19", // <-- birthday
    "gender": "M"
  },
  "terms": [
    { "type": "viceprez", /* (other fields omitted) */ },
    { "type": "viceprez", /* (other fields omitted) */ },
    { "type": "prez", /* (other fields omitted) */ } // <-- look for "prez"
  ]
}
```

_Filtering for Presidents_

The dataset includes Aaron Burr, a Vice President who was never President!

`Array#filter` creates a new array with the desired rows.  A President served
at least one term with `type` set to `"prez"`.  To test if a particular row has
at least one `"prez"` term, `Array#some` is another native JS function.  The
complete filter would be:

```js
const prez = raw_data.filter(row => row.terms.some(term => term.type === "prez"));
```

_Lining up the data_

For this example, the name will be the first name combined with the last name
(`row.name.first + " " + row.name.last`) and the birthday will be the subfield
`row.bio.birthday`.  Using `Array#map`, the dataset can be massaged in one call:

```js
const rows = prez.map(row => ({
  name: row.name.first + " " + row.name.last,
  birthday: row.bio.birthday
}));
```

The result is an array of "simple" objects with no nesting:

```js
[
  { name: "George Washington", birthday: "1732-02-22" },
  { name: "John Adams", birthday: "1735-10-19" },
  // ... one row per President
]
```

**Extract Data**

With the cleaned dataset, `XLSX.utils.json_to_sheet` generates a worksheet:

```js
const worksheet = XLSX.utils.json_to_sheet(rows);
```

`XLSX.utils.book_new` creates a new workbook and `XLSX.utils.book_append_sheet`
appends a worksheet to the workbook. The new worksheet will be called "Dates":

```js
const workbook = XLSX.utils.book_new();
XLSX.utils.book_append_sheet(workbook, worksheet, "Dates");
```

**Process Data**

_Fixing headers_

By default, `json_to_sheet` creates a worksheet with a header row. In this case,
the headers come from the JS object keys: "name" and "birthday".

The headers are in cells A1 and B1.  `XLSX.utils.sheet_add_aoa` can write text
values to the existing worksheet starting at cell A1:

```js
XLSX.utils.sheet_add_aoa(worksheet, [["Name", "Birthday"]], { origin: "A1" });
```

_Fixing Column Widths_

Some of the names are longer than the default column width.  Column widths are
set by [setting the `"!cols"` worksheet property](#row-and-column-properties).

The following line sets the width of column A to approximately 10 characters:

```js
worksheet["!cols"] = [ { wch: 10 } ]; // set column A width to 10 characters
```

One `Array#reduce` call over `rows` can calculate the maximum width:

```js
const max_width = rows.reduce((w, r) => Math.max(w, r.name.length), 10);
worksheet["!cols"] = [ { wch: max_width } ];
```

Note: If the starting point was a file or HTML table, `XLSX.utils.sheet_to_json`
will generate an array of JS objects.

**Package and Release Data**

`XLSX.writeFile` creates a spreadsheet file and tries to write it to the system.
In the browser, it will try to prompt the user to download the file.  In NodeJS,
it will write to the local directory.

```js
XLSX.writeFile(workbook, "Presidents.xlsx");
```

**Complete Example**

```js
// Uncomment the next line for use in NodeJS:
// const XLSX = require("xlsx"), axios = require("axios");

(async() => {
  /* fetch JSON data and parse */
  const url = "https://theunitedstates.io/congress-legislators/executive.json";
  const raw_data = (await axios(url, {responseType: "json"})).data;

  /* filter for the Presidents */
  const prez = raw_data.filter(row => row.terms.some(term => term.type === "prez"));

  /* flatten objects */
  const rows = prez.map(row => ({
    name: row.name.first + " " + row.name.last,
    birthday: row.bio.birthday
  }));

  /* generate worksheet and workbook */
  const worksheet = XLSX.utils.json_to_sheet(rows);
  const workbook = XLSX.utils.book_new();
  XLSX.utils.book_append_sheet(workbook, worksheet, "Dates");

  /* fix headers */
  XLSX.utils.sheet_add_aoa(worksheet, [["Name", "Birthday"]], { origin: "A1" });

  /* calculate column width */
  const max_width = rows.reduce((w, r) => Math.max(w, r.name.length), 10);
  worksheet["!cols"] = [ { wch: max_width } ];

  /* create an XLSX file and try to save to Presidents.xlsx */
  XLSX.writeFile(workbook, "Presidents.xlsx");
})();
```

For use in the web browser, assuming the snippet is saved to `snippet.js`,
script tags should be used to include the `axios` and `xlsx` standalone builds:

```html
<script src="https://unpkg.com/xlsx/dist/xlsx.full.min.js"></script>
<script src="https://unpkg.com/axios/dist/axios.min.js"></script>
<script src="snippet.js"></script>
```


</details>

_File formats are implementation details_

The parser covers a wide gamut of common spreadsheet file formats to ensure that
"HTML-saved-as-XLS" files work as well as actual XLS or XLSX files.

The writer supports a number of common output formats for broad compatibility
with the data ecosystem.

To the greatest extent possible, data processing code should not have to worry
about the specific file formats involved.
NUMBERS primary cell storage parse docs clarified row and column props (fixes #2486) (fixes #2511) 2022-02-04 05:29:01 +00:00			`### Usage`

			`Most scenarios involving spreadsheets and data can be broken into 5 parts:`

			`1) Acquire Data: Data may be stored anywhere: local or remote files,`
			`databases, HTML TABLE, or even generated programmatically in the web browser.`

			`2) Extract Data: For spreadsheet files, this involves parsing raw bytes to`
			`read the cell data. For general JS data, this involves reshaping the data.`

			`3) Process Data: From generating summary statistics to cleaning data`
			`records, this step is the heart of the problem.`

			`4) Package Data: This can involve making a new spreadsheet or serializing`
			with `JSON.stringify` or writing XML or simply flattening data for UI tools.

			`5) Release Data: Spreadsheet files can be uploaded to a server or written`
			`locally. Data can be presented to users in an HTML TABLE or data grid.`

			`A common problem involves generating a valid spreadsheet export from data stored`
			`in an HTML table. In this example, an HTML TABLE on the page will be scraped,`
			`a row will be added to the bottom with the date of the report, and a new file`
			will be generated and downloaded locally. `XLSX.writeFile` takes care of
			`packaging the data and attempting a local download:`

			```js
			`// Acquire Data (reference to the HTML table)`
			`var table_elt = document.getElementById("my-table-id");`

			`// Extract Data (create a workbook object from the table)`
			`var workbook = XLSX.utils.table_to_book(table_elt);`

			`// Process Data (add a new row)`
README cleanup in anticipation of node fetch 2022-02-05 13:59:25 +00:00			`var ws = workbook.Sheets["Sheet1"];`
			`XLSX.utils.sheet_add_aoa(ws, [["Created "+new Date().toISOString()]], {origin:-1});`
NUMBERS primary cell storage parse docs clarified row and column props (fixes #2486) (fixes #2511) 2022-02-04 05:29:01 +00:00
			// Package and Release Data (`writeFile` tries to write and save an XLSB file)
			`XLSX.writeFile(workbook, "Report.xlsb");`
			```

			`This library tries to simplify steps 2 and 4 with functions to extract useful`
			data from spreadsheet files (`read` / `readFile`) and generate new spreadsheet
README cleanup in anticipation of node fetch 2022-02-05 13:59:25 +00:00			files from data (`write` / `writeFile`). Additional utility functions like
			`table_to_book` work with other common data sources like HTML tables.
NUMBERS primary cell storage parse docs clarified row and column props (fixes #2486) (fixes #2511) 2022-02-04 05:29:01 +00:00
			`This documentation and various demo projects cover a number of common scenarios`
			`and approaches for steps 1 and 5.`

			`Utility functions help with step 3.`

README cleanup in anticipation of node fetch 2022-02-05 13:59:25 +00:00			`### The Zen of SheetJS`
NUMBERS primary cell storage parse docs clarified row and column props (fixes #2486) (fixes #2511) 2022-02-04 05:29:01 +00:00
			`_Data processing should fit in any workflow_`

			`The library does not impose a separate lifecycle. It fits nicely in websites`
			`and apps built using any framework. The plain JS data objects play nice with`
			`Web Workers and future APIs.`

README cleanup in anticipation of node fetch 2022-02-05 13:59:25 +00:00			`["Acquiring and Extracting Data"](#acquiring-and-extracting-data) describes`
			`solutions for common data import scenarios.`
NUMBERS primary cell storage parse docs clarified row and column props (fixes #2486) (fixes #2511) 2022-02-04 05:29:01 +00:00
			`["Writing Workbooks"](#writing-workbooks) describes solutions for common data`
			`export scenarios involving actual spreadsheet files.`

			`["Utility Functions"](#utility-functions) details utility functions for`
			`translating JSON Arrays and other common JS structures into worksheet objects.`

			`_JavaScript is a powerful language for data processing_`

			`The ["Common Spreadsheet Format"](#common-spreadsheet-format) is a simple object`
			`representation of the core concepts of a workbook. The various functions in the`
			`library provide low-level tools for working with the object.`

			`For friendly JS processing, there are utility functions for converting parts of`
README cleanup in anticipation of node fetch 2022-02-05 13:59:25 +00:00			`a worksheet to/from an Array of Arrays. The following example combines powerful`
			`JS Array methods with a network request library to download data, select the`
			`information we want and create a workbook file:`

			`<details>`
			`<summary><b>Get Data from a JSON Endpoint and Generate a Workbook</b> (click to show)</summary>`

			`The goal is to generate a XLSB workbook of US President names and birthdays.`

			`Acquire Data`

			`_Raw Data_`

			`<https://theunitedstates.io/congress-legislators/executive.json> has the desired`
			`data. For example, John Adams:`
NUMBERS primary cell storage parse docs clarified row and column props (fixes #2486) (fixes #2511) 2022-02-04 05:29:01 +00:00
			```js
README cleanup in anticipation of node fetch 2022-02-05 13:59:25 +00:00			`{`
			`"id": { /* (data omitted) */ },`
			`"name": {`
			`"first": "John", // <-- first name`
			`"last": "Adams" // <-- last name`
			`},`
			`"bio": {`
			`"birthday": "1735-10-19", // <-- birthday`
			`"gender": "M"`
			`},`
			`"terms": [`
			`{ "type": "viceprez", /* (other fields omitted) */ },`
			`{ "type": "viceprez", /* (other fields omitted) */ },`
			`{ "type": "prez", /* (other fields omitted) */ } // <-- look for "prez"`
			`]`
			`}`
NUMBERS primary cell storage parse docs clarified row and column props (fixes #2486) (fixes #2511) 2022-02-04 05:29:01 +00:00			```

README cleanup in anticipation of node fetch 2022-02-05 13:59:25 +00:00			`_Filtering for Presidents_`

			`The dataset includes Aaron Burr, a Vice President who was never President!`

			`Array#filter` creates a new array with the desired rows. A President served
			at least one term with `type` set to `"prez"`. To test if a particular row has
			at least one `"prez"` term, `Array#some` is another native JS function. The
			`complete filter would be:`

			```js
			`const prez = raw_data.filter(row => row.terms.some(term => term.type === "prez"));`
			```

			`_Lining up the data_`

			`For this example, the name will be the first name combined with the last name`
			(`row.name.first + " " + row.name.last`) and the birthday will be the subfield
			`row.bio.birthday`. Using `Array#map`, the dataset can be massaged in one call:

			```js
			`const rows = prez.map(row => ({`
			`name: row.name.first + " " + row.name.last,`
			`birthday: row.bio.birthday`
			`}));`
			```

			`The result is an array of "simple" objects with no nesting:`

			```js
			`[`
			`{ name: "George Washington", birthday: "1732-02-22" },`
			`{ name: "John Adams", birthday: "1735-10-19" },`
			`// ... one row per President`
			`]`
			```

			`Extract Data`

			With the cleaned dataset, `XLSX.utils.json_to_sheet` generates a worksheet:

			```js
			`const worksheet = XLSX.utils.json_to_sheet(rows);`
			```

			`XLSX.utils.book_new` creates a new workbook and `XLSX.utils.book_append_sheet`
			`appends a worksheet to the workbook. The new worksheet will be called "Dates":`

			```js
			`const workbook = XLSX.utils.book_new();`
			`XLSX.utils.book_append_sheet(workbook, worksheet, "Dates");`
			```

			`Process Data`

			`_Fixing headers_`

			By default, `json_to_sheet` creates a worksheet with a header row. In this case,
			`the headers come from the JS object keys: "name" and "birthday".`

			The headers are in cells A1 and B1. `XLSX.utils.sheet_add_aoa` can write text
			`values to the existing worksheet starting at cell A1:`

			```js
			`XLSX.utils.sheet_add_aoa(worksheet, [["Name", "Birthday"]], { origin: "A1" });`
			```

			`_Fixing Column Widths_`

			`Some of the names are longer than the default column width. Column widths are`
			set by [setting the `"!cols"` worksheet property](#row-and-column-properties).

			`The following line sets the width of column A to approximately 10 characters:`

			```js
			`worksheet["!cols"] = [ { wch: 10 } ]; // set column A width to 10 characters`
			```

			One `Array#reduce` call over `rows` can calculate the maximum width:

			```js
			`const max_width = rows.reduce((w, r) => Math.max(w, r.name.length), 10);`
			`worksheet["!cols"] = [ { wch: max_width } ];`
			```

			Note: If the starting point was a file or HTML table, `XLSX.utils.sheet_to_json`
			`will generate an array of JS objects.`

			`Package and Release Data`

			`XLSX.writeFile` creates a spreadsheet file and tries to write it to the system.
			`In the browser, it will try to prompt the user to download the file. In NodeJS,`
			`it will write to the local directory.`

			```js
			`XLSX.writeFile(workbook, "Presidents.xlsx");`
			```

			`Complete Example`

			```js
			`// Uncomment the next line for use in NodeJS:`
			`// const XLSX = require("xlsx"), axios = require("axios");`

			`(async() => {`
			`/* fetch JSON data and parse */`
			`const url = "https://theunitedstates.io/congress-legislators/executive.json";`
			`const raw_data = (await axios(url, {responseType: "json"})).data;`

			`/* filter for the Presidents */`
			`const prez = raw_data.filter(row => row.terms.some(term => term.type === "prez"));`

			`/* flatten objects */`
			`const rows = prez.map(row => ({`
			`name: row.name.first + " " + row.name.last,`
			`birthday: row.bio.birthday`
			`}));`

			`/* generate worksheet and workbook */`
			`const worksheet = XLSX.utils.json_to_sheet(rows);`
			`const workbook = XLSX.utils.book_new();`
			`XLSX.utils.book_append_sheet(workbook, worksheet, "Dates");`

			`/* fix headers */`
			`XLSX.utils.sheet_add_aoa(worksheet, [["Name", "Birthday"]], { origin: "A1" });`

			`/* calculate column width */`
			`const max_width = rows.reduce((w, r) => Math.max(w, r.name.length), 10);`
			`worksheet["!cols"] = [ { wch: max_width } ];`

			`/* create an XLSX file and try to save to Presidents.xlsx */`
			`XLSX.writeFile(workbook, "Presidents.xlsx");`
			`})();`
			```

			For use in the web browser, assuming the snippet is saved to `snippet.js`,
			script tags should be used to include the `axios` and `xlsx` standalone builds:

			```html
			`<script src="https://unpkg.com/xlsx/dist/xlsx.full.min.js"></script>`
			`<script src="https://unpkg.com/axios/dist/axios.min.js"></script>`
			`<script src="snippet.js"></script>`
			```


			`</details>`

			`_File formats are implementation details_`

			`The parser covers a wide gamut of common spreadsheet file formats to ensure that`
			`"HTML-saved-as-XLS" files work as well as actual XLS or XLSX files.`

			`The writer supports a number of common output formats for broad compatibility`
			`with the data ecosystem.`

			`To the greatest extent possible, data processing code should not have to worry`
			`about the specific file formats involved.`

NUMBERS primary cell storage parse docs clarified row and column props (fixes #2486) (fixes #2511) 2022-02-04 05:29:01 +00:00