math
1
.gitignore
vendored
@ -3,3 +3,4 @@
|
||||
package-lock.json
|
||||
pnpm-lock.yaml
|
||||
/docs
|
||||
node_modules
|
||||
|
307
docz/docs/03-demos/01-math/09-danfojs.md
Normal file
@ -0,0 +1,307 @@
|
||||
---
|
||||
title: Sheets in DanfoJS
|
||||
sidebar_label: DanfoJS
|
||||
pagination_prev: demos/index
|
||||
pagination_next: demos/frontend/index
|
||||
---
|
||||
|
||||
<head>
|
||||
<script src="https://cdn.jsdelivr.net/npm/danfojs@1.1.2/lib/bundle.min.js"></script>
|
||||
</head>
|
||||
|
||||
[SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing
|
||||
data from spreadsheets.
|
||||
|
||||
[DanfoJS](https://danfo.jsdata.org/) is a library for processing structured
|
||||
data. It uses SheetJS under the hood for reading and writing spreadsheets.
|
||||
|
||||
This demo covers details elided in the official DanfoJS documentation.
|
||||
|
||||
:::note Tested Deployments
|
||||
|
||||
This example was last tested on 2024 January 03 against DanfoJS 1.1.2.
|
||||
|
||||
:::
|
||||
|
||||
:::info Browser integration
|
||||
|
||||
The live demos on this page include the DanfoJS browser bundle:
|
||||
|
||||
```html
|
||||
<script src="https://cdn.jsdelivr.net/npm/danfojs@1.1.2/lib/bundle.min.js"></script>
|
||||
```
|
||||
|
||||
There are known issues with the documentation generator. If a demo explicitly
|
||||
prints "RELOAD THIS PAGE", please reload or refresh the page.
|
||||
|
||||
:::
|
||||
|
||||
## DataFrames and Worksheets
|
||||
|
||||
The DanfoJS `DataFrame`[^1] represents two-dimensional tabular data. It is the
|
||||
starting point for most DanfoJS data processing tasks. A `DataFrame` typically
|
||||
corresponds to one SheetJS worksheet[^2].
|
||||
|
||||
<table><thead><tr><th>Spreadsheet</th><th>DanfoJS DataFrame</th></tr></thead><tbody><tr><td>
|
||||
|
||||
![`pres.xlsx` data](pathname:///pres.png)
|
||||
|
||||
</td><td>
|
||||
|
||||
```
|
||||
╔════╤═══════════════╤═══════╗
|
||||
║ │ Name │ Index ║
|
||||
╟────┼───────────────┼───────╢
|
||||
║ 0 │ Bill Clinton │ 42 ║
|
||||
╟────┼───────────────┼───────╢
|
||||
║ 1 │ GeorgeW Bush │ 43 ║
|
||||
╟────┼───────────────┼───────╢
|
||||
║ 2 │ Barack Obama │ 44 ║
|
||||
╟────┼───────────────┼───────╢
|
||||
║ 3 │ Donald Trump │ 45 ║
|
||||
╟────┼───────────────┼───────╢
|
||||
║ 4 │ Joseph Biden │ 46 ║
|
||||
╚════╧═══════════════╧═══════╝
|
||||
```
|
||||
|
||||
</td></tr></tbody></table>
|
||||
|
||||
## DanfoJS SheetJS Integration
|
||||
|
||||
:::note pass
|
||||
|
||||
The official documentation inconsistently names the library object `danfo` and
|
||||
`dfd`. Since `dfd` is the browser global, the demos use the name `dfd`.
|
||||
|
||||
:::
|
||||
|
||||
Methods to read and write spreadsheets are attached to the main `dfd` object.
|
||||
|
||||
### Importing DataFrames
|
||||
|
||||
`readExcel`[^3] accepts two arguments: source data and options.
|
||||
|
||||
The source data must be a `string` or `File` object. Strings are interpreted as
|
||||
URLs while `File` objects are treated as data.
|
||||
|
||||
_Selecting a Worksheet_
|
||||
|
||||
DanfoJS will generate a dataframe from one worksheet. The parser normally uses
|
||||
the first worksheet. The `sheet` property of the options object controls the
|
||||
selected worksheet. It is expected to be a zero-indexed number:
|
||||
|
||||
```js
|
||||
const first_sheet = await dfd.readExcel(url, {sheet: 0});
|
||||
const second_sheet = await dfd.readExcel(url, {sheet: 1});
|
||||
```
|
||||
|
||||
_More Parsing Options_
|
||||
|
||||
The `parsingOptions` property of the options argument is passed directly to the
|
||||
SheetJS `read` method[^4].
|
||||
|
||||
For example, the `sheetRows` property controls how many rows are extracted from
|
||||
larger worksheets. To pull 3 data rows, `sheetRows` must be set to 4:
|
||||
|
||||
```js
|
||||
const first_three_rows = await dfd.readExcel(url, { parsingOptions: {
|
||||
// see https://docs.sheetjs.com/docs/api/parse-options for details
|
||||
sheetRows: 4
|
||||
} });
|
||||
```
|
||||
|
||||
#### URL source
|
||||
|
||||
The following example fetches a [test file](https://sheetjs.com/pres.xlsx),
|
||||
parses with SheetJS and generates a DanfoJS dataframe.
|
||||
|
||||
```jsx live
|
||||
function DanfoReadExcelURL() {
|
||||
const [text, setText] = React.useState("");
|
||||
React.useEffect(() => { (async() => {
|
||||
if(typeof dfd === "undefined") return setText("RELOAD THIS PAGE!");
|
||||
const df = await dfd.readExcel("https://sheetjs.com/pres.xlsx");
|
||||
setText("" + df.head());
|
||||
})(); }, []);
|
||||
return (<pre>{text}</pre>);
|
||||
}
|
||||
```
|
||||
|
||||
#### File source
|
||||
|
||||
The following example uses a file input element. The "File API"[^5] section of
|
||||
the "Local File Access" demo covers the browser API in more detail.
|
||||
|
||||
```jsx live
|
||||
function DanfoReadExcelFile() {
|
||||
const [text, setText] = React.useState("Select a spreadsheet");
|
||||
|
||||
return (<><pre>{text}</pre><input type="file" onChange={async(e) => {
|
||||
if(typeof dfd === "undefined") return setText("RELOAD THIS PAGE!");
|
||||
|
||||
/* get first file */
|
||||
const file = e.target.files[0];
|
||||
|
||||
/* create dataframe and pretty-print the first 10 rows */
|
||||
const df = await dfd.readExcel(file);
|
||||
setText("" + df.head());
|
||||
}}/></>);
|
||||
}
|
||||
```
|
||||
|
||||
### Exporting DataFrames
|
||||
|
||||
`toExcel`[^6] accepts two arguments: dataframe and options. Under the hood, it
|
||||
uses the SheetJS `writeFile` method[^7].
|
||||
|
||||
_Exported File Name_
|
||||
|
||||
The relevant property for the file name depends on the platform:
|
||||
|
||||
| Platform | Property |
|
||||
|:---------|:-----------|
|
||||
| NodeJS | `filePath` |
|
||||
| Browser | `fileName` |
|
||||
|
||||
The exporter will deduce the desired file format from the file extension.
|
||||
|
||||
_Worksheet Name_
|
||||
|
||||
The `sheetName` property specifies the name of the worksheet in the workbook:
|
||||
|
||||
```js
|
||||
dfd.toExcel(df, {
|
||||
fileName: "test.xlsx", // generate `test.xlsx`
|
||||
// highlight-next-line
|
||||
sheetName: "Export" // The name of the worksheet will be "Export"
|
||||
});
|
||||
```
|
||||
|
||||
:::caution pass
|
||||
|
||||
The DanfoJS integration forces the `.xlsx` file extension. Exporting to other
|
||||
file formats will require [low-level operations](#generating-files).
|
||||
|
||||
:::
|
||||
|
||||
_More Writing Options_
|
||||
|
||||
The `writingOptions` property of the options argument is passed directly to the
|
||||
SheetJS `writeFile` method[^8].
|
||||
|
||||
For example, the `compression` property enables ZIP compression for XLSX and
|
||||
other formats:
|
||||
|
||||
```js
|
||||
dfd.toExcel(df, {fileName: "export.xlsx", writingOptions: {
|
||||
// see https://docs.sheetjs.com/docs/api/write-options for details
|
||||
compression: true
|
||||
}});
|
||||
```
|
||||
|
||||
#### Export to File
|
||||
|
||||
The following example exports a sample dataframe to a XLSX spreadsheet.
|
||||
|
||||
```jsx live
|
||||
function DanfoToExcel() {
|
||||
if(typeof dfd === "undefined") return (<b>RELOAD THIS PAGE</b>);
|
||||
/* sample dataframe */
|
||||
const df = new dfd.DataFrame([{Sheet:1,JS:2},{Sheet:3,JS:4}]);
|
||||
return ( <><button onClick={async() => {
|
||||
/* dfd.toExcel calls the SheetJS `writeFile` method */
|
||||
dfd.toExcel(df, {fileName: "SheetJSDanfoJS.xlsx", writingOptions: {
|
||||
compression: true
|
||||
}});
|
||||
}}>Click to Export</button><pre>{"Data:\n"+df.head()}</pre></> );
|
||||
}
|
||||
```
|
||||
|
||||
## Low-Level Operations
|
||||
|
||||
DanfoJS and SheetJS provide methods for processing arrays of objects.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
ws((SheetJS\nWorksheet))
|
||||
aoo[[array of\nobjects]]
|
||||
df[(DanfoJS\nDataFrame)]
|
||||
ws --> |sheet_to_json\n\n| aoo
|
||||
aoo --> |\njson_to_sheet| ws
|
||||
df --> |\ndfd.toJSON| aoo
|
||||
aoo --> |new DataFrame\n\n| df
|
||||
```
|
||||
|
||||
### Creating DataFrames
|
||||
|
||||
The `DataFrame` constructor[^9] creates `DataFrame` objects from arrays of
|
||||
objects. Given a SheetJS worksheet object, the `sheet_to_json` method[^10]
|
||||
generates compatible arrays of objects:
|
||||
|
||||
```js
|
||||
function ws_to_df(ws) {
|
||||
const aoo = XLSX.utils.sheet_to_json(ws);
|
||||
return new dfd.DataFrame(aoo);
|
||||
}
|
||||
```
|
||||
|
||||
### Generating Files
|
||||
|
||||
`toJSON`[^11] accepts two arguments: dataframe and options.
|
||||
|
||||
The `format` key of the `options` argument dictates the result layout. The
|
||||
`column` layout generates an array of objects in row-major order. The SheetJS
|
||||
`json_to_sheet`[^12] method can generate a worksheet object from the result:
|
||||
|
||||
```js
|
||||
function df_to_ws(df) {
|
||||
const aoo = dfd.toJSON(df, { format: "column" });
|
||||
return XLSX.utils.json_to_sheet(aoo);
|
||||
}
|
||||
```
|
||||
|
||||
The SheetJS `book_new` method creates a workbook object from the worksheet[^13]
|
||||
and the `writeFile` method[^14] will generate the file:
|
||||
|
||||
```js
|
||||
const ws = df_to_ws(df);
|
||||
const wb = XLSX.utils.book_new(ws, "Export");
|
||||
XLSX.writeFile(wb, "SheetJSDanfoJS.xlsb", { compression: true });
|
||||
```
|
||||
|
||||
The following demo exports a sample dataframe to XLSB. This operation is not
|
||||
supported by the DanfoJS `toExcel` method since that method enforces XLSX.
|
||||
|
||||
```jsx live
|
||||
function DanfoToXLS() {
|
||||
if(typeof dfd === "undefined") return (<b>RELOAD THIS PAGE</b>);
|
||||
/* sample dataframe */
|
||||
const df = new dfd.DataFrame([{Sheet:1,JS:2},{Sheet:3,JS:4}]);
|
||||
return ( <><button onClick={async() => {
|
||||
/* generate worksheet */
|
||||
const aoo = dfd.toJSON(df, { format: "column" });
|
||||
const ws = XLSX.utils.json_to_sheet(aoo);
|
||||
|
||||
/* generate workbook */
|
||||
const wb = XLSX.utils.book_new(ws, "Export");
|
||||
|
||||
/* write to XLS */
|
||||
XLSX.writeFile(wb, "SheetJSDanfoJS.xlsb", { compression: true });
|
||||
}}>Click to Export</button><pre>{"Data:\n"+df.head()}</pre></> );
|
||||
}
|
||||
```
|
||||
|
||||
[^1]: See ["Dataframe"](https://danfo.jsdata.org/api-reference/dataframe) in the DanfoJS documentation
|
||||
[^2]: See ["Sheet Objects"](/docs/csf/sheet)
|
||||
[^3]: See ["danfo.readExcel"](https://danfo.jsdata.org/api-reference/input-output/danfo.read_excel) in the DanfoJS documentation.
|
||||
[^4]: See ["Reading Files"](/docs/api/parse-options/#parsing-options) for the full list of parsing options.
|
||||
[^5]: See ["File API" in "Local File Access"](/docs/demos/local/file#file-api) for more details.
|
||||
[^6]: See ["danfo.toExcel"](https://danfo.jsdata.org/api-reference/input-output/danfo.to_excel) in the DanfoJS documentation.
|
||||
[^7]: See [`writeFile` in "Writing Files"](/docs/api/write-options)
|
||||
[^8]: See ["Writing Files"](/docs/api/write-options/#writing-options) for the full list of writing options.
|
||||
[^9]: See ["Creating a DataFrame"](https://danfo.jsdata.org/api-reference/dataframe/creating-a-dataframe) in the DanfoJS documentation.
|
||||
[^10]: See [`sheet_to_json` in "Utilities"](/docs/api/utilities/array#array-output)
|
||||
[^11]: See ["danfo.toJSON"](https://danfo.jsdata.org/api-reference/input-output/danfo.to_json) in the DanfoJS documentation.
|
||||
[^12]: See [`json_to_sheet` in "Utilities"](/docs/api/utilities/array#array-of-objects-input)
|
||||
[^13]: See [`book_new` in "Utilities"](/docs/api/utilities/wb)
|
||||
[^14]: See [`writeFile` in "Writing Files"](/docs/api/write-options)
|
440
docz/docs/03-demos/01-math/11-tensorflow.md
Normal file
@ -0,0 +1,440 @@
|
||||
---
|
||||
title: Sheets in TensorFlow
|
||||
sidebar_label: TensorFlow.js
|
||||
pagination_prev: demos/index
|
||||
pagination_next: demos/frontend/index
|
||||
---
|
||||
|
||||
<head>
|
||||
<script src="https://docs.sheetjs.com/tfjs/tf.min.js"></script>
|
||||
</head>
|
||||
|
||||
[TensorFlow.js](https://www.tensorflow.org/js) (shortened to TF.js) is a library
|
||||
for machine learning in JavaScript.
|
||||
|
||||
[SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing
|
||||
data from spreadsheets.
|
||||
|
||||
This demo uses TensorFlow.js and SheetJS to process data in spreadsheets. We'll
|
||||
explore how to load spreadsheet data into TF.js datasets and how to export
|
||||
results back to spreadsheets.
|
||||
|
||||
- ["CSV Data Interchange"](#csv-data-interchange) uses SheetJS to process sheets
|
||||
and generate CSV data that TF.js can import.
|
||||
|
||||
- ["JSON Data Interchange"](#json-data-interchange) uses SheetJS to process
|
||||
sheets and generate rows of objects that can be post-processed.
|
||||
|
||||
:::info pass
|
||||
|
||||
Live code blocks in this page use the TF.js `4.14.0` standalone build.
|
||||
|
||||
For use in web frameworks, the `@tensorflow/tfjs` module should be used.
|
||||
|
||||
For use in NodeJS, the native bindings module is `@tensorflow/tfjs-node`.
|
||||
|
||||
:::
|
||||
|
||||
:::note Tested Deployments
|
||||
|
||||
Each browser demo was tested in the following environments:
|
||||
|
||||
| Browser | TF.js version | Date |
|
||||
|:------------|:--------------|:-----------|
|
||||
| Chrome 119 | `4.14.0` | 2023-12-09 |
|
||||
| Safari 16.6 | `4.14.0` | 2023-12-09 |
|
||||
|
||||
:::
|
||||
|
||||
## CSV Data Interchange
|
||||
|
||||
`tf.data.csv`[^1] generates a Dataset from CSV data. The function expects a URL.
|
||||
|
||||
:::note pass
|
||||
|
||||
When this demo was last tested, there was no direct method to pass a CSV string
|
||||
to the underlying parser.
|
||||
|
||||
:::
|
||||
|
||||
Fortunately blob URLs are supported.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
ws((SheetJS\nWorksheet))
|
||||
csv(CSV\nstring)
|
||||
url{{Data\nURL}}
|
||||
dataset[(TF.js\nDataset)]
|
||||
ws --> |sheet_to_csv\nSheetJS| csv
|
||||
csv --> |JavaScript\nAPIs| url
|
||||
url --> |tf.data.csv\nTensorFlow.js| dataset
|
||||
```
|
||||
|
||||
The SheetJS `sheet_to_csv` method[^2] generates a CSV string from a worksheet
|
||||
object. Using standard JavaScript techniques, a blob URL can be constructed:
|
||||
|
||||
```js
|
||||
function worksheet_to_csv_url(worksheet) {
|
||||
/* generate CSV */
|
||||
const csv = XLSX.utils.sheet_to_csv(worksheet);
|
||||
|
||||
/* CSV -> Uint8Array -> Blob */
|
||||
const u8 = new TextEncoder().encode(csv);
|
||||
const blob = new Blob([u8], { type: "text/csv" });
|
||||
|
||||
/* generate a blob URL */
|
||||
return URL.createObjectURL(blob);
|
||||
}
|
||||
```
|
||||
|
||||
### CSV Demo
|
||||
|
||||
This demo shows a simple model fitting using the "cars" dataset from TensorFlow.
|
||||
The [sample XLS file](https://sheetjs.com/data/cd.xls) contains the data. The
|
||||
data processing mirrors the official "Making Predictions from 2D Data" demo[^3].
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
file[(Remote\nFile)]
|
||||
subgraph SheetJS Operations
|
||||
ab[(Data\nBytes)]
|
||||
wb(((SheetJS\nWorkbook)))
|
||||
ws((SheetJS\nWorksheet))
|
||||
csv(CSV\nstring)
|
||||
end
|
||||
subgraph TensorFlow.js Operations
|
||||
url{{Data\nURL}}
|
||||
dataset[(TF.js\nDataset)]
|
||||
results((Results))
|
||||
end
|
||||
file --> |fetch\n\n| ab
|
||||
ab --> |read\n\n| wb
|
||||
wb --> |select\nsheet| ws
|
||||
ws --> |sheet_to_csv\n\n| csv
|
||||
csv --> |JS\nAPI| url
|
||||
url --> |tf.data.csv\nTF.js| dataset
|
||||
dataset --> |fitDataset\nTF.js| results
|
||||
```
|
||||
|
||||
The demo builds a model for predicting MPG from Horsepower data. It:
|
||||
|
||||
- fetches <https://sheetjs.com/data/cd.xls>
|
||||
- parses the data with the SheetJS `read`[^4] method
|
||||
- selects the first worksheet[^5] and converts to CSV using `sheet_to_csv`[^6]
|
||||
- generates a blob URL from the CSV text
|
||||
- generates a TF.js dataset with `tf.data.csv`[^7] and selects data columns
|
||||
- builds a model and trains with `fitDataset`[^8]
|
||||
- predicts MPG from a set of sample inputs and displays results in a table
|
||||
|
||||
<details><summary><b>Live Demo</b> (click to show)</summary>
|
||||
|
||||
:::caution pass
|
||||
|
||||
In some test runs, the results did not make sense given the underlying data.
|
||||
The dependent and independent variables are expected to be anti-correlated.
|
||||
|
||||
**This is a known issue in TF.js and affects the official demos**
|
||||
|
||||
:::
|
||||
|
||||
:::caution pass
|
||||
|
||||
If the live demo shows a message
|
||||
|
||||
```
|
||||
ReferenceError: tf is not defined
|
||||
```
|
||||
|
||||
please refresh the page. This is a known bug in the documentation generator.
|
||||
|
||||
:::
|
||||
|
||||
```jsx live
|
||||
function SheetJSToTFJSCSV() {
|
||||
const [output, setOutput] = React.useState("");
|
||||
const [results, setResults] = React.useState([]);
|
||||
const [disabled, setDisabled] = React.useState(false);
|
||||
|
||||
function worksheet_to_csv_url(worksheet) {
|
||||
/* generate CSV */
|
||||
const csv = XLSX.utils.sheet_to_csv(worksheet);
|
||||
|
||||
/* CSV -> Uint8Array -> Blob */
|
||||
const u8 = new TextEncoder().encode(csv);
|
||||
const blob = new Blob([u8], { type: "text/csv" });
|
||||
|
||||
/* generate a blob URL */
|
||||
return URL.createObjectURL(blob);
|
||||
}
|
||||
|
||||
const doit = React.useCallback(async () => {
|
||||
setResults([]); setOutput(""); setDisabled(true);
|
||||
try {
|
||||
/* fetch file */
|
||||
const f = await fetch("https://sheetjs.com/data/cd.xls");
|
||||
const ab = await f.arrayBuffer();
|
||||
/* parse file and get first worksheet */
|
||||
const wb = XLSX.read(ab);
|
||||
const ws = wb.Sheets[wb.SheetNames[0]];
|
||||
|
||||
/* generate blob URL */
|
||||
const url = worksheet_to_csv_url(ws);
|
||||
|
||||
/* feed to tf.js */
|
||||
const dataset = tf.data.csv(url, {
|
||||
hasHeader: true,
|
||||
configuredColumnsOnly: true,
|
||||
columnConfigs:{
|
||||
"Horsepower": {required: false, default: 0},
|
||||
"Miles_per_Gallon":{required: false, default: 0, isLabel:true}
|
||||
}
|
||||
});
|
||||
|
||||
/* pre-process data */
|
||||
let flat = dataset
|
||||
.map(({xs,ys}) =>({xs: Object.values(xs), ys: Object.values(ys)}))
|
||||
.filter(({xs,ys}) => [...xs,...ys].every(v => v>0));
|
||||
|
||||
/* normalize manually :( */
|
||||
let minX = Infinity, maxX = -Infinity, minY = Infinity, maxY = -Infinity;
|
||||
await flat.forEachAsync(({xs, ys}) => {
|
||||
minX = Math.min(minX, xs[0]); maxX = Math.max(maxX, xs[0]);
|
||||
minY = Math.min(minY, ys[0]); maxY = Math.max(maxY, ys[0]);
|
||||
});
|
||||
flat = flat.map(({xs, ys}) => ({xs:xs.map(v => (v-minX)/(maxX - minX)),ys:ys.map(v => (v-minY)/(maxY-minY))}));
|
||||
flat = flat.batch(32);
|
||||
|
||||
/* build and train model */
|
||||
const model = tf.sequential();
|
||||
model.add(tf.layers.dense({inputShape: [1], units: 1}));
|
||||
model.compile({ optimizer: tf.train.sgd(0.000001), loss: 'meanSquaredError' });
|
||||
await model.fitDataset(flat, { epochs: 100, callbacks: { onEpochEnd: async (epoch, logs) => {
|
||||
setOutput(`${epoch}:${logs.loss}`);
|
||||
}}});
|
||||
|
||||
/* predict values */
|
||||
const inp = tf.linspace(0, 1, 9);
|
||||
const pred = model.predict(inp);
|
||||
const xs = await inp.dataSync(), ys = await pred.dataSync();
|
||||
setResults(Array.from(xs).map((x, i) => [ x * (maxX - minX) + minX, ys[i] * (maxY - minY) + minY ]));
|
||||
setOutput("");
|
||||
|
||||
} catch(e) { setOutput(`ERROR: ${String(e)}`); } finally { setDisabled(false);}
|
||||
});
|
||||
return ( <>
|
||||
<button onClick={doit} disabled={disabled}>Click to run</button><br/>
|
||||
{output && <pre>{output}</pre> || <></>}
|
||||
{results.length && <table><thead><tr><th>Horsepower</th><th>MPG</th></tr></thead><tbody>
|
||||
{results.map((r,i) => <tr key={i}><td>{r[0]}</td><td>{r[1].toFixed(2)}</td></tr>)}
|
||||
</tbody></table> || <></>}
|
||||
</> );
|
||||
}
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
## JS Array Interchange
|
||||
|
||||
[The official Linear Regression tutorial](https://www.tensorflow.org/js/tutorials/training/linear_regression)
|
||||
loads data from a JSON file:
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
"Name": "chevrolet chevelle malibu",
|
||||
"Miles_per_Gallon": 18,
|
||||
"Cylinders": 8,
|
||||
"Displacement": 307,
|
||||
"Horsepower": 130,
|
||||
"Weight_in_lbs": 3504,
|
||||
"Acceleration": 12,
|
||||
"Year": "1970-01-01",
|
||||
"Origin": "USA"
|
||||
},
|
||||
// ...
|
||||
]
|
||||
```
|
||||
|
||||
In real use cases, data is stored in [spreadsheets](https://sheetjs.com/data/cd.xls)
|
||||
|
||||
![cd.xls screenshot](pathname:///files/cd.png)
|
||||
|
||||
Following the tutorial, the data fetching method can be adapted to handle arrays
|
||||
of objects, such as those generated by the SheetJS `sheet_to_json` method[^9].
|
||||
|
||||
Differences from the official example are highlighted below:
|
||||
|
||||
```js
|
||||
/**
|
||||
* Get the car data reduced to just the variables we are interested
|
||||
* and cleaned of missing data.
|
||||
*/
|
||||
async function getData() {
|
||||
// highlight-start
|
||||
/* fetch file */
|
||||
const carsDataResponse = await fetch('https://sheetjs.com/data/cd.xls');
|
||||
/* get file data (ArrayBuffer) */
|
||||
const carsDataAB = await carsDataResponse.arrayBuffer();
|
||||
/* parse */
|
||||
const carsDataWB = XLSX.read(carsDataAB);
|
||||
/* get first worksheet */
|
||||
const carsDataWS = carsDataWB.Sheets[carsDataWB.SheetNames[0]];
|
||||
/* generate array of JS objects */
|
||||
const carsData = XLSX.utils.sheet_to_json(carsDataWS);
|
||||
// highlight-end
|
||||
const cleaned = carsData.map(car => ({
|
||||
mpg: car.Miles_per_Gallon,
|
||||
horsepower: car.Horsepower,
|
||||
}))
|
||||
.filter(car => (car.mpg != null && car.horsepower != null));
|
||||
|
||||
return cleaned;
|
||||
}
|
||||
```
|
||||
|
||||
## Low-Level Operations
|
||||
|
||||
### Data Transposition
|
||||
|
||||
A typical dataset in a spreadsheet will start with one header row and represent
|
||||
each data record in its own row. For example, the Iris dataset might look like
|
||||
|
||||
![Iris dataset](pathname:///files/iris.png)
|
||||
|
||||
The SheetJS `sheet_to_json` method[^10] will translate worksheet objects into an
|
||||
array of row objects:
|
||||
|
||||
```js
|
||||
var aoo = [
|
||||
{"sepal length": 5.1, "sepal width": 3.5, ...},
|
||||
{"sepal length": 4.9, "sepal width": 3, ...},
|
||||
...
|
||||
];
|
||||
```
|
||||
|
||||
TF.js and other libraries tend to operate on individual columns, equivalent to:
|
||||
|
||||
```js
|
||||
var sepal_lengths = [5.1, 4.9, ...];
|
||||
var sepal_widths = [3.5, 3, ...];
|
||||
```
|
||||
|
||||
When a `tensor2d` can be exported, it will look different from the spreadsheet:
|
||||
|
||||
```js
|
||||
var data_set_2d = [
|
||||
[5.1, 4.9, ...],
|
||||
[3.5, 3, ...],
|
||||
...
|
||||
]
|
||||
```
|
||||
|
||||
This is the transpose of how people use spreadsheets!
|
||||
|
||||
### Exporting Datasets to a Worksheet
|
||||
|
||||
The `aoa_to_sheet` method[^11] can generate a worksheet from an array of arrays.
|
||||
ML libraries typically provide APIs to pull an array of arrays, but it will be
|
||||
transposed. To export multiple data sets, the data should be transposed:
|
||||
|
||||
```js
|
||||
/* assuming data is an array of typed arrays */
|
||||
var aoa = [];
|
||||
for(var i = 0; i < data.length; ++i) {
|
||||
for(var j = 0; j < data[i].length; ++j) {
|
||||
if(!aoa[j]) aoa[j] = [];
|
||||
aoa[j][i] = data[i][j];
|
||||
}
|
||||
}
|
||||
/* aoa can be directly converted to a worksheet object */
|
||||
var ws = XLSX.utils.aoa_to_sheet(aoa);
|
||||
```
|
||||
|
||||
### Importing Data from a Spreadsheet
|
||||
|
||||
`sheet_to_json` with the option `header:1`[^12] will generate a row-major array
|
||||
of arrays that can be transposed. However, it is more efficient to walk the
|
||||
sheet manually:
|
||||
|
||||
```js
|
||||
/* find worksheet range */
|
||||
var range = XLSX.utils.decode_range(ws['!ref']);
|
||||
var out = []
|
||||
/* walk the columns */
|
||||
for(var C = range.s.c; C <= range.e.c; ++C) {
|
||||
/* create the typed array */
|
||||
var ta = new Float32Array(range.e.r - range.s.r + 1);
|
||||
/* walk the rows */
|
||||
for(var R = range.s.r; R <= range.e.r; ++R) {
|
||||
/* find the cell, skip it if the cell isn't numeric or boolean */
|
||||
var cell = ws["!data"] ? (ws["!data"][R]||[])[C] : ws[XLSX.utils.encode_cell({r:R, c:C})];
|
||||
if(!cell || cell.t != 'n' && cell.t != 'b') continue;
|
||||
/* assign to the typed array */
|
||||
ta[R - range.s.r] = cell.v;
|
||||
}
|
||||
out.push(ta);
|
||||
}
|
||||
```
|
||||
|
||||
If the data set has a header row, the loop can be adjusted to skip those rows.
|
||||
|
||||
### TF.js Tensors
|
||||
|
||||
A single `Array#map` can pull individual named fields from the result, which
|
||||
can be used to construct TensorFlow.js tensor objects:
|
||||
|
||||
```js
|
||||
const aoo = XLSX.utils.sheet_to_json(worksheet);
|
||||
const lengths = aoo.map(row => row["sepal length"]);
|
||||
const tensor = tf.tensor1d(lengths);
|
||||
```
|
||||
|
||||
`tf.Tensor` objects can be directly transposed using `transpose`:
|
||||
|
||||
```js
|
||||
var aoo = XLSX.utils.sheet_to_json(worksheet);
|
||||
// "x" and "y" are the fields we want to pull from the data
|
||||
var data = aoo.map(row => ([row["x"], row["y"]]));
|
||||
|
||||
// create a tensor representing two column datasets
|
||||
var tensor = tf.tensor2d(data).transpose();
|
||||
|
||||
// individual columns can be accessed
|
||||
var col1 = tensor.slice([0,0], [1,tensor.shape[1]]).flatten();
|
||||
var col2 = tensor.slice([1,0], [1,tensor.shape[1]]).flatten();
|
||||
```
|
||||
|
||||
For exporting, `stack` can be used to collapse the columns into a linear array:
|
||||
|
||||
```js
|
||||
/* pull data into a Float32Array */
|
||||
var result = tf.stack([col1, col2]).transpose();
|
||||
var shape = tensor.shape;
|
||||
var f32 = tensor.dataSync();
|
||||
|
||||
/* construct an array of arrays of the data in spreadsheet order */
|
||||
var aoa = [];
|
||||
for(var j = 0; j < shape[0]; ++j) {
|
||||
aoa[j] = [];
|
||||
for(var i = 0; i < shape[1]; ++i) aoa[j][i] = f32[j * shape[1] + i];
|
||||
}
|
||||
|
||||
/* add headers to the top */
|
||||
aoa.unshift(["x", "y"]);
|
||||
|
||||
/* generate worksheet */
|
||||
var worksheet = XLSX.utils.aoa_to_sheet(aoa);
|
||||
```
|
||||
|
||||
[^1]: See [`tf.data.csv`](https://js.tensorflow.org/api/latest/#data.csv) in the TensorFlow.js documentation
|
||||
[^2]: See [`sheet_to_csv` in "CSV and Text"](/docs/api/utilities/csv#delimiter-separated-output)
|
||||
[^3]: The ["Making Predictions from 2D Data" example](https://codelabs.developers.google.com/codelabs/tfjs-training-regression/) uses a hosted JSON file. The [sample XLS file](https://sheetjs.com/data/cd.xls) includes the same data.
|
||||
[^4]: See [`read` in "Reading Files"](/docs/api/parse-options)
|
||||
[^5]: See ["Workbook Object"](/docs/csf/book)
|
||||
[^6]: See [`sheet_to_csv` in "CSV and Text"](/docs/api/utilities/csv#delimiter-separated-output)
|
||||
[^7]: See [`tf.data.csv`](https://js.tensorflow.org/api/latest/#data.csv) in the TensorFlow.js documentation
|
||||
[^8]: See [`tf.LayersModel.fitDataset`](https://js.tensorflow.org/api/latest/#tf.LayersModel.fitDataset) in the TensorFlow.js documentation
|
||||
[^9]: See [`sheet_to_json` in "Utilities"](/docs/api/utilities/array#array-output)
|
||||
[^10]: See [`sheet_to_json` in "Utilities"](/docs/api/utilities/array#array-output)
|
||||
[^11]: See [`aoa_to_sheet` in "Utilities"](/docs/api/utilities/array#array-of-arrays-input)
|
||||
[^12]: See [`sheet_to_json` in "Utilities"](/docs/api/utilities/array#array-output)
|
4
docz/docs/03-demos/01-math/_category_.json
Normal file
@ -0,0 +1,4 @@
|
||||
{
|
||||
"label": "Math and Statistics",
|
||||
"position": 1
|
||||
}
|
412
docz/docs/03-demos/01-math/index.md
Normal file
@ -0,0 +1,412 @@
|
||||
---
|
||||
title: Math and Statistics
|
||||
pagination_prev: demos/index
|
||||
pagination_next: demos/frontend/index
|
||||
---
|
||||
|
||||
import DocCardList from '@theme/DocCardList';
|
||||
import {useCurrentSidebarCategory} from '@docusaurus/theme-common';
|
||||
|
||||
With full support for IEEE754 doubles and singles, JavaScript is an excellent
|
||||
language for mathematics and statistical analysis. It has also proven to be a
|
||||
viable platform for machine learning.
|
||||
|
||||
## Demos
|
||||
|
||||
Demos for various libraries are included in separate pages:
|
||||
|
||||
<ul>{useCurrentSidebarCategory().items.map((item, index) => {
|
||||
const listyle = (item.customProps?.icon) ? {
|
||||
listStyleImage: `url("${item.customProps.icon}")`
|
||||
} : {};
|
||||
return (<li style={listyle} {...(item.customProps?.class ? {className: item.customProps.class}: {})}>
|
||||
<a href={item.href}>{item.label}</a>{item.customProps?.summary && (" - " + item.customProps.summary)}
|
||||
</li>);
|
||||
})}</ul>
|
||||
|
||||
|
||||
## Typed Arrays
|
||||
|
||||
Modern JavaScript math and statistics libraries typically use `Float64Array` or
|
||||
`Float32Array` objects to efficiently store data variables.
|
||||
|
||||
<details><summary><b>Technical details</b> (click to show)</summary>
|
||||
|
||||
Under the hood, `ArrayBuffer` objects represent raw binary data. "Typed arrays"
|
||||
such as `Float64Array` and `Float32Array` are objects designed for efficient
|
||||
interpretation and mutation of `ArrayBuffer` data.
|
||||
|
||||
|
||||
:::note pass
|
||||
|
||||
`ArrayBuffer` are roughly analogous to heap-allocated memory. Typed arrays
|
||||
behave like typed pointers.
|
||||
|
||||
**JavaScript**
|
||||
|
||||
```js
|
||||
const buf = new ArrayBuffer(16);
|
||||
const dbl = new Float64Array(buf);
|
||||
dbl[1] = 3.14159;
|
||||
const u8 = new Uint8Array(buf);
|
||||
for(let i = 0; i < 8; ++i)
|
||||
console.log(u8[i+8]);
|
||||
```
|
||||
|
||||
**Equivalent C**
|
||||
|
||||
```c
|
||||
void *const buf = malloc(16);
|
||||
double *const dbl = (double *)buf;
|
||||
dbl[1] = 3.14159;
|
||||
uint8_t *const u8 = (uint8_t *)buf;
|
||||
for(uint8_t i = 0; i < 8; ++i)
|
||||
printf("%u\n", u8[i+8]);
|
||||
```
|
||||
|
||||
:::
|
||||
|
||||
</details>
|
||||
|
||||
### Reading from Sheets
|
||||
|
||||
Each typed array class has a `from` static method for converting data into a
|
||||
typed array. `Float64Array.from` returns a `double` typed array (8 bytes per
|
||||
value) and `Float32Array.from` generates a `float` typed array (4 bytes).
|
||||
|
||||
```js
|
||||
const column_f32 = Float32Array.from(arr); // 4-byte floats
|
||||
const column_f64 = Float64Array.from(arr); // 8-byte doubles
|
||||
```
|
||||
|
||||
:::info pass
|
||||
|
||||
Values in the array will be coerced to the relevant data type. Unsupported
|
||||
entries will be converted to quiet `NaN` values.
|
||||
|
||||
:::
|
||||
|
||||
#### Extracting Worksheet Data
|
||||
|
||||
The SheetJS `sheet_to_json`[^1] method with the option `header: 1`[^2] generates
|
||||
an array of arrays from a worksheet object. The result is in row-major order:
|
||||
|
||||
```js
|
||||
const aoa = XLSX.utils.sheet_to_json(worksheet, {header: 1});
|
||||
```
|
||||
|
||||
#### Categorical Variables
|
||||
|
||||
Dichotomous variables are commonly represented as spreadsheet `TRUE` or `FALSE`.
|
||||
The SheetJS `sheet_to_json` method will translate these values to `true` and
|
||||
`false`. Typed array methods will interpret values as `1` and `0` respectively.
|
||||
|
||||
Polychotomous variables must be manually mapped to numeric values. For example,
|
||||
using the Iris dataset:
|
||||
|
||||
![Iris dataset](pathname:///typedarray/iris.png)
|
||||
|
||||
```js
|
||||
[
|
||||
["sepal length", "sepal width", "petal length", "petal width", "class"],
|
||||
[5.1, 3.5, 1.4, 0.2, "Iris-setosa"],
|
||||
[4.9, 3, 1.4, 0.2, "Iris-setosa"],
|
||||
]
|
||||
```
|
||||
|
||||
Column E (`class`) is a polychotomous variable and must be manually translated:
|
||||
|
||||
```js
|
||||
const aoa = XLSX.utils.sheet_to_json(worksheet, {header: 1});
|
||||
|
||||
/* index_to_class will be needed to recover the values later */
|
||||
const index_to_class = [];
|
||||
|
||||
/* map from class name to number */
|
||||
const class_to_index = new Map();
|
||||
|
||||
/* loop over the data */
|
||||
for(let R = 1; R < aoa.length; ++R) {
|
||||
/* Column E = SheetJS row 4 */
|
||||
const category = aoa[R][4];
|
||||
const val = class_to_index.get(category);
|
||||
if(val == null) {
|
||||
/* assign a new index */
|
||||
class_to_index.set(category, index_to_class.length);
|
||||
aoa[R][4] = index_to_class.length;
|
||||
index_to_class.push(category);
|
||||
} else aoa[R][4] = val;
|
||||
}
|
||||
```
|
||||
|
||||
<details><summary><b>Live Demo</b> (click to show)</summary>
|
||||
|
||||
This example fetches and parses [`iris.xlsx`](pathname:///typedarray/iris.xlsx).
|
||||
The first worksheet is processed and the new data and mapping are printed.
|
||||
|
||||
```jsx live
|
||||
function SheetJSPolychotomy() {
|
||||
const [cat, setCat] = React.useState([]);
|
||||
const [aoa, setAoA] = React.useState([]);
|
||||
|
||||
React.useEffect(() => { (async() => {
|
||||
const ab = await (await fetch("/typedarray/iris.xlsx")).arrayBuffer();
|
||||
const wb = XLSX.read(ab);
|
||||
const aoa = XLSX.utils.sheet_to_json(wb.Sheets[wb.SheetNames[0]], {header:1});
|
||||
|
||||
const index_to_class = [];
|
||||
const class_to_index = new Map();
|
||||
for(let R = 1; R < aoa.length; ++R) {
|
||||
const category = aoa[R][4];
|
||||
const val = class_to_index.get(category);
|
||||
if(val == null) {
|
||||
class_to_index.set(category, index_to_class.length);
|
||||
aoa[R][4] = index_to_class.length;
|
||||
index_to_class.push(category);
|
||||
} else aoa[R][4] = val;
|
||||
}
|
||||
|
||||
/* display every 25 rows, skipping the header row */
|
||||
setAoA(aoa.filter((_, i) => (i % 25) == 1));
|
||||
setCat(index_to_class);
|
||||
})(); }, []);
|
||||
|
||||
return ( <>
|
||||
<b>Mapping</b><br/>
|
||||
<table><thead><tr><th>Index</th><th>Name</th></tr></thead><tbody>
|
||||
{cat.map((name, i) => (<tr><td>{i}</td><td>{name}</td></tr>))}
|
||||
</tbody></table>
|
||||
<b>Sample Data</b><br/>
|
||||
<table><thead><tr>{"ABCDE".split("").map(c => (<th>{c}</th>))}</tr></thead><tbody>
|
||||
{aoa.map(row => (<tr>{row.map(col => (<td>{col}</td>))}</tr>))}
|
||||
</tbody></table>
|
||||
</>
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
#### One Variable per Column
|
||||
|
||||
It is common to store datasets where each row represents an observation and each
|
||||
column represents a variable:
|
||||
|
||||
![Iris dataset](pathname:///typedarray/iris.png)
|
||||
|
||||
```js
|
||||
var aoa = [
|
||||
["sepal length", "sepal width", "petal length", "petal width", "class"],
|
||||
[5.1, 3.5, 1.4, 0.2, "Iris-setosa"],
|
||||
[4.9, 3, 1.4, 0.2, "Iris-setosa"],
|
||||
]
|
||||
```
|
||||
|
||||
An array `map` operation can pull data from an individual column. After mapping,
|
||||
a `slice` can remove the header label. For example, the following snippet pulls
|
||||
column C ("petal length") into a `Float64Array`:
|
||||
|
||||
```js
|
||||
const C = XLSX.utils.decode_col("C"); // Column "C" = SheetJS index 2
|
||||
const petal_length = Float64Array.from(aoa.map(row => row[C]).slice(1));
|
||||
```
|
||||
|
||||
#### One Variable per Row
|
||||
|
||||
Some datasets are stored in tables where each row represents a variable and each
|
||||
column represents an observation:
|
||||
|
||||
<table><thead><tr><th>JavaScript</th><th>Spreadsheet</th></tr></thead><tbody><tr><td>
|
||||
|
||||
```js
|
||||
var aoa = [
|
||||
["sepal length", 5.1, 4.9],
|
||||
["sepal width", 3.5, 3],
|
||||
["petal length", 1.4, 1.4],
|
||||
["petal width", 0.2, 0.2],
|
||||
["class", "setosa", "setosa"]
|
||||
]
|
||||
```
|
||||
|
||||
</td><td>
|
||||
|
||||
![Single column of data](pathname:///typedarray/iristr.png)
|
||||
|
||||
</td></tr></tbody></table>
|
||||
|
||||
|
||||
From the row-major array of arrays, each entry of the outer array is a row.
|
||||
|
||||
Many sheets include header columns. The `slice` method can remove the header.
|
||||
After removing the header, `Float64Array.from` can generate a typed array. For
|
||||
example, this snippet pulls row 3 ("petal length") into a `Float64Array`:
|
||||
|
||||
```js
|
||||
const petal_length = Float64Array.from(aoa[2].slice(1));
|
||||
```
|
||||
|
||||
### Writing to Sheets
|
||||
|
||||
The SheetJS `aoa_to_sheet`[^1] method can generate a worksheet from an array of
|
||||
arrays. Similarly, `sheet_add_aoa`[^2] can add an array of arrays of data into
|
||||
an existing worksheet object. The `origin` option[^3] controls where data will
|
||||
be written in the worksheet.
|
||||
|
||||
Neither method understands typed arrays, so data columns must be converted to
|
||||
arrays of arrays.
|
||||
|
||||
#### One Variable per Row
|
||||
|
||||
A single typed array can be converted to a pure JS array with `Array.from`:
|
||||
|
||||
```js
|
||||
const arr = Array.from(column);
|
||||
```
|
||||
|
||||
An array of arrays can be created from the array:
|
||||
|
||||
```js
|
||||
const aoa = [
|
||||
arr // this array is the first element of the array literal
|
||||
];
|
||||
```
|
||||
|
||||
`aoa_to_sheet` and `sheet_add_aoa` treat this as one row. By default, data will
|
||||
be written to cells in the first row of the worksheet.
|
||||
|
||||
Titles can be added to data rows with an `unshift` operation, but it is more
|
||||
efficient to build up the worksheet with `aoa_to_sheet`:
|
||||
|
||||
```js
|
||||
/* sample data */
|
||||
const data = new Float64Array([54337.95, 3.14159, 2.718281828]);
|
||||
const title = "Values";
|
||||
|
||||
/* convert sample data to array */
|
||||
const arr = Array.from(data);
|
||||
/* create worksheet from title (array of arrays) */
|
||||
const ws = XLSX.utils.aoa_to_sheet([ [ "Values" ] ]);
|
||||
/* add data starting at B1 */
|
||||
XLSX.utils.sheet_add_aoa(ws, [ arr ], { origin: "B1" });
|
||||
```
|
||||
|
||||
![Typed Array to single row with title](pathname:///typedarray/ta-row.png)
|
||||
|
||||
<details open><summary><b>Live Demo</b> (click to hide)</summary>
|
||||
|
||||
In this example, two typed arrays are exported. `aoa_to_sheet` creates the
|
||||
worksheet and `sheet_add_aoa` will add the data to the sheet.
|
||||
|
||||
```jsx live
|
||||
function SheetJSeriesToRows() { return (<button onClick={() => {
|
||||
/* typed arrays */
|
||||
const ta1 = new Float64Array([54337.95, 3.14159, 2.718281828]);
|
||||
const ta2 = new Float64Array([281.3308004, 201.8675309, 1900.6492568]);
|
||||
|
||||
/* create worksheet from first typed array */
|
||||
const ws = XLSX.utils.aoa_to_sheet([ [ "Values" ] ]);
|
||||
const arr1 = Array.from(ta1);
|
||||
XLSX.utils.sheet_add_aoa(ws, [ arr1 ], { origin: "B1" });
|
||||
|
||||
/* add second title to cell A2 */
|
||||
XLSX.utils.sheet_add_aoa(ws, [["Value2"]], { origin: "A2" });
|
||||
|
||||
/* add second typed array starting from cell B2 */
|
||||
const arr2 = Array.from(ta2);
|
||||
XLSX.utils.sheet_add_aoa(ws, [ arr2 ], { origin: "B2" });
|
||||
|
||||
/* export to file */
|
||||
const wb = XLSX.utils.book_new(ws, "Export");
|
||||
XLSX.writeFile(wb, "SheetJSeriesToRows.xlsx");
|
||||
}}><b>Click to export</b></button>); }
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
#### One Variable per Column
|
||||
|
||||
A single typed array can be converted to a pure JS array with `Array.from`. For
|
||||
columns, each value should be individually wrapped in an array:
|
||||
|
||||
<table><thead><tr><th>JavaScript</th><th>Spreadsheet</th></tr></thead><tbody><tr><td>
|
||||
|
||||
```js
|
||||
var data = [
|
||||
[54337.95],
|
||||
[3.14159],
|
||||
[2.718281828]
|
||||
];
|
||||
```
|
||||
|
||||
</td><td>
|
||||
|
||||
![Single column of data](pathname:///typedarray/col.png)
|
||||
|
||||
</td></tr></tbody></table>
|
||||
|
||||
`Array.from` takes a second argument. If it is a function, the function will be
|
||||
called on each element and the value will be used in place of the original value
|
||||
(in effect, mapping over the data). To generate a data column, each element must
|
||||
be wrapped in an array literal:
|
||||
|
||||
```js
|
||||
var arr = Array.from(column, (value) => ([ value ]));
|
||||
```
|
||||
|
||||
`aoa_to_sheet` and `sheet_add_aoa` treat this as rows with one column of data
|
||||
per row. By default, data will be written to cells in column "A".
|
||||
|
||||
Titles can be added to data columns with an `unshift` operation, but it is more
|
||||
efficient to build up the worksheet with `aoa_to_sheet`:
|
||||
|
||||
```js
|
||||
/* sample data */
|
||||
const data = new Float64Array([54337.95, 3.14159, 2.718281828]);
|
||||
const title = "Values";
|
||||
|
||||
/* convert sample data to array */
|
||||
const arr = Array.from(data, (value) => ([value]));
|
||||
/* create worksheet from title (array of arrays) */
|
||||
const ws = XLSX.utils.aoa_to_sheet([ [ "Values" ] ]);
|
||||
/* add data starting at B1 */
|
||||
XLSX.utils.sheet_add_aoa(ws, arr, { origin: "A2" });
|
||||
```
|
||||
|
||||
![Typed Array to single column with title](pathname:///typedarray/ta-col.png)
|
||||
|
||||
<details open><summary><b>Live Demo</b> (click to hide)</summary>
|
||||
|
||||
In this example, two typed arrays are exported. `aoa_to_sheet` creates the
|
||||
worksheet and `sheet_add_aoa` will add the data to the sheet.
|
||||
|
||||
```jsx live
|
||||
function SheetJSeriesToCols() { return (<button onClick={() => {
|
||||
/* typed arrays */
|
||||
const ta1 = new Float64Array([54337.95, 3.14159, 2.718281828]);
|
||||
const ta2 = new Float64Array([281.3308004, 201.8675309, 1900.6492568]);
|
||||
|
||||
/* create worksheet from first title */
|
||||
const ws = XLSX.utils.aoa_to_sheet([ [ "Values" ] ]);
|
||||
|
||||
/* add first typed array starting from cell B1 */
|
||||
const arr1 = Array.from(ta1, (value) => ([value]));
|
||||
XLSX.utils.sheet_add_aoa(ws, arr1, { origin: "A2" });
|
||||
|
||||
/* add second title to cell B1 */
|
||||
XLSX.utils.sheet_add_aoa(ws, [["Value2"]], { origin: "B1" });
|
||||
|
||||
/* add second typed array starting from cell B2 */
|
||||
const arr2 = Array.from(ta2, (value) => ([value]));
|
||||
XLSX.utils.sheet_add_aoa(ws, arr2, { origin: "B2" });
|
||||
|
||||
/* export to file */
|
||||
const wb = XLSX.utils.book_new(ws, "Export");
|
||||
XLSX.writeFile(wb, "SheetJSeriesToCols.xlsx");
|
||||
}}><b>Click to export</b></button>); }
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
[^1]: See [`aoa_to_sheet` in "Utilities"](/docs/api/utilities/array#array-of-arrays-input)
|
||||
[^2]: See [`sheet_add_aoa` in "Utilities"](/docs/api/utilities/array#array-of-arrays-input)
|
||||
[^3]: See [the `origin` option of `sheet_add_aoa` in "Utilities"](/docs/api/utilities/array#array-of-arrays-input)
|
@ -43,6 +43,7 @@ This demo was tested in the following environments:
|
||||
|:---------|:-----------|
|
||||
| `5.0.5` | 2023-12-04 |
|
||||
| `4.5.0` | 2023-12-04 |
|
||||
| `3.2.7` | 2023-12-05 |
|
||||
|
||||
:::
|
||||
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
title: Web Frameworks
|
||||
pagination_prev: demos/index
|
||||
pagination_prev: demos/math/index
|
||||
pagination_next: demos/grid/index
|
||||
---
|
||||
|
||||
|
@ -11,6 +11,11 @@ pagination_next: demos/net/upload/index
|
||||
import current from '/version.js';
|
||||
import CodeBlock from '@theme/CodeBlock';
|
||||
|
||||
`XMLHttpRequest` and `fetch` browser APIs enable binary data transfer between
|
||||
web browser clients and web servers. Since this library works in web browsers,
|
||||
server conversion work can be offloaded to the client! This demo shows a few
|
||||
common scenarios involving browser APIs and popular wrapper libraries.
|
||||
|
||||
:::info pass
|
||||
|
||||
This demo focuses on downloading files. Other demos cover other HTTP use cases:
|
||||
@ -20,11 +25,6 @@ This demo focuses on downloading files. Other demos cover other HTTP use cases:
|
||||
|
||||
:::
|
||||
|
||||
`XMLHttpRequest` and `fetch` browser APIs enable binary data transfer between
|
||||
web browser clients and web servers. Since this library works in web browsers,
|
||||
server conversion work can be offloaded to the client! This demo shows a few
|
||||
common scenarios involving browser APIs and popular wrapper libraries.
|
||||
|
||||
:::caution Third-Party Hosts and Binary Data
|
||||
|
||||
Third-party cloud platforms such as AWS may corrupt raw binary downloads by
|
||||
@ -45,7 +45,20 @@ The APIs generally have a way to control the interpretation of the downloaded
|
||||
data. The `arraybuffer` response type usually forces the data to be presented
|
||||
as an `ArrayBuffer` which can be parsed directly with the SheetJS `read` method[^1].
|
||||
|
||||
For example, with `fetch`:
|
||||
The following example shows the data flow using `fetch` to download files:
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
server[(Remote\nFile)]
|
||||
response(Response\nobject)
|
||||
subgraph SheetJS operations
|
||||
ab(XLSX Data\nArrayBuffer)
|
||||
wb(((SheetJS\nWorkbook)))
|
||||
end
|
||||
server --> |`fetch`\nGET request| response
|
||||
response --> |`arrayBuffer`\n\n| ab
|
||||
ab --> |`read`\n\n| wb
|
||||
```
|
||||
|
||||
```js
|
||||
/* download data into an ArrayBuffer object */
|
||||
@ -69,7 +82,8 @@ contents match the first worksheet. The table is generated using the SheetJS
|
||||
### XMLHttpRequest
|
||||
|
||||
For downloading data, the `arraybuffer` response type generates an `ArrayBuffer`
|
||||
that can be viewed as an `Uint8Array` and fed to `XLSX.read` using `array` type:
|
||||
that can be viewed as an `Uint8Array` and fed to the SheetJS `read` method. For
|
||||
legacy browsers, the option `type: "array"` should be specified:
|
||||
|
||||
```js
|
||||
/* set up an async GET request */
|
||||
@ -122,7 +136,7 @@ function SheetJSXHRDL() {
|
||||
### fetch
|
||||
|
||||
For downloading data, `Response#arrayBuffer` resolves to an `ArrayBuffer` that
|
||||
can be converted to `Uint8Array` and passed to `XLSX.read`:
|
||||
can be converted to `Uint8Array` and passed to the SheetJS `read` method:
|
||||
|
||||
```js
|
||||
fetch(url).then(function(res) {
|
||||
@ -215,13 +229,14 @@ $.ajax({
|
||||
### Wrapper Libraries
|
||||
|
||||
Before `fetch` shipped with browsers, there were various wrapper libraries to
|
||||
simplify `XMLHttpRequest`. Due to limitations with `fetch`, these libraries
|
||||
are still relevant.
|
||||
simplify `XMLHttpRequest`. Due to limitations with `fetch`, these libraries are
|
||||
still relevant.
|
||||
|
||||
#### axios
|
||||
|
||||
[`axios`](https://axios-http.com/) presents a Promise based interface. Setting
|
||||
`responseType` to `arraybuffer` ensures the return type is an ArrayBuffer:
|
||||
`responseType` to `arraybuffer` ensures the return type is an ArrayBuffer. The
|
||||
`data` property of the result can be passed to the SheetJS `read` method:
|
||||
|
||||
```js
|
||||
async function workbook_dl_axios(url) {
|
||||
@ -491,7 +506,7 @@ npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz request
|
||||
#### axios
|
||||
|
||||
When the `responseType` is `"arraybuffer"`, `axios` actually captures the data
|
||||
in a NodeJS Buffer. `XLSX.read` will transparently handle Buffers:
|
||||
in a NodeJS Buffer. The SheetJS `read` method handles NodeJS Buffer objects:
|
||||
|
||||
```js title="SheetJSAxios.js"
|
||||
const XLSX = require("xlsx"), axios = require("axios");
|
||||
@ -548,6 +563,8 @@ Other demos show network operations in special platforms:
|
||||
|
||||
- [React Native "Fetching Remote Data"](/docs/demos/mobile/reactnative#fetching-remote-data)
|
||||
- [NativeScript "Fetching Remote Files"](/docs/demos/mobile/nativescript#fetching-remote-files)
|
||||
- [AngularJS "Remote Files"](/docs/demos/frontend/angularjs#remote-files)
|
||||
- [Dojo Toolkit "Parsing Remote Files"](/docs/demos/frontend/dojo#parsing-remote-files)
|
||||
|
||||
[^1]: See [`read` in "Reading Files"](/docs/api/parse-options)
|
||||
[^2]: See [`sheet_to_html` in "Utilities"](/docs/api/utilities/html#html-table-output)
|
||||
|
@ -219,17 +219,17 @@ This demo was tested in the following environments:
|
||||
|
||||
| OS | Type | Device | RN | Date |
|
||||
|:-----------|:-----|:--------------------|:---------|:-----------|
|
||||
| Android 34 | Sim | Pixel 3a | `0.72.7` | 2023-12-04 |
|
||||
| iOS 17.0.1 | Sim | iPhone 15 Pro Max | `0.72.7` | 2023-12-04 |
|
||||
| Android 29 | Real | NVIDIA Shield | `0.72.7` | 2023-12-04 |
|
||||
| iOS 15.1 | Real | iPad Pro | `0.72.7` | 2023-12-04 |
|
||||
| Android 34 | Sim | Pixel 3a | `0.73.1` | 2023-12-21 |
|
||||
| iOS 17.2 | Sim | iPhone 15 Pro Max | `0.73.1` | 2023-12-21 |
|
||||
| Android 29 | Real | NVIDIA Shield | `0.73.1` | 2023-12-21 |
|
||||
| iOS 15.1 | Real | iPad Pro | `0.73.1` | 2023-12-21 |
|
||||
|
||||
:::
|
||||
|
||||
1) Create project:
|
||||
|
||||
```bash
|
||||
npx -y react-native@0.72.7 init SheetJSRNFetch --version="0.72.7"
|
||||
npx -y react-native@0.73.1 init SheetJSRNFetch --version="0.73.1"
|
||||
```
|
||||
|
||||
2) Install shared dependencies:
|
||||
@ -249,16 +249,16 @@ curl -LO https://docs.sheetjs.com/reactnative/App.tsx
|
||||
|
||||
**Android Testing**
|
||||
|
||||
4) Install or switch to Java 11[^6]
|
||||
4) Install or switch to Java 17[^6]
|
||||
|
||||
:::note pass
|
||||
|
||||
When the demo was last tested on macOS, `java -version` displayed the following:
|
||||
|
||||
```
|
||||
openjdk version "11.0.21" 2023-10-17 LTS
|
||||
OpenJDK Runtime Environment Zulu11.68+17-CA (build 11.0.21+9-LTS)
|
||||
OpenJDK 64-Bit Server VM Zulu11.68+17-CA (build 11.0.21+9-LTS, mixed mode)
|
||||
openjdk version "17.0.9" 2023-10-17
|
||||
OpenJDK Runtime Environment Temurin-17.0.9+9 (build 17.0.9+9)
|
||||
OpenJDK 64-Bit Server VM Temurin-17.0.9+9 (build 17.0.9+9, mixed mode)
|
||||
```
|
||||
|
||||
:::
|
||||
@ -274,8 +274,14 @@ npx react-native run-android
|
||||
If the initial launch fails with an error referencing the emulator, manually
|
||||
start the emulator and try again.
|
||||
|
||||
Gradle errors typically stem from a Java version mismatch. Run `java -version`
|
||||
and verify that the Java major version is 11.
|
||||
Gradle errors typically stem from a Java version mismatch:
|
||||
|
||||
```
|
||||
> Failed to apply plugin 'com.android.internal.application'.
|
||||
> Android Gradle plugin requires Java 17 to run. You are currently using Java 11.
|
||||
```
|
||||
|
||||
This error can be resolved by installing and switching to the requested version.
|
||||
|
||||
:::
|
||||
|
||||
@ -299,7 +305,9 @@ tapping "Import data from a spreadsheet", verify that the app shows new data:
|
||||
|
||||
:::warning pass
|
||||
|
||||
iOS testing requires macOS. It does not work on Windows or Linux.
|
||||
**iOS testing can only be performed on Apple hardware running macOS!**
|
||||
|
||||
Xcode and iOS simulators are not available on Windows or Linux.
|
||||
|
||||
:::
|
||||
|
||||
@ -349,7 +357,7 @@ npx react-native run-android
|
||||
|
||||
13) Close any Android / iOS emulators.
|
||||
|
||||
14) Enable developer code signing certificates[^7]
|
||||
14) Enable developer code signing certificates[^7].
|
||||
|
||||
15) Install `ios-deploy` through Homebrew:
|
||||
|
||||
@ -363,6 +371,67 @@ brew install ios-deploy
|
||||
npx react-native run-ios
|
||||
```
|
||||
|
||||
:::caution pass
|
||||
|
||||
When this demo was last tested, the build failed with the following error:
|
||||
|
||||
```
|
||||
PhaseScriptExecution failed with a nonzero exit code
|
||||
```
|
||||
|
||||
This was due to an error in the `react-native` package. The script
|
||||
`node_modules/react-native/scripts/react-native-xcode.sh` must be edited.
|
||||
|
||||
Near the top of the script, there will be a `set` statement:
|
||||
|
||||
```bash title="node_modules/react-native/scripts/react-native-xcode.sh"
|
||||
# Print commands before executing them (useful for troubleshooting)
|
||||
# highlight-next-line
|
||||
set -x -e
|
||||
DEST=$CONFIGURATION_BUILD_DIR/$UNLOCALIZED_RESOURCES_FOLDER_PATH
|
||||
```
|
||||
|
||||
The `-e` argument must be removed:
|
||||
|
||||
```bash title="node_modules/react-native/scripts/react-native-xcode.sh (edit line)"
|
||||
# Print commands before executing them (useful for troubleshooting)
|
||||
# highlight-next-line
|
||||
set -x
|
||||
DEST=$CONFIGURATION_BUILD_DIR/$UNLOCALIZED_RESOURCES_FOLDER_PATH
|
||||
```
|
||||
|
||||
:::
|
||||
|
||||
:::info pass
|
||||
|
||||
By default, React Native generates applications that exclusively target iPhone.
|
||||
On a physical iPad, a pixellated iPhone app will be run.
|
||||
|
||||
The "targeted device families" setting must be changed to support iPad:
|
||||
|
||||
A) Open the Xcode workspace:
|
||||
|
||||
```bash
|
||||
open ./ios/SheetJSRNFetch.xcworkspace
|
||||
```
|
||||
|
||||
B) Select the project in the left sidebar:
|
||||
|
||||
![Select the project](pathname:///reactnative/xcode-select-project.png)
|
||||
|
||||
C) Select the "SheetJSRNFetch" target in the sidebar.
|
||||
|
||||
![Settings](pathname:///reactnative/xcode-targets.png)
|
||||
|
||||
D) Select the "Build Settings" tab in the main area.
|
||||
|
||||
E) In the search bar below "Build Settings", type "tar"
|
||||
|
||||
F) Look for the "Targeted Device Families" row. Change the corresponding value
|
||||
to "iPhone, iPad".
|
||||
|
||||
:::
|
||||
|
||||
## Local Files
|
||||
|
||||
:::warning pass
|
||||
@ -987,6 +1056,6 @@ npx xlsx-cli /tmp/sheetjsw.xlsx
|
||||
[^3]: See ["Array Output" in "Utility Functions"](/docs/api/utilities/array#array-output)
|
||||
[^4]: See ["Array of Arrays Input" in "Utility Functions"](/docs/api/utilities/array#array-of-arrays-input)
|
||||
[^5]: React-Native commit [`5b597b5`](https://github.com/facebook/react-native/commit/5b597b5ff94953accc635ed3090186baeecb3873) added the final piece required for `fetch` support. It landed in version `0.72.0-rc.1` and is available in official releases starting from `0.72.0`.
|
||||
[^6]: When the demo was last tested, the Zulu11 distribution of Java 11 was installed through the macOS Brew package manager. [Direct downloads are available at `azul.com`](https://www.azul.com/downloads/?version=java-11-lts&package=jdk#zulu)
|
||||
[^6]: When the demo was last tested, the Temurin distribution of Java 17 was installed through the macOS Brew package manager by running `brew install temurin17`. [Direct downloads are available at `adoptium.net`](https://adoptium.net/temurin/releases/?version=17)
|
||||
[^7]: See ["Running On Device"](https://reactnative.dev/docs/running-on-device) in the React Native documentation
|
||||
[^8]: Follow the ["React Native CLI Quickstart"](https://reactnative.dev/docs/environment-setup) for Android (and iOS, if applicable)
|
@ -178,7 +178,7 @@ npx cap init sheetjs-cap com.sheetjs.cap --web-dir=dist
|
||||
npm run build
|
||||
```
|
||||
|
||||
:::note
|
||||
:::note pass
|
||||
|
||||
If prompted to create an Ionic account, type `N` and press Enter.
|
||||
|
||||
|
@ -184,11 +184,11 @@ This demo was tested in the following environments:
|
||||
|
||||
| OS and Version | Architecture | Electron | Date |
|
||||
|:---------------|:-------------|:---------|:-----------|
|
||||
| macOS 13.5.1 | `darwin-x64` | `26.1.0` | 2023-09-03 |
|
||||
| macOS 13.5.1 | `darwin-x64` | `27.1.3` | 2023-12-09 |
|
||||
| macOS 14.1.2 | `darwin-arm` | `27.1.3` | 2023-12-01 |
|
||||
| Windows 10 | `win10-x64` | `26.1.0` | 2023-09-03 |
|
||||
| Windows 10 | `win10-x64` | `27.1.3` | 2023-12-09 |
|
||||
| Windows 11 | `win11-arm` | `27.1.3` | 2023-12-01 |
|
||||
| Linux (HoloOS) | `linux-x64` | `27.0.0` | 2023-10-11 |
|
||||
| Linux (HoloOS) | `linux-x64` | `27.1.3` | 2023-12-09 |
|
||||
| Linux (Debian) | `linux-arm` | `27.1.3` | 2023-12-01 |
|
||||
|
||||
:::
|
||||
@ -247,7 +247,7 @@ The app will show.
|
||||
npm run make
|
||||
```
|
||||
|
||||
This will create a package in the `out\make` folder.
|
||||
This will create a package in the `out\make` folder and a standalone binary.
|
||||
|
||||
:::caution pass
|
||||
|
||||
@ -266,11 +266,13 @@ The program will run on ARM64 Windows.
|
||||
|
||||
5) Download [the test file `pres.numbers`](https://sheetjs.com/pres.numbers)
|
||||
|
||||
6) Re-launch the application in the test environment:
|
||||
6) Launch the generated application:
|
||||
|
||||
```bash
|
||||
npx -y electron .
|
||||
```
|
||||
| Architecture | Command |
|
||||
|:-------------|:--------------------------------------------------------------|
|
||||
| `darwin-x64` | `open ./out/sheetjs-electron-darwin-x64/sheetjs-electron.app` |
|
||||
| `win10-x64` | `.\out\sheetjs-electron-win32-x64\sheetjs-electron.exe` |
|
||||
| `linux-x64` | `./out/sheetjs-electron-linux-x64/sheetjs-electron` |
|
||||
|
||||
#### Electron API
|
||||
|
||||
@ -284,7 +286,7 @@ to write to `Untitled.xls` in the Downloads folder.
|
||||
|
||||
:::note pass
|
||||
|
||||
During the most recent Linux ARM64 test, the dialog did not have a default name.
|
||||
In some tests, the dialog did not have a default name.
|
||||
|
||||
If there is no default name, enter `Untitled.xls` and click "Save".
|
||||
|
||||
@ -335,4 +337,4 @@ call is required to enable Developer Tools in the window.
|
||||
|
||||
:::
|
||||
|
||||
[^1]: See ["Makers"](https://www.electronforge.io/config/makers) in the Electron Forge documentation. On Linux, the demo generates `rpm` and `deb` distributables.
|
||||
[^1]: See ["Makers"](https://www.electronforge.io/config/makers) in the Electron Forge documentation. On Linux, the demo generates `rpm` and `deb` distributables. On Arch Linux and the Steam Deck, `sudo pacman -Syu rpm-tools dpkg fakeroot` installed required packages.
|
@ -115,9 +115,9 @@ This demo was tested in the following environments:
|
||||
|:---------------|:-------------|:---------|:-----------|
|
||||
| macOS 13.5.2 | `darwin-x64` | `0.78.1` | 2023-09-27 |
|
||||
| macOS 14.1.2 | `darwin-arm` | `0.82.0` | 2023-12-01 |
|
||||
| Windows 10 | `win10-x64` | `0.78.1` | 2023-09-27 |
|
||||
| Windows 10 | `win10-x64` | `0.82.0` | 2023-12-09 |
|
||||
| Windows 11 | `win11-arm` | `0.82.0` | 2023-12-01 |
|
||||
| Linux (HoloOS) | `linux-x64` | `0.78.1` | 2023-10-11 |
|
||||
| Linux (HoloOS) | `linux-x64` | `0.82.0` | 2023-12-07 |
|
||||
|
||||
There is no official Linux ARM64 release. The community release[^1] was tested
|
||||
and verified on 2023-09-27.
|
||||
|
@ -299,7 +299,7 @@ This demo was tested in the following environments:
|
||||
|:---------------|:-------------|:---------|:-----------|
|
||||
| macOS 13.6 | `darwin-x64` | `v2.6.0` | 2023-11-05 |
|
||||
| macOS 14.1.2 | `darwin-arm` | `v2.6.0` | 2023-12-01 |
|
||||
| Windows 10 | `win10-x64` | `v2.5.1` | 2023-08-25 |
|
||||
| Windows 10 | `win10-x64` | `v2.6.0` | 2023-12-09 |
|
||||
| Windows 11 | `win11-arm` | `v2.6.0` | 2023-12-01 |
|
||||
| Linux (HoloOS) | `linux-x64` | `v2.6.0` | 2023-10-11 |
|
||||
| Linux (Debian) | `linux-arm` | `v2.6.0` | 2023-12-01 |
|
||||
|
@ -192,11 +192,11 @@ This demo was tested in the following environments:
|
||||
|
||||
| OS and Version | Architecture | Server | Client | Date |
|
||||
|:---------------|:-------------|:----------|:----------|:-----------|
|
||||
| macOS 13.5.1 | `darwin-x64` | `v4.13.0` | `v3.11.0` | 2023-08-26 |
|
||||
| macOS 13.5.1 | `darwin-x64` | `v4.14.1` | `v3.12.0` | 2023-12-13 |
|
||||
| macOS 14.0 | `darwin-arm` | `v4.14.1` | `v3.12.0` | 2023-10-18 |
|
||||
| Windows 10 | `win10-x64` | `v4.13.0` | `v3.11.0` | 2023-08-26 |
|
||||
| Windows 10 | `win10-x64` | `v4.14.1` | `v3.12.0` | 2023-12-09 |
|
||||
| Windows 11 | `win11-arm` | `v4.14.1` | `v3.12.0` | 2023-12-01 |
|
||||
| Linux (HoloOS) | `linux-x64` | `v4.14.1` | `v3.12.0` | 2023-10-11 |
|
||||
| Linux (HoloOS) | `linux-x64` | `v4.14.1` | `v3.12.0` | 2023-12-09 |
|
||||
| Linux (Debian) | `linux-arm` | `v4.14.1` | `v3.12.0` | 2023-12-01 |
|
||||
|
||||
:::
|
||||
|
@ -55,7 +55,7 @@ This demo was tested in the following deployments:
|
||||
| `darwin-arm` | `4.0.0-rc.2` | `18.18.0` | 2023-12-01 |
|
||||
| `win10-x64` | `4.0.0-rc.2` | `14.15.3` | 2023-10-09 |
|
||||
| `win11-arm` | `4.0.0-rc.2` | `20.10.0` | 2023-12-01 |
|
||||
| `linux-x64` | `4.0.0-rc.2` | `14.15.3` | 2023-10-11 |
|
||||
| `linux-x64` | `4.0.0-rc.2` | `14.15.3` | 2023-12-07 |
|
||||
| `linux-arm` | `4.0.0-rc.2` | `20.10.0` | 2023-12-01 |
|
||||
|
||||
</TabItem>
|
||||
@ -67,7 +67,7 @@ This demo was tested in the following deployments:
|
||||
| `darwin-arm` | `5.8.1` | `18.5.0` | 2023-12-01 |
|
||||
| `win10-x64` | `5.8.1` | `18.5.0` | 2023-10-09 |
|
||||
| `win11-arm` | `5.8.1` | `18.5.0` | 2023-12-01 |
|
||||
| `linux-x64` | `5.8.1` | `18.5.0` | 2023-10-11 |
|
||||
| `linux-x64` | `5.8.1` | `18.5.0` | 2023-12-07 |
|
||||
| `linux-arm` | `5.8.1` | `18.5.0` | 2023-12-01 |
|
||||
|
||||
</TabItem>
|
||||
@ -78,7 +78,7 @@ This demo was tested in the following deployments:
|
||||
| `darwin-x64` | `2.1.2` | `20.8.0` | 2023-10-12 |
|
||||
| `darwin-arm` | `2.3.0` | `21.3.0` | 2023-12-01 |
|
||||
| `win10-x64` | `2.1.2` | `16.20.2` | 2023-10-09 |
|
||||
| `linux-x64` | `2.1.2` | `20.8.0` | 2023-10-11 |
|
||||
| `linux-x64` | `2.3.0` | `21.4.0` | 2023-12-07 |
|
||||
| `linux-arm` | `2.3.0` | `21.3.0` | 2023-12-01 |
|
||||
|
||||
</TabItem>
|
||||
|
@ -24,10 +24,10 @@ This demo was verified by NetSuite consultants in the following deployments:
|
||||
|
||||
| `@NScriptType` | `@NApiVersion` | Date |
|
||||
|:----------------|:---------------|:-----------|
|
||||
| ScheduledScript | 2.1 | 2023-08-18 |
|
||||
| ScheduledScript | 2.1 | 2023-12-13 |
|
||||
| Restlet | 2.1 | 2023-10-05 |
|
||||
| Suitelet | 2.1 | 2023-10-27 |
|
||||
| MapReduceScript | 2.1 | 2023-11-16 |
|
||||
| Suitelet | 2.1 | 2023-12-22 |
|
||||
| MapReduceScript | 2.1 | 2023-12-07 |
|
||||
|
||||
:::
|
||||
|
||||
|
@ -1,5 +1,5 @@
|
||||
---
|
||||
title: Data Processing in GitHub
|
||||
title: Flat Data Processing in GitHub
|
||||
sidebar_label: GitHub
|
||||
pagination_prev: demos/local/index
|
||||
pagination_next: demos/extensions/index
|
||||
@ -8,15 +8,12 @@ pagination_next: demos/extensions/index
|
||||
import current from '/version.js';
|
||||
import CodeBlock from '@theme/CodeBlock';
|
||||
|
||||
Many official data releases by governments and organizations include XLSX or
|
||||
XLS files. Unfortunately some data sources do not retain older versions.
|
||||
|
||||
[Git](https://git-scm.com/) is a popular system for organizing a historical
|
||||
record of text files and changes. Git can also store and track spreadsheets.
|
||||
|
||||
[GitHub](https://github.com/) hosts Git repositories and provides infrastructure
|
||||
to run scheduled tasks. ["Flat Data"](https://octo.github.com/projects/flat-data)
|
||||
explores storing and comparing versions of structured CSV and JSON data.
|
||||
GitHub hosts Git repositories and provides infrastructure to execute workflows.
|
||||
The ["Flat Data" project](https://octo.github.com/projects/flat-data) explores
|
||||
storing and comparing versions of structured data using GitHub infrastructure.
|
||||
|
||||
[SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing
|
||||
data from spreadsheets.
|
||||
@ -29,7 +26,7 @@ changes over time.
|
||||
|
||||
["Excel to CSV"](https://octo.github.com/projects/flat-data#:~:text=Excel) is an
|
||||
official example that pulls XLSX workbooks from an endpoint and uses SheetJS to
|
||||
parse the workbooks and generate CSV files:
|
||||
parse the workbooks and generate CSV files.
|
||||
|
||||
:::
|
||||
|
||||
@ -38,8 +35,8 @@ The following diagram depicts the data dance:
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
autonumber
|
||||
participant R as GH Repo
|
||||
participant A as GH Action
|
||||
participant R as GitHub Repo
|
||||
participant A as GitHub Action
|
||||
participant S as Data Source
|
||||
loop Regular Interval (cron)
|
||||
A->>R: clone repo
|
||||
@ -56,18 +53,30 @@ sequenceDiagram
|
||||
|
||||
## Flat Data
|
||||
|
||||
Many official data releases by governments and organizations include XLSX or
|
||||
XLS files. Unfortunately some data sources do not retain older versions.
|
||||
|
||||
Software developers typically use version control systems such as Git to track
|
||||
changes in source code.
|
||||
|
||||
The "Flat Data" project starts from the idea that the same version control
|
||||
systems can be used to track changes in data. Third-party data sources can be
|
||||
snapshotted at regular intervals and stored in Git repositories.
|
||||
|
||||
### Components
|
||||
|
||||
As a project from the company, the entire lifecycle uses GitHub offerings:
|
||||
|
||||
- GitHub offers free hosting for Git repositories
|
||||
- GitHub Actions[^1] infrastructure runs tasks at regular intervals
|
||||
- `githubocto/flat`[^2] library helps fetch data and automate post-processing
|
||||
- `flat-postprocessing`[^3] library provides post-processing helper functions
|
||||
- "Flat Viewer"[^4] displays structured CSV and JSON data from Git repositories
|
||||
- GitHub.com[^1] offers free hosting for Git repositories
|
||||
- GitHub Actions[^2] infrastructure runs tasks at regular intervals
|
||||
- `githubocto/flat`[^3] library helps fetch data and automate post-processing
|
||||
- `flat-postprocessing`[^4] library provides post-processing helper functions
|
||||
- "Flat Viewer"[^5] displays structured CSV and JSON data from Git repositories
|
||||
|
||||
:::caution pass
|
||||
|
||||
A GitHub account is required. When the demo was last tested, "GitHub Free"
|
||||
accounts had no Actions usage limits for public repositories[^5].
|
||||
accounts had no Actions usage limits for public repositories[^6].
|
||||
|
||||
Private GitHub repositories can be used for processing data, but the Flat Viewer
|
||||
will not be able to display private data.
|
||||
@ -143,12 +152,12 @@ for more details.
|
||||
|
||||
The first argument to the post-processing script is the filename.
|
||||
|
||||
The SheetJS `readFile` method[^6] will read the file and generate a SheetJS
|
||||
workbook object[^7]. After extracting the first worksheet, `sheet_to_csv`[^8]
|
||||
The SheetJS `readFile` method[^7] will read the file and generate a SheetJS
|
||||
workbook object[^8]. After extracting the first worksheet, `sheet_to_csv`[^9]
|
||||
generates a CSV string.
|
||||
|
||||
After generating a CSV string, the string should be written to the filesystem
|
||||
using `Deno.writeFileSync`[^9]. By convention, the CSV should preserve the file
|
||||
using `Deno.writeFileSync`[^10]. By convention, the CSV should preserve the file
|
||||
name stem and replace the extension with `.csv`:
|
||||
|
||||
<CodeBlock title="postprocess.ts" language="ts">{`\
|
||||
@ -316,12 +325,13 @@ jobs:
|
||||
|
||||
The column chart in the Index column is a histogram.
|
||||
|
||||
[^1]: See ["GitHub Actions documentation"](https://docs.github.com/en/actions)
|
||||
[^2]: See [`githubocto/flat`](https://github.com/githubocto/flat) repo on GitHub.
|
||||
[^3]: See [`githubocto/flat-postprocessing`](https://github.com/githubocto/flat-postprocessing) repo on GitHub.
|
||||
[^4]: The hosted version is available at <https://flatgithub.com/>
|
||||
[^5]: See ["About billing for GitHub Actions"](https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions) in the GitHub documentation.
|
||||
[^6]: See [`readFile` in "Reading Files"](/docs/api/parse-options)
|
||||
[^7]: See ["Workbook Object"](/docs/csf/book)
|
||||
[^8]: See [`sheet_to_csv` in "CSV and Text"](/docs/api/utilities/csv#delimiter-separated-output)
|
||||
[^9]: See [`Deno.writeFileSync`](https://deno.land/api?s=Deno.writeFileSync) in the Deno Runtime APIs documentation.
|
||||
[^1]: See ["Repositories documentation"](https://docs.github.com/en/repositories) in the GitHub documentation.
|
||||
[^2]: See ["GitHub Actions documentation"](https://docs.github.com/en/actions) in the GitHub documentation.
|
||||
[^3]: See [`githubocto/flat`](https://github.com/githubocto/flat) repo on GitHub.
|
||||
[^4]: See [`githubocto/flat-postprocessing`](https://github.com/githubocto/flat-postprocessing) repo on GitHub.
|
||||
[^5]: The hosted version is available at <https://flatgithub.com/>
|
||||
[^6]: See ["About billing for GitHub Actions"](https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions) in the GitHub documentation.
|
||||
[^7]: See [`readFile` in "Reading Files"](/docs/api/parse-options)
|
||||
[^8]: See ["Workbook Object"](/docs/csf/book)
|
||||
[^9]: See [`sheet_to_csv` in "CSV and Text"](/docs/api/utilities/csv#delimiter-separated-output)
|
||||
[^10]: See [`Deno.writeFileSync`](https://deno.land/api?s=Deno.writeFileSync) in the Deno Runtime APIs documentation.
|
@ -28,7 +28,7 @@ flowchart LR
|
||||
nfile --> |ExcelTools\nImport|data
|
||||
```
|
||||
|
||||
:::note
|
||||
:::note Tested Deployments
|
||||
|
||||
This demo was last tested by SheetJS users on 2023 October 3 in Maple 2023.
|
||||
|
||||
|
@ -1,347 +0,0 @@
|
||||
---
|
||||
title: Typed Arrays and ML
|
||||
pagination_prev: demos/extensions/index
|
||||
pagination_next: demos/engines/index
|
||||
sidebar_custom_props:
|
||||
summary: Parse and serialize Uint8Array data from TensorFlow
|
||||
---
|
||||
|
||||
<head>
|
||||
<script src="https://docs.sheetjs.com/tfjs/tf.min.js"></script>
|
||||
</head>
|
||||
|
||||
Machine learning libraries in JS typically use "Typed Arrays". Typed Arrays are
|
||||
not JS Arrays! With some data wrangling, translating between SheetJS worksheets
|
||||
and typed arrays is straightforward.
|
||||
|
||||
This demo covers conversions between worksheets and Typed Arrays for use with
|
||||
TensorFlow.js and other ML libraries.
|
||||
|
||||
:::info pass
|
||||
|
||||
Live code blocks in this page load the standalone build from version `4.10.0`.
|
||||
|
||||
For use in web frameworks, the `@tensorflow/tfjs` module should be used.
|
||||
|
||||
For use in NodeJS, the native bindings module is `@tensorflow/tfjs-node`.
|
||||
|
||||
:::
|
||||
|
||||
:::note pass
|
||||
|
||||
Each browser demo was tested in the following environments:
|
||||
|
||||
| Browser | Date | TF.js version |
|
||||
|:------------|:-----------|:--------------|
|
||||
| Chrome 116 | 2023-09-02 | `4.10.0` |
|
||||
| Safari 16.6 | 2023-09-02 | `4.10.0` |
|
||||
| Brave 1.57 | 2023-09-02 | `4.10.0` |
|
||||
|
||||
:::
|
||||
|
||||
## CSV Data Interchange
|
||||
|
||||
`tf.data.csv` generates a Dataset from CSV data. The function expects a URL.
|
||||
Fortunately blob URLs are supported, making data import straightforward:
|
||||
|
||||
```js
|
||||
function worksheet_to_csv_url(worksheet) {
|
||||
/* generate CSV */
|
||||
const csv = XLSX.utils.sheet_to_csv(worksheet);
|
||||
|
||||
/* CSV -> Uint8Array -> Blob */
|
||||
const u8 = new TextEncoder().encode(csv);
|
||||
const blob = new Blob([u8], { type: "text/csv" });
|
||||
|
||||
/* generate a blob URL */
|
||||
return URL.createObjectURL(blob);
|
||||
}
|
||||
```
|
||||
|
||||
<details><summary><b>TF CSV Demo using XLSX files</b> (click to show)</summary>
|
||||
|
||||
This demo shows a simple model fitting using the "Boston Housing" dataset. The
|
||||
[sample XLSX file](https://sheetjs.com/data/bht.xlsx) contains the data.
|
||||
|
||||
The demo first fetches the XLSX file and generates CSV text. A blob URL is
|
||||
generated and fed to `tf.data.csv`. The rest of the demo follows the official
|
||||
example in the TensorFlow documentation.
|
||||
|
||||
:::caution pass
|
||||
|
||||
If the live demo shows a message
|
||||
|
||||
```
|
||||
ReferenceError: tf is not defined
|
||||
```
|
||||
|
||||
please refresh the page. This is a known bug in the documentation generator.
|
||||
|
||||
:::
|
||||
|
||||
```jsx live
|
||||
function SheetJSToTFJSCSV() {
|
||||
const [output, setOutput] = React.useState("");
|
||||
const doit = React.useCallback(async () => {
|
||||
/* fetch file */
|
||||
const f = await fetch("https://sheetjs.com/data/bht.xlsx");
|
||||
const ab = await f.arrayBuffer();
|
||||
/* parse file and get first worksheet */
|
||||
const wb = XLSX.read(ab);
|
||||
const ws = wb.Sheets[wb.SheetNames[0]];
|
||||
|
||||
/* generate CSV */
|
||||
const csv = XLSX.utils.sheet_to_csv(ws);
|
||||
|
||||
/* generate blob URL */
|
||||
const u8 = new TextEncoder().encode(csv);
|
||||
const blob = new Blob([u8], {type: "text/csv"});
|
||||
const url = URL.createObjectURL(blob);
|
||||
|
||||
/* feed to tfjs */
|
||||
const dataset = tf.data.csv(url, {columnConfigs:{"medv":{isLabel:true}}});
|
||||
|
||||
/* this part mirrors the tf.data.csv docs */
|
||||
const flat = dataset.map(({xs,ys}) => ({xs: Object.values(xs), ys: Object.values(ys)})).batch(10);
|
||||
const model = tf.sequential();
|
||||
model.add(tf.layers.dense({inputShape: [(await dataset.columnNames()).length - 1], units: 1}));
|
||||
model.compile({ optimizer: tf.train.sgd(0.000001), loss: 'meanSquaredError' });
|
||||
let base = output;
|
||||
await model.fitDataset(flat, { epochs: 10, callbacks: { onEpochEnd: async (epoch, logs) => {
|
||||
setOutput(base += "\n" + epoch + ":" + logs.loss);
|
||||
}}});
|
||||
model.summary();
|
||||
});
|
||||
return ( <pre>
|
||||
<button onClick={doit}>Click to run</button>
|
||||
{output}
|
||||
</pre> );
|
||||
}
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
In the other direction, `XLSX.read` will readily parse CSV exports.
|
||||
|
||||
## JS Array Interchange
|
||||
|
||||
[The official Linear Regression tutorial](https://www.tensorflow.org/js/tutorials/training/linear_regression)
|
||||
loads data from a JSON file:
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
"Name": "chevrolet chevelle malibu",
|
||||
"Miles_per_Gallon": 18,
|
||||
"Cylinders": 8,
|
||||
"Displacement": 307,
|
||||
"Horsepower": 130,
|
||||
"Weight_in_lbs": 3504,
|
||||
"Acceleration": 12,
|
||||
"Year": "1970-01-01",
|
||||
"Origin": "USA"
|
||||
},
|
||||
{
|
||||
"Name": "buick skylark 320",
|
||||
"Miles_per_Gallon": 15,
|
||||
"Cylinders": 8,
|
||||
"Displacement": 350,
|
||||
"Horsepower": 165,
|
||||
"Weight_in_lbs": 3693,
|
||||
"Acceleration": 11.5,
|
||||
"Year": "1970-01-01",
|
||||
"Origin": "USA"
|
||||
},
|
||||
// ...
|
||||
]
|
||||
```
|
||||
|
||||
In real use cases, data is stored in [spreadsheets](https://sheetjs.com/data/cd.xls)
|
||||
|
||||
![cd.xls screenshot](pathname:///files/cd.png)
|
||||
|
||||
Following the tutorial, the data fetching method is easily adapted. Differences
|
||||
from the official example are highlighted below:
|
||||
|
||||
```js
|
||||
/**
|
||||
* Get the car data reduced to just the variables we are interested
|
||||
* and cleaned of missing data.
|
||||
*/
|
||||
async function getData() {
|
||||
// highlight-start
|
||||
/* fetch file */
|
||||
const carsDataResponse = await fetch('https://sheetjs.com/data/cd.xls');
|
||||
/* get file data (ArrayBuffer) */
|
||||
const carsDataAB = await carsDataResponse.arrayBuffer();
|
||||
/* parse */
|
||||
const carsDataWB = XLSX.read(carsDataAB);
|
||||
/* get first worksheet */
|
||||
const carsDataWS = carsDataWB.Sheets[carsDataWB.SheetNames[0]];
|
||||
/* generate array of JS objects */
|
||||
const carsData = XLSX.utils.sheet_to_json(carsDataWS);
|
||||
// highlight-end
|
||||
const cleaned = carsData.map(car => ({
|
||||
mpg: car.Miles_per_Gallon,
|
||||
horsepower: car.Horsepower,
|
||||
}))
|
||||
.filter(car => (car.mpg != null && car.horsepower != null));
|
||||
|
||||
return cleaned;
|
||||
}
|
||||
```
|
||||
|
||||
## Low-Level Operations
|
||||
|
||||
:::caution pass
|
||||
|
||||
While it is more efficient to use low-level operations, JS or CSV interchange
|
||||
is strongly recommended when possible.
|
||||
|
||||
:::
|
||||
|
||||
### Data Transposition
|
||||
|
||||
A typical dataset in a spreadsheet will start with one header row and represent
|
||||
each data record in its own row. For example, the Iris dataset might look like
|
||||
|
||||
![Iris dataset](pathname:///files/iris.png)
|
||||
|
||||
`XLSX.utils.sheet_to_json` will translate this into an array of row objects:
|
||||
|
||||
```js
|
||||
var aoo = [
|
||||
{"sepal length": 5.1, "sepal width": 3.5, ...},
|
||||
{"sepal length": 4.9, "sepal width": 3, ...},
|
||||
...
|
||||
];
|
||||
```
|
||||
|
||||
TF.js and other libraries tend to operate on individual columns, equivalent to:
|
||||
|
||||
```js
|
||||
var sepal_lengths = [5.1, 4.9, ...];
|
||||
var sepal_widths = [3.5, 3, ...];
|
||||
```
|
||||
|
||||
When a `tensor2d` can be exported, it will look different from the spreadsheet:
|
||||
|
||||
```js
|
||||
var data_set_2d = [
|
||||
[5.1, 4.9, ...],
|
||||
[3.5, 3, ...],
|
||||
...
|
||||
]
|
||||
```
|
||||
|
||||
This is the transpose of how people use spreadsheets!
|
||||
|
||||
#### Typed Arrays and Columns
|
||||
|
||||
A single typed array can be converted to a pure JS array with `Array.from`:
|
||||
|
||||
```js
|
||||
var column = Array.from(dataset_typedarray);
|
||||
```
|
||||
|
||||
Similarly, `Float32Array.from` generates a typed array from a normal array:
|
||||
|
||||
```js
|
||||
var dataset = Float32Array.from(column);
|
||||
```
|
||||
|
||||
### Exporting Datasets to a Worksheet
|
||||
|
||||
`XLSX.utils.aoa_to_sheet` can generate a worksheet from an array of arrays.
|
||||
ML libraries typically provide APIs to pull an array of arrays, but it will
|
||||
be transposed. To export multiple data sets, manually "transpose" the data:
|
||||
|
||||
```js
|
||||
/* assuming data is an array of typed arrays */
|
||||
var aoa = [];
|
||||
for(var i = 0; i < data.length; ++i) {
|
||||
for(var j = 0; j < data[i].length; ++j) {
|
||||
if(!aoa[j]) aoa[j] = [];
|
||||
aoa[j][i] = data[i][j];
|
||||
}
|
||||
}
|
||||
/* aoa can be directly converted to a worksheet object */
|
||||
var ws = XLSX.utils.aoa_to_sheet(aoa);
|
||||
```
|
||||
|
||||
### Importing Data from a Spreadsheet
|
||||
|
||||
`sheet_to_json` with the option `header:1` will generate a row-major array of
|
||||
arrays that can be transposed. However, it is more efficient to walk the sheet
|
||||
manually:
|
||||
|
||||
```js
|
||||
/* find worksheet range */
|
||||
var range = XLSX.utils.decode_range(ws['!ref']);
|
||||
var out = []
|
||||
/* walk the columns */
|
||||
for(var C = range.s.c; C <= range.e.c; ++C) {
|
||||
/* create the typed array */
|
||||
var ta = new Float32Array(range.e.r - range.s.r + 1);
|
||||
/* walk the rows */
|
||||
for(var R = range.s.r; R <= range.e.r; ++R) {
|
||||
/* find the cell, skip it if the cell isn't numeric or boolean */
|
||||
var cell = ws[XLSX.utils.encode_cell({r:R, c:C})];
|
||||
if(!cell || cell.t != 'n' && cell.t != 'b') continue;
|
||||
/* assign to the typed array */
|
||||
ta[R - range.s.r] = cell.v;
|
||||
}
|
||||
out.push(ta);
|
||||
}
|
||||
```
|
||||
|
||||
If the data set has a header row, the loop can be adjusted to skip those rows.
|
||||
|
||||
### TF.js Tensors
|
||||
|
||||
A single `Array#map` can pull individual named fields from the result, which
|
||||
can be used to construct TensorFlow.js tensor objects:
|
||||
|
||||
```js
|
||||
const aoo = XLSX.utils.sheet_to_json(worksheet);
|
||||
const lengths = aoo.map(row => row["sepal length"]);
|
||||
const tensor = tf.tensor1d(lengths);
|
||||
```
|
||||
|
||||
`tf.Tensor` objects can be directly transposed using `transpose`:
|
||||
|
||||
```js
|
||||
var aoo = XLSX.utils.sheet_to_json(worksheet);
|
||||
// "x" and "y" are the fields we want to pull from the data
|
||||
var data = aoo.map(row => ([row["x"], row["y"]]));
|
||||
|
||||
// create a tensor representing two column datasets
|
||||
var tensor = tf.tensor2d(data).transpose();
|
||||
|
||||
// individual columns can be accessed
|
||||
var col1 = tensor.slice([0,0], [1,tensor.shape[1]]).flatten();
|
||||
var col2 = tensor.slice([1,0], [1,tensor.shape[1]]).flatten();
|
||||
```
|
||||
|
||||
For exporting, `stack` can be used to collapse the columns into a linear array:
|
||||
|
||||
```js
|
||||
/* pull data into a Float32Array */
|
||||
var result = tf.stack([col1, col2]).transpose();
|
||||
var shape = tensor.shape;
|
||||
var f32 = tensor.dataSync();
|
||||
|
||||
/* construct an array of arrays of the data in spreadsheet order */
|
||||
var aoa = [];
|
||||
for(var j = 0; j < shape[0]; ++j) {
|
||||
aoa[j] = [];
|
||||
for(var i = 0; i < shape[1]; ++i) aoa[j][i] = f32[j * shape[1] + i];
|
||||
}
|
||||
|
||||
/* add headers to the top */
|
||||
aoa.unshift(["x", "y"]);
|
||||
|
||||
/* generate worksheet */
|
||||
var worksheet = XLSX.utils.aoa_to_sheet(aoa);
|
||||
```
|
||||
|
@ -130,11 +130,11 @@ This demo was tested in the following deployments:
|
||||
|
||||
| Architecture | Version | Date |
|
||||
|:-------------|:--------|:-----------|
|
||||
| `darwin-x64` | `2.7.0` | 2023-10-26 |
|
||||
| `darwin-x64` | `2.7.0` | 2023-12-05 |
|
||||
| `darwin-arm` | `2.7.0` | 2023-10-18 |
|
||||
| `win10-x64` | `2.7.0` | 2023-10-27 |
|
||||
| `win11-arm` | `2.7.0` | 2023-12-01 |
|
||||
| `linux-x64` | `2.7.0` | 2023-10-11 |
|
||||
| `linux-x64` | `2.7.0` | 2023-12-07 |
|
||||
| `linux-arm` | `2.7.0` | 2023-12-01 |
|
||||
|
||||
:::
|
||||
|
@ -17,8 +17,18 @@ result is a JAR.
|
||||
|
||||
:::caution pass
|
||||
|
||||
Rhino does not support Uint8Array, so certain formats like NUMBERS cannot be
|
||||
parsed or written from Rhino JS code!
|
||||
Rhino does not support Uint8Array, so NUMBERS files cannot be read or written.
|
||||
|
||||
:::
|
||||
|
||||
:::note Tested Deployments
|
||||
|
||||
This demo was tested in the following deployments:
|
||||
|
||||
| OpenJDK | Rhino | Date |
|
||||
|:--------|:---------|:-----------|
|
||||
| 21.0.1 | `1.7.14` | 2023-12-05 |
|
||||
| 1.8.0 | `1.7.14` | 2023-12-05 |
|
||||
|
||||
:::
|
||||
|
||||
@ -118,12 +128,6 @@ This string can be loaded into the JS engine and processed:
|
||||
|
||||
## Complete Example
|
||||
|
||||
:::note
|
||||
|
||||
This demo was tested on 2023-10-26 against Rhino 1.7.14.
|
||||
|
||||
:::
|
||||
|
||||
0) Ensure Java is installed.
|
||||
|
||||
1) Create a folder for the project:
|
||||
|
@ -27,7 +27,7 @@ command-line tool for reading data from files.
|
||||
:::note pass
|
||||
|
||||
Many QuickJS functions are not documented. The explanation was verified against
|
||||
the latest release (commit `2788d71`).
|
||||
the latest release (commit `daa35bc`).
|
||||
|
||||
:::
|
||||
|
||||
@ -262,14 +262,14 @@ This demo was tested in the following deployments:
|
||||
|
||||
| Architecture | Git Commit | Date |
|
||||
|:-------------|:-----------|:-----------|
|
||||
| `darwin-x64` | `2788d71` | 2023-10-26 |
|
||||
| `darwin-x64` | `daa35bc` | 2023-12-09 |
|
||||
| `darwin-arm` | `2788d71` | 2023-10-18 |
|
||||
| `win10-x64` | `2788d71` | 2023-10-09 |
|
||||
| `win10-x64` | `daa35bc` | 2023-12-09 |
|
||||
| `win11-arm` | `03cc5ec` | 2023-12-01 |
|
||||
| `linux-x64` | `2788d71` | 2023-10-11 |
|
||||
| `linux-x64` | `03cc5ec` | 2023-12-07 |
|
||||
| `linux-arm` | `03cc5ec` | 2023-12-01 |
|
||||
|
||||
When the demo was tested, commit `03cc5ec` corresponded to the latest commit.
|
||||
When the demo was tested, commit `daa35bc` corresponded to the latest release.
|
||||
|
||||
:::
|
||||
|
||||
@ -285,7 +285,7 @@ tests were run entirely within Windows Subsystem for Linux.
|
||||
```bash
|
||||
git clone https://github.com/bellard/quickjs
|
||||
cd quickjs
|
||||
git checkout 03cc5ec
|
||||
git checkout daa35bc
|
||||
make
|
||||
cd ..
|
||||
```
|
||||
@ -342,10 +342,10 @@ This demo was tested in the following environments:
|
||||
|
||||
| Git Commit | Date |
|
||||
|:-----------|:-----------|
|
||||
| `03cc5ec` | 2023-12-01 |
|
||||
| `2788d71` | 2023-10-11 |
|
||||
| `daa35bc` | 2023-12-09 |
|
||||
| `2788d71` | 2023-12-09 |
|
||||
|
||||
When the demo was tested, commit `03cc5ec` corresponded to the latest commit.
|
||||
When the demo was tested, commit `daa35bc` corresponded to the latest release.
|
||||
|
||||
:::
|
||||
|
||||
@ -354,7 +354,7 @@ When the demo was tested, commit `03cc5ec` corresponded to the latest commit.
|
||||
```bash
|
||||
git clone https://github.com/bellard/quickjs
|
||||
cd quickjs
|
||||
git checkout 03cc5ec
|
||||
git checkout daa35bc
|
||||
make
|
||||
cd ..
|
||||
cp quickjs/qjs .
|
||||
|
@ -135,7 +135,7 @@ This demo was tested in the following deployments:
|
||||
| `darwin-x64` | `c3ead3f` | 2023-11-04 |
|
||||
| `darwin-arm` | `c3ead3f` | 2023-10-19 |
|
||||
| `win10-x64` | `c3ead3f` | 2023-10-28 |
|
||||
| `linux-x64` | `c3ead3f` | 2023-10-11 |
|
||||
| `linux-x64` | `c3ead3f` | 2023-12-09 |
|
||||
|
||||
:::
|
||||
|
||||
|
@ -124,7 +124,7 @@ This demo was tested in the following deployments:
|
||||
| `darwin-arm` | 2023-10-20 |
|
||||
| `win10-x64` | 2023-10-28 |
|
||||
| `win11-arm` | 2023-12-01 |
|
||||
| `linux-x64` | 2023-10-11 |
|
||||
| `linux-x64` | 2023-12-07 |
|
||||
| `linux-arm` | 2023-12-01 |
|
||||
|
||||
:::
|
||||
|
@ -100,9 +100,9 @@ write_file("SheetJE.fods", $fods);
|
||||
|
||||
## Complete Example
|
||||
|
||||
:::note
|
||||
:::note Tested Deployments
|
||||
|
||||
This demo was tested on 2023-08-26 against JE 0.066
|
||||
This demo was tested on 2023-12-05 against JE 0.066
|
||||
|
||||
:::
|
||||
|
||||
@ -131,5 +131,5 @@ curl -LO https://sheetjs.com/data/cd.xls
|
||||
perl SheetJE.pl cd.xls
|
||||
```
|
||||
|
||||
After a short wait, the contents will be displayed in CSV form. It will also
|
||||
write a file `SheetJE.fods` that can be opened in LibreOffice.
|
||||
After a short wait, the contents will be displayed in CSV form. The script will
|
||||
also generate the spreadsheet `SheetJE.fods` which can be opened in LibreOffice.
|
@ -125,14 +125,22 @@ generates a C library and a standalone CLI tool.
|
||||
|
||||
The simplest way to interact with the engine is to pass Base64 strings.
|
||||
|
||||
:::note pass
|
||||
:::note Tested Environments
|
||||
|
||||
This demo was tested in the following deployments:
|
||||
This demo was tested in the following environments:
|
||||
|
||||
| Architecture | Commit | Date |
|
||||
|:-------------|:----------|:-----------|
|
||||
| `darwin-x64` | `bc408b1` | 2023-11-14 |
|
||||
| `linux-x64` | `a588e49` | 2023-10-11 |
|
||||
| `darwin-x64` | `ef4cb2b` | 2023-12-08 |
|
||||
| `darwin-arm` | `ef4cb2b` | 2023-12-08 |
|
||||
| `win11-x64` | `ef4cb2b` | 2023-12-08 |
|
||||
| `win11-arm` | `ef4cb2b` | 2023-12-08 |
|
||||
| `linux-x64` | `ef4cb2b` | 2023-12-08 |
|
||||
| `linux-arm` | `ef4cb2b` | 2023-12-08 |
|
||||
|
||||
The Windows tests were run in WSL.
|
||||
|
||||
Debian and WSL require the `cmake`, `python3` and `python-is-python3` packages.
|
||||
|
||||
:::
|
||||
|
||||
|
@ -21,7 +21,7 @@ in the [issue tracker](https://git.sheetjs.com/sheetjs/docs.sheetjs.com/issues)
|
||||
- [`XMLHttpRequest and fetch`](/docs/demos/net/network)
|
||||
- [`Clipboard Data`](/docs/demos/local/clipboard)
|
||||
- [`Web Workers`](/docs/demos/bigdata/worker)
|
||||
- [`Typed Arrays for Machine Learning`](/docs/demos/bigdata/ml)
|
||||
- [`Typed Arrays`](/docs/demos/math)
|
||||
- [`Local File Access`](/docs/demos/local/file)
|
||||
- [`LocalStorage and SessionStorage`](/docs/demos/data/storageapi)
|
||||
- [`Web SQL Database`](/docs/demos/data/websql)
|
||||
|
@ -748,7 +748,7 @@ example of fetching data from a JSON Endpoint and generating a workbook.
|
||||
[`x-spreadsheet`](/docs/demos/grid/xs) is an interactive data grid for
|
||||
previewing and modifying structured data in the web browser.
|
||||
|
||||
["Typed Arrays and ML"](/docs/demos/ml) covers strategies for
|
||||
["TensorFlow.js"](/docs/demos/math/tensorflow) covers strategies for
|
||||
creating worksheets from ML library exports (datasets stored in Typed Arrays).
|
||||
|
||||
<details>
|
||||
|
@ -652,7 +652,7 @@ export default function App() {
|
||||
|
||||
### Example: Data Loading
|
||||
|
||||
["Typed Arrays and ML"](/docs/demos/ml) covers strategies for
|
||||
["TensorFlow.js"](/docs/demos/math/tensorflow) covers strategies for
|
||||
generating typed arrays and tensors from worksheet data.
|
||||
|
||||
<details>
|
||||
|
@ -1,6 +1,7 @@
|
||||
---
|
||||
sidebar_position: 8
|
||||
title: Workbook Helpers
|
||||
hide_table_of_contents: true
|
||||
---
|
||||
|
||||
Many utility functions return worksheet objects. Worksheets cannot be written to
|
||||
@ -9,10 +10,12 @@ workbook file formats directly. They must be added to a workbook object.
|
||||
**Create a new workbook**
|
||||
|
||||
```js
|
||||
var workbook = XLSX.utils.book_new();
|
||||
var wb_sans_sheets = XLSX.utils.book_new();
|
||||
```
|
||||
|
||||
The `book_new` utility function creates an empty workbook with no worksheets.
|
||||
With no arguments, the `book_new` utility function creates an empty workbook.
|
||||
|
||||
:::info pass
|
||||
|
||||
Spreadsheet software generally require at least one worksheet and enforce the
|
||||
requirement in the user interface. For example, if the last worksheet is deleted
|
||||
@ -21,6 +24,29 @@ in the program, Apple Numbers will automatically create a new blank sheet.
|
||||
The SheetJS [write functions](/docs/api/write-options) enforce the requirement.
|
||||
They will throw errors when trying to export empty workbooks.
|
||||
|
||||
:::
|
||||
|
||||
_Single Worksheet_
|
||||
|
||||
:::tip pass
|
||||
|
||||
Version `0.20.1` introduced the one and two argument forms of `book_new`. It is
|
||||
strongly recommended to [upgrade](/docs/getting-started/installation/).
|
||||
|
||||
:::
|
||||
|
||||
```js
|
||||
var wb_with_sheet_named_Sheet1 = XLSX.utils.book_new(worksheet);
|
||||
var wb_with_sheet_named_Blatte = XLSX.utils.book_new(worksheet, "Blatte");
|
||||
```
|
||||
|
||||
`book_new` can accept one or two arguments.
|
||||
|
||||
If provided, the first argument is expected to be a worksheet object. It will
|
||||
be added to the new workbook.
|
||||
|
||||
If provided, the second argument is the name of the worksheet. If omitted, the
|
||||
default name "Sheet1" will be used.
|
||||
|
||||
**Append a Worksheet to a Workbook**
|
||||
|
||||
|
@ -21,7 +21,7 @@ flowchart LR
|
||||
wb(SheetJS\nWorkbook)
|
||||
file[(workbook\nfile)]
|
||||
html --> |table_to_sheet\n\n| ws
|
||||
ws --> |book_new\nbook_append_sheet| wb
|
||||
ws --> |book_new\n\n| wb
|
||||
wb --> |writeFile\n\n| file
|
||||
```
|
||||
|
||||
|
@ -116,7 +116,7 @@ _Exporting Formulae:_
|
||||
|
||||
_Workbook Operations:_
|
||||
|
||||
- `book_new` creates an empty workbook
|
||||
- `book_new` creates a workbook object
|
||||
- `book_append_sheet` adds a worksheet to a workbook
|
||||
|
||||
**[Utility Functions](/docs/api/utilities)**
|
||||
|
@ -37,15 +37,15 @@ building, reproducing official releases, and running NodeJS and browser tests.
|
||||
|
||||
These instructions were tested on the following platforms:
|
||||
|
||||
| Platform | Test Date |
|
||||
|:------------------------------|:-----------|
|
||||
| Linux (Steam Deck Holo x64) | 2023-11-27 |
|
||||
| Linux (Ubuntu 18 AArch64) | 2023-12-01 |
|
||||
| MacOS 10.13.6 (x64) | 2023-09-30 |
|
||||
| MacOS 14.1.2 (ARM64) | 2023-12-01 |
|
||||
| Windows 10 (x64) + WSL Ubuntu | 2023-11-27 |
|
||||
| Windows 11 (x64) + WSL Ubuntu | 2023-10-14 |
|
||||
| Windows 11 (ARM) + WSL Ubuntu | 2023-09-18 |
|
||||
| Platform | Architecture | Test Date |
|
||||
|:------------------------------|:-------------|:-----------|
|
||||
| Linux (Steam Deck Holo x64) | `linux-x64` | 2023-11-27 |
|
||||
| Linux (Ubuntu 18 AArch64) | `linux-arm` | 2023-12-01 |
|
||||
| MacOS 10.13.6 (x64) | `darwin-x64` | 2023-09-30 |
|
||||
| MacOS 14.1.2 (ARM64) | `darwin-arm` | 2023-12-01 |
|
||||
| Windows 10 (x64) + WSL Ubuntu | `win10-x64` | 2023-11-27 |
|
||||
| Windows 11 (x64) + WSL Ubuntu | `win11-x64` | 2023-10-14 |
|
||||
| Windows 11 (ARM) + WSL Ubuntu | `win11-arm` | 2023-09-18 |
|
||||
|
||||
With some additional dependencies, the unminified scripts are reproducible and
|
||||
tests will pass in Windows XP with NodeJS 5.10.0.
|
||||
@ -525,28 +525,28 @@ echo 'export PATH="/usr/local/opt/gnu-sed/libexec/gnubin:$PATH"' >> ~/.profile
|
||||
### Reproduce official builds
|
||||
|
||||
5) Run `git log` and search for the commit that matches a particular release
|
||||
version. For example, version `0.20.0` can be found with:
|
||||
version. For example, version `0.20.1` can be found with:
|
||||
|
||||
```bash
|
||||
git log | grep -B4 "version bump 0.20.0"
|
||||
git log | grep -B4 "version bump 0.20.1"
|
||||
```
|
||||
|
||||
The output should look like:
|
||||
|
||||
```bash
|
||||
$ git log | grep -B4 "version bump 0.20.0"
|
||||
$ git log | grep -B4 "version bump 0.20.1"
|
||||
# highlight-next-line
|
||||
commit 955543147dac0274d20307057c5a9f3e3e5d5307 <-- this is the commit hash
|
||||
commit 29d46c07a895bdfd948d15b5115529ae697ccb48 <-- this is the commit hash
|
||||
Author: SheetJS <dev@sheetjs.com>
|
||||
Date: Fri Jun 23 05:48:47 2023 -0400
|
||||
Date: Tue Dec 5 03:19:42 2023 -0500
|
||||
|
||||
version bump 0.20.0
|
||||
version bump 0.20.1
|
||||
```
|
||||
|
||||
6) Switch to that commit:
|
||||
|
||||
```bash
|
||||
git checkout 955543147dac0274d20307057c5a9f3e3e5d5307
|
||||
git checkout 29d46c07a895bdfd948d15b5115529ae697ccb48
|
||||
```
|
||||
|
||||
7) Run the full build sequence
|
||||
@ -593,36 +593,36 @@ The checksum for the CDN version can be computed with:
|
||||
<TabItem value="wsl" label="Windows WSL">
|
||||
|
||||
```bash
|
||||
curl -L https://cdn.sheetjs.com/xlsx-0.20.0/package/dist/xlsx.full.min.js | md5sum -
|
||||
curl -L https://cdn.sheetjs.com/xlsx-0.20.1/package/dist/xlsx.full.min.js | md5sum -
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="osx" label="MacOS">
|
||||
|
||||
```bash
|
||||
curl -k -L https://cdn.sheetjs.com/xlsx-0.20.0/package/dist/xlsx.full.min.js | md5
|
||||
curl -k -L https://cdn.sheetjs.com/xlsx-0.20.1/package/dist/xlsx.full.min.js | md5
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="l" label="Linux">
|
||||
|
||||
```bash
|
||||
curl -L https://cdn.sheetjs.com/xlsx-0.20.0/package/dist/xlsx.full.min.js | md5sum -
|
||||
curl -L https://cdn.sheetjs.com/xlsx-0.20.1/package/dist/xlsx.full.min.js | md5sum -
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
When the demo was last tested on macOS, against version `0.20.0`:
|
||||
When the demo was last tested on macOS, against version `0.20.1`:
|
||||
|
||||
>
|
||||
```bash
|
||||
$ md5 dist/xlsx.full.min.js
|
||||
# highlight-next-line
|
||||
MD5 (dist/xlsx.full.min.js) = 0b2f539797f92d35c6394274818f2c22
|
||||
$ curl -k -L https://cdn.sheetjs.com/xlsx-0.20.0/package/dist/xlsx.full.min.js | md5
|
||||
MD5 (dist/xlsx.full.min.js) = c5db4b1d2a1985a4ebfbaa500243f593
|
||||
$ curl -k -L https://cdn.sheetjs.com/xlsx-0.20.1/package/dist/xlsx.full.min.js | md5
|
||||
# highlight-next-line
|
||||
0b2f539797f92d35c6394274818f2c22
|
||||
c5db4b1d2a1985a4ebfbaa500243f593
|
||||
```
|
||||
|
||||
The two hashes should match.
|
||||
|
@ -227,9 +227,11 @@ const config = {
|
||||
{ from: '/docs/getting-started/demos/cli', to: '/docs/demos/desktop/cli/' },
|
||||
{ from: '/docs/getting-started/demos/desktop', to: '/docs/demos/desktop/' },
|
||||
/* bigdata */
|
||||
{ from: '/docs/demos/ml', to: '/docs/demos/bigdata/ml/' },
|
||||
{ from: '/docs/demos/worker', to: '/docs/demos/bigdata/worker/' },
|
||||
{ from: '/docs/demos/stream', to: '/docs/demos/bigdata/stream/' },
|
||||
/* math */
|
||||
{ from: '/docs/demos/ml', to: '/docs/demos/math/' },
|
||||
{ from: '/docs/demos/bigdata/ml', to: '/docs/demos/math/' },
|
||||
/* installation */
|
||||
{ from: '/docs/installation/standalone', to: '/docs/getting-started/installation/standalone/' },
|
||||
{ from: '/docs/installation/frameworks', to: '/docs/getting-started/installation/frameworks/' },
|
||||
|
@ -26,7 +26,7 @@
|
||||
"prism-react-renderer": "1.3.5",
|
||||
"react": "17.0.2",
|
||||
"react-dom": "17.0.2",
|
||||
"xlsx": "https://cdn.sheetjs.com/xlsx-0.20.0/xlsx-0.20.0.tgz"
|
||||
"xlsx": "https://cdn.sheetjs.com/xlsx-0.20.1/xlsx-0.20.1.tgz"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@docusaurus/module-type-aliases": "2.4.1"
|
||||
|
@ -1,6 +1,6 @@
|
||||
// @deno-types="https://cdn.sheetjs.com/xlsx-0.20.0/package/types/index.d.ts"
|
||||
import { read, utils, set_cptable, version } from 'https://cdn.sheetjs.com/xlsx-0.20.0/package/xlsx.mjs';
|
||||
import * as cptable from 'https://cdn.sheetjs.com/xlsx-0.20.0/package/dist/cpexcel.full.mjs';
|
||||
// @deno-types="https://cdn.sheetjs.com/xlsx-0.20.1/package/types/index.d.ts"
|
||||
import { read, utils, set_cptable, version } from 'https://cdn.sheetjs.com/xlsx-0.20.1/package/xlsx.mjs';
|
||||
import * as cptable from 'https://cdn.sheetjs.com/xlsx-0.20.1/package/dist/cpexcel.full.mjs';
|
||||
set_cptable(cptable);
|
||||
|
||||
import * as Drash from "https://cdn.jsdelivr.net/gh/drashland/drash@v2.8.1/mod.ts";
|
||||
|
@ -64,6 +64,10 @@ async function do_file(files) {
|
||||
process_wb(XLSX.read(data));
|
||||
}
|
||||
|
||||
(async() => {
|
||||
process_wb(XLSX.read(await (await fetch("https://sheetjs.com/pres.numbers")).arrayBuffer()))
|
||||
})();
|
||||
|
||||
var drop = document.getElementById('drop');
|
||||
|
||||
function handleDrop(e) {
|
||||
|
BIN
docz/static/reactnative/xcode-select-project.png
Normal file
After Width: | Height: | Size: 24 KiB |
BIN
docz/static/reactnative/xcode-targets.png
Normal file
After Width: | Height: | Size: 141 KiB |
2
docz/static/tfjs/tf.min.js
vendored
BIN
docz/static/typedarray/col.png
Normal file
After Width: | Height: | Size: 14 KiB |
BIN
docz/static/typedarray/iris.png
Normal file
After Width: | Height: | Size: 31 KiB |
BIN
docz/static/typedarray/iris.xlsx
Normal file
BIN
docz/static/typedarray/iristr.png
Normal file
After Width: | Height: | Size: 32 KiB |
BIN
docz/static/typedarray/ta-col.png
Normal file
After Width: | Height: | Size: 17 KiB |
BIN
docz/static/typedarray/ta-row.png
Normal file
After Width: | Height: | Size: 16 KiB |
@ -1,3 +1,3 @@
|
||||
//const version = "0.20.0";
|
||||
//const version = "0.20.1";
|
||||
import { version } from "xlsx";
|
||||
export default version;
|
||||
|