diff --git a/.gitignore b/.gitignore index d3ab5af..f828735 100644 --- a/.gitignore +++ b/.gitignore @@ -3,3 +3,4 @@ package-lock.json pnpm-lock.yaml /docs +node_modules diff --git a/docz/docs/03-demos/01-math/09-danfojs.md b/docz/docs/03-demos/01-math/09-danfojs.md new file mode 100644 index 0000000..6280e2e --- /dev/null +++ b/docz/docs/03-demos/01-math/09-danfojs.md @@ -0,0 +1,307 @@ +--- +title: Sheets in DanfoJS +sidebar_label: DanfoJS +pagination_prev: demos/index +pagination_next: demos/frontend/index +--- + +
+ + + +[SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing +data from spreadsheets. + +[DanfoJS](https://danfo.jsdata.org/) is a library for processing structured +data. It uses SheetJS under the hood for reading and writing spreadsheets. + +This demo covers details elided in the official DanfoJS documentation. + +:::note Tested Deployments + +This example was last tested on 2024 January 03 against DanfoJS 1.1.2. + +::: + +:::info Browser integration + +The live demos on this page include the DanfoJS browser bundle: + +```html + +``` + +There are known issues with the documentation generator. If a demo explicitly +prints "RELOAD THIS PAGE", please reload or refresh the page. + +::: + +## DataFrames and Worksheets + +The DanfoJS `DataFrame`[^1] represents two-dimensional tabular data. It is the +starting point for most DanfoJS data processing tasks. A `DataFrame` typically +corresponds to one SheetJS worksheet[^2]. + +Spreadsheet | DanfoJS DataFrame |
---|---|
+ +![`pres.xlsx` data](pathname:///pres.png) + + | + +``` +╔════╤═══════════════╤═══════╗ +║ │ Name │ Index ║ +╟────┼───────────────┼───────╢ +║ 0 │ Bill Clinton │ 42 ║ +╟────┼───────────────┼───────╢ +║ 1 │ GeorgeW Bush │ 43 ║ +╟────┼───────────────┼───────╢ +║ 2 │ Barack Obama │ 44 ║ +╟────┼───────────────┼───────╢ +║ 3 │ Donald Trump │ 45 ║ +╟────┼───────────────┼───────╢ +║ 4 │ Joseph Biden │ 46 ║ +╚════╧═══════════════╧═══════╝ +``` + + |
{text}); +} +``` + +#### File source + +The following example uses a file input element. The "File API"[^5] section of +the "Local File Access" demo covers the browser API in more detail. + +```jsx live +function DanfoReadExcelFile() { + const [text, setText] = React.useState("Select a spreadsheet"); + + return (<>
{text}{ + if(typeof dfd === "undefined") return setText("RELOAD THIS PAGE!"); + + /* get first file */ + const file = e.target.files[0]; + + /* create dataframe and pretty-print the first 10 rows */ + const df = await dfd.readExcel(file); + setText("" + df.head()); + }}/>>); +} +``` + +### Exporting DataFrames + +`toExcel`[^6] accepts two arguments: dataframe and options. Under the hood, it +uses the SheetJS `writeFile` method[^7]. + +_Exported File Name_ + +The relevant property for the file name depends on the platform: + +| Platform | Property | +|:---------|:-----------| +| NodeJS | `filePath` | +| Browser | `fileName` | + +The exporter will deduce the desired file format from the file extension. + +_Worksheet Name_ + +The `sheetName` property specifies the name of the worksheet in the workbook: + +```js +dfd.toExcel(df, { + fileName: "test.xlsx", // generate `test.xlsx` + // highlight-next-line + sheetName: "Export" // The name of the worksheet will be "Export" +}); +``` + +:::caution pass + +The DanfoJS integration forces the `.xlsx` file extension. Exporting to other +file formats will require [low-level operations](#generating-files). + +::: + +_More Writing Options_ + +The `writingOptions` property of the options argument is passed directly to the +SheetJS `writeFile` method[^8]. + +For example, the `compression` property enables ZIP compression for XLSX and +other formats: + +```js +dfd.toExcel(df, {fileName: "export.xlsx", writingOptions: { + // see https://docs.sheetjs.com/docs/api/write-options for details + compression: true +}}); +``` + +#### Export to File + +The following example exports a sample dataframe to a XLSX spreadsheet. + +```jsx live +function DanfoToExcel() { + if(typeof dfd === "undefined") return (RELOAD THIS PAGE); + /* sample dataframe */ + const df = new dfd.DataFrame([{Sheet:1,JS:2},{Sheet:3,JS:4}]); + return ( <>
{"Data:\n"+df.head()}> ); +} +``` + +## Low-Level Operations + +DanfoJS and SheetJS provide methods for processing arrays of objects. + +```mermaid +flowchart LR + ws((SheetJS\nWorksheet)) + aoo[[array of\nobjects]] + df[(DanfoJS\nDataFrame)] + ws --> |sheet_to_json\n\n| aoo + aoo --> |\njson_to_sheet| ws + df --> |\ndfd.toJSON| aoo + aoo --> |new DataFrame\n\n| df +``` + +### Creating DataFrames + +The `DataFrame` constructor[^9] creates `DataFrame` objects from arrays of +objects. Given a SheetJS worksheet object, the `sheet_to_json` method[^10] +generates compatible arrays of objects: + +```js +function ws_to_df(ws) { + const aoo = XLSX.utils.sheet_to_json(ws); + return new dfd.DataFrame(aoo); +} +``` + +### Generating Files + +`toJSON`[^11] accepts two arguments: dataframe and options. + +The `format` key of the `options` argument dictates the result layout. The +`column` layout generates an array of objects in row-major order. The SheetJS +`json_to_sheet`[^12] method can generate a worksheet object from the result: + +```js +function df_to_ws(df) { + const aoo = dfd.toJSON(df, { format: "column" }); + return XLSX.utils.json_to_sheet(aoo); +} +``` + +The SheetJS `book_new` method creates a workbook object from the worksheet[^13] +and the `writeFile` method[^14] will generate the file: + +```js +const ws = df_to_ws(df); +const wb = XLSX.utils.book_new(ws, "Export"); +XLSX.writeFile(wb, "SheetJSDanfoJS.xlsb", { compression: true }); +``` + +The following demo exports a sample dataframe to XLSB. This operation is not +supported by the DanfoJS `toExcel` method since that method enforces XLSX. + +```jsx live +function DanfoToXLS() { + if(typeof dfd === "undefined") return (RELOAD THIS PAGE); + /* sample dataframe */ + const df = new dfd.DataFrame([{Sheet:1,JS:2},{Sheet:3,JS:4}]); + return ( <>
{"Data:\n"+df.head()}> ); +} +``` + +[^1]: See ["Dataframe"](https://danfo.jsdata.org/api-reference/dataframe) in the DanfoJS documentation +[^2]: See ["Sheet Objects"](/docs/csf/sheet) +[^3]: See ["danfo.readExcel"](https://danfo.jsdata.org/api-reference/input-output/danfo.read_excel) in the DanfoJS documentation. +[^4]: See ["Reading Files"](/docs/api/parse-options/#parsing-options) for the full list of parsing options. +[^5]: See ["File API" in "Local File Access"](/docs/demos/local/file#file-api) for more details. +[^6]: See ["danfo.toExcel"](https://danfo.jsdata.org/api-reference/input-output/danfo.to_excel) in the DanfoJS documentation. +[^7]: See [`writeFile` in "Writing Files"](/docs/api/write-options) +[^8]: See ["Writing Files"](/docs/api/write-options/#writing-options) for the full list of writing options. +[^9]: See ["Creating a DataFrame"](https://danfo.jsdata.org/api-reference/dataframe/creating-a-dataframe) in the DanfoJS documentation. +[^10]: See [`sheet_to_json` in "Utilities"](/docs/api/utilities/array#array-output) +[^11]: See ["danfo.toJSON"](https://danfo.jsdata.org/api-reference/input-output/danfo.to_json) in the DanfoJS documentation. +[^12]: See [`json_to_sheet` in "Utilities"](/docs/api/utilities/array#array-of-objects-input) +[^13]: See [`book_new` in "Utilities"](/docs/api/utilities/wb) +[^14]: See [`writeFile` in "Writing Files"](/docs/api/write-options) \ No newline at end of file diff --git a/docz/docs/03-demos/01-math/11-tensorflow.md b/docz/docs/03-demos/01-math/11-tensorflow.md new file mode 100644 index 0000000..f225108 --- /dev/null +++ b/docz/docs/03-demos/01-math/11-tensorflow.md @@ -0,0 +1,440 @@ +--- +title: Sheets in TensorFlow +sidebar_label: TensorFlow.js +pagination_prev: demos/index +pagination_next: demos/frontend/index +--- + + + + + +[TensorFlow.js](https://www.tensorflow.org/js) (shortened to TF.js) is a library +for machine learning in JavaScript. + +[SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing +data from spreadsheets. + +This demo uses TensorFlow.js and SheetJS to process data in spreadsheets. We'll +explore how to load spreadsheet data into TF.js datasets and how to export +results back to spreadsheets. + +- ["CSV Data Interchange"](#csv-data-interchange) uses SheetJS to process sheets + and generate CSV data that TF.js can import. + +- ["JSON Data Interchange"](#json-data-interchange) uses SheetJS to process + sheets and generate rows of objects that can be post-processed. + +:::info pass + +Live code blocks in this page use the TF.js `4.14.0` standalone build. + +For use in web frameworks, the `@tensorflow/tfjs` module should be used. + +For use in NodeJS, the native bindings module is `@tensorflow/tfjs-node`. + +::: + +:::note Tested Deployments + +Each browser demo was tested in the following environments: + +| Browser | TF.js version | Date | +|:------------|:--------------|:-----------| +| Chrome 119 | `4.14.0` | 2023-12-09 | +| Safari 16.6 | `4.14.0` | 2023-12-09 | + +::: + +## CSV Data Interchange + +`tf.data.csv`[^1] generates a Dataset from CSV data. The function expects a URL. + +:::note pass + +When this demo was last tested, there was no direct method to pass a CSV string +to the underlying parser. + +::: + +Fortunately blob URLs are supported. + +```mermaid +flowchart LR + ws((SheetJS\nWorksheet)) + csv(CSV\nstring) + url{{Data\nURL}} + dataset[(TF.js\nDataset)] + ws --> |sheet_to_csv\nSheetJS| csv + csv --> |JavaScript\nAPIs| url + url --> |tf.data.csv\nTensorFlow.js| dataset +``` + +The SheetJS `sheet_to_csv` method[^2] generates a CSV string from a worksheet +object. Using standard JavaScript techniques, a blob URL can be constructed: + +```js +function worksheet_to_csv_url(worksheet) { + /* generate CSV */ + const csv = XLSX.utils.sheet_to_csv(worksheet); + + /* CSV -> Uint8Array -> Blob */ + const u8 = new TextEncoder().encode(csv); + const blob = new Blob([u8], { type: "text/csv" }); + + /* generate a blob URL */ + return URL.createObjectURL(blob); +} +``` + +### CSV Demo + +This demo shows a simple model fitting using the "cars" dataset from TensorFlow. +The [sample XLS file](https://sheetjs.com/data/cd.xls) contains the data. The +data processing mirrors the official "Making Predictions from 2D Data" demo[^3]. + +```mermaid +flowchart LR + file[(Remote\nFile)] + subgraph SheetJS Operations + ab[(Data\nBytes)] + wb(((SheetJS\nWorkbook))) + ws((SheetJS\nWorksheet)) + csv(CSV\nstring) + end + subgraph TensorFlow.js Operations + url{{Data\nURL}} + dataset[(TF.js\nDataset)] + results((Results)) + end + file --> |fetch\n\n| ab + ab --> |read\n\n| wb + wb --> |select\nsheet| ws + ws --> |sheet_to_csv\n\n| csv + csv --> |JS\nAPI| url + url --> |tf.data.csv\nTF.js| dataset + dataset --> |fitDataset\nTF.js| results +``` + +The demo builds a model for predicting MPG from Horsepower data. It: + +- fetches
{output}|| <>>} + {results.length &&
Horsepower | MPG |
---|---|
{r[0]} | {r[1].toFixed(2)} |
Index | Name |
---|---|
{i} | {name} |
{c} | ))}
---|
{col} | ))}
JavaScript | Spreadsheet |
---|---|
+ +```js +var aoa = [ + ["sepal length", 5.1, 4.9], + ["sepal width", 3.5, 3], + ["petal length", 1.4, 1.4], + ["petal width", 0.2, 0.2], + ["class", "setosa", "setosa"] +] +``` + + | + +![Single column of data](pathname:///typedarray/iristr.png) + + |
JavaScript | Spreadsheet |
---|---|
+ +```js +var data = [ + [54337.95], + [3.14159], + [2.718281828] +]; +``` + + | + +![Single column of data](pathname:///typedarray/col.png) + + |
- - {output} -); -} -``` - -