docs.sheetjs.com/docz/docs/03-demos/01-math/09-danfojs.md

309 lines
9.9 KiB
Markdown
Raw Normal View History

2024-01-03 06:47:00 +00:00
---
title: Sheets in DanfoJS
sidebar_label: DanfoJS
pagination_prev: demos/index
pagination_next: demos/frontend/index
---
<head>
<script src="https://cdn.jsdelivr.net/npm/danfojs@1.1.2/lib/bundle.min.js"></script>
</head>
[SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing
data from spreadsheets.
[DanfoJS](https://danfo.jsdata.org/) is a library for processing structured
data. It uses SheetJS under the hood for reading and writing spreadsheets.
This demo covers details elided in the official DanfoJS documentation.
:::note Tested Deployments
This example was last tested on 2024 January 03 against DanfoJS 1.1.2.
:::
:::info Browser integration
The live demos on this page include the DanfoJS browser bundle:
```html
<script src="https://cdn.jsdelivr.net/npm/danfojs@1.1.2/lib/bundle.min.js"></script>
```
There are known issues with the documentation generator. If a demo explicitly
prints "RELOAD THIS PAGE", please reload or refresh the page.
:::
## DataFrames and Worksheets
The DanfoJS `DataFrame`[^1] represents two-dimensional tabular data. It is the
starting point for most DanfoJS data processing tasks. A `DataFrame` typically
corresponds to one SheetJS worksheet[^2].
2024-04-12 01:04:37 +00:00
<table>
<thead><tr><th>Spreadsheet</th><th>DanfoJS DataFrame</th></tr></thead>
<tbody><tr><td>
2024-01-03 06:47:00 +00:00
![`pres.xlsx` data](pathname:///pres.png)
</td><td>
```
╔════╤═══════════════╤═══════╗
║ │ Name │ Index ║
╟────┼───────────────┼───────╢
║ 0 │ Bill Clinton │ 42 ║
╟────┼───────────────┼───────╢
║ 1 │ GeorgeW Bush │ 43 ║
╟────┼───────────────┼───────╢
║ 2 │ Barack Obama │ 44 ║
╟────┼───────────────┼───────╢
║ 3 │ Donald Trump │ 45 ║
╟────┼───────────────┼───────╢
║ 4 │ Joseph Biden │ 46 ║
╚════╧═══════════════╧═══════╝
```
</td></tr></tbody></table>
## DanfoJS SheetJS Integration
:::note pass
The official documentation inconsistently names the library object `danfo` and
`dfd`. Since `dfd` is the browser global, the demos use the name `dfd`.
:::
Methods to read and write spreadsheets are attached to the main `dfd` object.
### Importing DataFrames
`readExcel`[^3] accepts two arguments: source data and options.
The source data must be a `string` or `File` object. Strings are interpreted as
URLs while `File` objects are treated as data.
_Selecting a Worksheet_
DanfoJS will generate a dataframe from one worksheet. The parser normally uses
the first worksheet. The `sheet` property of the options object controls the
selected worksheet. It is expected to be a zero-indexed number:
```js
const first_sheet = await dfd.readExcel(url, {sheet: 0});
const second_sheet = await dfd.readExcel(url, {sheet: 1});
```
_More Parsing Options_
The `parsingOptions` property of the options argument is passed directly to the
SheetJS `read` method[^4].
For example, the `sheetRows` property controls how many rows are extracted from
larger worksheets. To pull 3 data rows, `sheetRows` must be set to 4:
```js
const first_three_rows = await dfd.readExcel(url, { parsingOptions: {
// see https://docs.sheetjs.com/docs/api/parse-options for details
sheetRows: 4
} });
```
#### URL source
The following example fetches a [test file](https://sheetjs.com/pres.xlsx),
parses with SheetJS and generates a DanfoJS dataframe.
```jsx live
function DanfoReadExcelURL() {
const [text, setText] = React.useState("");
React.useEffect(() => { (async() => {
if(typeof dfd === "undefined") return setText("RELOAD THIS PAGE!");
const df = await dfd.readExcel("https://sheetjs.com/pres.xlsx");
setText("" + df.head());
})(); }, []);
return (<pre>{text}</pre>);
}
```
#### File source
The following example uses a file input element. The "File API"[^5] section of
the "Local File Access" demo covers the browser API in more detail.
```jsx live
function DanfoReadExcelFile() {
const [text, setText] = React.useState("Select a spreadsheet");
return (<><pre>{text}</pre><input type="file" onChange={async(e) => {
if(typeof dfd === "undefined") return setText("RELOAD THIS PAGE!");
/* get first file */
const file = e.target.files[0];
/* create dataframe and pretty-print the first 10 rows */
const df = await dfd.readExcel(file);
setText("" + df.head());
}}/></>);
}
```
### Exporting DataFrames
`toExcel`[^6] accepts two arguments: dataframe and options. Under the hood, it
uses the SheetJS `writeFile` method[^7].
_Exported File Name_
The relevant property for the file name depends on the platform:
| Platform | Property |
|:---------|:-----------|
| NodeJS | `filePath` |
| Browser | `fileName` |
The exporter will deduce the desired file format from the file extension.
_Worksheet Name_
The `sheetName` property specifies the name of the worksheet in the workbook:
```js
dfd.toExcel(df, {
fileName: "test.xlsx", // generate `test.xlsx`
// highlight-next-line
sheetName: "Export" // The name of the worksheet will be "Export"
});
```
:::caution pass
The DanfoJS integration forces the `.xlsx` file extension. Exporting to other
file formats will require [low-level operations](#generating-files).
:::
_More Writing Options_
The `writingOptions` property of the options argument is passed directly to the
SheetJS `writeFile` method[^8].
For example, the `compression` property enables ZIP compression for XLSX and
other formats:
```js
dfd.toExcel(df, {fileName: "export.xlsx", writingOptions: {
// see https://docs.sheetjs.com/docs/api/write-options for details
compression: true
}});
```
#### Export to File
The following example exports a sample dataframe to a XLSX spreadsheet.
```jsx live
function DanfoToExcel() {
if(typeof dfd === "undefined") return (<b>RELOAD THIS PAGE</b>);
/* sample dataframe */
const df = new dfd.DataFrame([{Sheet:1,JS:2},{Sheet:3,JS:4}]);
return ( <><button onClick={async() => {
/* dfd.toExcel calls the SheetJS `writeFile` method */
dfd.toExcel(df, {fileName: "SheetJSDanfoJS.xlsx", writingOptions: {
compression: true
}});
}}>Click to Export</button><pre>{"Data:\n"+df.head()}</pre></> );
}
```
## Low-Level Operations
DanfoJS and SheetJS provide methods for processing arrays of objects.
```mermaid
flowchart LR
ws((SheetJS\nWorksheet))
aoo[[array of\nobjects]]
df[(DanfoJS\nDataFrame)]
ws --> |sheet_to_json\n\n| aoo
aoo --> |\njson_to_sheet| ws
df --> |\ndfd.toJSON| aoo
aoo --> |new DataFrame\n\n| df
```
### Creating DataFrames
The `DataFrame` constructor[^9] creates `DataFrame` objects from arrays of
objects. Given a SheetJS worksheet object, the `sheet_to_json` method[^10]
generates compatible arrays of objects:
```js
function ws_to_df(ws) {
const aoo = XLSX.utils.sheet_to_json(ws);
return new dfd.DataFrame(aoo);
}
```
### Generating Files
`toJSON`[^11] accepts two arguments: dataframe and options.
The `format` key of the `options` argument dictates the result layout. The
`column` layout generates an array of objects in row-major order. The SheetJS
`json_to_sheet`[^12] method can generate a worksheet object from the result:
```js
function df_to_ws(df) {
const aoo = dfd.toJSON(df, { format: "column" });
return XLSX.utils.json_to_sheet(aoo);
}
```
The SheetJS `book_new` method creates a workbook object from the worksheet[^13]
and the `writeFile` method[^14] will generate the file:
```js
const ws = df_to_ws(df);
const wb = XLSX.utils.book_new(ws, "Export");
XLSX.writeFile(wb, "SheetJSDanfoJS.xlsb", { compression: true });
```
The following demo exports a sample dataframe to XLSB. This operation is not
supported by the DanfoJS `toExcel` method since that method enforces XLSX.
```jsx live
function DanfoToXLS() {
if(typeof dfd === "undefined") return (<b>RELOAD THIS PAGE</b>);
/* sample dataframe */
const df = new dfd.DataFrame([{Sheet:1,JS:2},{Sheet:3,JS:4}]);
return ( <><button onClick={async() => {
/* generate worksheet */
const aoo = dfd.toJSON(df, { format: "column" });
const ws = XLSX.utils.json_to_sheet(aoo);
/* generate workbook */
const wb = XLSX.utils.book_new(ws, "Export");
/* write to XLS */
XLSX.writeFile(wb, "SheetJSDanfoJS.xlsb", { compression: true });
}}>Click to Export</button><pre>{"Data:\n"+df.head()}</pre></> );
}
```
[^1]: See ["Dataframe"](https://danfo.jsdata.org/api-reference/dataframe) in the DanfoJS documentation
[^2]: See ["Sheet Objects"](/docs/csf/sheet)
[^3]: See ["danfo.readExcel"](https://danfo.jsdata.org/api-reference/input-output/danfo.read_excel) in the DanfoJS documentation.
2024-04-12 01:04:37 +00:00
[^4]: See ["Reading Files"](/docs/api/parse-options#parsing-options) for the full list of parsing options.
2024-01-03 06:47:00 +00:00
[^5]: See ["File API" in "Local File Access"](/docs/demos/local/file#file-api) for more details.
[^6]: See ["danfo.toExcel"](https://danfo.jsdata.org/api-reference/input-output/danfo.to_excel) in the DanfoJS documentation.
[^7]: See [`writeFile` in "Writing Files"](/docs/api/write-options)
2024-04-12 01:04:37 +00:00
[^8]: See ["Writing Files"](/docs/api/write-options#writing-options) for the full list of writing options.
2024-01-03 06:47:00 +00:00
[^9]: See ["Creating a DataFrame"](https://danfo.jsdata.org/api-reference/dataframe/creating-a-dataframe) in the DanfoJS documentation.
[^10]: See [`sheet_to_json` in "Utilities"](/docs/api/utilities/array#array-output)
[^11]: See ["danfo.toJSON"](https://danfo.jsdata.org/api-reference/input-output/danfo.to_json) in the DanfoJS documentation.
[^12]: See [`json_to_sheet` in "Utilities"](/docs/api/utilities/array#array-of-objects-input)
[^13]: See [`book_new` in "Utilities"](/docs/api/utilities/wb)
[^14]: See [`writeFile` in "Writing Files"](/docs/api/write-options)