forked from sheetjs/docs.sheetjs.com
useeffect-async
This commit is contained in:
parent
952244b917
commit
17d9d3d7cf
@ -1,7 +1,7 @@
|
||||
---
|
||||
title: Spreadsheet Data in Pandas
|
||||
sidebar_label: Python + Pandas
|
||||
description: Process structured data in Python with Pandas. Seamlessly integrate spreadsheets into your workflow with SheetJS. Analyze complex Excel spreadsheets with confidence.
|
||||
title: Spreadsheet Data in Python
|
||||
sidebar_label: Python DataFrames
|
||||
description: Process structured data in Python DataFrames. Seamlessly integrate spreadsheets into your workflow with SheetJS. Analyze complex Excel spreadsheets with confidence.
|
||||
pagination_prev: demos/index
|
||||
pagination_next: demos/frontend/index
|
||||
---
|
||||
@ -11,7 +11,7 @@ import Tabs from '@theme/Tabs';
|
||||
import TabItem from '@theme/TabItem';
|
||||
import CodeBlock from '@theme/CodeBlock';
|
||||
|
||||
Pandas[^1] is a Python software library for data analysis.
|
||||
[Pandas](https://pandas.pydata.org/) is a Python library for data analysis.
|
||||
|
||||
[SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing
|
||||
data from spreadsheets.
|
||||
@ -28,7 +28,7 @@ simplifies importing and exporting spreadsheets.
|
||||
Pandas includes limited support for reading spreadsheets (`pandas.from_excel`)
|
||||
and writing XLSX spreadsheets (`pandas.DataFrame.to_excel`).
|
||||
|
||||
**SheetJS supports common spreadsheet formats that Pandas cannot process.**
|
||||
**SheetJS supports many common spreadsheet formats that Pandas cannot process.**
|
||||
|
||||
SheetJS operations also offer more flexibility in processing complex worksheets.
|
||||
|
||||
@ -55,8 +55,8 @@ included in the ["Complete Example"](#complete-example) section.
|
||||
|
||||
JS code cannot be directly evaluated in Python implementations.
|
||||
|
||||
To run JS code from Python, JavaScript engines[^2] can be embedded in Python
|
||||
modules or dynamically loaded using the `ctypes` foreign function library[^3].
|
||||
To run JS code from Python, JavaScript engines[^1] can be embedded in Python
|
||||
modules or dynamically loaded using the `ctypes` foreign function library[^2].
|
||||
This demo uses `ctypes` with the [Duktape engine](/docs/demos/engines/duktape).
|
||||
|
||||
### Wrapper
|
||||
@ -138,12 +138,12 @@ flowchart LR
|
||||
|
||||
2) SheetJS libraries parse the string and generate a clean CSV.
|
||||
|
||||
- The `read` method[^4] parses file bytes into a SheetJS workbook object[^5]
|
||||
- After selecting a worksheet, `sheet_to_csv`[^6] generates a CSV string
|
||||
- The `read` method[^3] parses file bytes into a SheetJS workbook object[^4]
|
||||
- After selecting a worksheet, `sheet_to_csv`[^5] generates a CSV string
|
||||
|
||||
3) Python operations convert the CSV string to a stream object.[^7]
|
||||
3) Python operations convert the CSV string to a stream object.[^6]
|
||||
|
||||
4) The Pandas `read_csv` method[^8] ingests the stream and generate a DataFrame.
|
||||
4) The Pandas `read_csv` method[^7] ingests the stream and generate a DataFrame.
|
||||
|
||||
### Writing Files
|
||||
|
||||
@ -179,15 +179,15 @@ flowchart LR
|
||||
u8a --> |`open`/`write`\nPython ops| file
|
||||
```
|
||||
|
||||
1) The Pandas DataFrame `to_json` method[^9] generates a JSON string.
|
||||
1) The Pandas DataFrame `to_json` method[^8] generates a JSON string.
|
||||
|
||||
2) JS engine operations translate the JSON string to an array of objects.
|
||||
|
||||
3) SheetJS libraries process the data array and generate file bytes.
|
||||
|
||||
- The `json_to_sheet` method[^10] creates a SheetJS sheet object from the data.
|
||||
- The `book_new` method[^11] creates a SheetJS workbook that includes the sheet.
|
||||
- The `write` method[^12] generates the spreadsheet file bytes.
|
||||
- The `json_to_sheet` method[^9] creates a SheetJS sheet object from the data.
|
||||
- The `book_new` method[^10] creates a SheetJS workbook that includes the sheet.
|
||||
- The `write` method[^11] generates the spreadsheet file bytes.
|
||||
|
||||
4) Pure Python operations write the bytes to file.
|
||||
|
||||
@ -317,15 +317,166 @@ dtypes: int64(1), object(1)
|
||||
It will also export the DataFrame to `SheetJSPandas.xlsb`. The file can be
|
||||
inspected with a spreadsheet editor that supports XLSB files.
|
||||
|
||||
[^1]: The official documentation site is <https://pandas.pydata.org/> and the official distribution point is <https://pypi.org/project/pandas/>
|
||||
[^2]: See ["Other Languages"](/docs/demos/engines/) for more examples.
|
||||
[^3]: See [`ctypes`](https://docs.python.org/3/library/ctypes.html) in the Python documentation.
|
||||
[^4]: See [`read` in "Reading Files"](/docs/api/parse-options)
|
||||
[^5]: See ["Workbook Object"](/docs/csf/book)
|
||||
[^6]: See [`sheet_to_csv` in "Utilities"](/docs/api/utilities/csv#delimiter-separated-output)
|
||||
[^7]: See [the examples in "IO tools"](https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html) in the Pandas documentation.
|
||||
[^8]: See [`pandas.read_csv`](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html) in the Pandas documentation.
|
||||
[^9]: See [`pandas.DataFrame.to_json`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_json.html) in the Pandas documentation.
|
||||
[^10]: See [`json_to_sheet` in "Utilities"](/docs/api/utilities/array#array-of-objects-input)
|
||||
[^11]: See [`book_new` in "Utilities"](/docs/api/utilities/wb)
|
||||
[^12]: See [`write` in "Writing Files"](/docs/api/write-options)
|
||||
## Other Libraries
|
||||
|
||||
Other Python DataFrame libraries mirror the Pandas DataFrame API.
|
||||
|
||||
### Polars
|
||||
|
||||
[Polars](https://pola.rs/) is a similar DataFrame library that offers many
|
||||
features from Pandas DataFrames.
|
||||
|
||||
:::info pass
|
||||
|
||||
Polars includes limited support for reading and writing spreadsheets by wrapping
|
||||
third-party libraries. In practice, Polars communicates with the third-party
|
||||
libraries using intermediate CSV files.[^12]
|
||||
|
||||
**SheetJS supports many common spreadsheet formats that Polars cannot process.**
|
||||
|
||||
SheetJS operations also offer more flexibility in processing complex worksheets.
|
||||
|
||||
:::
|
||||
|
||||
The Pandas example requires a few slight changes to work with Polars:
|
||||
|
||||
- Polars DataFrames expose `write_json` instead of `to_json`:
|
||||
|
||||
```diff
|
||||
- json = df.to_json(orient="records")
|
||||
+ json = df.write_json(row_oriented=True)
|
||||
```
|
||||
|
||||
- Polars DataFrames do not expose `info`
|
||||
|
||||
#### Polars Demo
|
||||
|
||||
:::note Tested Environments
|
||||
|
||||
This demo was tested in the following deployments:
|
||||
|
||||
| Architecture | JS Engine | Polars | Python | Date |
|
||||
|:-------------|:----------------|:-------|:-------|:-----------|
|
||||
| `darwin-x64` | Duktape `2.7.0` | 0.20.6 | 3.11.7 | 2024-01-30 |
|
||||
| `linux-x64` | Duktape `2.7.0` | 0.20.6 | 3.11.3 | 2024-01-30 |
|
||||
|
||||
:::
|
||||
|
||||
0) Follow the [Pandas "Complete Example"](#complete-example) through the end.
|
||||
|
||||
1) Edit `sheetjs.py`.
|
||||
|
||||
Near the top of the script, change the import from `pandas` to `polars`:
|
||||
|
||||
```diff title="sheetjs.py (apply changes)"
|
||||
-from pandas import read_csv
|
||||
+from polars import read_csv
|
||||
```
|
||||
|
||||
:::note pass
|
||||
|
||||
The red lines starting with `-` should be removed from the file and the green
|
||||
lines starting with `+` should be added to the file. Black lines show the source
|
||||
context and should not be changed.
|
||||
|
||||
:::
|
||||
|
||||
Within the `export_df_to_wb` function, change the `df.to_json` line:
|
||||
|
||||
```diff title="sheetjs.py (apply changes)"
|
||||
def export_df_to_wb(ctx, df, path, sheet_name="Sheet1", book_type=None):
|
||||
- json = df.to_json(orient="records")
|
||||
+ json = df.write_json(row_oriented=True)
|
||||
```
|
||||
|
||||
2) Edit `SheetJSPandas.py`.
|
||||
|
||||
In the script, change `df.info()` to `df`:
|
||||
|
||||
```diff title="SheetJSPandas.py (apply changes)"
|
||||
def export_df_to_wb(ctx, df, path, sheet_name="Sheet1", book_type=None):
|
||||
- print(df.info())
|
||||
+ print(df)
|
||||
```
|
||||
|
||||
Change the export filename from `SheetJSPandas.xlsb` to `SheetJSPolars.xlsb`:
|
||||
|
||||
```diff
|
||||
# Export DataFrame to XLSB
|
||||
- sheetjs.write_df(df, "SheetJSPandas.xlsb", sheet_name="DataFrame")
|
||||
+ sheetjs.write_df(df, "SheetJSPolars.xlsb", sheet_name="DataFrame")
|
||||
```
|
||||
|
||||
3) Install Polars:
|
||||
|
||||
```bash
|
||||
sudo python3 -m pip install polars
|
||||
```
|
||||
|
||||
:::caution pass
|
||||
|
||||
On Arch Linux-based platforms including the Steam Deck, the install may fail:
|
||||
|
||||
```
|
||||
error: externally-managed-environment
|
||||
```
|
||||
|
||||
It is recommended to use a virtual environment for Polars:
|
||||
|
||||
```bash
|
||||
mkdir sheetjs-polars
|
||||
cd sheetjs-polars
|
||||
python -m venv .
|
||||
./bin/pip install polars
|
||||
```
|
||||
|
||||
:::
|
||||
|
||||
4) Run the script:
|
||||
|
||||
```bash
|
||||
python3 SheetJSPandas.py pres.numbers
|
||||
```
|
||||
|
||||
:::note pass
|
||||
|
||||
If the virtual environment was configured in the previous step, run:
|
||||
|
||||
```bash
|
||||
./bin/python3 SheetJSPandas.py pres.numbers
|
||||
```
|
||||
|
||||
:::
|
||||
|
||||
If successful, the script will display DataFrame data:
|
||||
|
||||
```
|
||||
shape: (5, 2)
|
||||
┌──────────────┬───────┐
|
||||
│ Name ┆ Index │
|
||||
│ --- ┆ --- │
|
||||
│ str ┆ i64 │
|
||||
╞══════════════╪═══════╡
|
||||
│ Bill Clinton ┆ 42 │
|
||||
│ GeorgeW Bush ┆ 43 │
|
||||
│ Barack Obama ┆ 44 │
|
||||
│ Donald Trump ┆ 45 │
|
||||
│ Joseph Biden ┆ 46 │
|
||||
└──────────────┴───────┘
|
||||
```
|
||||
|
||||
It will also export the DataFrame to `SheetJSPolars.xlsb`. The file can be
|
||||
inspected with a spreadsheet editor that supports XLSB files.
|
||||
|
||||
[^1]: See ["Other Languages"](/docs/demos/engines/) for more examples.
|
||||
[^2]: See [`ctypes`](https://docs.python.org/3/library/ctypes.html) in the Python documentation.
|
||||
[^3]: See [`read` in "Reading Files"](/docs/api/parse-options)
|
||||
[^4]: See ["Workbook Object"](/docs/csf/book)
|
||||
[^5]: See [`sheet_to_csv` in "Utilities"](/docs/api/utilities/csv#delimiter-separated-output)
|
||||
[^6]: See [the examples in "IO tools"](https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html) in the Pandas documentation.
|
||||
[^7]: See [`pandas.read_csv`](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html) in the Pandas documentation.
|
||||
[^8]: See [`pandas.DataFrame.to_json`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_json.html) in the Pandas documentation.
|
||||
[^9]: See [`json_to_sheet` in "Utilities"](/docs/api/utilities/array#array-of-objects-input)
|
||||
[^10]: See [`book_new` in "Utilities"](/docs/api/utilities/wb)
|
||||
[^11]: See [`write` in "Writing Files"](/docs/api/write-options)
|
||||
[^12]: As explained [in the Polars documentation](https://docs.pola.rs/py-polars/html/reference/api/polars.read_excel.html), "... the target Excel sheet is first converted to CSV ... and then parsed with Polars’ `read_csv()` function."
|
||||
|
@ -82,7 +82,7 @@ Each browser demo was tested in the following environments:
|
||||
|
||||
| Browser | Date |
|
||||
|:------------|:-----------|
|
||||
| Chrome 120 | 2024-01-15 |
|
||||
| Chrome 120 | 2024-01-30 |
|
||||
| Safari 17.2 | 2024-01-15 |
|
||||
|
||||
:::
|
||||
@ -118,7 +118,7 @@ function SheetJSXHRDL() {
|
||||
const [__html, setHTML] = React.useState("");
|
||||
|
||||
/* Fetch and update HTML */
|
||||
React.useEffect(async() => {
|
||||
React.useEffect(() => { (async() => {
|
||||
/* Fetch file */
|
||||
const req = new XMLHttpRequest();
|
||||
req.open("GET", "https://sheetjs.com/pres.numbers", true);
|
||||
@ -132,7 +132,7 @@ function SheetJSXHRDL() {
|
||||
setHTML(XLSX.utils.sheet_to_html(ws));
|
||||
};
|
||||
req.send();
|
||||
}, []);
|
||||
})(); }, []);
|
||||
|
||||
return ( <div dangerouslySetInnerHTML={{ __html }}/> );
|
||||
}
|
||||
@ -170,7 +170,7 @@ function SheetJSFetchDL() {
|
||||
const [__html, setHTML] = React.useState("");
|
||||
|
||||
/* Fetch and update HTML */
|
||||
React.useEffect(async() => {
|
||||
React.useEffect(() => { (async() => {
|
||||
/* Fetch file */
|
||||
const res = await fetch("https://sheetjs.com/pres.numbers");
|
||||
const ab = await res.arrayBuffer();
|
||||
@ -181,7 +181,7 @@ function SheetJSFetchDL() {
|
||||
|
||||
/* Generate HTML */
|
||||
setHTML(XLSX.utils.sheet_to_html(ws));
|
||||
}, []);
|
||||
})(); }, []);
|
||||
|
||||
return ( <div dangerouslySetInnerHTML={{ __html }}/> );
|
||||
}
|
||||
@ -276,7 +276,7 @@ function SheetJSAxiosDL() {
|
||||
const [__html, setHTML] = React.useState("");
|
||||
|
||||
/* Fetch and update HTML */
|
||||
React.useEffect(async() => {
|
||||
React.useEffect(() => { (async() => {
|
||||
if(typeof axios != "function") return setHTML("ReferenceError: axios is not defined");
|
||||
/* Fetch file */
|
||||
const res = await axios("https://sheetjs.com/pres.numbers", {responseType: "arraybuffer"});
|
||||
@ -287,7 +287,7 @@ function SheetJSAxiosDL() {
|
||||
|
||||
/* Generate HTML */
|
||||
setHTML(XLSX.utils.sheet_to_html(ws));
|
||||
}, []);
|
||||
})(); }, []);
|
||||
|
||||
return ( <div dangerouslySetInnerHTML={{ __html }}/> );
|
||||
}
|
||||
@ -337,7 +337,7 @@ function SheetJSSuperAgentDL() {
|
||||
const [__html, setHTML] = React.useState("");
|
||||
|
||||
/* Fetch and update HTML */
|
||||
React.useEffect(async() => {
|
||||
React.useEffect(() => { (async() => {
|
||||
if(typeof superagent == "undefined" || typeof superagent.get != "function")
|
||||
return setHTML("ReferenceError: superagent is not defined");
|
||||
/* Fetch file */
|
||||
@ -352,7 +352,7 @@ function SheetJSSuperAgentDL() {
|
||||
/* Generate HTML */
|
||||
setHTML(XLSX.utils.sheet_to_html(ws));
|
||||
});
|
||||
}, []);
|
||||
})(); }, []);
|
||||
|
||||
return ( <div dangerouslySetInnerHTML={{ __html }}/> );
|
||||
}
|
||||
|
@ -231,7 +231,7 @@ please refresh the page. This is a known bug in the documentation generator.
|
||||
function SheetJSEnregistrez() {
|
||||
const [msg, setMsg] = React.useState("Press the button to write XLS file");
|
||||
const btn = useRef(), tbl = useRef();
|
||||
React.useEffect(async() => {
|
||||
React.useEffect(() => { (async() => {
|
||||
if(typeof Dropbox == "undefined") return setMsg("Dropbox is not defined");
|
||||
/* fetch data and write table (sample data) */
|
||||
const f = await(await fetch("https://sheetjs.com/pres.xlsx")).arrayBuffer();
|
||||
@ -255,7 +255,7 @@ function SheetJSEnregistrez() {
|
||||
});
|
||||
/* add button to page */
|
||||
btn.current.appendChild(button);
|
||||
}, []);
|
||||
})(); }, []);
|
||||
return ( <><b>{msg}</b><br/><div ref={btn}/><div ref={tbl}/></> );
|
||||
}
|
||||
```
|
||||
|
@ -155,9 +155,9 @@ function ConcatFormula(props) {
|
||||
};
|
||||
|
||||
/* Fetch sample file */
|
||||
React.useEffect(async() => {
|
||||
React.useEffect(() => {(async() => {
|
||||
process_ab(await (await fetch("/files/concat.xlsx")).arrayBuffer());
|
||||
}, []);
|
||||
})(); }, []);
|
||||
const process_file = async(e) => {
|
||||
process_ab(await e.target.files[0].arrayBuffer());
|
||||
};
|
||||
@ -415,14 +415,14 @@ function Translator(props) {
|
||||
const [names, setNames] = React.useState([]);
|
||||
const [name, setName] = React.useState("Enter a function name");
|
||||
/* Fetch and display formula */
|
||||
React.useEffect(async() => {
|
||||
React.useEffect(() => { (async() => {
|
||||
/* Fetch data */
|
||||
const json = await (await fetch("https://oss.sheetjs.com/notes/fmla/table.json")).json();
|
||||
setLocales(Object.keys(json));
|
||||
setData(json);
|
||||
setNames(json.en);
|
||||
setName(json.es[0])
|
||||
}, []);
|
||||
})(); }, []);
|
||||
|
||||
const update_name = React.useCallback(() => {
|
||||
const nameelt = document.getElementById("fmla");
|
||||
|
@ -110,14 +110,14 @@ function Visibility(props) {
|
||||
const [sheets, setSheets] = React.useState([]);
|
||||
const vis = [ "Visible", "Hidden", "Very Hidden" ];
|
||||
|
||||
React.useEffect(async() => {
|
||||
React.useEffect(() => { (async() => {
|
||||
const f = await fetch("/files/sheet_visibility.xlsx");
|
||||
const ab = await f.arrayBuffer();
|
||||
const wb = XLSX.read(ab);
|
||||
setWB(wb);
|
||||
/* State will be set to the `Sheets` property array */
|
||||
setSheets(wb.Workbook.Sheets);
|
||||
}, []);
|
||||
})(); }, []);
|
||||
|
||||
return (<table>
|
||||
<thead><tr><th>Name</th><th>Value</th><th>Hidden</th></tr></thead>
|
||||
|
@ -185,22 +185,26 @@ using the `puppeteer` and `playwright` browser automation frameworks.
|
||||
|
||||
<details open><summary><b>Live Example</b> (click to hide)</summary>
|
||||
|
||||
This example uses a ReactJS `ref` to reference the HTML TABLE element. ReactJS
|
||||
details are covered in the [ReactJS demo](/docs/demos/frontend/react#html)
|
||||
|
||||
```jsx live
|
||||
/* The live editor requires this function wrapper */
|
||||
function Table2XLSX(props) {
|
||||
/* reference to the table element */
|
||||
const tbl = React.useRef();
|
||||
|
||||
/* Callback invoked when the button is clicked */
|
||||
const xport = React.useCallback(() => {
|
||||
/* Create worksheet from HTML DOM TABLE */
|
||||
const table = document.getElementById("Table2XLSX");
|
||||
const wb = XLSX.utils.table_to_book(table);
|
||||
const wb = XLSX.utils.table_to_book(tbl.current);
|
||||
|
||||
/* Export to file (start a download) */
|
||||
XLSX.writeFile(wb, "SheetJSTable.xlsx");
|
||||
});
|
||||
|
||||
return ( <>
|
||||
<table id="Table2XLSX"><tbody>
|
||||
<table ref={tbl}><tbody>
|
||||
<tr><td colSpan="3">SheetJS Table Export</td></tr>
|
||||
<tr><td>Author</td><td>ID</td><td>你好!</td></tr>
|
||||
<tr><td>SheetJS</td><td>7262</td><td>வணக்கம்!</td></tr>
|
||||
@ -267,7 +271,7 @@ function Numbers2HTML(props) {
|
||||
const [__html, setHTML] = React.useState("");
|
||||
|
||||
/* Fetch and update HTML */
|
||||
React.useEffect(async() => {
|
||||
React.useEffect(() => { (async() => {
|
||||
/* Fetch file */
|
||||
const f = await fetch("https://sheetjs.com/pres.numbers");
|
||||
const ab = await f.arrayBuffer();
|
||||
@ -278,7 +282,7 @@ function Numbers2HTML(props) {
|
||||
|
||||
/* Generate HTML */
|
||||
setHTML(XLSX.utils.sheet_to_html(ws));
|
||||
}, []);
|
||||
})(); }, []);
|
||||
|
||||
return ( <div dangerouslySetInnerHTML={{ __html }}/> );
|
||||
}
|
||||
|
Loading…
Reference in New Issue
Block a user