V8 Python Binding Demo

This commit is contained in:
SheetJS 2024-10-20 13:40:09 -04:00
parent d0f75f27c3
commit 8ccb92b3e5
10 changed files with 378 additions and 73 deletions

13
docz/data/bindings.js Normal file

@ -0,0 +1,13 @@
import url from './engines.xls';
import React, { useEffect, useState } from 'react';
const BindingData = () => {
const [binding, setBinding] = useState("");
useEffect(() => { (async() => {
const html = await (await fetch(url)).json();
setBinding(html["Bindings"]);
})(); }, []);
return ( <p dangerouslySetInnerHTML={{__html: binding}}/> );
};
export default BindingData;

@ -3,19 +3,11 @@ import React, { useEffect, useState } from 'react';
const EngineData = () => {
const [engines, setEngines] = useState("");
const [binding, setBinding] = useState("");
useEffect(() => { (async() => {
const html = await (await fetch(url)).json();
setEngines(html["Engines"]);
setBinding(html["Bindings"]);
})(); }, []);
return ( <>
<p>The following engines have been tested in their native languages:</p>
<div dangerouslySetInnerHTML={{__html: engines}}/>
<p>The following bindings have been tested:</p>
<div dangerouslySetInnerHTML={{__html: binding}}/>
<p>Asterisks () in the Windows columns mark tests that were run in Windows Subsystem for Linux (WSL)</p>
</> );
return ( <p dangerouslySetInnerHTML={{__html: engines}}/> );
};
export default EngineData;

@ -244,7 +244,7 @@
</WorksheetOptions>
</Worksheet>
<Worksheet ss:Name="Bindings">
<Table ss:ExpandedColumnCount="8" ss:ExpandedRowCount="15" x:FullColumns="1" x:FullRows="1" ss:DefaultColumnWidth="65" ss:DefaultRowHeight="16">
<Table ss:ExpandedColumnCount="8" ss:ExpandedRowCount="16" x:FullColumns="1" x:FullRows="1" ss:DefaultColumnWidth="65" ss:DefaultRowHeight="16">
<Column ss:Index="3" ss:Width="24"/>
<Column ss:Width="31"/>
<Column ss:Width="24"/>
@ -337,6 +337,16 @@
<Cell ss:StyleID="s16"><Data ss:Type="String">✔</Data></Cell>
<Cell ss:StyleID="s16"><Data ss:Type="String">✔</Data></Cell>
</Row>
<Row>
<Cell ss:StyleID="s20" ss:HRef="/docs/demos/engines/v8#python"><Data ss:Type="String">V8</Data></Cell>
<Cell><Data ss:Type="String">Python</Data></Cell>
<Cell ss:StyleID="s16"><Data ss:Type="String"></Data></Cell>
<Cell ss:StyleID="s16"><Data ss:Type="String">✔</Data></Cell>
<Cell ss:StyleID="s16"><Data ss:Type="String"></Data></Cell>
<Cell ss:StyleID="s16"><Data ss:Type="String"></Data></Cell>
<Cell ss:StyleID="s16"><Data ss:Type="String"></Data></Cell>
<Cell ss:StyleID="s16"><Data ss:Type="String"></Data></Cell>
</Row>
<Row>
<Cell ss:StyleID="s20" ss:HRef="/docs/demos/engines/jsc#swift"><Data ss:Type="String">JSC</Data></Cell>
<Cell><Data ss:Type="String">Swift</Data></Cell>

@ -134,9 +134,31 @@ sap.ui.define([
"path/to/xlsx.full.min"
], function(/* ... variables for the other libraries ... */, XLSX) {
// use XLSX here
})
});
```
:::caution pass
In some deployments, the function argument was `undefined`.
The standalone scripts add `window.XLSX`, so it is recommended to use `_XLSX`
in the function arguments and access the library with `XLSX` in the callback:
```js
sap.ui.define([
/* ... other libraries ... */
"path/to/xlsx.full.min"
], function(
/* ... variables for the other libraries ... */,
_XLSX // !! NOTE: this is not XLSX! A different variable name must be used
) {
// highlight-next-line
alert(XLSX.version); // use XLSX in the callback
});
```
:::
:::danger pass
**Copy and pasting code does not work** for SheetJS scripts as they contain

@ -59,7 +59,7 @@ can be loaded in the root HTML page (typically `wwwroot/index.html`):
#### ECMAScript Module
The SheetJS ECMAScript module script can be dynamically imported from functions.
This ensures the library is only loaded when necessary. The following example
This ensures the library is only loaded when necessary. The following JS example
loads the library and returns a Promise that resolves to the version string:
<CodeBlock language="js">{`\
@ -75,6 +75,22 @@ async function sheetjs_version(id) {
### Calling JS from C#
Callbacks for events in Razor elements invoke C# methods. The C# methods can use
Blazor APIs to invoke JS methods that are visible in the browser global scope.
```mermaid
sequenceDiagram
actor U as User
participant P as Browser
participant A as Blazor
U-->>P: click button
P-->>A: click event
Note over A: C#35; callback<br/><br/>InvokeVoidAsync
A->>P: call JS function
Note over P: global method<br/><br/>SheetJS logic
P->>U: download workbook
```
#### Setup
The primary mechanism for invoking JS functions from Blazor is `IJSRuntime`[^1].
@ -88,7 +104,9 @@ It should be injected at the top of relevant Razor component scripts:
When exporting a file with the SheetJS `writeFile` method[^2], browser APIs do
not provide success or error feedback. As a result, this demo invokes functions
using the `InvokeVoidAsync` static method[^3]:
using the `InvokeVoidAsync` static method[^3].
The following C# method will invoke the `export_method` method in the browser:
```csharp title="Invoking JS functions from C#"
private async Task ExportDataset() {
@ -96,7 +114,54 @@ private async Task ExportDataset() {
}
```
Methods are commonly bound to buttons in the Razor template using `@onclick`:
:::caution pass
**The JS methods must be defined in the global scope!**
In this demo, the script is added to the `HEAD` block of the root HTML file:
```html title="wwwroot/index.html"
<head>
<!-- ... meta / title / base / link tags -->
<link href="SheetJSBlazorWasm.styles.css" rel="stylesheet" />
<!-- highlight-start -->
<!-- script with `export_method` is in the HEAD block -->
<script>
/* In a normal script tag, Blazor JS can call this method */
async function export_method(...rows) {
/* display the array of objects */
console.log(rows);
}
</script>
<!-- highlight-end -->
</head>
```
When using `<script type="module">`, top-level function definitions are not
visible to Blazor by default. They must be attached to `globalThis`:
```html title="Attaching methods to globalThis"
<script type="module">
/* Using `type="module"`, Blazor JS cannot see this function definition */
async function export_method(...rows) {
/* display the array of objects */
console.log(rows);
}
// highlight-start
/* Once attached to `globalThis`, Blazor JS can call this method */
globalThis.export_method = export_method;
// highlight-end
</script>
```
:::
#### Blazor Callbacks
Methods are commonly bound to buttons in the Razor template using `@onclick`.
When the following button is clicked, Blazor will invoke `ExportDataset`:
```html title="Binding callback to a HTML button"
<button @onclick="ExportDataset">Export Dataset</button>

@ -32,10 +32,10 @@ can be parsed and evaluated in a V8 context.
:::note pass
This section describes a flow where the script is parsed and evaluated every
time the program is run.
This section describes a flow where the script is parsed and evaluated each time
the program is run.
Using V8 snapshots, SheetJS libraries can be parsed and evaluated beforehand.
Using V8 snapshots, SheetJS libraries can be parsed and evaluated at build time.
This greatly improves program startup time.
The ["Snapshots"](#snapshots) section includes a complete example.
@ -96,30 +96,40 @@ To confirm the library is loaded, `XLSX.version` can be inspected:
### Reading Files
V8 supports `ArrayBuffer` natively. Assuming `buf` is a C byte array, with
length `len`, this snippet stores the data as an `ArrayBuffer` in global scope:
length `len`, the following code stores the data in a global `ArrayBuffer`:
```cpp
```cpp title="Loading data into an ArrayBuffer in the V8 engine"
/* load C char array and save to an ArrayBuffer */
std::unique_ptr<v8::BackingStore> back = v8::ArrayBuffer::NewBackingStore(isolate, len);
memcpy(back->Data(), buf, len);
v8::Local<v8::ArrayBuffer> ab = v8::ArrayBuffer::New(isolate, std::move(back));
v8::Maybe<bool> res = context->Global()->Set(context, v8::String::NewFromUtf8Literal(isolate, "buf"), ab);
```
Once the raw data is pulled into the engine, the SheetJS `read` method[^1] can
parse the data. It is recommended to attach the result to a global variable:
```cpp
/* parse with SheetJS */
v8::Local<v8::Value> result = eval_code(isolate, context, "globalThis.wb = XLSX.read(buf)");
```
`wb` will be a variable in the JS environment that can be inspected using the
various SheetJS API functions.
`wb`, a SheetJS workbook object[^2], will be a variable in the JS environment
that can be inspected using the various SheetJS API functions[^3].
### Writing Files
The underlying memory from an `ArrayBuffer` can be recovered:
The SheetJS `write` method[^4] generates file bytes from workbook objects. The
`array` type[^5] instructs the library to generate `ArrayBuffer` objects:
```c
```cpp
/* write with SheetJS using type: "array" */
v8::Local<v8::Value> result = eval_code(isolate, context, "XLSX.write(wb, {type:'array', bookType:'xlsb'})");
```
The underlying memory from an `ArrayBuffer` can be pulled from the engine:
```cpp title="Pulling raw bytes from an ArrayBuffer"
/* pull result back to C++ */
v8::Local<v8::ArrayBuffer> ab = v8::Local<v8::ArrayBuffer>::Cast(result);
size_t sz = ab->ByteLength();
@ -150,13 +160,13 @@ generates an XLSB file and writes to the filesystem.
:::caution pass
When the demo was last tested, there were errors in the official V8 embed guide.
The correct instructions are included below.
Corrected instructions are included below.
:::
:::caution pass
:::danger pass
The build process is long and will test your patience.
**The build process is long and will test your patience.**
:::
@ -894,7 +904,7 @@ may not work on every platform.
### Rust
The `v8` crate provides binary builds and straightforward bindings. The Rust
The `v8` crate[^6] provides binary builds and straightforward bindings. The Rust
code is similar to the C++ code.
Pulling data from an `ArrayBuffer` back into Rust involves an unsafe operation:
@ -1125,7 +1135,8 @@ If the program succeeded, the CSV contents will be printed to console.
### C#
ClearScript provides a .NET interface to the V8 engine.
[ClearScript](https://microsoft.github.io/ClearScript/) is a .NET interface to
the V8 engine.
C# byte arrays (`byte[]`) must be explicitly converted to arrays of bytes:
@ -1231,8 +1242,8 @@ dotnet run
dotnet add package Microsoft.ClearScript.Complete --version 7.4.5
```
5) Download the SheetJS standalone script and test file. Move all three files to
the project directory:
5) Download the SheetJS standalone script and test file. Move both files to the
project directory:
<ul>
<li><a href={`https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js`}>xlsx.full.min.js</a></li>
@ -1282,6 +1293,139 @@ If successful, the program will print the contents of the first sheet as CSV
rows. It will also create `SheetJSClearScript.xlsb`, a workbook that can be
opened in a spreadsheet editor.
### Python
[`pyv8`](https://code.google.com/archive/p/pyv8/) is a Python wrapper for V8.
The `stpyv8` package[^7] is an actively-maintained fork with binary wheels.
:::caution pass
When this demo was last tested, there was no direct conversion between Python
`bytes` and JavaScript `ArrayBuffer` data.
This is a known issue[^8]. The current recommendation is Base64 strings.
:::
#### Python Base64 Strings
The SheetJS `read`[^1] and `write`[^4] methods support Base64 strings through
the `base64` type[^5].
_Reading Files_
It is recommended to create a global context with a special method that handles
file reading from Python. The `read_file` helper in the following snippet will
read bytes from `sheetjs.xlsx` and generate a Base64 string:
```py
from base64 import b64encode;
from STPyV8 import JSContext, JSClass;
# Create context with methods for file i/o
class Base64Context(JSClass):
def read_file(self, path):
with open(path, "rb") as f:
data = f.read();
return b64encode(data).decode("ascii");
globals = Base64Context();
# The JSContext starts and cleans up the V8 engine
with JSContext(globals) as ctxt:
print(ctxt.eval("read_file('sheetjs.xlsx')")); # read base64 data and print
```
_Writing Files_
Since the SheetJS `write` method returns a Base64 string, the result can be
decoded and written to file from Python:
```py
from base64 import b64decode;
from STPyV8 import JSContext;
# The JSContext starts and cleans up the V8 engine
with JSContext() as ctxt:
# ... initialization and workbook creation ...
xlsb = ctxt.eval("XLSX.write(wb, {type: 'base64', bookType: 'xlsb'})");
with open("SheetJSSTPyV8.xlsb", "wb") as f:
f.write(b64decode(xlsb));
```
#### Python Demo
:::note Tested Deployments
This demo was last tested in the following deployments:
| Architecture | V8 Version | Python | Date |
|:-------------|:--------------|:---------|:-----------|
| `darwin-arm` | `13.0.245.16` | `3.13.0` | 2024-10-20 |
:::
0) Make a new folder for the project:
```bash
mkdir sheetjs-stpyv8
cd sheetjs-stpyv8
```
1) Install `stpyv8`:
```bash
pip install stpyv8
```
:::caution pass
The install may fail with a `externally-managed-environment` error:
```
error: externally-managed-environment
× This environment is externally managed
```
The wheel can be downloaded and forcefully installed. The following commands
download and install version `13.0.245.16` for Python `3.13` on `darwin-arm`:
```bash
curl -LO https://github.com/cloudflare/stpyv8/releases/download/v13.0.245.16/stpyv8-13.0.245.16-cp313-cp313-macosx_14_0_arm64.whl
sudo python -m pip install --upgrade stpyv8-13.0.245.16-cp313-cp313-macosx_14_0_arm64.whl --break-system-packages
```
:::
2) Download the SheetJS standalone script and test file. Move both files to the
project directory:
<ul>
<li><a href={`https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js`}>xlsx.full.min.js</a></li>
<li><a href="https://docs.sheetjs.com/pres.xlsx">pres.xlsx</a></li>
</ul>
<CodeBlock language="bash">{`\
curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js
curl -LO https://docs.sheetjs.com/pres.xlsx`}
</CodeBlock>
3) Download [`sheetjs-stpyv8.py`](pathname:///v8/sheetjs-stpyv8.py):
```bash
curl -LO https://docs.sheetjs.com/v8/sheetjs-stpyv8.py
```
4) Run the script and pass `pres.xlsx` as the first argument:
```bash
python sheetjs-stpyv8.py pres.xlsx
```
The script will display CSV rows from the first worksheet. It will also create
`SheetJSSTPyV8.xlsb`, a workbook that can be opened with a spreadsheet editor.
## Snapshots
At a high level, V8 snapshots are raw dumps of the V8 engine state. It is much
@ -1403,3 +1547,12 @@ mv target/release/sheet2csv.exe .
</TabItem>
</Tabs>
[^1]: See [`read` in "Reading Files"](/docs/api/parse-options)
[^2]: See ["SheetJS Data Model"](/docs/csf) for more details on the object representation.
[^3]: See ["API Reference"](/docs/api) for a list of functions that ship with the library. ["Spreadsheet Features"](/docs/csf/features) covers workbook and worksheet features that can be modified directly.
[^4]: See [`write` in "Writing Files"](/docs/api/write-options)
[^5]: See ["Supported Output Formats" type in "Writing Files"](/docs/api/write-options#supported-output-formats)
[^6]: The project does not have an official website. The [official Rust crate](https://crates.io/crates/v8) is hosted on `crates.io`.
[^7]: The project does not have a separate website. The source repository is hosted on [GitHub](https://github.com/cloudflare/stpyv8)
[^8]: According to a maintainer, [typed arrays were not supported in the original `pyv8` project](https://github.com/cloudflare/stpyv8/issues/104#issuecomment-2059125389)

@ -22,6 +22,41 @@ This demo uses JSC and SheetJS to read and write spreadsheets. We'll explore how
to load SheetJS in a JSC context and process spreadsheets and structured data
from C++ and Swift programs.
:::note pass
This demo was tested in the following environments:
[**Swift Built-in**](#swift)
Swift on MacOS supports JavaScriptCore without additional dependencies.
| Architecture | Swift | Date |
|:-------------|:--------|:-----------|
| `darwin-x64` | `5.10` | 2024-04-04 |
| `darwin-arm` | `5.10` | 2024-06-30 |
[**C / C++ Compiled from Source**](#c)
JavaScriptCore can be built from source and linked in C / C++ programs.
| Architecture | Version | Date |
|:-------------|:-----------------|:-----------|
| `darwin-x64` | `7618.1.15.14.7` | 2024-04-24 |
| `darwin-arm` | `7618.2.12.11.7` | 2024-05-24 |
| `linux-x64` | `7618.2.12.11.7` | 2024-06-22 |
| `linux-arm` | `7618.2.12.11.7` | 2024-06-22 |
[**Swift Compiled from Source**](#swift-c)
Swift compiler can link against libraries built from the JavaScriptCore source.
| Architecture | Version | Date |
|:-------------|:-----------------|:-----------|
| `linux-x64` | `7618.2.12.11.7` | 2024-06-22 |
| `linux-arm` | `7618.2.12.11.7` | 2024-06-22 |
:::
## Integration Details
The [SheetJS Standalone scripts](/docs/getting-started/installation/standalone)
@ -308,31 +343,6 @@ FILE *f = fopen("sheetjsw.xlsb", "wb"); fwrite(buf, 1, sz, f); fclose(f);
### Swift
:::note pass
This demo was tested in the following environments:
**Built-in**
Swift on MacOS supports JavaScriptCore without additional dependencies.
| Architecture | Swift | Date |
|:-------------|:--------|:-----------|
| `darwin-x64` | `5.10` | 2024-04-04 |
| `darwin-arm` | `5.10` | 2024-06-30 |
**Compiled**
The ["Swift C"](#swift-c) section starts from the static libraries built in the
["C++"](#c) section and builds Swift bindings.
| Architecture | Version | Date |
|:-------------|:-----------------|:-----------|
| `linux-x64` | `7618.2.12.11.7` | 2024-06-22 |
| `linux-arm` | `7618.2.12.11.7` | 2024-06-22 |
:::
The demo includes a sample `SheetJSCore` Wrapper class to simplify operations.
:::caution This demo only runs on MacOS
@ -399,19 +409,6 @@ to `SheetJSwift.xlsx`. That file can be verified by opening in Excel / Numbers.
### C++
:::note pass
This demo was tested in the following environments:
| Architecture | Version | Date |
|:-------------|:-----------------|:-----------|
| `darwin-x64` | `7618.1.15.14.7` | 2024-04-24 |
| `darwin-arm` | `7618.2.12.11.7` | 2024-05-24 |
| `linux-x64` | `7618.2.12.11.7` | 2024-06-22 |
| `linux-arm` | `7618.2.12.11.7` | 2024-06-22 |
:::
0) Install dependencies
<details>

@ -95,8 +95,7 @@ This demo was tested in the following deployments:
| `linux-arm` | `3.1.2` | `2.9.1` | 2024-05-25 |
When the demo was last tested, there was no official Ruby release for Windows
on ARM. The `win11-arm` test was run in WSL. The `win10-x64` test used the
official Ruby for Windows x64 release.
on ARM. The `win11-arm` test was run in WSL.
:::

@ -5,6 +5,7 @@ pagination_next: solutions/input
---
import EngineData from '/data/engines.js'
import BindingData from '/data/bindings.js'
[SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing
data from spreadsheets.
@ -85,8 +86,27 @@ across multiple architectures (x64 and ARM64).
:::
The following engines have been tested in their native languages:
<EngineData/>
The following bindings have been tested:
<BindingData/>
:::note pass
Asterisks (✱) in the Windows columns mark tests that were run in Windows
Subsystem for Linux (WSL). In some cases, community efforts have produced forks
with native Windows support.
Blank cells mark untested or unsupported configurations. With cross-compilation,
V8 can run natively in Windows on ARM. The `win11-arm` platform is not tested
since the official build infrastructure does not support Windows on ARM and the
V8 project does not distribute shared or static libraries for Windows on ARM.
:::
#### Boa
Boa is an embeddable JS engine written in Rust.
@ -165,9 +185,7 @@ V8 is an embeddable JS engine written in C++. It powers Chromium and Chrome,
NodeJS and Deno, Adobe UXP and other platforms.
This demo has been moved [to a dedicated page](/docs/demos/engines/v8).
The demo includes examples in C++ and Rust.
The ["Python + Pandas" demo](/docs/demos/math/pandas) uses V8 with Python.
The demo includes examples in C++, C#, Python, and Rust.
[^1]: See ["Initialize Hermes"](/docs/demos/engines/hermes#initialize-hermes) in the Hermes demo.
[^2]: See [`read` in "Reading Files"](/docs/api/parse-options)

@ -0,0 +1,36 @@
from sys import stderr, argv;
from base64 import b64encode, b64decode;
from STPyV8 import JSContext, JSClass;
# Create context with methods for file i/o
class Base64Context(JSClass):
def read_file(self, path):
with open(path, "rb") as f:
data = f.read();
return b64encode(data).decode("ascii");
def write_file(self, data, path):
with open(path, "wb") as f:
f.write(b64decode(data));
globals = Base64Context();
# Read xlsx.full.min.js
with open("xlsx.full.min.js", "r") as f:
sheetjs = f.read();
# The JSContext starts and cleans up the V8 engine
with JSContext(globals) as ctxt:
# Load SheetJS library and display version number
ctxt.eval(sheetjs);
version = ctxt.eval("XLSX.version");
print(f"SheetJS Version: {version}", file=stderr);
# Parse workbook
ctxt.eval(f"globalThis.wb = XLSX.read(read_file('{argv[1]}'), {{type:'base64'}}); void 0;");
# Print CSV from first worksheet
csv = ctxt.eval("XLSX.utils.sheet_to_csv(wb.Sheets[wb.SheetNames[0]]);");
print(csv);
# Generate XLSB
xlsb = ctxt.eval("XLSX.write(wb, {type: 'base64', bookType: 'xlsb'})");
globals.write_file(xlsb,"SheetJSSTPyV8.xlsb");