forked from sheetjs/docs.sheetjs.com
V8 Java Binding demo
This commit is contained in:
parent
30827f4b7f
commit
234c63dcaa
@ -244,7 +244,7 @@
|
||||
</WorksheetOptions>
|
||||
</Worksheet>
|
||||
<Worksheet ss:Name="Bindings">
|
||||
<Table ss:ExpandedColumnCount="8" ss:ExpandedRowCount="12" x:FullColumns="1" x:FullRows="1" ss:DefaultColumnWidth="65" ss:DefaultRowHeight="16">
|
||||
<Table ss:ExpandedColumnCount="8" ss:ExpandedRowCount="15" x:FullColumns="1" x:FullRows="1" ss:DefaultColumnWidth="65" ss:DefaultRowHeight="16">
|
||||
<Column ss:Index="3" ss:Width="24"/>
|
||||
<Column ss:Width="31"/>
|
||||
<Column ss:Width="24"/>
|
||||
@ -317,6 +317,16 @@
|
||||
<Cell ss:StyleID="s16"><Data ss:Type="String">✔</Data></Cell>
|
||||
<Cell ss:StyleID="s16"><Data ss:Type="String">✔</Data></Cell>
|
||||
</Row>
|
||||
<Row>
|
||||
<Cell ss:StyleID="s20" ss:HRef="/docs/demos/engines/v8#java"><Data ss:Type="String">V8</Data></Cell>
|
||||
<Cell><Data ss:Type="String">Java</Data></Cell>
|
||||
<Cell ss:StyleID="s16"><Data ss:Type="String">✔</Data></Cell>
|
||||
<Cell ss:StyleID="s16"><Data ss:Type="String">✔</Data></Cell>
|
||||
<Cell ss:StyleID="s16"/>
|
||||
<Cell ss:StyleID="s16"/>
|
||||
<Cell ss:StyleID="s16"/>
|
||||
<Cell ss:StyleID="s16"/>
|
||||
</Row>
|
||||
<Row>
|
||||
<Cell ss:StyleID="s20" ss:HRef="/docs/demos/engines/jsc#swift"><Data ss:Type="String">JSC</Data></Cell>
|
||||
<Cell><Data ss:Type="String">Swift</Data></Cell>
|
||||
|
@ -23,8 +23,11 @@ In ["SheetJS Conversion"](#sheetjs-conversion), we will use SheetJS libraries to
|
||||
generate CSV files for the LangChain CSV loader. These conversions can be run in
|
||||
a preprocessing step without disrupting existing CSV workflows.
|
||||
|
||||
In ["SheetJS Loader"](#sheetjs-loader), we will use SheetJS libraries in a custom
|
||||
loader to directly generate documents and metadata.
|
||||
In ["SheetJS Loader"](#sheetjs-loader), we will use SheetJS libraries in a
|
||||
custom loader to directly generate documents and metadata.
|
||||
|
||||
["SheetJS Loader Demo"](#sheetjs-loader-demo) is a complete demo that uses the
|
||||
SheetJS Loader to answer questions based on data from a XLS workbook.
|
||||
|
||||
:::note Tested Deployments
|
||||
|
||||
@ -34,6 +37,7 @@ This demo was tested in the following configurations:
|
||||
|:-----------|:--------------------------------------------------------------|
|
||||
| 2024-06-19 | Apple M2 Max 12-Core CPU + 30-Core GPU (32 GB unified memory) |
|
||||
| 2024-06-19 | NVIDIA RTX 4080 SUPER (16 GB VRAM) + i9-10910 (128 GB RAM) |
|
||||
| 2024-06-19 | NVIDIA RTX 3090 (24 GB VRAM) + Ryzen 9 3900XT (128 GB RAM) |
|
||||
|
||||
This explanation was verified against LangChain 0.2.
|
||||
|
||||
@ -103,7 +107,8 @@ Document {
|
||||
The [SheetJS NodeJS module](/docs/getting-started/installation/nodejs) can be
|
||||
imported in NodeJS scripts that use LangChain and other JavaScript libraries.
|
||||
|
||||
A simple pre-processing step can convert workbooks to spreadsheets
|
||||
A simple pre-processing step can convert workbooks to CSV files that can be
|
||||
processed by the existing CSV tooling:
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
@ -150,6 +155,23 @@ const csv = utils.sheet_to_csv(first_ws);
|
||||
console.log(csv);
|
||||
```
|
||||
|
||||
:::note pass
|
||||
|
||||
A number of demos cover spiritually similar workflows:
|
||||
|
||||
- [Stata](/docs/demos/extensions/stata), [MATLAB](/docs/demos/extensions/matlab)
|
||||
and [Maple](/docs/demos/extensions/maple/) support XLSX data import. The SheetJS
|
||||
integrations generate clean XLSX workbooks from user-supplied spreadsheets.
|
||||
|
||||
- [TensorFlow.js](/docs/demos/math/tensorflow), [Pandas](/docs/demos/math/pandas)
|
||||
and [Mathematica](/docs/demos/extensions/mathematica) support CSV. The SheetJS
|
||||
integrations generate clean CSVs and use built-in CSV processors.
|
||||
|
||||
- The ["Command-Line Tools"](/docs/demos/cli/) demo covers techniques for making
|
||||
standalone command-line tools for file conversion.
|
||||
|
||||
:::
|
||||
|
||||
### Single Worksheet
|
||||
|
||||
For a single worksheet, a SheetJS pre-processing step can write the CSV rows to
|
||||
@ -257,6 +279,17 @@ The demo [`LoadOfSheet` loader](pathname:///loadofsheet/loadofsheet.mjs) will
|
||||
generate one Document per data row across all worksheets. It will also attempt
|
||||
to build metadata and attributes for use in self-querying retrievers.
|
||||
|
||||
```js title="Sample usage"
|
||||
/* read and parse `data.xlsb` */
|
||||
const loader = new LoadOfSheet("./data.xlsb");
|
||||
|
||||
/* generate documents */
|
||||
const docs = await loader.load();
|
||||
|
||||
/* synthesized attributes for the SelfQueryRetriever */
|
||||
const attributes = loader.attributes;
|
||||
```
|
||||
|
||||
<details>
|
||||
<summary><b>Sample SheetJS Loader</b> (click to show)</summary>
|
||||
|
||||
|
@ -25,6 +25,6 @@ ultimately displayed to the user in a HTML table.
|
||||
|
||||
## Loading Sheets
|
||||
|
||||
["Loading Sheets"](/docs/getting-started/examples/loader) explores deep SheetJS
|
||||
The ["Loader Tutorial"](/docs/getting-started/examples/loader) explores SheetJS
|
||||
integrations. Based on the existing CSV and binary loaders, a spreadsheet loader
|
||||
for LangChain is developed and tested.
|
||||
is developed and tested in a natural language query workflow.
|
||||
|
@ -81,8 +81,8 @@ Each browser demo was tested in the following environments:
|
||||
|
||||
| Browser | Date |
|
||||
|:------------|:-----------|
|
||||
| Chrome 120 | 2024-01-30 |
|
||||
| Safari 17.2 | 2024-01-15 |
|
||||
| Chrome 126 | 2024-06-19 |
|
||||
| Safari 17.3 | 2024-06-19 |
|
||||
|
||||
:::
|
||||
|
||||
|
@ -135,8 +135,8 @@ Each browser demo was tested in the following environments:
|
||||
|
||||
| Browser | Date |
|
||||
|:------------|:-----------|
|
||||
| Chrome 120 | 2024-01-15 |
|
||||
| Safari 17.3 | 2024-02-21 |
|
||||
| Chrome 126 | 2024-06-19 |
|
||||
| Safari 17.3 | 2024-06-19 |
|
||||
|
||||
:::
|
||||
|
||||
|
@ -288,7 +288,7 @@ The script will create a file `SheetJSCheerio.xlsx` that can be opened.
|
||||
### DenoDOM
|
||||
|
||||
[DenoDOM](https://deno.land/x/deno_dom) provides a DOM framework for Deno. For
|
||||
the tested version (`0.1.43`), the following patches were needed:
|
||||
the tested version (`0.1.46`), the following patches were needed:
|
||||
|
||||
- TABLE `rows` property (explained above)
|
||||
- TR `cells` property (explained above)
|
||||
@ -299,7 +299,7 @@ This example fetches [a sample table](pathname:///dom/SheetJSTable.html):
|
||||
// @deno-types="https://cdn.sheetjs.com/xlsx-${current}/package/types/index.d.ts"
|
||||
import * as XLSX from 'https://cdn.sheetjs.com/xlsx-${current}/package/xlsx.mjs';
|
||||
\n\
|
||||
import { DOMParser } from 'https://deno.land/x/deno_dom@v0.1.43/deno-dom-wasm.ts';
|
||||
import { DOMParser } from 'https://deno.land/x/deno_dom@v0.1.46/deno-dom-wasm.ts';
|
||||
\n\
|
||||
const doc = new DOMParser().parseFromString(
|
||||
await (await fetch('https://docs.sheetjs.com/dom/SheetJSTable.html')).text(),
|
||||
@ -323,7 +323,12 @@ XLSX.writeFile(workbook, "SheetJSDenoDOM.xlsx");`}
|
||||
|
||||
:::note Tested Deployments
|
||||
|
||||
This demo was last tested on 2024 January 27 against DenoDOM `0.1.43`
|
||||
This demo was tested in the following deployments:
|
||||
|
||||
| Architecture | DenoDOM | Deno | Date |
|
||||
|:-------------|:--------|:-------|:-----------|
|
||||
| `darwin-x64` | 0.1.46 | 1.44.4 | 2024-06-19 |
|
||||
| `darwin-arm` | 0.1.46 | 1.44.4 | 2024-06-19 |
|
||||
|
||||
:::
|
||||
|
||||
|
@ -970,6 +970,73 @@ cargo run pres.numbers
|
||||
If the program succeeded, the CSV contents will be printed to console and the
|
||||
file `sheetjsw.xlsb` will be created. That file can be opened with Excel.
|
||||
|
||||
### Java
|
||||
|
||||
[Javet](https://www.caoccao.com/Javet/) is a Java binding to the V8 engine.
|
||||
Javet simplifies conversions between Java data structures and V8 equivalents.
|
||||
|
||||
Java byte arrays (`byte[]`) are projected in V8 as `Int8Array`. The SheetJS
|
||||
`read` method expects a `Uint8Array`. The following script snippet performs a
|
||||
zero-copy conversion:
|
||||
|
||||
```js title="Zero-copy conversion from Int8Array to Uint8Array"
|
||||
// assuming `i8` is an Int8Array
|
||||
const u8 = new Uint8Array(i8.buffer, i8.byteOffset, i8.length);
|
||||
```
|
||||
|
||||
:::note Tested Deployments
|
||||
|
||||
This demo was last tested in the following deployments:
|
||||
|
||||
| Architecture | V8 Version | Javet | Java | Date |
|
||||
|:-------------|:--------------|:--------|:--------|:-----------|
|
||||
| `darwin-x64` | `12.6.228.13` | `3.1.3` | 22 | 2024-06-19 |
|
||||
| `darwin-arm` | `12.6.228.13` | `3.1.3` | 11.0.23 | 2024-06-19 |
|
||||
|
||||
:::
|
||||
|
||||
1) Create a new project:
|
||||
|
||||
```bash
|
||||
mkdir sheetjs-javet
|
||||
cd sheetjs-javet
|
||||
```
|
||||
|
||||
2) Download the Javet JAR. There are different archives for different platforms.
|
||||
The following command runs on `darwin-x64` and `darwin-arm`:
|
||||
|
||||
```bash
|
||||
curl -LO https://repo1.maven.org/maven2/com/caoccao/javet/javet-macos/3.1.3/javet-macos-3.1.3.jar
|
||||
```
|
||||
|
||||
3) Download the SheetJS Standalone script and test file. Save both files in the
|
||||
project directory:
|
||||
|
||||
<ul>
|
||||
<li><a href={`https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js`}>xlsx.full.min.js</a></li>
|
||||
<li><a href="https://docs.sheetjs.com/pres.xlsx">pres.xlsx</a></li>
|
||||
</ul>
|
||||
|
||||
<CodeBlock language="bash">{`\
|
||||
curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js
|
||||
curl -LO https://docs.sheetjs.com/pres.xlsx`}
|
||||
</CodeBlock>
|
||||
|
||||
4) Download [`SheetJSJavet.java`](pathname:///v8/SheetJSJavet.java):
|
||||
|
||||
```bash
|
||||
curl -LO https://docs.sheetjs.com/v8/SheetJSJavet.java
|
||||
```
|
||||
|
||||
5) Build and run the Java application:
|
||||
|
||||
```bash
|
||||
javac -cp ".:javet-macos-3.1.3.jar" SheetJSJavet.java
|
||||
java -cp ".:javet-macos-3.1.3.jar" SheetJSJavet pres.xlsx
|
||||
```
|
||||
|
||||
If the program succeeded, the CSV contents will be printed to console.
|
||||
|
||||
## Snapshots
|
||||
|
||||
At a high level, V8 snapshots are raw dumps of the V8 engine state. It is much
|
||||
|
30
docz/static/v8/SheetJSJavet.java
Normal file
30
docz/static/v8/SheetJSJavet.java
Normal file
@ -0,0 +1,30 @@
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Paths;
|
||||
import java.util.Scanner;
|
||||
import com.caoccao.javet.interop.V8Host;
|
||||
import com.caoccao.javet.interop.V8Runtime;
|
||||
|
||||
public class SheetJSJavet {
|
||||
public static void main(String[] args) throws Exception {
|
||||
/* initialize */
|
||||
V8Runtime v8Runtime = V8Host.getV8Instance().createV8Runtime();
|
||||
|
||||
/* read script file */
|
||||
v8Runtime.getExecutor("var global = (function(){ return this; }).call(null);").executeVoid();
|
||||
v8Runtime.getExecutor(new Scanner(SheetJSJavet.class.getResourceAsStream("/xlsx.full.min.js")).useDelimiter("\\Z").next()).executeVoid();
|
||||
|
||||
System.out.println(v8Runtime.getExecutor("'SheetJS Version ' + XLSX.version").executeString());
|
||||
|
||||
/* read spreadsheet bytes */
|
||||
v8Runtime.getGlobalObject().set("i8", Files.readAllBytes(Paths.get(args[0])));
|
||||
v8Runtime.getExecutor("var u8 = new Uint8Array(i8.buffer, i8.byteOffset, i8.length);").executeVoid();
|
||||
|
||||
/* parse workbook */
|
||||
v8Runtime.getExecutor("var wb = XLSX.read(u8, {type: 'array'})").executeVoid();
|
||||
|
||||
/* get first worksheet as CSV */
|
||||
v8Runtime.getExecutor("var ws = wb.Sheets[wb.SheetNames[0]];").executeVoid();
|
||||
String res = v8Runtime.getExecutor("XLSX.utils.sheet_to_csv(ws)").executeString();
|
||||
System.out.println(res);
|
||||
}
|
||||
}
|
Loading…
Reference in New Issue
Block a user