docs.sheetjs.com/docz/docs/03-demos/42-engines/08-quickjs.md

397 lines
10 KiB
Markdown
Raw Permalink Normal View History

2023-03-12 06:25:57 +00:00
---
2023-08-03 02:49:32 +00:00
title: Data Processing with QuickJS
2023-06-20 01:21:34 +00:00
sidebar_label: C + QuickJS
description: Process structured data in C programs. Seamlessly integrate spreadsheets into your program by pairing QuickJS and SheetJS. Supercharge programs with modern data tools.
2023-03-12 06:25:57 +00:00
pagination_prev: demos/bigdata/index
pagination_next: solutions/input
---
2023-04-27 09:12:19 +00:00
import current from '/version.js';
2023-05-07 13:58:36 +00:00
import CodeBlock from '@theme/CodeBlock';
2023-04-27 09:12:19 +00:00
2023-06-20 01:21:34 +00:00
[QuickJS](https://bellard.org/quickjs/) is an embeddable JS engine written in C.
It has built-in support for reading and writing file data stored in memory.
[SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing
data from spreadsheets.
This demo uses QuickJS and SheetJS to pull data from a spreadsheet and print CSV
rows. We'll explore how to load SheetJS in a QuickJS context and process
spreadsheets from C programs.
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
The ["Integration Example"](#integration-example) section includes a complete
command-line tool for reading data from files.
2023-03-14 08:38:47 +00:00
2023-03-12 06:25:57 +00:00
## Integration Details
2023-07-25 02:31:22 +00:00
:::note pass
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
Many QuickJS functions are not documented. The explanation was verified against
2024-03-16 16:04:18 +00:00
the latest release (commit `6a89d7c`).
2023-06-20 01:21:34 +00:00
:::
### Initialize QuickJS
Most QuickJS API functions interact with a `JSContext` object[^1], which is
normally created with `JS_NewRuntime` and `JS_NewContext`:
2023-03-12 06:25:57 +00:00
```c
2023-06-20 01:21:34 +00:00
#include "quickjs.h"
/* initialize context */
2023-03-12 06:25:57 +00:00
JSRuntime *rt = JS_NewRuntime();
JSContext *ctx = JS_NewContext(rt);
2023-06-20 01:21:34 +00:00
```
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
QuickJS provides a `global` object through `JS_GetGlobalObject`:
```c
2023-03-12 06:25:57 +00:00
/* obtain reference to global object */
JSValue global = JS_GetGlobalObject(ctx);
2023-06-20 01:21:34 +00:00
```
2023-03-12 06:25:57 +00:00
<details>
<summary><b>Cleanup</b> (click to show)</summary>
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
Once finished, programs are expected to cleanup by using `JS_FreeValue` to free
values, `JS_FreeContext` to free the context pointer, and `JS_FreeRuntime` to
free the runtime:
```c
/* global is a JSValue */
2023-03-12 06:25:57 +00:00
JS_FreeValue(ctx, global);
/* cleanup */
JS_FreeContext(ctx);
JS_FreeRuntime(rt);
```
2023-06-20 01:21:34 +00:00
The [Integration Example](#integration-example) frees JS values after use.
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
</details>
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
### Load SheetJS Scripts
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
[SheetJS Standalone scripts](/docs/getting-started/installation/standalone) can
be loaded and executed in QuickJS.
2023-03-12 06:25:57 +00:00
The main library can be loaded by reading the script from the file system and
2023-06-20 01:21:34 +00:00
evaluating in the QuickJS context using `JS_Eval`:
2023-03-12 06:25:57 +00:00
```c
static char *read_file(const char *filename, size_t *sz) {
FILE *f = fopen(filename, "rb");
if(!f) return NULL;
2023-06-20 01:21:34 +00:00
long fsize; { fseek(f, 0, SEEK_END); fsize = ftell(f); fseek(f, 0, SEEK_SET); }
2023-03-12 06:25:57 +00:00
char *buf = (char *)malloc(fsize * sizeof(char));
*sz = fread((void *) buf, 1, fsize, f);
fclose(f);
return buf;
}
// ...
{
2023-06-20 01:21:34 +00:00
/* Read `xlsx.full.min.js` from the filesystem */
2023-03-12 06:25:57 +00:00
size_t len; char *buf = read_file("xlsx.full.min.js", &len);
2023-06-20 01:21:34 +00:00
/* evaluate from the QuickJS context */
2023-03-12 06:25:57 +00:00
JS_Eval(ctx, buf, len, "<input>", 0);
2023-06-20 01:21:34 +00:00
/* Free the file buffer */
2023-03-12 06:25:57 +00:00
free(buf);
}
```
2023-06-20 01:21:34 +00:00
If the library is loaded, `XLSX.version` will be a string. This string can be
pulled into the main C program.
1) Get the `XLSX` property of the global object using `JS_GetPropertyStr`:
2023-03-12 06:25:57 +00:00
```c
/* obtain reference to the XLSX object */
JSValue XLSX = JS_GetPropertyStr(ctx, global, "XLSX");
2023-06-20 01:21:34 +00:00
```
2) Get the `version` property of the `XLSX` object using `JS_GetPropertyStr`:
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
```c
/* obtain reference to `XLSX.version` */
2023-03-12 06:25:57 +00:00
JSValue version = JS_GetPropertyStr(ctx, XLSX, "version");
2023-06-20 01:21:34 +00:00
```
3) Pull the string into C code with `JS_ToCStringLen`:
```c
/* pull the version string into C */
2023-03-12 06:25:57 +00:00
size_t vlen; const char *vers = JS_ToCStringLen(ctx, &vlen, version);
printf("Version: %s\n", vers);
```
### Reading Files
`JS_NewArrayBuffer` can generate an `ArrayBuffer` from a C byte array. The
function signature expects `uint8_t *` instead of `char *`:
```c
/* read file */
size_t dlen; uint8_t * dbuf = (uint8_t *)read_file("pres.numbers", &dlen);
/* load data into array buffer */
JSValue ab = JS_NewArrayBuffer(ctx, dbuf, dlen, NULL, NULL, 0);
2023-06-20 01:21:34 +00:00
```
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
The `ArrayBuffer` will be parsed with the SheetJS `read` method[^2]. The CSV row
data will be generated with `sheet_to_csv`[^3].
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
#### Parse the ArrayBuffer
:::note pass
The goal is to run the equivalent of the following JavaScript code:
```js
/* `ab` is the `ArrayBuffer` from the previous step */
var wb = XLSX.read(ab);
```
:::
1) Get the `XLSX` property of the global object and the `read` property of `XLSX`:
```c
/* obtain reference to XLSX.read */
JSValue XLSX = JS_GetPropertyStr(ctx, global, "XLSX");
2023-03-12 06:25:57 +00:00
JSValue XLSX_read = JS_GetPropertyStr(ctx, XLSX, "read");
2023-06-20 01:21:34 +00:00
```
2) Create an array of arguments to pass to the function. In this case, the
`read` function will be called with one argument (`ArrayBuffer` data):
```c
/* prepare arguments */
2023-03-12 06:25:57 +00:00
JSValue args[] = { ab };
2023-06-20 01:21:34 +00:00
```
3) Use `JS_Call` to call the function with the arguments:
```c
/* call XLSX.read(ab) */
2023-03-12 06:25:57 +00:00
JSValue wb = JS_Call(ctx, XLSX_read, XLSX, 1, args);
```
2023-06-20 01:21:34 +00:00
#### Get First Worksheet
:::note pass
The goal is to get the first worksheet. In JavaScript, the `SheetNames` property
of the workbook is an array of strings and the `Sheets` property holds worksheet
objects[^4]. The desired action looks like:
```js
/* `wb` is the workbook from the previous step */
var wsname = wb.SheetNames[0];
var ws = wb.Sheets[wsname];
```
:::
4) Pull `wb.SheetNames[0]` into a C string using `JS_GetPropertyStr`:
```c
/* get `wb.SheetNames[0]` */
JSValue SheetNames = JS_GetPropertyStr(ctx, wb, "SheetNames");
JSValue Sheet1 = JS_GetPropertyStr(ctx, SheetNames, "0");
/* pull first sheet name into C code */
size_t wslen; const char *wsname = JS_ToCStringLen(ctx, &wslen, Sheet1);
```
5) Get the worksheet object:
```c
/* get wb.Sheets[wsname] */
JSValue Sheets = JS_GetPropertyStr(ctx, wb, "Sheets");
JSValue ws = JS_GetPropertyStr(ctx, Sheets, wsname);
```
#### Convert to CSV
:::note pass
The goal is to call `sheet_to_csv`[^5] and pull the result into C code:
```js
/* `ws` is the worksheet from the previous step */
var csv = XLSX.utils.sheet_to_csv(ws);
```
:::
6) Create a references to `XLSX.utils` and `XLSX.utils.sheet_to_csv`:
```c
/* obtain reference to XLSX.utils.sheet_to_csv */
JSValue utils = JS_GetPropertyStr(ctx, XLSX, "utils");
JSValue sheet_to_csv = JS_GetPropertyStr(ctx, utils, "sheet_to_csv");
```
7) Create arguments array:
```c
/* prepare arguments */
JSValue args[] = { ws };
```
8) Use `JS_Call` to call the function and use `JS_ToCStringLen` to pull the CSV:
```c
JSValue csv = JS_Call(ctx, sheet_to_csv, utils, 1, args);
size_t csvlen; const char *csvstr = JS_ToCStringLen(ctx, &csvlen, csv);
```
At this point, `csvstr` is a C string that can be printed to standard output.
2023-03-12 06:25:57 +00:00
## Complete Example
The "Integration Example" covers a traditional integration in a C application,
while the "CLI Test" demonstrates other concepts using the `quickjs` CLI tool.
### Integration Example
2023-12-02 08:39:35 +00:00
:::note Tested Deployments
2023-03-12 06:25:57 +00:00
2023-06-03 09:10:50 +00:00
This demo was tested in the following deployments:
2023-03-12 06:25:57 +00:00
2023-06-03 09:10:50 +00:00
| Architecture | Git Commit | Date |
|:-------------|:-----------|:-----------|
2024-03-16 16:04:18 +00:00
| `darwin-x64` | `6a89d7c` | 2024-03-15 |
2023-10-19 05:23:55 +00:00
| `darwin-arm` | `2788d71` | 2023-10-18 |
2024-03-12 06:47:52 +00:00
| `win10-x64` | `9e561d5` | 2024-03-04 |
2023-12-02 08:39:35 +00:00
| `win11-arm` | `03cc5ec` | 2023-12-01 |
2024-04-26 04:16:13 +00:00
| `linux-x64` | `3b45d15` | 2024-04-25 |
2023-12-02 08:39:35 +00:00
| `linux-arm` | `03cc5ec` | 2023-12-01 |
2023-03-12 06:25:57 +00:00
2024-04-26 04:16:13 +00:00
When the demo was tested, `3b45d15` was the HEAD commit on the `master` branch.
2023-06-20 01:21:34 +00:00
2023-03-12 06:25:57 +00:00
:::
2023-09-25 07:30:54 +00:00
:::caution pass
QuickJS does not officially support Windows. The `win10-x64` and `win11-arm`
tests were run entirely within Windows Subsystem for Linux.
:::
2023-03-12 06:25:57 +00:00
0) Build `libquickjs.a`:
```bash
2023-04-09 06:58:43 +00:00
git clone https://github.com/bellard/quickjs
2023-03-12 06:25:57 +00:00
cd quickjs
2024-04-26 04:16:13 +00:00
git checkout 3b45d15
2023-03-12 06:25:57 +00:00
make
cd ..
```
1) Copy `libquickjs.a` and `quickjs.h` into the working directory:
```bash
cp quickjs/libquickjs.a .
cp quickjs/quickjs.h .
```
2) Download [`sheetjs.quick.c`](pathname:///quickjs/sheetjs.quick.c):
```bash
curl -LO https://docs.sheetjs.com/quickjs/sheetjs.quick.c
```
3) Build the sample application:
```bash
2023-07-25 02:31:22 +00:00
gcc -o sheetjs.quick -Wall sheetjs.quick.c libquickjs.a -lm
2023-03-12 06:25:57 +00:00
```
This program tries to parse the file specified by the first argument
2023-09-22 06:32:55 +00:00
4) Download the SheetJS Standalone script and test file. Save both files in
the project directory:
2023-03-12 06:25:57 +00:00
<ul>
2023-04-27 09:12:19 +00:00
<li><a href={`https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js`}>xlsx.full.min.js</a></li>
2024-04-26 04:16:13 +00:00
<li><a href="https://docs.sheetjs.com/pres.numbers">pres.numbers</a></li>
2023-03-12 06:25:57 +00:00
</ul>
2023-05-07 13:58:36 +00:00
<CodeBlock language="bash">{`\
2023-04-27 09:12:19 +00:00
curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js
2024-04-26 04:16:13 +00:00
curl -LO https://docs.sheetjs.com/pres.numbers`}
2023-05-07 13:58:36 +00:00
</CodeBlock>
2023-03-12 06:25:57 +00:00
5) Run the test program:
2023-06-20 01:21:34 +00:00
```bash
2023-03-12 06:25:57 +00:00
./sheetjs.quick pres.numbers
```
If successful, the program will print the library version number, file size,
first worksheet name, and the contents of the first sheet as CSV rows.
### CLI Test
2023-12-05 03:46:54 +00:00
:::note Tested Deployments
This demo was tested in the following environments:
| Git Commit | Date |
|:-----------|:-----------|
2024-04-26 04:16:13 +00:00
| `3b45d15` | 2024-04-25 |
2023-03-12 06:25:57 +00:00
2024-04-26 04:16:13 +00:00
When the demo was tested, `3b45d15` was the HEAD commit on the `master` branch.
2023-03-12 06:25:57 +00:00
:::
2023-12-05 03:46:54 +00:00
0) Build the `qjs` command line utility from source:
```bash
git clone https://github.com/bellard/quickjs
cd quickjs
2024-01-23 09:26:06 +00:00
git checkout 9e561d5
2023-12-05 03:46:54 +00:00
make
cd ..
cp quickjs/qjs .
```
2023-03-12 06:25:57 +00:00
2023-09-22 06:32:55 +00:00
1) Download the SheetJS Standalone script and the test file. Save both files in
the project directory:
2023-03-12 06:25:57 +00:00
<ul>
2023-04-27 09:12:19 +00:00
<li><a href={`https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js`}>xlsx.full.min.js</a></li>
2024-04-26 04:16:13 +00:00
<li><a href="https://docs.sheetjs.com/pres.numbers">pres.numbers</a></li>
2023-03-12 06:25:57 +00:00
</ul>
2023-05-07 13:58:36 +00:00
<CodeBlock language="bash">{`\
2023-04-27 09:12:19 +00:00
curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js
2024-04-26 04:16:13 +00:00
curl -LO https://docs.sheetjs.com/pres.numbers`}
2023-05-07 13:58:36 +00:00
</CodeBlock>
2023-03-12 06:25:57 +00:00
2) Download [`SheetJSQuick.js`](pathname:///quickjs/SheetJSQuick.js)
```bash
curl -LO https://docs.sheetjs.com/quickjs/SheetJSQuick.js
```
3) Test the program:
```bash
2023-12-05 03:46:54 +00:00
./qjs SheetJSQuick.js
2023-03-12 06:25:57 +00:00
```
2023-12-05 03:46:54 +00:00
If successful, the script will print CSV rows and generate `SheetJSQuick.xlsx`.
The generated file can be opened in Excel or another spreadsheet editor.
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
[^1]: See ["Runtime and Contexts"](https://bellard.org/quickjs/quickjs.html#Runtime-and-contexts) in the QuickJS documentation
[^2]: See [`read` in "Reading Files"](/docs/api/parse-options)
[^3]: See [`sheet_to_csv` in "CSV and Text"](/docs/api/utilities/csv#delimiter-separated-output)
[^4]: See ["Workbook Object" in "SheetJS Data Model"](/docs/csf/book)
[^5]: See [`sheet_to_csv` in "CSV and Text"](/docs/api/utilities/csv#delimiter-separated-output)