docs.sheetjs.com/docz/docs/03-demos/12-engines/08_quickjs.md

369 lines
9.5 KiB
Markdown
Raw Normal View History

2023-03-12 06:25:57 +00:00
---
2023-08-03 02:49:32 +00:00
title: Data Processing with QuickJS
2023-06-20 01:21:34 +00:00
sidebar_label: C + QuickJS
description: Process structured data in C programs. Seamlessly integrate spreadsheets into your program by pairing QuickJS and SheetJS. Supercharge programs with modern data tools.
2023-03-12 06:25:57 +00:00
pagination_prev: demos/bigdata/index
pagination_next: solutions/input
---
2023-04-27 09:12:19 +00:00
import current from '/version.js';
2023-05-07 13:58:36 +00:00
import CodeBlock from '@theme/CodeBlock';
2023-04-27 09:12:19 +00:00
2023-06-20 01:21:34 +00:00
[QuickJS](https://bellard.org/quickjs/) is an embeddable JS engine written in C.
It has built-in support for reading and writing file data stored in memory.
[SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing
data from spreadsheets.
This demo uses QuickJS and SheetJS to pull data from a spreadsheet and print CSV
rows. We'll explore how to load SheetJS in a QuickJS context and process
spreadsheets from C programs.
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
The ["Integration Example"](#integration-example) section includes a complete
command-line tool for reading data from files.
2023-03-14 08:38:47 +00:00
2023-03-12 06:25:57 +00:00
## Integration Details
2023-07-25 02:31:22 +00:00
:::note pass
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
Many QuickJS functions are not documented. The explanation was verified against
2023-06-23 20:24:44 +00:00
the latest release (commit `2788d71`).
2023-06-20 01:21:34 +00:00
:::
### Initialize QuickJS
Most QuickJS API functions interact with a `JSContext` object[^1], which is
normally created with `JS_NewRuntime` and `JS_NewContext`:
2023-03-12 06:25:57 +00:00
```c
2023-06-20 01:21:34 +00:00
#include "quickjs.h"
/* initialize context */
2023-03-12 06:25:57 +00:00
JSRuntime *rt = JS_NewRuntime();
JSContext *ctx = JS_NewContext(rt);
2023-06-20 01:21:34 +00:00
```
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
QuickJS provides a `global` object through `JS_GetGlobalObject`:
```c
2023-03-12 06:25:57 +00:00
/* obtain reference to global object */
JSValue global = JS_GetGlobalObject(ctx);
2023-06-20 01:21:34 +00:00
```
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
<details><summary><b>Cleanup</b> (click to show)</summary>
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
Once finished, programs are expected to cleanup by using `JS_FreeValue` to free
values, `JS_FreeContext` to free the context pointer, and `JS_FreeRuntime` to
free the runtime:
```c
/* global is a JSValue */
2023-03-12 06:25:57 +00:00
JS_FreeValue(ctx, global);
/* cleanup */
JS_FreeContext(ctx);
JS_FreeRuntime(rt);
```
2023-06-20 01:21:34 +00:00
The [Integration Example](#integration-example) frees JS values after use.
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
</details>
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
### Load SheetJS Scripts
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
[SheetJS Standalone scripts](/docs/getting-started/installation/standalone) can
be loaded and executed in QuickJS.
2023-03-12 06:25:57 +00:00
The main library can be loaded by reading the script from the file system and
2023-06-20 01:21:34 +00:00
evaluating in the QuickJS context using `JS_Eval`:
2023-03-12 06:25:57 +00:00
```c
static char *read_file(const char *filename, size_t *sz) {
FILE *f = fopen(filename, "rb");
if(!f) return NULL;
2023-06-20 01:21:34 +00:00
long fsize; { fseek(f, 0, SEEK_END); fsize = ftell(f); fseek(f, 0, SEEK_SET); }
2023-03-12 06:25:57 +00:00
char *buf = (char *)malloc(fsize * sizeof(char));
*sz = fread((void *) buf, 1, fsize, f);
fclose(f);
return buf;
}
// ...
{
2023-06-20 01:21:34 +00:00
/* Read `xlsx.full.min.js` from the filesystem */
2023-03-12 06:25:57 +00:00
size_t len; char *buf = read_file("xlsx.full.min.js", &len);
2023-06-20 01:21:34 +00:00
/* evaluate from the QuickJS context */
2023-03-12 06:25:57 +00:00
JS_Eval(ctx, buf, len, "<input>", 0);
2023-06-20 01:21:34 +00:00
/* Free the file buffer */
2023-03-12 06:25:57 +00:00
free(buf);
}
```
2023-06-20 01:21:34 +00:00
If the library is loaded, `XLSX.version` will be a string. This string can be
pulled into the main C program.
1) Get the `XLSX` property of the global object using `JS_GetPropertyStr`:
2023-03-12 06:25:57 +00:00
```c
/* obtain reference to the XLSX object */
JSValue XLSX = JS_GetPropertyStr(ctx, global, "XLSX");
2023-06-20 01:21:34 +00:00
```
2) Get the `version` property of the `XLSX` object using `JS_GetPropertyStr`:
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
```c
/* obtain reference to `XLSX.version` */
2023-03-12 06:25:57 +00:00
JSValue version = JS_GetPropertyStr(ctx, XLSX, "version");
2023-06-20 01:21:34 +00:00
```
3) Pull the string into C code with `JS_ToCStringLen`:
```c
/* pull the version string into C */
2023-03-12 06:25:57 +00:00
size_t vlen; const char *vers = JS_ToCStringLen(ctx, &vlen, version);
printf("Version: %s\n", vers);
```
### Reading Files
`JS_NewArrayBuffer` can generate an `ArrayBuffer` from a C byte array. The
function signature expects `uint8_t *` instead of `char *`:
```c
/* read file */
size_t dlen; uint8_t * dbuf = (uint8_t *)read_file("pres.numbers", &dlen);
/* load data into array buffer */
JSValue ab = JS_NewArrayBuffer(ctx, dbuf, dlen, NULL, NULL, 0);
2023-06-20 01:21:34 +00:00
```
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
The `ArrayBuffer` will be parsed with the SheetJS `read` method[^2]. The CSV row
data will be generated with `sheet_to_csv`[^3].
2023-03-12 06:25:57 +00:00
2023-06-20 01:21:34 +00:00
#### Parse the ArrayBuffer
:::note pass
The goal is to run the equivalent of the following JavaScript code:
```js
/* `ab` is the `ArrayBuffer` from the previous step */
var wb = XLSX.read(ab);
```
:::
1) Get the `XLSX` property of the global object and the `read` property of `XLSX`:
```c
/* obtain reference to XLSX.read */
JSValue XLSX = JS_GetPropertyStr(ctx, global, "XLSX");
2023-03-12 06:25:57 +00:00
JSValue XLSX_read = JS_GetPropertyStr(ctx, XLSX, "read");
2023-06-20 01:21:34 +00:00
```
2) Create an array of arguments to pass to the function. In this case, the
`read` function will be called with one argument (`ArrayBuffer` data):
```c
/* prepare arguments */
2023-03-12 06:25:57 +00:00
JSValue args[] = { ab };
2023-06-20 01:21:34 +00:00
```
3) Use `JS_Call` to call the function with the arguments:
```c
/* call XLSX.read(ab) */
2023-03-12 06:25:57 +00:00
JSValue wb = JS_Call(ctx, XLSX_read, XLSX, 1, args);
```
2023-06-20 01:21:34 +00:00
#### Get First Worksheet
:::note pass
The goal is to get the first worksheet. In JavaScript, the `SheetNames` property
of the workbook is an array of strings and the `Sheets` property holds worksheet
objects[^4]. The desired action looks like:
```js
/* `wb` is the workbook from the previous step */
var wsname = wb.SheetNames[0];
var ws = wb.Sheets[wsname];
```
:::
4) Pull `wb.SheetNames[0]` into a C string using `JS_GetPropertyStr`:
```c
/* get `wb.SheetNames[0]` */
JSValue SheetNames = JS_GetPropertyStr(ctx, wb, "SheetNames");
JSValue Sheet1 = JS_GetPropertyStr(ctx, SheetNames, "0");
/* pull first sheet name into C code */
size_t wslen; const char *wsname = JS_ToCStringLen(ctx, &wslen, Sheet1);
```
5) Get the worksheet object:
```c
/* get wb.Sheets[wsname] */
JSValue Sheets = JS_GetPropertyStr(ctx, wb, "Sheets");
JSValue ws = JS_GetPropertyStr(ctx, Sheets, wsname);
```
#### Convert to CSV
:::note pass
The goal is to call `sheet_to_csv`[^5] and pull the result into C code:
```js
/* `ws` is the worksheet from the previous step */
var csv = XLSX.utils.sheet_to_csv(ws);
```
:::
6) Create a references to `XLSX.utils` and `XLSX.utils.sheet_to_csv`:
```c
/* obtain reference to XLSX.utils.sheet_to_csv */
JSValue utils = JS_GetPropertyStr(ctx, XLSX, "utils");
JSValue sheet_to_csv = JS_GetPropertyStr(ctx, utils, "sheet_to_csv");
```
7) Create arguments array:
```c
/* prepare arguments */
JSValue args[] = { ws };
```
8) Use `JS_Call` to call the function and use `JS_ToCStringLen` to pull the CSV:
```c
JSValue csv = JS_Call(ctx, sheet_to_csv, utils, 1, args);
size_t csvlen; const char *csvstr = JS_ToCStringLen(ctx, &csvlen, csv);
```
At this point, `csvstr` is a C string that can be printed to standard output.
2023-03-12 06:25:57 +00:00
## Complete Example
The "Integration Example" covers a traditional integration in a C application,
while the "CLI Test" demonstrates other concepts using the `quickjs` CLI tool.
### Integration Example
:::note
2023-06-03 09:10:50 +00:00
This demo was tested in the following deployments:
2023-03-12 06:25:57 +00:00
2023-06-03 09:10:50 +00:00
| Architecture | Git Commit | Date |
|:-------------|:-----------|:-----------|
2023-07-25 02:31:22 +00:00
| `darwin-x64` | `2788d71` | 2023-07-24 |
2023-06-05 20:12:53 +00:00
| `darwin-arm` | `2788d71` | 2023-06-05 |
2023-06-03 09:10:50 +00:00
| `linux-x64` | `2788d71` | 2023-06-02 |
2023-07-25 02:31:22 +00:00
| `win10-x64` | `2788d71` | 2023-07-24 |
2023-03-12 06:25:57 +00:00
2023-06-23 20:24:44 +00:00
When the demo was tested, commit `2788d71` corresponded to the latest release.
2023-06-20 01:21:34 +00:00
2023-03-12 06:25:57 +00:00
:::
0) Build `libquickjs.a`:
```bash
2023-04-09 06:58:43 +00:00
git clone https://github.com/bellard/quickjs
2023-03-12 06:25:57 +00:00
cd quickjs
git checkout 2788d71
make
cd ..
```
1) Copy `libquickjs.a` and `quickjs.h` into the working directory:
```bash
cp quickjs/libquickjs.a .
cp quickjs/quickjs.h .
```
2) Download [`sheetjs.quick.c`](pathname:///quickjs/sheetjs.quick.c):
```bash
curl -LO https://docs.sheetjs.com/quickjs/sheetjs.quick.c
```
3) Build the sample application:
```bash
2023-07-25 02:31:22 +00:00
gcc -o sheetjs.quick -Wall sheetjs.quick.c libquickjs.a -lm
2023-03-12 06:25:57 +00:00
```
This program tries to parse the file specified by the first argument
4) Download the standalone script and test file:
<ul>
2023-04-27 09:12:19 +00:00
<li><a href={`https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js`}>xlsx.full.min.js</a></li>
2023-03-12 06:25:57 +00:00
<li><a href="https://sheetjs.com/pres.numbers">pres.numbers</a></li>
</ul>
2023-05-07 13:58:36 +00:00
<CodeBlock language="bash">{`\
2023-04-27 09:12:19 +00:00
curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js
curl -LO https://sheetjs.com/pres.numbers`}
2023-05-07 13:58:36 +00:00
</CodeBlock>
2023-03-12 06:25:57 +00:00
5) Run the test program:
2023-06-20 01:21:34 +00:00
```bash
2023-03-12 06:25:57 +00:00
./sheetjs.quick pres.numbers
```
If successful, the program will print the library version number, file size,
first worksheet name, and the contents of the first sheet as CSV rows.
### CLI Test
:::note
2023-06-23 20:24:44 +00:00
This demo was last tested on 2023 March 11 against QuickJS commit `2788d71`.
2023-03-12 06:25:57 +00:00
:::
0) Ensure `quickjs` command line utility is installed
1) Download the standalone script and the test file:
<ul>
2023-04-27 09:12:19 +00:00
<li><a href={`https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js`}>xlsx.full.min.js</a></li>
2023-03-12 06:25:57 +00:00
<li><a href="https://sheetjs.com/pres.numbers">pres.numbers</a></li>
</ul>
2023-05-07 13:58:36 +00:00
<CodeBlock language="bash">{`\
2023-04-27 09:12:19 +00:00
curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js
curl -LO https://sheetjs.com/pres.numbers`}
2023-05-07 13:58:36 +00:00
</CodeBlock>
2023-03-12 06:25:57 +00:00
2) Download [`SheetJSQuick.js`](pathname:///quickjs/SheetJSQuick.js)
```bash
curl -LO https://docs.sheetjs.com/quickjs/SheetJSQuick.js
```
3) Test the program:
```bash
quickjs SheetJSQuick.js
```
If successful, the script will generate `SheetJSQuick.xlsx`.
2023-06-20 01:21:34 +00:00
[^1]: See ["Runtime and Contexts"](https://bellard.org/quickjs/quickjs.html#Runtime-and-contexts) in the QuickJS documentation
[^2]: See [`read` in "Reading Files"](/docs/api/parse-options)
[^3]: See [`sheet_to_csv` in "CSV and Text"](/docs/api/utilities/csv#delimiter-separated-output)
[^4]: See ["Workbook Object" in "SheetJS Data Model"](/docs/csf/book)
[^5]: See [`sheet_to_csv` in "CSV and Text"](/docs/api/utilities/csv#delimiter-separated-output)