docs.sheetjs.com/docz/docs/03-demos/12-engines/08_quickjs.md
2023-08-08 21:10:17 -04:00

9.5 KiB

title sidebar_label description pagination_prev pagination_next
Data Processing with QuickJS C + QuickJS Process structured data in C programs. Seamlessly integrate spreadsheets into your program by pairing QuickJS and SheetJS. Supercharge programs with modern data tools. demos/bigdata/index solutions/input

import current from '/version.js'; import CodeBlock from '@theme/CodeBlock';

QuickJS is an embeddable JS engine written in C. It has built-in support for reading and writing file data stored in memory.

SheetJS is a JavaScript library for reading and writing data from spreadsheets.

This demo uses QuickJS and SheetJS to pull data from a spreadsheet and print CSV rows. We'll explore how to load SheetJS in a QuickJS context and process spreadsheets from C programs.

The "Integration Example" section includes a complete command-line tool for reading data from files.

Integration Details

:::note pass

Many QuickJS functions are not documented. The explanation was verified against the latest release (commit 2788d71).

:::

Initialize QuickJS

Most QuickJS API functions interact with a JSContext object1, which is normally created with JS_NewRuntime and JS_NewContext:

#include "quickjs.h"

/* initialize context */
JSRuntime *rt = JS_NewRuntime();
JSContext *ctx = JS_NewContext(rt);

QuickJS provides a global object through JS_GetGlobalObject:

/* obtain reference to global object */
JSValue global = JS_GetGlobalObject(ctx);
Cleanup (click to show)

Once finished, programs are expected to cleanup by using JS_FreeValue to free values, JS_FreeContext to free the context pointer, and JS_FreeRuntime to free the runtime:

/* global is a JSValue */
JS_FreeValue(ctx, global);

/* cleanup */
JS_FreeContext(ctx);
JS_FreeRuntime(rt);

The Integration Example frees JS values after use.

Load SheetJS Scripts

SheetJS Standalone scripts can be loaded and executed in QuickJS.

The main library can be loaded by reading the script from the file system and evaluating in the QuickJS context using JS_Eval:

static char *read_file(const char *filename, size_t *sz) {
  FILE *f = fopen(filename, "rb");
  if(!f) return NULL;
  long fsize; { fseek(f, 0, SEEK_END); fsize = ftell(f); fseek(f, 0, SEEK_SET); }
  char *buf = (char *)malloc(fsize * sizeof(char));
  *sz = fread((void *) buf, 1, fsize, f);
  fclose(f);
  return buf;
}

// ...
  {
    /* Read `xlsx.full.min.js` from the filesystem */
    size_t len; char *buf = read_file("xlsx.full.min.js", &len);
    /* evaluate from the QuickJS context */
    JS_Eval(ctx, buf, len, "<input>", 0);
    /* Free the file buffer */
    free(buf);
  }

If the library is loaded, XLSX.version will be a string. This string can be pulled into the main C program.

  1. Get the XLSX property of the global object using JS_GetPropertyStr:
/* obtain reference to the XLSX object */
JSValue XLSX = JS_GetPropertyStr(ctx, global, "XLSX");
  1. Get the version property of the XLSX object using JS_GetPropertyStr:
/* obtain reference to `XLSX.version` */
JSValue version = JS_GetPropertyStr(ctx, XLSX, "version");
  1. Pull the string into C code with JS_ToCStringLen:
/* pull the version string into C */
size_t vlen; const char *vers = JS_ToCStringLen(ctx, &vlen, version);
printf("Version: %s\n", vers);

Reading Files

JS_NewArrayBuffer can generate an ArrayBuffer from a C byte array. The function signature expects uint8_t * instead of char *:

/* read file */
size_t dlen; uint8_t * dbuf = (uint8_t *)read_file("pres.numbers", &dlen);

/* load data into array buffer */
JSValue ab = JS_NewArrayBuffer(ctx, dbuf, dlen, NULL, NULL, 0);

The ArrayBuffer will be parsed with the SheetJS read method2. The CSV row data will be generated with sheet_to_csv3.

Parse the ArrayBuffer

:::note pass

The goal is to run the equivalent of the following JavaScript code:

/* `ab` is the `ArrayBuffer` from the previous step */
var wb = XLSX.read(ab);

:::

  1. Get the XLSX property of the global object and the read property of XLSX:
/* obtain reference to XLSX.read */
JSValue XLSX = JS_GetPropertyStr(ctx, global, "XLSX");
JSValue XLSX_read = JS_GetPropertyStr(ctx, XLSX, "read");
  1. Create an array of arguments to pass to the function. In this case, the read function will be called with one argument (ArrayBuffer data):
/* prepare arguments */
JSValue args[] = { ab };
  1. Use JS_Call to call the function with the arguments:
/* call XLSX.read(ab) */
JSValue wb = JS_Call(ctx, XLSX_read, XLSX, 1, args);

Get First Worksheet

:::note pass

The goal is to get the first worksheet. In JavaScript, the SheetNames property of the workbook is an array of strings and the Sheets property holds worksheet objects4. The desired action looks like:

/* `wb` is the workbook from the previous step */
var wsname = wb.SheetNames[0];
var ws = wb.Sheets[wsname];

:::

  1. Pull wb.SheetNames[0] into a C string using JS_GetPropertyStr:
/* get `wb.SheetNames[0]` */
JSValue SheetNames = JS_GetPropertyStr(ctx, wb, "SheetNames");
JSValue Sheet1 = JS_GetPropertyStr(ctx, SheetNames, "0");

/* pull first sheet name into C code */
size_t wslen; const char *wsname = JS_ToCStringLen(ctx, &wslen, Sheet1);
  1. Get the worksheet object:
/* get wb.Sheets[wsname] */
JSValue Sheets = JS_GetPropertyStr(ctx, wb, "Sheets");
JSValue ws = JS_GetPropertyStr(ctx, Sheets, wsname);

Convert to CSV

:::note pass

The goal is to call sheet_to_csv5 and pull the result into C code:

/* `ws` is the worksheet from the previous step */
var csv = XLSX.utils.sheet_to_csv(ws);

:::

  1. Create a references to XLSX.utils and XLSX.utils.sheet_to_csv:
/* obtain reference to XLSX.utils.sheet_to_csv */
JSValue utils = JS_GetPropertyStr(ctx, XLSX, "utils");
JSValue sheet_to_csv = JS_GetPropertyStr(ctx, utils, "sheet_to_csv");
  1. Create arguments array:
/* prepare arguments */
JSValue args[] = { ws };
  1. Use JS_Call to call the function and use JS_ToCStringLen to pull the CSV:
JSValue csv = JS_Call(ctx, sheet_to_csv, utils, 1, args);
size_t csvlen; const char *csvstr = JS_ToCStringLen(ctx, &csvlen, csv);

At this point, csvstr is a C string that can be printed to standard output.

Complete Example

The "Integration Example" covers a traditional integration in a C application, while the "CLI Test" demonstrates other concepts using the quickjs CLI tool.

Integration Example

:::note

This demo was tested in the following deployments:

Architecture Git Commit Date
darwin-x64 2788d71 2023-07-24
darwin-arm 2788d71 2023-06-05
linux-x64 2788d71 2023-06-02
win10-x64 2788d71 2023-07-24

When the demo was tested, commit 2788d71 corresponded to the latest release.

:::

  1. Build libquickjs.a:
git clone https://github.com/bellard/quickjs
cd quickjs
git checkout 2788d71
make
cd ..
  1. Copy libquickjs.a and quickjs.h into the working directory:
cp quickjs/libquickjs.a .
cp quickjs/quickjs.h .
  1. Download sheetjs.quick.c:
curl -LO https://docs.sheetjs.com/quickjs/sheetjs.quick.c
  1. Build the sample application:
gcc -o sheetjs.quick -Wall sheetjs.quick.c libquickjs.a -lm

This program tries to parse the file specified by the first argument

  1. Download the standalone script and test file:

{\ curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js curl -LO https://sheetjs.com/pres.numbers}

  1. Run the test program:
./sheetjs.quick pres.numbers

If successful, the program will print the library version number, file size, first worksheet name, and the contents of the first sheet as CSV rows.

CLI Test

:::note

This demo was last tested on 2023 March 11 against QuickJS commit 2788d71.

:::

  1. Ensure quickjs command line utility is installed

  2. Download the standalone script and the test file:

{\ curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js curl -LO https://sheetjs.com/pres.numbers}

  1. Download SheetJSQuick.js
curl -LO https://docs.sheetjs.com/quickjs/SheetJSQuick.js
  1. Test the program:
quickjs SheetJSQuick.js

If successful, the script will generate SheetJSQuick.xlsx.