docs.sheetjs.com/docz/docs/03-demos/42-engines/09_hermes.md
2023-08-29 23:44:38 -04:00

13 KiB

title sidebar_label description pagination_prev pagination_next
Sharing Sheets with Hermes C++ + Hermes Process structured data in C++ programs. Seamlessly integrate spreadsheets into your program by pairing Hermes and SheetJS. Handle the most complex files without breaking a sweat. demos/bigdata/index solutions/input

import current from '/version.js'; import CodeBlock from '@theme/CodeBlock';

Hermes is an embeddable JS engine written in C++.

SheetJS is a JavaScript library for reading and writing data from spreadsheets.

This demo uses Hermes and SheetJS to pull data from a spreadsheet and print CSV rows. We'll explore how to load SheetJS in a Hermes context and process spreadsheets from a C++ program.

The "Integration Example" section includes a complete command-line tool for reading data from files.

Integration Details

:::info pass

Many Hermes functions are not documented. The explanation was verified against commit 70af78b.

:::

:::warning pass

The main target for Hermes is React Native. At the time of writing, there was no official documentation for embedding the Hermes engine in C++ programs.

:::

Initialize Hermes

A Hermes engine instance is created with facebook::hermes::makeHermesRuntime:

std::unique_ptr<facebook::jsi::Runtime> rt(facebook::hermes::makeHermesRuntime());

Essential Objects

Hermes does not expose a console or global variable, but they can be synthesized from JS code in the runtime:

  • global can be obtained from a reference to this in an unbound function:
/* create global object */
var global = (function(){ return this; }).call(null);
  • console.log can be constructed from the builtin print function:
/* create a fake `console` from the hermes `print` builtin */
var console = { log: function(x) { print(x); } };

The code can be stored in a C string and evaluated using prepareJavascript to prepare code and evaluatePreparedJavascript to evaluate:

const char *init_code =
  /* create global object */
  "var global = (function(){ return this; }).call(null);"
  /* create a fake `console` from the hermes `print` builtin */
  "var console = { log: function(x) { print(x); } };"
  ;
auto src = std::make_shared<facebook::jsi::StringBuffer>(init_code);
auto js = rt->prepareJavaScript(src, std::string("<eval>"));
rt->evaluatePreparedJavaScript(js);

:::info Exception handling

Standard C++ exception handling patterns are used in Hermes integration code. The base class for Hermes exceptions is facebook::jsi::JSIException:

try {
  const char *init_code = "...";
  auto src = std::make_shared<facebook::jsi::StringBuffer>(init_code);
  auto js = rt->prepareJavaScript(src, std::string("<eval>"));
  rt->evaluatePreparedJavaScript(js);
} catch (const facebook::jsi::JSIException &e) {
  std::cerr << "JavaScript exception: " << e.what() << std::endl;
  return 1;
}

:::

Load SheetJS Scripts

SheetJS Standalone scripts can be parsed and evaluated in a Hermes context.

The main library can be loaded by reading the script from the file system and evaluating in the Hermes context.

:::tip pass

There are nonstandard tricks to embed the entire script in the binary. There are language proposals such as #embed (mirroring the same feature in C23).

For simplicity, the examples read the script file from the filesystem.

:::

Reading scripts from the filesystem

For the purposes of this demo, the standard C <stdio.h> methods are used:

static char *read_file(const char *filename, size_t *sz) {
  FILE *f = fopen(filename, "rb");
  if(!f) return NULL;
  long fsize; { fseek(f, 0, SEEK_END); fsize = ftell(f); fseek(f, 0, SEEK_SET); }
  char *buf = (char *)malloc(fsize * sizeof(char));
  *sz = fread((void *) buf, 1, fsize, f);
  fclose(f);
  return buf;
}

// ...
  /* read SheetJS library from filesystem */
  size_t sz; char *xlsx_full_min_js = read_file("xlsx.full.min.js", &sz);

Hermes Wrapper

Hermes does not provide a friendly way to prepare JavaScript code stored in a standard heap-allocated C string. Fortunately a wrapper can be created:

/* Unfortunately the library provides no C-friendly Buffer classes */
class CBuffer : public facebook::jsi::Buffer {
  public:
    CBuffer(const uint8_t *data, size_t size) : buf(data), sz(size) {}
    size_t size() const override { return sz; }
    const uint8_t *data() const override { return buf; }

  private:
    const uint8_t *buf;
    size_t sz;
};

// ...
  /* load SheetJS library */
  auto src = std::make_shared<CBuffer>(CBuffer((uint8_t *)xlsx_full_min_js, sz));

Evaluating SheetJS Library Code

The code wrapper can be "prepared" with prepareJavascript and "evaluated" with evaluatePreparedJavascript.

The second argument to preparedJavascript is a C++ std::string that holds the source URL. Typically a name like xlsx.full.min.js helps distinguish SheetJS library exceptions from other parts of the application.

  auto js = rt->prepareJavaScript(src, std::string("xlsx.full.min.js"));
  rt->evaluatePreparedJavaScript(js);

Testing

If the library is loaded, XLSX.version will be a string. This string can be pulled into the main C++ program.

The evaluatePreparedJavascript method returns a facebook::jsi::Value object that represents the result:

/* evaluate XLSX.version and capture the result */
auto src = std::make_shared<facebook::jsi::StringBuffer>("XLSX.version");
auto js = rt->prepareJavaScript(src, std::string("<eval>"));
facebook::jsi::Value jsver = rt->evaluatePreparedJavaScript(js);

The getString method extracts the string value and returns an internal string object (facebook::jsi::String). Given that string object, the utf8 method returns a proper C++ std::string that can be printed:

/* pull the version string into C++ code and print */
facebook::jsi::String jsstr = jsver.getString(*rt);
std::string cppver = jsstr.utf8(*rt);
std::cout << "SheetJS version " << cppver << std::endl;

Reading Files

Typically C++ code will read files and Hermes will project the data in the JS engine as an ArrayBuffer. SheetJS libraries can parse ArrayBuffer data.

Standard SheetJS operations can pick the first worksheet and generate CSV string data from the worksheet. Hermes provides methods to convert the JS strings back to std::string objects for further processing in C++.

:::note

It is strongly recommended to create a stub function to perform the entire workflow in JS code and pass the final result back to C++.

:::

Hermes Wrapper

Hermes supports ArrayBuffer but has no simple helper to read raw memory. Libraries are expected to implement MutableBuffer:

/* ArrayBuffer constructor expects MutableBuffer */
class CMutableBuffer : public facebook::jsi::MutableBuffer {
  public:
    CMutableBuffer(uint8_t *data, size_t size) : buf(data), sz(size) {}
    size_t size() const override { return sz; }
    uint8_t *data() override { return buf; }

  private:
    uint8_t *buf;
    size_t sz;
};

A facebook::jsi::ArrayBuffer object can be created using the wrapper:

/* load payload as ArrayBuffer */
size_t sz; char *data = read_file("pres.xlsx", &sz);
auto payload = std::make_shared<CMutableBuffer>(CMutableBuffer((uint8_t *)data, sz));
auto ab = facebook::jsi::ArrayBuffer(*rt, payload);

SheetJS Operations

In this example, the goal is to pull the first worksheet and generate CSV rows.

XLSX.read1 parses the ArrayBuffer and returns a SheetJS workbook object:

var wb = XLSX.read(buf);

The SheetNames property2 is an array of the sheet names in the workbook. The first sheet name can be obtained with the following JS snippet:

var first_sheet_name = wb.SheetNames[0];

The Sheets property3 is an object whose keys are sheet names and whose corresponding values are worksheet objects.

var first_sheet = wb.Sheets[first_sheet_name];

The sheet_to_csv utility function4 generates a CSV string from the sheet:

var csv = XLSX.utils.sheet_to_csv(first_sheet);

C++ integration code

:::note pass

The stub function will be passed an ArrayBuffer object:

function(buf) {
  /* `buf` will be an ArrayBuffer */
  var wb = XLSX.read(buf);
  return XLSX.utils.sheet_to_csv(wb.Sheets[wb.SheetNames[0]]);
}

:::

The result after evaluating the stub is a facebook::jsi::Value object:

/* define stub function to read and convert first sheet to CSV */
auto src = std::make_shared<facebook::jsi::StringBuffer>(
  "(function(buf) {"
    "var wb = XLSX.read(buf);"
    "return XLSX.utils.sheet_to_csv(wb.Sheets[wb.SheetNames[0]]);"
  "})"
);
auto js = rt->prepareJavaScript(src, std::string("<eval>"));
facebook::jsi::Value funcval = rt->evaluatePreparedJavaScript(js);

To call this function, the opaque Value must be converted to a Function:

facebook::jsi::Function func = func.asObject(*rt).asFunction(*rt);

The Function exposes a call method to perform the function invocation. The stub accepts an ArrayBuffer argument:

/* call stub function and capture result */
facebook::jsi::Value csv = func.call(*rt, ab);

In the same way the library version string was pulled into C++ code, the CSV data can be captured using getString and utf8 methods:

/* interpret as utf8 */
std::string str = csv.getString(*rt).utf8(*rt);
std::cout << str << std::endl;

Complete Example

The "Integration Example" covers a traditional integration in a C++ application, while the "CLI Test" demonstrates other concepts using the hermes CLI tool.

Integration Example

:::note

This demo was tested in the following deployments:

Architecture Git Commit Date
darwin-x64 70af78b 2023-08-27
darwin-arm 869312f 2023-06-05
linux-x64 70af78b 2023-08-27

:::

  1. Install dependencies
Installation Notes (click to show)

The official guidance5 has been verified in macOS and HoloOS (Linux).

On macOS:

brew install icu4c cmake ninja

On HoloOS (and other Arch Linux distros):

sudo pacman -Syu cmake git ninja icu python zip readline
  1. Make a project directory:
mkdir sheetjs-hermes
cd sheetjs-hermes
  1. Download the Makefile:
curl -LO https://docs.sheetjs.com/hermes/Makefile
  1. Download sheetjs-hermes.cpp:
curl -LO https://docs.sheetjs.com/hermes/sheetjs-hermes.cpp
  1. Build the library (this is the init target):
make init
  1. Build the application:
make sheetjs-hermes
  1. Download the standalone script and test file:

{\ curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js curl -LO https://sheetjs.com/pres.numbers}

  1. Run the application:
./sheetjs-hermes pres.numbers

If successful, the program will print the library version number and the contents of the first sheet as CSV rows.

CLI Test

:::note

This demo was last tested on 2023 August 27 against Hermes version 0.11.0.

:::

Due to limitations of the standalone binary, this demo will encode a test file as a Base64 string and directly add it to an amalgamated script.

  1. Install the hermes command line tool

  2. Download the standalone script and test file:

{\ curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js curl -LO https://sheetjs.com/pres.numbers}

  1. Bundle the test file and create payload.js:
node -e "fs.writeFileSync('payload.js', 'var payload = \"' + fs.readFileSync('pres.numbers').toString('base64') + '\";')"
  1. Create support scripts:
  • global.js creates a global variable and defines a fake console:
var global = (function(){ return this; }).call(null);
var console = { log: function(x) { print(x); } };
  • hermes.js will call XLSX.read and XLSX.utils.sheet_to_csv:
var wb = XLSX.read(payload, {type:'base64'});
console.log(XLSX.utils.sheet_to_csv(wb.Sheets[wb.SheetNames[0]]));
  1. Create the amalgamation xlsx.hermes.js:
cat global.js xlsx.full.min.js payload.js hermes.js > xlsx.hermes.js

The final script defines global before loading the standalone library. Once ready, it will read the bundled test data and print the contents as CSV.

  1. Run the script using the Hermes standalone binary:
hermes xlsx.hermes.js

If successful, the script will print CSV data from the test file.