docs.sheetjs.com/docz/docs/03-demos/12-engines/02_v8.md
2023-06-03 05:10:50 -04:00

8.7 KiB

title pagination_prev pagination_next
C++ + V8 demos/bigdata/index solutions/input

import current from '/version.js'; import CodeBlock from '@theme/CodeBlock';

V8 is an embeddable JS engine written in C++. It powers Chromium and Chrome, NodeJS and Deno, Adobe UXP and other platforms.

The Standalone scripts can be parsed and evaluated in a V8 context.

Integration Details

Initialize V8

The official V8 hello-world example covers initialization and cleanup. For the purposes of this demo, the key variables are noted below:

v8::Isolate* isolate = v8::Isolate::New(create_params);
v8::Local<v8::Context> context = v8::Context::New(isolate);

The following helper function evaluates C strings as JS code:

v8::Local<v8::Value> eval_code(v8::Isolate *i, v8::Local<v8::Context> c, char* code) {
  v8::Local<v8::String> source = v8::String::NewFromUtf8(i, code).ToLocalChecked();
  v8::Local<v8::Script> script = v8::Script::Compile(i, source).ToLocalChecked();
  return script->Run(c).ToLocalChecked();
}

Load SheetJS Scripts

The main library can be loaded by reading the scripts from the file system and evaluating in the V8 context:

/* simple wrapper to read the entire script file */
static char *read_file(const char *filename, size_t *sz) {
  FILE *f = fopen(filename, "rb");
  if(!f) return NULL;
  long fsize; { fseek(f, 0, SEEK_END); fsize = ftell(f); fseek(f, 0, SEEK_SET); }
  char *buf = (char *)malloc(fsize * sizeof(char));
  *sz = fread((void *) buf, 1, fsize, f);
  fclose(f);
  return buf;
}

// ...
  size_t sz; char *file = read_file("xlsx.full.min.js", &sz);
  v8::Local<v8::Value> result = eval_code(isolate, context, file);

To confirm the library is loaded, XLSX.version can be inspected:

  /* get version string */
  v8::Local<v8::Value> result = eval_code(isolate, context, "XLSX.version");
  v8::String::Utf8Value vers(isolate, result);
  printf("SheetJS library version %s\n", *vers);

Reading Files

V8 supports ArrayBuffer natively. Assuming buf is a C byte array, with length len, this snippet stores the data as an ArrayBuffer in global scope:

/* load C char array and save to an ArrayBuffer */
std::unique_ptr<v8::BackingStore> back = v8::ArrayBuffer::NewBackingStore(isolate, len);
memcpy(back->Data(), buf, len);
v8::Local<v8::ArrayBuffer> ab = v8::ArrayBuffer::New(isolate, std::move(back));
v8::Maybe<bool> res = context->Global()->Set(context, v8::String::NewFromUtf8Literal(isolate, "buf"), ab);

/* parse with SheetJS */
v8::Local<v8::Value> result = eval_code(isolate, context, "globalThis.wb = XLSX.read(buf)");

wb will be a variable in the JS environment that can be inspected using the various SheetJS API functions.

Writing Files

The underlying memory from an ArrayBuffer can be recovered:

/* write with SheetJS using type: "array" */
v8::Local<v8::Value> result = eval_code(isolate, context, "XLSX.write(wb, {type:'array', bookType:'xlsb'})");

/* pull result back to C++ */
v8::Local<v8::ArrayBuffer> ab = v8::Local<v8::ArrayBuffer>::Cast(result);
size_t sz = ab->ByteLength();
char *buf = ab->Data();

The resulting buf can be written to file with fwrite.

Complete Example

:::note

This demo was tested in the following deployments:

V8 Version Platform OS Version Compiler Date
11.3.244.11 darwin-x64 macOS 13.2 clang 14.0.3 2023-05-20
11.3.244.11 linux-x64 HoloOS 3.4.6 gcc 12.2.0 2023-05-20

:::

This program parses a file and prints CSV data from the first worksheet. It also generates an XLSB file and writes to the filesystem.

:::caution

When the demo was last tested, there were errors in the official V8 embed guide. The correct instructions are included below.

:::

:::caution

The build process is long and will test your patience.

:::

Preparation

  1. Download and install depot_tools:
mkdir -p /usr/local/lib
cd /usr/local/lib
git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git
  1. Add the path to the PATH environment variable:
export PATH="/usr/local/lib/depot_tools:$PATH"

At this point, it is strongly recommended to add the line to a shell startup script such as .bashrc or .zshrc

  1. Run gclient once to update depot_tools:
gclient

Clone V8

  1. Create a base directory:
mkdir -p ~/dev/v8
cd ~/dev/v8
fetch v8
cd v8

Note that the actual repo will be placed in ~/dev/v8/v8.

  1. Checkout the desired version. The following command pulls 11.3.244.11:
git checkout refs/tags/11.3.244.11 -b sample -t

Build V8

  1. Build the static library.
tools/dev/v8gen.py x64.release.sample
ninja -C out.gn/x64.release.sample v8_monolith
  1. Ensure the sample hello-world compiles and runs:
g++ -I. -Iinclude samples/hello-world.cc -o hello_world -fno-rtti -lv8_monolith \
    -lv8_libbase -lv8_libplatform -ldl -Lout.gn/x64.release.sample/obj/ -pthread \
    -std=c++17 -DV8_COMPRESS_POINTERS=1 -DV8_ENABLE_SANDBOX
./hello_world

Prepare Project

  1. Make a new project folder:
cd ~/dev
mkdir sheetjs-v8
cd sheetjs-v8
  1. Copy the sample source:
cp ~/dev/v8/v8/samples/hello-world.cc .
  1. Create symbolic links to the include headers and obj library folders:
ln -s ~/dev/v8/v8/include
ln -s ~/dev/v8/v8/out.gn/x64.release.sample/obj
  1. Build and run the hello-world example from this folder:
g++ -I. -Iinclude hello-world.cc -o hello_world -fno-rtti -lv8_monolith \
    -lv8_libbase -lv8_libplatform -ldl -Lobj/ -pthread -std=c++17 \
    -DV8_COMPRESS_POINTERS=1 -DV8_ENABLE_SANDBOX
./hello_world

Add SheetJS

  1. Download the standalone script and test file:

{\ curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js curl -LO https://sheetjs.com/pres.numbers}

  1. Download sheetjs.v8.cc:
curl -LO https://docs.sheetjs.com/v8/sheetjs.v8.cc
  1. Compile standalone sheetjs.v8 binary
g++ -I. -Iinclude sheetjs.v8.cc -o sheetjs.v8 -fno-rtti -lv8_monolith \
    -lv8_libbase -lv8_libplatform -ldl -Lobj/ -pthread -std=c++17 \
    -DV8_COMPRESS_POINTERS=1 -DV8_ENABLE_SANDBOX
  1. Run the demo:
./sheetjs.v8 pres.numbers

If the program succeeded, the CSV contents will be printed to console and the file sheetjsw.xlsb will be created. That file can be opened with Excel.

Bindings

Bindings exist for many languages. As these bindings require "native" code, they may not work on every platform.

Rust

The v8 crate provides binary builds and straightforward bindings. The Rust code is similar to the C++ code.

Pulling data from an ArrayBuffer back into Rust involves an unsafe operation:

/* assuming JS code returns an ArrayBuffer, copy result to a Vec<u8> */
fn eval_code_ab(scope: &mut v8::HandleScope, code: &str) -> Vec<u8> {
  let source = v8::String::new(scope, &code).unwrap();
  let script = v8::Script::compile(scope, source, None).unwrap();
  let result: v8::Local<v8::ArrayBuffer> = script.run(scope).unwrap().try_into().unwrap();
  /* In C++, `Data` returns a pointer. Collecting data into Vec<u8> is unsafe */
  unsafe { return std::slice::from_raw_parts_mut(
    result.data().unwrap().cast::<u8>().as_ptr(),
    result.byte_length()
  ).to_vec(); }
}

:::note

This demo was last tested in the following deployments:

Architecture V8 Crate Date
darwin-x64 0.71.2 2023-05-22
linux-x64 0.71.2 2023-05-23
win32-x64 0.71.2 2023-05-23

:::

  1. Create a new project:
cargo new sheetjs-rustyv8
cd sheetjs-rustyv8
cargo run
  1. Add the v8 crate:
cargo add v8
cargo run
  1. Download the Standalone build:

{\ curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js}

  1. Download main.rs and replace src/main.rs:
curl -L -o src/main.rs https://docs.sheetjs.com/v8/main.rs
  1. Download the test file and run:
curl -LO https://sheetjs.com/pres.numbers
cargo run pres.numbers

If the program succeeded, the CSV contents will be printed to console and the file sheetjsw.xlsb will be created. That file can be opened with Excel.