--- title: Blazing Fast Data Processing with V8 sidebar_label: C++ + V8 description: Process structured data in C++ or Rust programs. Seamlessly integrate spreadsheets by pairing V8 and SheetJS. Modernize workflows while maintaining Excel compatibility. pagination_prev: demos/bigdata/index pagination_next: solutions/input --- import current from '/version.js'; import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; import CodeBlock from '@theme/CodeBlock'; [V8](https://v8.dev/) is an embeddable JavaScript engine written in C++. It powers Chromium and Chrome, NodeJS and Deno, Adobe UXP and other platforms. [SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing data from spreadsheets. This demo uses V8 and SheetJS to read and write spreadsheets. We'll explore how to load SheetJS in a V8 context and process spreadsheets and structured data from C++ and Rust programs. The ["Complete Example"](#complete-example) creates a C++ command-line tool for reading spreadsheet files and generating new workbooks. ["Bindings"](#bindings) covers V8 engine bindings for other programming languages. ## Integration Details The [SheetJS Standalone scripts](/docs/getting-started/installation/standalone) can be parsed and evaluated in a V8 context. :::note pass This section describes a flow where the script is parsed and evaluated each time the program is run. Using V8 snapshots, SheetJS libraries can be parsed and evaluated at build time. This greatly improves program startup time. The ["Snapshots"](#snapshots) section includes a complete example. ::: ### Initialize V8 The official V8 `hello-world` example covers initialization and cleanup. For the purposes of this demo, the key variables are noted below: ```cpp v8::Isolate* isolate = v8::Isolate::New(create_params); v8::Local context = v8::Context::New(isolate); ``` The following helper function evaluates C strings as JS code: ```cpp v8::Local eval_code(v8::Isolate *isolate, v8::Local context, char* code, size_t sz = -1) { v8::Local source = v8::String::NewFromUtf8(isolate, code, v8::NewStringType::kNormal, sz).ToLocalChecked(); v8::Local script = v8::Script::Compile(context, source).ToLocalChecked(); return script->Run(context).ToLocalChecked(); } ``` ### Load SheetJS Scripts The main library can be loaded by reading the scripts from the file system and evaluating in the V8 context: ```cpp /* simple wrapper to read the entire script file */ static char *read_file(const char *filename, size_t *sz) { FILE *f = fopen(filename, "rb"); if(!f) return NULL; long fsize; { fseek(f, 0, SEEK_END); fsize = ftell(f); fseek(f, 0, SEEK_SET); } char *buf = (char *)malloc(fsize * sizeof(char)); *sz = fread((void *) buf, 1, fsize, f); fclose(f); return buf; } // ... size_t sz; char *file = read_file("xlsx.full.min.js", &sz); v8::Local result = eval_code(isolate, context, file, sz); ``` To confirm the library is loaded, `XLSX.version` can be inspected: ```cpp /* get version string */ v8::Local result = eval_code(isolate, context, "XLSX.version"); v8::String::Utf8Value vers(isolate, result); printf("SheetJS library version %s\n", *vers); ``` ### Reading Files V8 supports `ArrayBuffer` natively. Assuming `buf` is a C byte array, with length `len`, the following code stores the data in a global `ArrayBuffer`: ```cpp title="Loading data into an ArrayBuffer in the V8 engine" /* load C char array and save to an ArrayBuffer */ std::unique_ptr back = v8::ArrayBuffer::NewBackingStore(isolate, len); memcpy(back->Data(), buf, len); v8::Local ab = v8::ArrayBuffer::New(isolate, std::move(back)); v8::Maybe res = context->Global()->Set(context, v8::String::NewFromUtf8Literal(isolate, "buf"), ab); ``` Once the raw data is pulled into the engine, the SheetJS `read` method[^1] can parse the data. It is recommended to attach the result to a global variable: ```cpp /* parse with SheetJS */ v8::Local result = eval_code(isolate, context, "globalThis.wb = XLSX.read(buf)"); ``` `wb`, a SheetJS workbook object[^2], will be a variable in the JS environment that can be inspected using the various SheetJS API functions[^3]. ### Writing Files The SheetJS `write` method[^4] generates file bytes from workbook objects. The `array` type[^5] instructs the library to generate `ArrayBuffer` objects: ```cpp /* write with SheetJS using type: "array" */ v8::Local result = eval_code(isolate, context, "XLSX.write(wb, {type:'array', bookType:'xlsb'})"); ``` The underlying memory from an `ArrayBuffer` can be pulled from the engine: ```cpp title="Pulling raw bytes from an ArrayBuffer" /* pull result back to C++ */ v8::Local ab = v8::Local::Cast(result); size_t sz = ab->ByteLength(); char *buf = ab->Data(); ``` The resulting `buf` can be written to file with `fwrite`. ## Complete Example :::note Tested Deployments This demo was tested in the following deployments: | V8 Version | Platform | OS Version | Compiler | Date | |:--------------|:-------------|:--------------|:-----------------|:-----------| | `12.4.253` | `darwin-x64` | macOS 14.4 | `clang 15.0.0` | 2024-03-15 | | `12.7.130` | `darwin-arm` | macOS 14.5 | `clang 15.0.0` | 2024-05-25 | | `12.5.48` | `win10-x64` | Windows 10 | `CL 19.39.33523` | 2024-03-24 | | `12.5.48` | `linux-x64` | HoloOS 3.5.17 | `gcc 13.1.1` | 2024-03-21 | | `12.7.130` | `linux-arm` | Debian 12 | `gcc 12.2.0` | 2024-05-25 | ::: This program parses a file and prints CSV data from the first worksheet. It also generates an XLSB file and writes to the filesystem. :::caution pass When the demo was last tested, there were errors in the official V8 embed guide. Corrected instructions are included below. ::: :::danger pass **The build process is long and will test your patience.** ::: ### Preparation 0) Prepare `/usr/local/lib`: ```bash mkdir -p /usr/local/lib cd /usr/local/lib ``` :::note pass If this step throws a permission error, run the following commands: ```bash sudo mkdir -p /usr/local/lib sudo chmod 777 /usr/local/lib ``` ::: 0) Follow the official ["Visual Studio"](https://chromium.googlesource.com/chromium/src/+/master/docs/windows_build_instructions.md#visual-studio) installation steps. :::info pass Using the installer tool, the "Desktop development with C++" workload must be installed. In the sidebar, verify the following components are checked: - "C++ ATL for latest ... build tools" (`v143` when last tested) - "C++ MFC for latest ... build tools" (`v143` when last tested) In the "Individual components" tab, search for "Windows 11 SDK" and verify that "Windows 11 SDK (10.0.22621.0)" is checked. Click "Modify" and allow the installer to finish. The SDK debugging tools must be installed after the SDK is installed. 1) Using the Search bar, search "Apps & features". 2) When the setting panel opens, scroll down to "Windows Software Development Kit - Windows 10.0.22621 and click "Modify". 3) In the new window, select "Change" and click "Next" 4) Check "Debugging Tools for Windows" and click "Change" ::: The following `git` settings should be changed: ```bash git config --global core.autocrlf false git config --global core.filemode false git config --global branch.autosetuprebase always ``` 1) Download and install `depot_tools`: ```bash git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git ``` :::note pass If this step throws a permission error, run the following commands and retry: ```bash sudo mkdir -p /usr/local/lib sudo chmod 777 /usr/local/lib ``` ::: [The bundle](https://storage.googleapis.com/chrome-infra/depot_tools.zip) is a ZIP file that should be downloaded and extracted. The demo was last tested on an exFAT-formatted drive (mounted at `E:\`). After extracting, verify that the `depot_tools` folder is not read-only. 2) Add the path to the `PATH` environment variable: ```bash export PATH="/usr/local/lib/depot_tools:$PATH" ``` At this point, it is strongly recommended to add the line to a shell startup script such as `.bashrc` or `.zshrc` :::caution pass These instructions are for `cmd` use. Do not run in PowerShell! It is strongly recommended to use the "Developer Command Prompt" from Visual Studio as it prepares the console to run build tools. ::: ```bash set DEPOT_TOOLS_WIN_TOOLCHAIN=0 set PATH=E:\depot_tools;%PATH% ``` In addition, the `vs2022_install` variable must be set to the Visual Studio folder. For example, using the "Community Edition", the assignment should be ```bash set vs2022_install="C:\Program Files\Microsoft Visual Studio\2022\Community" ``` These environment variables can be persisted in the Control Panel. 3) Run `gclient` once to update `depot_tools`: ```bash gclient ``` ```bash gclient ``` :::caution pass `gclient` may throw errors related to `git` and permissions issues: ``` fatal: detected dubious ownership in repository at 'E:/depot_tools' 'E:/depot_tools' is on a file system that doesnot record ownership To add an exception for this directory, call: git config --global --add safe.directory E:/depot_tools ``` These issues are related to the exFAT file system. They were resolved by running the recommended commands and re-running `gclient`. ::: :::caution pass There were errors pertaining to `gitconfig`: ``` error: could not write config file E:/depot_tools/bootstrap-2@3_8_10_chromium_26_bin/git/etc/gitconfig: File exists ``` This can happen if the `depot_tools` folder is read-only. The workaround is to unset the read-only flag for the `E:\depot_tools` folder. ::: ### Clone V8 4) Create a base directory: ```bash mkdir -p ~/dev/v8 cd ~/dev/v8 fetch v8 cd v8 ``` Note that the actual repo will be placed in `~/dev/v8/v8`. ```bash cd E:\ mkdir v8 cd v8 fetch v8 cd v8 ``` :::caution pass On exFAT, every cloned repo elicited the same `git` permissions error. `fetch` will fail with a clear remedy message such as ``` git config --global --add safe.directory E:/v8/v8 ``` Run the command then run `gclient sync`, repeating each time the command fails. ::: :::caution pass There were occasional `git` conflict errors: ``` v8/tools/clang (ERROR) ---------------------------------------- [0:00:01] Started. ... error: Your local changes to the following files would be overwritten by checkout: plugins/FindBadRawPtrPatterns.cpp ... Please commit your changes or stash them before you switch branches. Aborting error: could not detach HEAD ---------------------------------------- Error: 28> Unrecognized error, please merge or rebase manually. 28> cd E:\v8\v8\tools\clang && git rebase --onto 65ceb79efbc9d1dec9b1a0f4bc0b8d010b9d7a66 refs/remotes/origin/main ``` The recommended fix is to delete the referenced folder and re-run `gclient sync` ::: 5) Checkout the desired version. The following command pulls `12.7.130`: ```bash git checkout tags/12.7.130 -b sample ``` :::caution pass The official documentation recommends: ```bash git checkout refs/tags/12.7.130 -b sample -t ``` This command failed in local testing: ``` E:\v8\v8>git checkout refs/tags/12.7.130 -b sample -t fatal: cannot set up tracking information; starting point 'refs/tags/12.7.130' is not a branch ``` ::: ### Build V8 6) Build the static library. ```bash tools/dev/v8gen.py x64.release.sample ninja -C out.gn/x64.release.sample v8_monolith ``` ```bash tools/dev/v8gen.py arm64.release.sample ninja -C out.gn/arm64.release.sample v8_monolith ``` ```bash tools/dev/v8gen.py x64.release.sample ninja -C out.gn/x64.release.sample v8_monolith ``` :::note pass In some Linux x64 tests using GCC 12, there were build errors that stemmed from warnings. The error messages included the tag `-Werror`: ``` ../../src/compiler/turboshaft/wasm-gc-type-reducer.cc:212:18: error: 'back_insert_iterator' may not intend to support class template argument deduction [-Werror,-Wctad-maybe-unsupported] 212 | std::back_insert_iterator(snapshots), [this](Block* pred) { | ^ ../../build/linux/debian_bullseye_amd64-sysroot/usr/lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/stl_iterator.h:596:11: note: add a deduction guide to suppress this warning 596 | class back_insert_iterator | ^ 1 error generated. ``` This was resolved by manually editing `out.gn/x64.release.sample/args.gn`. The option `treat_warnings_as_errors` should be set to `false`: ```ninja title="out.gn/x64.release.sample/args.gn (add to file)" treat_warnings_as_errors = false ``` ::: ```bash tools/dev/v8gen.py arm64.release.sample ``` Append the following lines to `out.gn/arm64.release.sample/args.gn`: ```text title="out.gn/arm64.release.sample/args.gn (add to file)" is_clang = false treat_warnings_as_errors = false ``` Run the build: ```bash ninja -C out.gn/arm64.release.sample v8_monolith ``` :::caution pass When this demo was last tested, an assertion failed: ``` ../../src/base/small-vector.h: In instantiation of ‘class v8::base::SmallVector, 16>’: ../../src/compiler/turboshaft/loop-unrolling-reducer.h:577:11: required from here ../../src/base/macros.h:215:55: error: static assertion failed: T should be trivially copyable 215 | static_assert(::v8::base::is_trivially_copyable::value, \ | ^~~~~ ../../src/base/small-vector.h:25:3: note: in expansion of macro ‘ASSERT_TRIVIALLY_COPYABLE’ 25 | ASSERT_TRIVIALLY_COPYABLE(T); | ^~~~~~~~~~~~~~~~~~~~~~~~~ ``` The build passed after disabling the assertions: ```cpp title="src/base/small-vector.h (edit highlighted lines)" class SmallVector { // Currently only support trivially copyable and trivially destructible data // types, as it uses memcpy to copy elements and never calls destructors. // highlight-start //ASSERT_TRIVIALLY_COPYABLE(T); //static_assert(std::is_trivially_destructible::value); // highlight-end public: static constexpr size_t kInlineSize = kSize; ``` ::: ```bash python3 tools\dev\v8gen.py -vv x64.release.sample ninja -C out.gn\x64.release.sample v8_monolith ``` :::caution pass In local testing, the build sometimes failed with a `dbghelp.dll` error: ``` Exception: dbghelp.dll not found in "C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\dbghelp.dll" ``` This issue was fixed by removing and reinstalling "Debugging Tools for Windows" from the Control Panel as described in step 0. ::: :::caution pass In local testing, the `ninja` build failed with C++ deprecation errors: ```c++ ../..\src/wasm/wasm-code-manager.h(670,28): error: 'atomic_load>' is deprecated: warning STL4029: std::atomic_*() overloads for shared_ptr are deprecated in C++20. The shared_ptr specialization of std::atomic provides superior functionality. You can define _SILENCE_CXX20_OLD_SHARED_PTR_ATOMIC_SUPPORT_DEPRECATION_WARNING or _SILENCE_ALL_CXX20_DEPRECATION_WARNINGS to suppress this warning. [-Werror,-Wdeprecated-declarations] 670 | auto wire_bytes = std::atomic_load(&wire_bytes_); | ^ C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\include\memory(3794,1): note: 'atomic_load>' has been explicitly marked deprecated here 3794 | _CXX20_DEPRECATE_OLD_SHARED_PTR_ATOMIC_SUPPORT _NODISCARD shared_ptr<_Ty> atomic_load( | ^ C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\include\yvals_core.h(1317,7): note: expanded from macro '_CXX20_DEPRECATE_OLD_SHARED_PTR_ATOMIC_SUPPORT' 1317 | [[deprecated("warning STL4029: " \ | ^ 2 errors generated. ``` The workaround is to append a line to `out.gn\x64.release.sample\args.gn`: ```text title="out.gn\x64.release.sample\args.gn (add to end)" treat_warnings_as_errors = false ``` After adding the line, run the `ninja` command again: ```bash ninja -C out.gn\x64.release.sample v8_monolith ``` ::: 7) Ensure the sample `hello-world` compiles and runs: ```bash g++ -I. -Iinclude samples/hello-world.cc -o hello_world -fno-rtti -lv8_monolith \ -ldl -Lout.gn/x64.release.sample/obj/ -pthread \ -std=c++17 -DV8_COMPRESS_POINTERS=1 -DV8_ENABLE_SANDBOX ./hello_world ``` :::info pass In older V8 versions, the flags `-lv8_libbase -lv8_libplatform` were required. Linking against `libv8_libbase` or `libv8_libplatform` in V8 version `12.4.253` elicited linker errors: ``` ld: multiple errors: unknown file type in '/Users/test/dev/v8/v8/out.gn/x64.release.sample/obj/libv8_libplatform.a'; unknown file type in '/Users/test/dev/v8/v8/out.gn/x64.release.sample/obj/libv8_libbase.a' ``` ::: ```bash g++ -I. -Iinclude samples/hello-world.cc -o hello_world -fno-rtti -lv8_monolith \ -ldl -Lout.gn/arm64.release.sample/obj/ -pthread \ -std=c++17 -DV8_COMPRESS_POINTERS=1 -DV8_ENABLE_SANDBOX ./hello_world ``` :::info pass In older V8 versions, the flags `-lv8_libbase -lv8_libplatform` were required. Linking against `libv8_libbase` or `libv8_libplatform` in V8 version `12.4.253` elicited linker errors: ``` ld: multiple errors: unknown file type in '/Users/test/dev/v8/v8/out.gn/x64.release.sample/obj/libv8_libplatform.a'; unknown file type in '/Users/test/dev/v8/v8/out.gn/x64.release.sample/obj/libv8_libbase.a' ``` ::: ```bash g++ -I. -Iinclude samples/hello-world.cc -o hello_world -fno-rtti -lv8_monolith \ -lv8_libbase -lv8_libplatform -ldl -Lout.gn/x64.release.sample/obj/ -pthread \ -std=c++17 -DV8_COMPRESS_POINTERS=1 -DV8_ENABLE_SANDBOX ./hello_world ``` ```bash g++ -I. -Iinclude samples/hello-world.cc -o hello_world -fno-rtti -lv8_monolith \ -lv8_libbase -lv8_libplatform -ldl -Lout.gn/arm64.release.sample/obj/ -pthread \ -std=c++17 -DV8_COMPRESS_POINTERS=1 -DV8_ENABLE_SANDBOX ./hello_world ``` ```bash cl /I. /Iinclude samples/hello-world.cc /GR- v8_monolith.lib Advapi32.lib Winmm.lib Dbghelp.lib /std:c++17 /DV8_COMPRESS_POINTERS=1 /DV8_ENABLE_SANDBOX /link /out:hello_world.exe /LIBPATH:out.gn\x64.release.sample\obj\ .\hello_world.exe ``` ### Prepare Project 8) Make a new project folder: ```bash cd ~/dev mkdir -p sheetjs-v8 cd sheetjs-v8 ``` ```bash cd E:\ mkdir sheetjs-v8 cd sheetjs-v8 ``` 9) Copy the sample source: ```bash cp ~/dev/v8/v8/samples/hello-world.cc . ``` 10) Create symbolic links to the `include` headers and `obj` library folders: ```bash ln -s ~/dev/v8/v8/include ln -s ~/dev/v8/v8/out.gn/x64.release.sample/obj ``` ```bash ln -s ~/dev/v8/v8/include ln -s ~/dev/v8/v8/out.gn/arm64.release.sample/obj ``` ```bash ln -s ~/dev/v8/v8/include ln -s ~/dev/v8/v8/out.gn/x64.release.sample/obj ``` ```bash ln -s ~/dev/v8/v8/include ln -s ~/dev/v8/v8/out.gn/arm64.release.sample/obj ``` ```bash copy E:\v8\v8\samples\hello-world.cc .\ ``` 10) Observe that exFAT does not support symbolic links and move on to step 11. 11) Build and run the `hello-world` example from this folder: ```bash g++ -I. -Iinclude hello-world.cc -o hello_world -fno-rtti -lv8_monolith \ -lv8_libbase -lv8_libplatform -ldl -Lobj/ -pthread -std=c++17 \ -DV8_COMPRESS_POINTERS=1 -DV8_ENABLE_SANDBOX ./hello_world ``` :::caution pass In some V8 versions, the command failed in the linker stage: ``` ld: multiple errors: unknown file type in '/Users/test/dev/v8/v8/out.gn/x64.release.sample/obj/libv8_libplatform.a'; unknown file type in '/Users/test/dev/v8/v8/out.gn/x64.release.sample/obj/libv8_libbase.a' ``` The build succeeds after removing `libv8_libbase` and `libv8_libplatform`: ```bash g++ -I. -Iinclude hello-world.cc -o hello_world -fno-rtti -lv8_monolith \ -ldl -Lobj/ -pthread -std=c++17 \ -DV8_COMPRESS_POINTERS=1 -DV8_ENABLE_SANDBOX ./hello_world ``` ::: ```bash cl /MT /I..\v8\v8\ /I..\v8\v8\include hello-world.cc /GR- v8_monolith.lib Advapi32.lib Winmm.lib Dbghelp.lib /std:c++17 /DV8_COMPRESS_POINTERS=1 /DV8_ENABLE_SANDBOX /link /out:hello_world.exe /LIBPATH:..\v8\v8\out.gn\x64.release.sample\obj\ .\hello_world.exe ``` ### Add SheetJS 12) Download the SheetJS Standalone script and test file. Save both files in the project directory: {`\ curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js curl -LO https://docs.sheetjs.com/pres.numbers`} 13) Download [`sheetjs.v8.cc`](pathname:///v8/sheetjs.v8.cc): ```bash curl -LO https://docs.sheetjs.com/v8/sheetjs.v8.cc ``` 14) Compile standalone `sheetjs.v8` binary ```bash g++ -I. -Iinclude sheetjs.v8.cc -o sheetjs.v8 -fno-rtti -lv8_monolith \ -lv8_libbase -lv8_libplatform -ldl -Lobj/ -pthread -std=c++17 \ -DV8_COMPRESS_POINTERS=1 -DV8_ENABLE_SANDBOX ``` :::caution pass In some V8 versions, the command failed in the linker stage: ``` ld: multiple errors: unknown file type in '/Users/test/dev/v8/v8/out.gn/x64.release.sample/obj/libv8_libplatform.a'; unknown file type in '/Users/test/dev/v8/v8/out.gn/x64.release.sample/obj/libv8_libbase.a' ``` The build succeeds after removing `libv8_libbase` and `libv8_libplatform`: ```bash g++ -I. -Iinclude sheetjs.v8.cc -o sheetjs.v8 -fno-rtti -lv8_monolith \ -ldl -Lobj/ -pthread -std=c++17 \ -DV8_COMPRESS_POINTERS=1 -DV8_ENABLE_SANDBOX ``` ::: ```bash cl /MT /I..\v8\v8\ /I..\v8\v8\include sheetjs.v8.cc /GR- v8_monolith.lib Advapi32.lib Winmm.lib Dbghelp.lib /std:c++17 /DV8_COMPRESS_POINTERS=1 /DV8_ENABLE_SANDBOX /link /out:sheetjs.v8.exe /LIBPATH:..\v8\v8\out.gn\x64.release.sample\obj\ ``` 15) Run the demo: ```bash ./sheetjs.v8 pres.numbers ``` ```bash .\sheetjs.v8.exe pres.numbers ``` If the program succeeded, the CSV contents will be printed to console and the file `sheetjsw.xlsb` will be created. That file can be opened with Excel. ## Bindings Bindings exist for many languages. As these bindings require "native" code, they may not work on every platform. ### Rust The `v8` crate[^6] provides binary builds and straightforward bindings. The Rust code is similar to the C++ code. Pulling data from an `ArrayBuffer` back into Rust involves an unsafe operation: ```rust /* assuming JS code returns an ArrayBuffer, copy result to a Vec */ fn eval_code_ab(scope: &mut v8::HandleScope, code: &str) -> Vec { let source = v8::String::new(scope, code).unwrap(); let script = v8::Script::compile(scope, source, None).unwrap(); let result: v8::Local = script.run(scope).unwrap().try_into().unwrap(); /* In C++, `Data` returns a pointer. Collecting data into Vec is unsafe */ unsafe { return std::slice::from_raw_parts_mut( result.data().unwrap().cast::().as_ptr(), result.byte_length() ).to_vec(); } } ``` :::note Tested Deployments This demo was last tested in the following deployments: | Architecture | V8 Crate | Date | |:-------------|:---------|:-----------| | `darwin-x64` | `0.92.0` | 2024-05-28 | | `darwin-arm` | `0.92.0` | 2024-05-25 | | `win10-x64` | `0.89.0` | 2024-03-24 | | `linux-x64` | `0.91.0` | 2024-04-25 | | `linux-arm` | `0.92.0` | 2024-05-25 | ::: 1) Create a new project: ```bash cargo new sheetjs-rustyv8 cd sheetjs-rustyv8 cargo run ``` 2) Add the `v8` crate: ```bash cargo add v8 cargo run ``` 3) Download the SheetJS Standalone script and test file. Save both files in the project directory: {`\ curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js curl -LO https://docs.sheetjs.com/pres.numbers`} 4) Download [`main.rs`](pathname:///v8/main.rs) and replace `src/main.rs`: ```bash curl -L -o src/main.rs https://docs.sheetjs.com/v8/main.rs ``` 5) Build and run the app: ```bash cargo run pres.numbers ``` If the program succeeded, the CSV contents will be printed to console and the file `sheetjsw.xlsb` will be created. That file can be opened with Excel. ### Java [Javet](https://www.caoccao.com/Javet/) is a Java binding to the V8 engine. Javet simplifies conversions between Java data structures and V8 equivalents. Java byte arrays (`byte[]`) are projected in V8 as `Int8Array`. The SheetJS `read` method expects a `Uint8Array`. The following script snippet performs a zero-copy conversion: ```js title="Zero-copy conversion from Int8Array to Uint8Array" // assuming `i8` is an Int8Array const u8 = new Uint8Array(i8.buffer, i8.byteOffset, i8.length); ``` :::note Tested Deployments This demo was last tested in the following deployments: | Architecture | V8 Version | Javet | Java | Date | |:-------------|:--------------|:--------|:----------|:-----------| | `darwin-x64` | `12.6.228.13` | `3.1.3` | `22` | 2024-06-19 | | `darwin-arm` | `12.6.228.13` | `3.1.3` | `11.0.23` | 2024-06-19 | | `win10-x64` | `12.6.228.13` | `3.1.3` | `11.0.16` | 2024-06-21 | | `linux-x64` | `12.6.228.13` | `3.1.3` | `17.0.7` | 2024-06-20 | | `linux-arm` | `12.6.228.13` | `3.1.3` | `17.0.11` | 2024-06-20 | ::: 1) Create a new project: ```bash mkdir sheetjs-javet cd sheetjs-javet ``` 2) Download the Javet JAR. There are different archives for different platforms. ```bash curl -LO https://repo1.maven.org/maven2/com/caoccao/javet/javet-macos/3.1.3/javet-macos-3.1.3.jar ``` ```bash curl -LO https://repo1.maven.org/maven2/com/caoccao/javet/javet-macos/3.1.3/javet-macos-3.1.3.jar ``` ```bash curl -LO https://repo1.maven.org/maven2/com/caoccao/javet/javet/3.1.3/javet-3.1.3.jar ``` ```bash curl -LO https://repo1.maven.org/maven2/com/caoccao/javet/javet-linux-arm64/3.1.3/javet-linux-arm64-3.1.3.jar ``` ```bash curl -LO https://repo1.maven.org/maven2/com/caoccao/javet/javet/3.1.3/javet-3.1.3.jar ``` 3) Download the SheetJS Standalone script and test file. Save both files in the project directory: {`\ curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js curl -LO https://docs.sheetjs.com/pres.xlsx`} 4) Download [`SheetJSJavet.java`](pathname:///v8/SheetJSJavet.java): ```bash curl -LO https://docs.sheetjs.com/v8/SheetJSJavet.java ``` 5) Build and run the Java application: ```bash javac -cp ".:javet-macos-3.1.3.jar" SheetJSJavet.java java -cp ".:javet-macos-3.1.3.jar" SheetJSJavet pres.xlsx ``` ```bash javac -cp ".:javet-macos-3.1.3.jar" SheetJSJavet.java java -cp ".:javet-macos-3.1.3.jar" SheetJSJavet pres.xlsx ``` ```bash javac -cp ".:javet-3.1.3.jar" SheetJSJavet.java java -cp ".:javet-3.1.3.jar" SheetJSJavet pres.xlsx ``` ```bash javac -cp ".:javet-linux-arm64-3.1.3.jar" SheetJSJavet.java java -cp ".:javet-linux-arm64-3.1.3.jar" SheetJSJavet pres.xlsx ``` ```bash javac -cp ".;javet-3.1.3.jar" SheetJSJavet.java java -cp ".;javet-3.1.3.jar" SheetJSJavet pres.xlsx ``` If the program succeeded, the CSV contents will be printed to console. ### C# [ClearScript](https://microsoft.github.io/ClearScript/) is a .NET interface to the V8 engine. C# byte arrays (`byte[]`) must be explicitly converted to arrays of bytes: ```csharp /* read data into a byte array */ byte[] filedata = File.ReadAllBytes("pres.numbers"); // highlight-start /* generate a JS Array (variable name `buf`) from the data */ engine.Script.buf = engine.Script.Array.from(filedata); // highlight-end /* parse data */ engine.Evaluate("var wb = XLSX.read(buf, {type: 'array'});"); ``` :::note Tested Deployments This demo was last tested in the following deployments: | Architecture | V8 Version | Date | |:-------------|:--------------|:-----------| | `darwin-x64` | `12.3.219.12` | 2024-07-16 | | `darwin-arm` | `12.3.219.12` | 2024-07-16 | | `win10-x64` | `12.3.219.12` | 2024-07-16 | | `win11-arm` | `12.3.219.12` | 2024-07-16 | | `linux-x64` | `12.3.219.12` | 2024-07-16 | | `linux-arm` | `12.3.219.12` | 2024-07-16 | ::: 0) Set the `DOTNET_CLI_TELEMETRY_OPTOUT` environment variable to `1`.
How to disable telemetry (click to hide) Add the following line to `.profile`, `.bashrc` and `.zshrc`: ```bash title="(add to .profile , .bashrc , and .zshrc)" export DOTNET_CLI_TELEMETRY_OPTOUT=1 ``` Close and restart the Terminal to load the changes. Type `env` in the search bar and select "Edit the system environment variables". In the new window, click the "Environment Variables..." button. In the new window, look for the "System variables" section and click "New..." Set the "Variable name" to `DOTNET_CLI_TELEMETRY_OPTOUT` and the value to `1`. Click "OK" in each window (3 windows) and restart your computer.
1) Install .NET
Installation Notes (click to show) For macOS x64 and ARM64, install the `dotnet-sdk` Cask with Homebrew: ```bash brew install --cask dotnet-sdk ``` For Steam Deck Holo and other Arch Linux x64 distributions, the `dotnet-sdk` and `dotnet-runtime` packages should be installed using `pacman`: ```bash sudo pacman -Syu dotnet-sdk dotnet-runtime ``` https://dotnet.microsoft.com/en-us/download/dotnet/6.0 is the official source for Windows and ARM64 Linux versions.
2) Open a new Terminal window in macOS or PowerShell window in Windows. 3) Create a new project: ```bash mkdir SheetJSClearScript cd SheetJSClearScript dotnet new console dotnet run ``` 4) Add ClearScript to the project: ```bash dotnet add package Microsoft.ClearScript.Complete --version 7.4.5 ``` 5) Download the SheetJS standalone script and test file. Move both files to the project directory: {`\ curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js curl -LO https://docs.sheetjs.com/pres.xlsx`} 6) Replace `Program.cs` with the following: ```csharp title="Program.cs" using Microsoft.ClearScript.JavaScript; using Microsoft.ClearScript.V8; /* initialize ClearScript */ var engine = new V8ScriptEngine(); /* Load SheetJS Scripts */ engine.Evaluate(File.ReadAllText("xlsx.full.min.js")); Console.WriteLine("SheetJS version {0}", engine.Evaluate("XLSX.version")); /* Read and Parse File */ byte[] filedata = File.ReadAllBytes(args[0]); engine.Script.buf = engine.Script.Array.from(filedata); engine.Evaluate("var wb = XLSX.read(buf, {type: 'array'});"); /* Print CSV of first worksheet */ engine.Evaluate("var ws = wb.Sheets[wb.SheetNames[0]];"); var csv = engine.Evaluate("XLSX.utils.sheet_to_csv(ws)"); Console.Write(csv); /* Generate XLSB file and save to SheetJSJint.xlsb */ var xlsb = (ITypedArray)engine.Evaluate("XLSX.write(wb, {bookType: 'xlsb', type: 'buffer'})"); File.WriteAllBytes("SheetJSClearScript.xlsb", xlsb.ToArray()); ``` After saving, run the program and pass the test file name as an argument: ```bash dotnet run pres.xlsx ``` If successful, the program will print the contents of the first sheet as CSV rows. It will also create `SheetJSClearScript.xlsb`, a workbook that can be opened in a spreadsheet editor. ### Python [`pyv8`](https://code.google.com/archive/p/pyv8/) is a Python wrapper for V8. The `stpyv8` package[^7] is an actively-maintained fork with binary wheels. :::caution pass When this demo was last tested, there was no direct conversion between Python `bytes` and JavaScript `ArrayBuffer` data. This is a known issue[^8]. The current recommendation is Base64 strings. ::: #### Python Base64 Strings The SheetJS `read`[^1] and `write`[^4] methods support Base64 strings through the `base64` type[^5]. _Reading Files_ It is recommended to create a global context with a special method that handles file reading from Python. The `read_file` helper in the following snippet will read bytes from `sheetjs.xlsx` and generate a Base64 string: ```py from base64 import b64encode; from STPyV8 import JSContext, JSClass; # Create context with methods for file i/o class Base64Context(JSClass): def read_file(self, path): with open(path, "rb") as f: data = f.read(); return b64encode(data).decode("ascii"); globals = Base64Context(); # The JSContext starts and cleans up the V8 engine with JSContext(globals) as ctxt: print(ctxt.eval("read_file('sheetjs.xlsx')")); # read base64 data and print ``` _Writing Files_ Since the SheetJS `write` method returns a Base64 string, the result can be decoded and written to file from Python: ```py from base64 import b64decode; from STPyV8 import JSContext; # The JSContext starts and cleans up the V8 engine with JSContext() as ctxt: # ... initialization and workbook creation ... xlsb = ctxt.eval("XLSX.write(wb, {type: 'base64', bookType: 'xlsb'})"); with open("SheetJSSTPyV8.xlsb", "wb") as f: f.write(b64decode(xlsb)); ``` #### Python Demo :::note Tested Deployments This demo was last tested in the following deployments: | Architecture | V8 Version | Python | Date | |:-------------|:--------------|:---------|:-----------| | `darwin-arm` | `13.0.245.16` | `3.13.0` | 2024-10-20 | ::: 0) Make a new folder for the project: ```bash mkdir sheetjs-stpyv8 cd sheetjs-stpyv8 ``` 1) Install `stpyv8`: ```bash pip install stpyv8 ``` :::caution pass The install may fail with a `externally-managed-environment` error: ``` error: externally-managed-environment × This environment is externally managed ``` The wheel can be downloaded and forcefully installed. The following commands download and install version `13.0.245.16` for Python `3.13` on `darwin-arm`: ```bash curl -LO https://github.com/cloudflare/stpyv8/releases/download/v13.0.245.16/stpyv8-13.0.245.16-cp313-cp313-macosx_14_0_arm64.whl sudo python -m pip install --upgrade stpyv8-13.0.245.16-cp313-cp313-macosx_14_0_arm64.whl --break-system-packages ``` ::: 2) Download the SheetJS standalone script and test file. Move both files to the project directory: {`\ curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js curl -LO https://docs.sheetjs.com/pres.xlsx`} 3) Download [`sheetjs-stpyv8.py`](pathname:///v8/sheetjs-stpyv8.py): ```bash curl -LO https://docs.sheetjs.com/v8/sheetjs-stpyv8.py ``` 4) Run the script and pass `pres.xlsx` as the first argument: ```bash python sheetjs-stpyv8.py pres.xlsx ``` The script will display CSV rows from the first worksheet. It will also create `SheetJSSTPyV8.xlsb`, a workbook that can be opened with a spreadsheet editor. ## Snapshots At a high level, V8 snapshots are raw dumps of the V8 engine state. It is much more efficient for programs to load snapshots than to evaluate code. ### Snapshot Demo There are two parts to this demo: A) The [`snapshot`](pathname:///cli/snapshot.rs) command creates a snapshot with the [SheetJS standalone script](/docs/getting-started/installation/standalone) and [supplementary NUMBERS script](/docs/api/write-options#writing-options). It will dump the snapshot to `snapshot.bin` B) The [`sheet2csv`](pathname:///cli/sheet2csv.rs) tool embeds `snapshot.bin`. The tool will parse a specified file, print CSV contents of a named worksheet, and export the workbook to NUMBERS. :::note Tested Deployments This demo was last tested in the following deployments: | Architecture | V8 Version | Crate | Date | |:-------------|:--------------|:---------|:-----------| | `darwin-x64` | `12.6.228.3` | `0.92.0` | 2024-05-28 | | `darwin-arm` | `12.6.228.3` | `0.92.0` | 2024-05-23 | | `win10-x64` | `12.3.219.9` | `0.88.0` | 2024-03-24 | | `win11-x64` | `12.6.228.3` | `0.92.0` | 2024-05-23 | | `linux-x64` | `12.3.219.9` | `0.88.0` | 2024-03-18 | | `linux-arm` | `12.6.228.3` | `0.92.0` | 2024-05-26 | ::: 0) Make a new folder for the project: ```bash mkdir sheetjs2csv cd sheetjs2csv ``` 1) Download the following scripts: - [`Cargo.toml`](pathname:///cli/Cargo.toml) - [`snapshot.rs`](pathname:///cli/snapshot.rs) - [`sheet2csv.rs`](pathname:///cli/sheet2csv.rs) ```bash curl -o Cargo.toml https://docs.sheetjs.com/cli/Cargo.toml curl -o snapshot.rs https://docs.sheetjs.com/cli/snapshot.rs curl -o sheet2csv.rs https://docs.sheetjs.com/cli/sheet2csv.rs ``` 2) Download the SheetJS Standalone script and NUMBERS supplementary script. Move both scripts to the project directory: {`\ curl -o xlsx.full.min.js https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js curl -o xlsx.zahl.js https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.zahl.js`} 3) Build the V8 snapshot: ```bash cargo build --bin snapshot cargo run --bin snapshot ``` :::caution pass In some tests, the Linux AArch64 build failed with an error: ``` error[E0080]: evaluation of constant value failed | 1715 | assert!(size_of::() == size_of::()); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the evaluated program panicked at 'assertion failed: size_of::() == size_of::()' ``` Versions `0.75.1`, `0.82.0`, and `0.92.0` are known to work. ::: 4) Build `sheet2csv` (`sheet2csv.exe` in Windows): ```bash cargo build --release --bin sheet2csv ``` 5) Download the test file https://docs.sheetjs.com/pres.numbers: ```bash curl -o pres.numbers https://docs.sheetjs.com/pres.numbers ``` 6) Test the application: ```bash mv target/release/sheet2csv . ./sheet2csv pres.numbers ``` ```bash mv target/release/sheet2csv.exe . .\sheet2csv.exe pres.numbers ``` [^1]: See [`read` in "Reading Files"](/docs/api/parse-options) [^2]: See ["SheetJS Data Model"](/docs/csf) for more details on the object representation. [^3]: See ["API Reference"](/docs/api) for a list of functions that ship with the library. ["Spreadsheet Features"](/docs/csf/features) covers workbook and worksheet features that can be modified directly. [^4]: See [`write` in "Writing Files"](/docs/api/write-options) [^5]: See ["Supported Output Formats" type in "Writing Files"](/docs/api/write-options#supported-output-formats) [^6]: The project does not have an official website. The [official Rust crate](https://crates.io/crates/v8) is hosted on `crates.io`. [^7]: The project does not have a separate website. The source repository is hosted on [GitHub](https://github.com/cloudflare/stpyv8) [^8]: According to a maintainer, [typed arrays were not supported in the original `pyv8` project](https://github.com/cloudflare/stpyv8/issues/104#issuecomment-2059125389)