--- title: Data Processing with Duktape sidebar_label: C + Duktape description: Process structured data in C programs. Seamlessly integrate spreadsheets into your program by pairing Duktape and SheetJS. Supercharge programs with modern data tools. pagination_prev: demos/bigdata/index pagination_next: solutions/input --- import current from '/version.js'; import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; import CodeBlock from '@theme/CodeBlock'; [Duktape](https://duktape.org) is an embeddable JS engine written in C. It has been ported to a number of exotic architectures and operating systems. [SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing data from spreadsheets. The ["Complete Example"](#complete-example) section includes a complete command-line tool for reading data from spreadsheets and exporting to Excel XLSB workbooks. The ["Bindings"](#bindings) section covers bindings for other ecosystems. ## Integration Details ### Initialize Duktape Duktape does not provide a `global` variable. It can be created in one line: ```c /* initialize */ duk_context *ctx = duk_create_heap_default(); /* duktape does not expose a standard "global" by default */ // highlight-next-line duk_eval_string_noresult(ctx, "var global = (function(){ return this; }).call(null);"); ``` ### Load SheetJS Scripts The [SheetJS Standalone scripts](/docs/getting-started/installation/standalone) can be parsed and evaluated in a Duktape context. The shim and main libraries can be loaded by reading the scripts from the file system and evaluating in the Duktape context: ```c /* simple wrapper to read the entire script file */ static duk_int_t eval_file(duk_context *ctx, const char *filename) { size_t len; /* read script from filesystem */ FILE *f = fopen(filename, "rb"); if(!f) { duk_push_undefined(ctx); perror("fopen"); return 1; } long fsize; { fseek(f, 0, SEEK_END); fsize = ftell(f); fseek(f, 0, SEEK_SET); } char *buf = (char *)malloc(fsize * sizeof(char)); len = fread((void *) buf, 1, fsize, f); fclose(f); if(!buf) { duk_push_undefined(ctx); perror("fread"); return 1; } // highlight-start /* load script into the context */ duk_push_lstring(ctx, (const char *)buf, (duk_size_t)len); /* eval script */ duk_int_t retval = duk_peval(ctx); /* cleanup */ duk_pop(ctx); // highlight-end return retval; } // ... duk_int_t res = 0; if((res = eval_file(ctx, "shim.min.js")) != 0) { /* error handler */ } if((res = eval_file(ctx, "xlsx.full.min.js")) != 0) { /* error handler */ } ``` To confirm the library is loaded, `XLSX.version` can be inspected: ```c /* get version string */ duk_eval_string(ctx, "XLSX.version"); printf("SheetJS library version %s\n", duk_get_string(ctx, -1)); duk_pop(ctx); ``` ### Reading Files Duktape supports `Buffer` natively but should be sliced before processing. Assuming `buf` is a C byte array, with length `len`, this snippet parses data: ```c /* load C char array and save to a Buffer */ duk_push_external_buffer(ctx); duk_config_buffer(ctx, -1, buf, len); duk_put_global_string(ctx, "buf"); /* parse with SheetJS */ duk_eval_string_noresult("workbook = XLSX.read(buf.slice(0, buf.length), {type:'buffer'});"); ``` `workbook` will be a variable in the JS environment that can be inspected using the various SheetJS API functions. ### Writing Files `duk_get_buffer_data` can pull `Buffer` object data into the C code: ```c /* write with SheetJS using type: "array" */ duk_eval_string(ctx, "XLSX.write(workbook, {type:'array', bookType:'xlsx'})"); /* pull result back to C */ duk_size_t sz; char *buf = (char *)duk_get_buffer_data(ctx, -1, sz); /* discard result in duktape */ duk_pop(ctx); ``` The resulting `buf` can be written to file with `fwrite`. ## Complete Example :::note This demo was tested in the following deployments: | Architecture | Version | Date | |:-------------|:--------|:-----------| | `darwin-x64` | `2.7.0` | 2023-10-26 | | `darwin-arm` | `2.7.0` | 2023-10-18 | | `win10-x64` | `2.7.0` | 2023-10-27 | | `win11-arm` | `2.7.0` | 2023-09-26 | | `linux-x64` | `2.7.0` | 2023-10-11 | | `linux-arm` | `2.7.0` | 2023-08-30 | ::: This program parses a file and prints CSV data from the first worksheet. It also generates an XLSB file and writes to the filesystem. The [flow diagram is displayed after the example steps](#flow-diagram) :::info pass On Windows, the Visual Studio "Native Tools Command Prompt" must be used. ::: 0) Create a project folder: ```bash mkdir sheetjs-duk cd sheetjs-duk ``` 1) Download and extract Duktape: :::caution pass The Windows built-in `tar` does not support `xz` archives. **The commands must be run within WSL `bash`.** After the `mv` command, exit WSL. ::: ```bash curl -LO https://duktape.org/duktape-2.7.0.tar.xz tar -xJf duktape-2.7.0.tar.xz mv duktape-2.7.0/src/*.{c,h} . ``` 2) Download the SheetJS Standalone script, shim script and test file. Move all three files to the project directory: :::caution pass If the `curl` command fails, run the commands within WSL `bash`. ::: {`\ curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/shim.min.js curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js curl -LO https://sheetjs.com/pres.numbers`} 3) Download [`sheetjs.duk.c`](pathname:///duk/sheetjs.duk.c): ```bash curl -LO https://docs.sheetjs.com/duk/sheetjs.duk.c ``` 4) Compile standalone `sheetjs.duk` binary ```bash gcc -std=c99 -Wall -osheetjs.duk sheetjs.duk.c duktape.c -lm ``` :::note pass GCC may generate a warning: ``` duk_js_compiler.c:5628:13: warning: variable 'num_stmts' set but not used [-Wunused-but-set-variable] duk_int_t num_stmts; ^ ``` This warning can be ignored. ::: ```powershell cl sheetjs.duk.c duktape.c /I .\ ``` 5) Run the demo: ```bash ./sheetjs.duk pres.numbers ``` ```bash .\sheetjs.duk.exe pres.numbers ``` If the program succeeded, the CSV contents will be printed to console and the file `sheetjsw.xlsb` will be created. That file can be opened with Excel. ### Flow Diagram ```mermaid sequenceDiagram participant F as Filesystem participant C as C Code participant D as Duktape activate C opt Note over F,D: ~ Prepare Duktape ~ C->>+D: Initialize deactivate C D->>-C: Done activate C C->>F: Need SheetJS F->>C: SheetJS Code C->>+D: Load SheetJS Code deactivate C D->>-C: Loaded activate C C->>+D: Execute Code deactivate C Note over D: Eval SheetJS Code D->>-C: Done activate C Note over D: XLSX
ready to rock end opt Note over F,D: ~ Parse File ~ C->>F: Read Spreadsheet F->>C: Spreadsheet File C->>+D: Load Data deactivate C D->>-C: Loaded activate C C->>+D: eval `var workbook = XLSX.read(...)` deactivate C Note over D: Parse File D->>-C: Done activate C Note over D: `workbook`
can be used later end opt Note over F,D: ~ Print CSV to screen ~ C->>+D: eval `XLSX.utils.sheet_to_csv(...)` deactivate C Note over D: Generate CSV D->>-C: CSV Data activate C Note over C: Print to standard output end opt Note over F,D: ~ Write XLSB File ~ C->>+D: eval `XLSX.write(...)` deactivate C Note over D: Generate File D->>-C: done activate C C->>+D: get file bytes deactivate C D->>-C: binary data activate C C->>F: Write File end deactivate C ``` ## Bindings Bindings exist for many languages. As these bindings require "native" code, they may not work on every platform. ### Perl The Perl binding for Duktape is available as `JavaScript::Duktape` on CPAN. The Perl binding does not have raw `Buffer` ops, so Base64 strings are used. #### Perl Demo :::note This demo was tested in the following deployments: | Architecture | Version | Date | |:-------------|:--------|:-----------| | `darwin-x64` | `2.5.0` | 2023-10-26 | ::: 0) Ensure `perl` and `cpan` are installed and available on the system path. 1) Install the `JavaScript::Duktape` library: ```bash cpan install JavaScript::Duktape ``` 2) Save the following codeblock to `SheetJSDuk.pl`: ```perl title="SheetJSDuk.pl" # usage: perl SheetJSDuk.pl path/to/file use JavaScript::Duktape; use File::Slurp; use MIME::Base64 qw( encode_base64 decode_base64 ); # Initialize my $js = JavaScript::Duktape->new( max_memory => 256 * 1024 * 1024 ); $js->eval("var global = (function(){ return this; }).call(null);"); # Load the ExtendScript build my $src = read_file('xlsx.extendscript.js', { binmode => ':raw' }); $src =~ s/^\xEF\xBB\xBF//; my $XLSX = $js->eval($src); # Print version number $js->set('log' => sub { print $_[0], "\n"; }); $js->eval("log('SheetJS library version ' + XLSX.version);"); # Parse File my $raw_data = encode_base64(read_file($ARGV[0], { binmode => ':raw' }), ""); $js->set("b64", $raw_data); $js->eval(qq{ global.wb = XLSX.read(b64, {type: "base64", WTF:1}); global.ws = wb.Sheets[wb.SheetNames[0]]; void 0; }); # Print first worksheet CSV $js->eval('log(XLSX.utils.sheet_to_csv(global.ws))'); # Write XLSB file my $xlsb = $js->eval("XLSX.write(global.wb, {type:'base64', bookType:'xlsb'})"); write_file("SheetJSDuk.xlsb", decode_base64($xlsb)); ``` 3) Download the SheetJS ExtendScript build and test file: {`\ curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.extendscript.js curl -LO https://sheetjs.com/pres.xlsx`} 4) Run the script: ```bash perl SheetJSDuk.pl pres.xlsx ``` If the script succeeded, the data in the test file will be printed in CSV rows. The script will also export `SheetJSDuk.xlsb`.