---
title: Data Processing with Duktape
sidebar_label: C + Duktape
description: Process structured data in C programs. Seamlessly integrate spreadsheets into your program by pairing Duktape and SheetJS. Supercharge programs with modern data tools.
pagination_prev: demos/bigdata/index
pagination_next: solutions/input
---
import current from '/version.js';
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import CodeBlock from '@theme/CodeBlock';
[Duktape](https://duktape.org) is an embeddable JS engine written in C. It has
been ported to a number of exotic architectures and operating systems.
[SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing
data from spreadsheets.
The ["Integration Example"](#integration-example) section includes a complete
command-line tool for reading data from spreadsheets and exporting to Excel XLSB
workbooks.
The ["Bindings"](#bindings) section covers bindings for other ecosystems.
## Integration Details
### Initialize Duktape
Duktape does not provide a `global` variable. It can be created in one line:
```c
/* initialize */
duk_context *ctx = duk_create_heap_default();
/* duktape does not expose a standard "global" by default */
// highlight-next-line
duk_eval_string_noresult(ctx, "var global = (function(){ return this; }).call(null);");
```
### Load SheetJS Scripts
The [SheetJS Standalone scripts](/docs/getting-started/installation/standalone)
can be parsed and evaluated in a Duktape context.
The shim and main libraries can be loaded by reading the scripts from the file
system and evaluating in the Duktape context:
```c
/* simple wrapper to read the entire script file */
static duk_int_t eval_file(duk_context *ctx, const char *filename) {
size_t len;
/* read script from filesystem */
FILE *f = fopen(filename, "rb");
if(!f) { duk_push_undefined(ctx); perror("fopen"); return 1; }
long fsize; { fseek(f, 0, SEEK_END); fsize = ftell(f); fseek(f, 0, SEEK_SET); }
char *buf = (char *)malloc(fsize * sizeof(char));
len = fread((void *) buf, 1, fsize, f);
fclose(f);
if(!buf) { duk_push_undefined(ctx); perror("fread"); return 1; }
// highlight-start
/* load script into the context */
duk_push_lstring(ctx, (const char *)buf, (duk_size_t)len);
/* eval script */
duk_int_t retval = duk_peval(ctx);
/* cleanup */
duk_pop(ctx);
// highlight-end
return retval;
}
// ...
duk_int_t res = 0;
if((res = eval_file(ctx, "shim.min.js")) != 0) { /* error handler */ }
if((res = eval_file(ctx, "xlsx.full.min.js")) != 0) { /* error handler */ }
```
To confirm the library is loaded, `XLSX.version` can be inspected:
```c
/* get version string */
duk_eval_string(ctx, "XLSX.version");
printf("SheetJS library version %s\n", duk_get_string(ctx, -1));
duk_pop(ctx);
```
### Reading Files
Duktape supports `Buffer` natively but should be sliced before processing.
Assuming `buf` is a C byte array, with length `len`, this snippet parses data:
```c
/* load C char array and save to a Buffer */
duk_push_external_buffer(ctx);
duk_config_buffer(ctx, -1, buf, len);
duk_put_global_string(ctx, "buf");
/* parse with SheetJS */
duk_eval_string_noresult("workbook = XLSX.read(buf.slice(0, buf.length), {type:'buffer'});");
```
`workbook` will be a variable in the JS environment that can be inspected using
the various SheetJS API functions.
### Writing Files
`duk_get_buffer_data` can pull `Buffer` object data into the C code:
```c
/* write with SheetJS using type: "array" */
duk_eval_string(ctx, "XLSX.write(workbook, {type:'array', bookType:'xlsx'})");
/* pull result back to C */
duk_size_t sz;
char *buf = (char *)duk_get_buffer_data(ctx, -1, sz);
/* discard result in duktape */
duk_pop(ctx);
```
The resulting `buf` can be written to file with `fwrite`.
## Complete Example
:::note
This demo was tested in the following deployments:
| Architecture | Version | Date |
|:-------------|:--------|:-----------|
| `darwin-x64` | `2.7.0` | 2023-10-26 |
| `darwin-arm` | `2.7.0` | 2023-10-18 |
| `win10-x64` | `2.7.0` | 2023-10-27 |
| `win11-arm` | `2.7.0` | 2023-09-26 |
| `linux-x64` | `2.7.0` | 2023-10-11 |
| `linux-arm` | `2.7.0` | 2023-08-30 |
:::
This program parses a file and prints CSV data from the first worksheet. It also
generates an XLSB file and writes to the filesystem.
The [flow diagram is displayed after the example steps](#flow-diagram)
:::info pass
On Windows, the Visual Studio "Native Tools Command Prompt" must be used.
:::
0) Create a project folder:
```bash
mkdir sheetjs-duk
cd sheetjs-duk
```
1) Download and extract Duktape:
:::caution pass
The Windows built-in `tar` does not support `xz` archives.
**The commands must be run within WSL `bash`.**
After the `mv` command, exit WSL.
:::
```bash
curl -LO https://duktape.org/duktape-2.7.0.tar.xz
tar -xJf duktape-2.7.0.tar.xz
mv duktape-2.7.0/src/*.{c,h} .
```
2) Download the SheetJS Standalone script, shim script and test file. Move all
three files to the project directory:
:::caution pass
If the `curl` command fails, run the commands within WSL `bash`.
:::
{`\
curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/shim.min.js
curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js
curl -LO https://sheetjs.com/pres.numbers`}
3) Download [`sheetjs.duk.c`](pathname:///duk/sheetjs.duk.c):
```bash
curl -LO https://docs.sheetjs.com/duk/sheetjs.duk.c
```
4) Compile standalone `sheetjs.duk` binary
```bash
gcc -std=c99 -Wall -osheetjs.duk sheetjs.duk.c duktape.c -lm
```
:::note
GCC may generate a warning:
```
duk_js_compiler.c:5628:13: warning: variable 'num_stmts' set but not used [-Wunused-but-set-variable]
duk_int_t num_stmts;
^
```
This warning can be ignored.
:::
```powershell
cl sheetjs.duk.c duktape.c /I .\
```
5) Run the demo:
```bash
./sheetjs.duk pres.numbers
```
```bash
.\sheetjs.duk.exe pres.numbers
```
If the program succeeded, the CSV contents will be printed to console and the
file `sheetjsw.xlsb` will be created. That file can be opened with Excel.
### Flow Diagram
```mermaid
sequenceDiagram
participant F as Filesystem
participant C as C Code
participant D as Duktape
activate C
opt
Note over F,D: ~ Prepare Duktape ~
C->>+D: Initialize
deactivate C
D->>-C: Done
activate C
C->>F: Need SheetJS
F->>C: SheetJS Code
C->>+D: Load SheetJS Code
deactivate C
D->>-C: Loaded
activate C
C->>+D: Execute Code
deactivate C
Note over D: Eval SheetJS Code
D->>-C: Done
activate C
Note over D: XLSX
ready to rock
end
opt
Note over F,D: ~ Parse File ~
C->>F: Read Spreadsheet
F->>C: Spreadsheet File
C->>+D: Load Data
deactivate C
D->>-C: Loaded
activate C
C->>+D: eval `var workbook = XLSX.read(...)`
deactivate C
Note over D: Parse File
D->>-C: Done
activate C
Note over D: `workbook`
can be used later
end
opt
Note over F,D: ~ Print CSV to screen ~
C->>+D: eval `XLSX.utils.sheet_to_csv(...)`
deactivate C
Note over D: Generate CSV
D->>-C: CSV Data
activate C
Note over C: Print to standard output
end
opt
Note over F,D: ~ Write XLSB File ~
C->>+D: eval `XLSX.write(...)`
deactivate C
Note over D: Generate File
D->>-C: done
activate C
C->>+D: get file bytes
deactivate C
D->>-C: binary data
activate C
C->>F: Write File
end
deactivate C
```
## Bindings
Bindings exist for many languages. As these bindings require "native" code, they
may not work on every platform.
### Perl
The Perl binding for Duktape is available as `JavaScript::Duktape` on CPAN.
The Perl binding does not have raw `Buffer` ops, so Base64 strings are used.
#### Perl Demo
:::note
This demo was tested in the following deployments:
| Architecture | Version | Date |
|:-------------|:--------|:-----------|
| `darwin-x64` | `2.5.0` | 2023-10-26 |
:::
0) Ensure `perl` and `cpan` are installed and available on the system path.
1) Install the `JavaScript::Duktape` library:
```bash
cpan install JavaScript::Duktape
```
2) Save the following codeblock to `SheetJSDuk.pl`:
```perl title="SheetJSDuk.pl"
# usage: perl SheetJSDuk.pl path/to/file
use JavaScript::Duktape;
use File::Slurp;
use MIME::Base64 qw( encode_base64 decode_base64 );
# Initialize
my $js = JavaScript::Duktape->new( max_memory => 256 * 1024 * 1024 );
$js->eval("var global = (function(){ return this; }).call(null);");
# Load the ExtendScript build
my $src = read_file('xlsx.extendscript.js', { binmode => ':raw' });
$src =~ s/^\xEF\xBB\xBF//;
my $XLSX = $js->eval($src);
# Print version number
$js->set('log' => sub { print $_[0], "\n"; });
$js->eval("log('SheetJS library version ' + XLSX.version);");
# Parse File
my $raw_data = encode_base64(read_file($ARGV[0], { binmode => ':raw' }), "");
$js->set("b64", $raw_data);
$js->eval(qq{
global.wb = XLSX.read(b64, {type: "base64", WTF:1});
global.ws = wb.Sheets[wb.SheetNames[0]];
void 0;
});
# Print first worksheet CSV
$js->eval('log(XLSX.utils.sheet_to_csv(global.ws))');
# Write XLSB file
my $xlsb = $js->eval("XLSX.write(global.wb, {type:'base64', bookType:'xlsb'})");
write_file("SheetJSDuk.xlsb", decode_base64($xlsb));
```
3) Download the SheetJS ExtendScript build and test file:
{`\
curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.extendscript.js
curl -LO https://sheetjs.com/pres.xlsx`}
4) Run the script:
```bash
perl SheetJSDuk.pl pres.xlsx
```
If the script succeeded, the data in the test file will be printed in CSV rows.
The script will also export `SheetJSDuk.xlsb`.