docs.sheetjs.com/docz/docs/03-demos/42-engines/01-duktape.md

690 lines
17 KiB
Markdown
Raw Normal View History

2023-02-13 04:07:25 +00:00
---
2023-10-28 08:57:22 +00:00
title: Data Processing with Duktape
sidebar_label: C + Duktape
description: Process structured data in C programs. Seamlessly integrate spreadsheets into your program by pairing Duktape and SheetJS. Supercharge programs with modern data tools.
2023-02-28 11:40:44 +00:00
pagination_prev: demos/bigdata/index
pagination_next: solutions/input
2023-02-13 04:07:25 +00:00
---
2023-04-27 09:12:19 +00:00
import current from '/version.js';
2023-09-27 04:43:00 +00:00
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
2023-05-07 13:58:36 +00:00
import CodeBlock from '@theme/CodeBlock';
2023-04-27 09:12:19 +00:00
2023-10-28 08:57:22 +00:00
[Duktape](https://duktape.org) is an embeddable JS engine written in C. It has
been ported to a number of exotic architectures and operating systems.
2023-02-13 04:07:25 +00:00
2023-10-28 08:57:22 +00:00
[SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing
data from spreadsheets.
2023-10-29 03:22:50 +00:00
The ["Complete Example"](#complete-example) section includes a complete
2023-10-28 08:57:22 +00:00
command-line tool for reading data from spreadsheets and exporting to Excel XLSB
workbooks.
The ["Bindings"](#bindings) section covers bindings for other ecosystems.
2023-02-13 04:07:25 +00:00
## Integration Details
2023-10-28 08:57:22 +00:00
### Initialize Duktape
2023-02-13 04:07:25 +00:00
Duktape does not provide a `global` variable. It can be created in one line:
```c
/* initialize */
duk_context *ctx = duk_create_heap_default();
/* duktape does not expose a standard "global" by default */
// highlight-next-line
duk_eval_string_noresult(ctx, "var global = (function(){ return this; }).call(null);");
```
2023-10-28 08:57:22 +00:00
### Load SheetJS Scripts
The [SheetJS Standalone scripts](/docs/getting-started/installation/standalone)
can be parsed and evaluated in a Duktape context.
2023-02-13 04:07:25 +00:00
The shim and main libraries can be loaded by reading the scripts from the file
system and evaluating in the Duktape context:
```c
/* simple wrapper to read the entire script file */
static duk_int_t eval_file(duk_context *ctx, const char *filename) {
size_t len;
/* read script from filesystem */
FILE *f = fopen(filename, "rb");
if(!f) { duk_push_undefined(ctx); perror("fopen"); return 1; }
long fsize; { fseek(f, 0, SEEK_END); fsize = ftell(f); fseek(f, 0, SEEK_SET); }
char *buf = (char *)malloc(fsize * sizeof(char));
len = fread((void *) buf, 1, fsize, f);
fclose(f);
if(!buf) { duk_push_undefined(ctx); perror("fread"); return 1; }
// highlight-start
/* load script into the context */
duk_push_lstring(ctx, (const char *)buf, (duk_size_t)len);
/* eval script */
duk_int_t retval = duk_peval(ctx);
/* cleanup */
duk_pop(ctx);
// highlight-end
return retval;
}
// ...
duk_int_t res = 0;
if((res = eval_file(ctx, "shim.min.js")) != 0) { /* error handler */ }
if((res = eval_file(ctx, "xlsx.full.min.js")) != 0) { /* error handler */ }
```
To confirm the library is loaded, `XLSX.version` can be inspected:
```c
/* get version string */
duk_eval_string(ctx, "XLSX.version");
printf("SheetJS library version %s\n", duk_get_string(ctx, -1));
duk_pop(ctx);
```
### Reading Files
Duktape supports `Buffer` natively but should be sliced before processing.
Assuming `buf` is a C byte array, with length `len`, this snippet parses data:
```c
/* load C char array and save to a Buffer */
duk_push_external_buffer(ctx);
duk_config_buffer(ctx, -1, buf, len);
duk_put_global_string(ctx, "buf");
/* parse with SheetJS */
duk_eval_string_noresult("workbook = XLSX.read(buf.slice(0, buf.length), {type:'buffer'});");
```
`workbook` will be a variable in the JS environment that can be inspected using
the various SheetJS API functions.
### Writing Files
`duk_get_buffer_data` can pull `Buffer` object data into the C code:
```c
/* write with SheetJS using type: "array" */
duk_eval_string(ctx, "XLSX.write(workbook, {type:'array', bookType:'xlsx'})");
/* pull result back to C */
duk_size_t sz;
char *buf = (char *)duk_get_buffer_data(ctx, -1, sz);
/* discard result in duktape */
duk_pop(ctx);
```
The resulting `buf` can be written to file with `fwrite`.
## Complete Example
2023-12-02 08:39:35 +00:00
:::note Tested Deployments
2023-02-13 04:07:25 +00:00
2023-06-03 09:10:50 +00:00
This demo was tested in the following deployments:
| Architecture | Version | Date |
|:-------------|:--------|:-----------|
2024-01-03 06:47:00 +00:00
| `darwin-x64` | `2.7.0` | 2023-12-05 |
2023-10-19 05:23:55 +00:00
| `darwin-arm` | `2.7.0` | 2023-10-18 |
2023-10-28 08:57:22 +00:00
| `win10-x64` | `2.7.0` | 2023-10-27 |
2023-12-02 08:39:35 +00:00
| `win11-arm` | `2.7.0` | 2023-12-01 |
2024-01-29 03:29:45 +00:00
| `linux-x64` | `2.7.0` | 2024-01-26 |
2023-12-02 08:39:35 +00:00
| `linux-arm` | `2.7.0` | 2023-12-01 |
2023-02-13 04:07:25 +00:00
:::
This program parses a file and prints CSV data from the first worksheet. It also
generates an XLSB file and writes to the filesystem.
The [flow diagram is displayed after the example steps](#flow-diagram)
2023-10-28 08:57:22 +00:00
:::info pass
On Windows, the Visual Studio "Native Tools Command Prompt" must be used.
:::
0) Create a project folder:
2023-02-13 04:07:25 +00:00
```bash
mkdir sheetjs-duk
cd sheetjs-duk
2023-10-28 08:57:22 +00:00
```
1) Download and extract Duktape:
<Tabs groupId="os">
<TabItem value="unix" label="Linux/MacOS">
</TabItem>
<TabItem value="win" label="Windows">
:::caution pass
The Windows built-in `tar` does not support `xz` archives.
**The commands must be run within WSL `bash`.**
After the `mv` command, exit WSL.
:::
</TabItem>
</Tabs>
```bash
2023-02-13 04:07:25 +00:00
curl -LO https://duktape.org/duktape-2.7.0.tar.xz
tar -xJf duktape-2.7.0.tar.xz
mv duktape-2.7.0/src/*.{c,h} .
```
2023-10-28 08:57:22 +00:00
2) Download the SheetJS Standalone script, shim script and test file. Move all
2023-09-22 06:32:55 +00:00
three files to the project directory:
2023-02-13 04:07:25 +00:00
<ul>
2023-04-27 09:12:19 +00:00
<li><a href={`https://cdn.sheetjs.com/xlsx-${current}/package/dist/shim.min.js`}>shim.min.js</a></li>
<li><a href={`https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js`}>xlsx.full.min.js</a></li>
2023-02-13 04:07:25 +00:00
<li><a href="https://sheetjs.com/pres.numbers">pres.numbers</a></li>
</ul>
2023-10-28 08:57:22 +00:00
<Tabs groupId="os">
<TabItem value="unix" label="Linux/MacOS">
</TabItem>
<TabItem value="win" label="Windows">
:::caution pass
If the `curl` command fails, run the commands within WSL `bash`.
:::
</TabItem>
</Tabs>
2023-05-07 13:58:36 +00:00
<CodeBlock language="bash">{`\
2023-04-27 09:12:19 +00:00
curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/shim.min.js
curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js
curl -LO https://sheetjs.com/pres.numbers`}
2023-05-07 13:58:36 +00:00
</CodeBlock>
2023-02-13 04:07:25 +00:00
2023-10-28 08:57:22 +00:00
3) Download [`sheetjs.duk.c`](pathname:///duk/sheetjs.duk.c):
2023-02-13 04:07:25 +00:00
```bash
curl -LO https://docs.sheetjs.com/duk/sheetjs.duk.c
```
2023-10-28 08:57:22 +00:00
4) Compile standalone `sheetjs.duk` binary
2023-02-13 04:07:25 +00:00
2023-09-27 04:43:00 +00:00
<Tabs groupId="os">
<TabItem value="unix" label="Linux/MacOS">
2023-02-13 04:07:25 +00:00
```bash
gcc -std=c99 -Wall -osheetjs.duk sheetjs.duk.c duktape.c -lm
```
2023-11-04 05:05:26 +00:00
:::note pass
2023-10-19 05:23:55 +00:00
GCC may generate a warning:
```
duk_js_compiler.c:5628:13: warning: variable 'num_stmts' set but not used [-Wunused-but-set-variable]
duk_int_t num_stmts;
^
```
This warning can be ignored.
:::
2023-09-27 04:43:00 +00:00
</TabItem>
<TabItem value="win" label="Windows">
```powershell
cl sheetjs.duk.c duktape.c /I .\
```
</TabItem>
</Tabs>
2023-10-28 08:57:22 +00:00
5) Run the demo:
2023-02-13 04:07:25 +00:00
2023-09-27 04:43:00 +00:00
<Tabs groupId="os">
<TabItem value="unix" label="Linux/MacOS">
2023-02-13 04:07:25 +00:00
```bash
./sheetjs.duk pres.numbers
```
2023-09-27 04:43:00 +00:00
</TabItem>
<TabItem value="win" label="Windows">
```bash
.\sheetjs.duk.exe pres.numbers
```
</TabItem>
</Tabs>
2023-02-13 04:07:25 +00:00
If the program succeeded, the CSV contents will be printed to console and the
file `sheetjsw.xlsb` will be created. That file can be opened with Excel.
2023-02-13 09:20:49 +00:00
### Flow Diagram
2023-02-13 04:07:25 +00:00
```mermaid
sequenceDiagram
participant F as Filesystem
participant C as C Code
participant D as Duktape
activate C
opt
Note over F,D: ~ Prepare Duktape ~
C->>+D: Initialize
deactivate C
D->>-C: Done
activate C
C->>F: Need SheetJS
F->>C: SheetJS Code
C->>+D: Load SheetJS Code
deactivate C
D->>-C: Loaded
activate C
C->>+D: Execute Code
deactivate C
Note over D: Eval SheetJS Code
D->>-C: Done
activate C
Note over D: XLSX<br/>ready to rock
end
opt
Note over F,D: ~ Parse File ~
C->>F: Read Spreadsheet
F->>C: Spreadsheet File
C->>+D: Load Data
deactivate C
D->>-C: Loaded
activate C
C->>+D: eval `var workbook = XLSX.read(...)`
deactivate C
Note over D: Parse File
D->>-C: Done
activate C
Note over D: `workbook`<br/>can be used later
end
opt
Note over F,D: ~ Print CSV to screen ~
C->>+D: eval `XLSX.utils.sheet_to_csv(...)`
deactivate C
Note over D: Generate CSV
D->>-C: CSV Data
activate C
Note over C: Print to standard output
end
opt
Note over F,D: ~ Write XLSB File ~
C->>+D: eval `XLSX.write(...)`
deactivate C
Note over D: Generate File
D->>-C: done
activate C
C->>+D: get file bytes
deactivate C
D->>-C: binary data
activate C
C->>F: Write File
end
deactivate C
2023-02-13 09:20:49 +00:00
```
## Bindings
2023-05-26 22:50:23 +00:00
Bindings exist for many languages. As these bindings require "native" code, they
may not work on every platform.
2023-02-13 09:20:49 +00:00
2024-01-29 03:29:45 +00:00
### PHP
There is no official PHP binding to the Duktape library. Instead, this demo uses
the raw `FFI` interface[^1] to the Duktape shared library.
#### PHP Demo
:::note Tested Deployments
This demo was tested in the following deployments:
| Architecture | Version | PHP Version | Date |
|:-------------|:--------|:------------|:-----------|
| `darwin-x64` | `2.7.0` | `8.3.2` | 2024-01-26 |
2024-01-30 09:27:22 +00:00
| `linux-x64` | `2.7.0` | `8.2.7` | 2024-01-29 |
2024-01-29 03:29:45 +00:00
:::
0) Ensure `php` is installed and available on the system path.
2024-01-30 09:27:22 +00:00
1) Inspect the `php.ini` configuration file. The location of the file can be
found by running `php --ini`. The following output is from the last macOS test:
```text pass
Configuration File (php.ini) Path: /usr/local/etc/php/8.3
// highlight-next-line
Loaded Configuration File: /usr/local/etc/php/8.3/php.ini
Scan for additional .ini files in: /usr/local/etc/php/8.3/conf.d
Additional .ini files parsed: /usr/local/etc/php/8.3/conf.d/ext-opcache.ini
```
The following line should appear in the configuration:
```ini title="php.ini (add to end)"
extension=ffi
```
If this line is prefixed with a `;`, remove the semicolon. If this line does not
appear in the file, add it to the end.
2) Build the Duktape shared library:
```bash
curl -LO https://duktape.org/duktape-2.7.0.tar.xz
tar -xJf duktape-2.7.0.tar.xz
cd duktape-2.7.0
make -f Makefile.sharedlibrary
cd ..
```
3) Copy the shared library to the current folder. When the demo was last tested,
the shared library file name differed by platform:
| OS | name |
|:-------|:--------------------------|
| Darwin | `libduktape.207.20700.so` |
| Linux | `libduktape.so.207.20700` |
```bash
cp duktape-*/libduktape.* .
```
4) Download the SheetJS Standalone script, shim script and test file. Move all
three files to the project directory:
<ul>
<li><a href={`https://cdn.sheetjs.com/xlsx-${current}/package/dist/shim.min.js`}>shim.min.js</a></li>
<li><a href={`https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js`}>xlsx.full.min.js</a></li>
<li><a href="https://sheetjs.com/pres.numbers">pres.numbers</a></li>
</ul>
<CodeBlock language="bash">{`\
curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/shim.min.js
curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js
curl -LO https://sheetjs.com/pres.numbers`}
</CodeBlock>
5) Download [`SheetJSDuk.php`](pathname:///duk/SheetJSDuk.php):
```bash
curl -LO https://docs.sheetjs.com/duk/SheetJSDuk.php
```
6) Edit the `SheetJSDuk.php` script.
The `$sofile` variable declares the path to the library:
```php title="SheetJSDuk.php (edit highlighted line)"
<?php
// highlight-next-line
$sofile = './libduktape.207.20700.so';
```
<Tabs groupId="triple">
<TabItem value="darwin-x64" label="MacOS">
The name of the library is `libduktape.207.20700.so`:
```php title="SheetJSDuk.php (change highlighted line)"
// highlight-next-line
$sofile = './libduktape.207.20700.so';
```
</TabItem>
<TabItem value="linux-x64" label="Linux">
The name of the library is `libduktape.so.207.20700`:
```php title="SheetJSDuk.php (change highlighted line)"
// highlight-next-line
$sofile = './libduktape.so.207.20700';
```
</TabItem>
</Tabs>
7) Run the script:
```bash
php SheetJSDuk.php pres.numbers
```
If the program succeeded, the CSV contents will be printed to console and the
file `sheetjsw.xlsb` will be created. That file can be opened with Excel.
### Python
There is no official Python binding to the Duktape library. Instead, this demo
uses the raw `ctypes` interface[^2] to the Duktape shared library.
#### Python Demo
:::note Tested Deployments
This demo was tested in the following deployments:
| Architecture | Version | Python | Date |
|:-------------|:--------|:---------|:-----------|
| `darwin-x64` | `2.7.0` | `3.11.7` | 2024-01-29 |
| `linux-x64` | `2.7.0` | `3.11.3` | 2024-01-29 |
:::
0) Ensure `python` is installed and available on the system path.
2024-01-29 03:29:45 +00:00
1) Build the Duktape shared library:
```bash
curl -LO https://duktape.org/duktape-2.7.0.tar.xz
tar -xJf duktape-2.7.0.tar.xz
cd duktape-2.7.0
make -f Makefile.sharedlibrary
cd ..
```
2) Copy the shared library to the current folder. When the demo was last tested,
2024-01-30 09:27:22 +00:00
the shared library file name differed by platform:
| OS | name |
|:-------|:--------------------------|
| Darwin | `libduktape.207.20700.so` |
| Linux | `libduktape.so.207.20700` |
2024-01-29 03:29:45 +00:00
```bash
2024-01-30 09:27:22 +00:00
cp duktape-*/libduktape.* .
2024-01-29 03:29:45 +00:00
```
3) Download the SheetJS Standalone script, shim script and test file. Move all
three files to the project directory:
<ul>
<li><a href={`https://cdn.sheetjs.com/xlsx-${current}/package/dist/shim.min.js`}>shim.min.js</a></li>
<li><a href={`https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js`}>xlsx.full.min.js</a></li>
<li><a href="https://sheetjs.com/pres.numbers">pres.numbers</a></li>
</ul>
<CodeBlock language="bash">{`\
curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/shim.min.js
curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.full.min.js
curl -LO https://sheetjs.com/pres.numbers`}
</CodeBlock>
2024-01-30 09:27:22 +00:00
4) Download [`SheetJSDuk.py`](pathname:///duk/SheetJSDuk.py):
2024-01-29 03:29:45 +00:00
```bash
2024-01-30 09:27:22 +00:00
curl -LO https://docs.sheetjs.com/duk/SheetJSDuk.py
2024-01-29 03:29:45 +00:00
```
2024-01-30 09:27:22 +00:00
5) Edit the `SheetJSDuk.py` script.
The `lib` variable declares the path to the library:
```python title="SheetJSDuk.py (edit highlighted line)"
#!/usr/bin/env python3
# highlight-next-line
lib = "libduktape.207.20700.so"
```
<Tabs groupId="triple">
<TabItem value="darwin-x64" label="MacOS">
The name of the library is `libduktape.207.20700.so`:
```python title="SheetJSDuk.py (change highlighted line)"
# highlight-next-line
lib = "libduktape.207.20700.so"
```
</TabItem>
<TabItem value="linux-x64" label="Linux">
The name of the library is `libduktape.so.207.20700`:
```python title="SheetJSDuk.py (change highlighted line)"
# highlight-next-line
lib = "libduktape.so.207.20700"
```
</TabItem>
</Tabs>
6) Run the script:
2024-01-29 03:29:45 +00:00
```bash
2024-01-30 09:27:22 +00:00
python3 SheetJSDuk.py pres.numbers
2024-01-29 03:29:45 +00:00
```
If the program succeeded, the CSV contents will be printed to console and the
file `sheetjsw.xlsb` will be created. That file can be opened with Excel.
2023-02-13 09:20:49 +00:00
### Perl
2024-01-29 03:29:45 +00:00
The Perl binding for Duktape is available as `JavaScript::Duktape::XS` on CPAN.
2023-10-27 01:49:35 +00:00
The Perl binding does not have raw `Buffer` ops, so Base64 strings are used.
#### Perl Demo
2023-12-02 08:39:35 +00:00
:::note Tested Deployments
2023-10-27 01:49:35 +00:00
This demo was tested in the following deployments:
| Architecture | Version | Date |
|:-------------|:--------|:-----------|
2024-01-29 03:29:45 +00:00
| `darwin-x64` | `2.2.0` | 2024-01-26 |
| `linux-x64` | `2.2.0` | 2024-01-26 |
2023-10-27 01:49:35 +00:00
:::
0) Ensure `perl` and `cpan` are installed and available on the system path.
2024-01-29 03:29:45 +00:00
1) Install the `JavaScript::Duktape::XS` library:
2023-02-13 09:20:49 +00:00
```bash
2024-01-29 03:29:45 +00:00
cpan install JavaScript::Duktape::XS
2023-02-13 09:20:49 +00:00
```
2023-12-02 08:39:35 +00:00
:::note pass
On some systems, the command must be run as the root user:
```bash
2024-01-29 03:29:45 +00:00
sudo cpan install JavaScript::Duktape::XS
2023-12-02 08:39:35 +00:00
```
:::
2023-10-27 01:49:35 +00:00
2) Save the following codeblock to `SheetJSDuk.pl`:
2023-02-13 09:20:49 +00:00
2023-10-27 01:49:35 +00:00
```perl title="SheetJSDuk.pl"
2023-02-13 09:20:49 +00:00
# usage: perl SheetJSDuk.pl path/to/file
2024-01-29 03:29:45 +00:00
use JavaScript::Duktape::XS;
2023-02-13 09:20:49 +00:00
use File::Slurp;
use MIME::Base64 qw( encode_base64 decode_base64 );
# Initialize
2024-01-29 03:29:45 +00:00
my $js = JavaScript::Duktape::XS->new({ max_memory_bytes => 256 * 1024 * 1024 });
2023-02-13 09:20:49 +00:00
$js->eval("var global = (function(){ return this; }).call(null);");
# Load the ExtendScript build
my $src = read_file('xlsx.extendscript.js', { binmode => ':raw' });
$src =~ s/^\xEF\xBB\xBF//;
my $XLSX = $js->eval($src);
# Print version number
$js->set('log' => sub { print $_[0], "\n"; });
$js->eval("log('SheetJS library version ' + XLSX.version);");
# Parse File
my $raw_data = encode_base64(read_file($ARGV[0], { binmode => ':raw' }), "");
$js->set("b64", $raw_data);
$js->eval(qq{
2023-10-27 01:49:35 +00:00
global.wb = XLSX.read(b64, {type: "base64", WTF:1});
2023-02-13 09:20:49 +00:00
global.ws = wb.Sheets[wb.SheetNames[0]];
2023-10-27 01:49:35 +00:00
void 0;
2023-02-13 09:20:49 +00:00
});
# Print first worksheet CSV
2023-10-27 01:49:35 +00:00
$js->eval('log(XLSX.utils.sheet_to_csv(global.ws))');
2023-02-13 09:20:49 +00:00
# Write XLSB file
my $xlsb = $js->eval("XLSX.write(global.wb, {type:'base64', bookType:'xlsb'})");
write_file("SheetJSDuk.xlsb", decode_base64($xlsb));
2023-10-27 01:49:35 +00:00
```
3) Download the SheetJS ExtendScript build and test file:
<CodeBlock language="bash">{`\
curl -LO https://cdn.sheetjs.com/xlsx-${current}/package/dist/xlsx.extendscript.js
curl -LO https://sheetjs.com/pres.xlsx`}
</CodeBlock>
4) Run the script:
```bash
perl SheetJSDuk.pl pres.xlsx
```
If the script succeeded, the data in the test file will be printed in CSV rows.
2023-12-02 08:39:35 +00:00
The script will also export `SheetJSDuk.xlsb`.
:::note pass
In the latest Linux ARM64 test, the command failed due to missing `File::Slurp`:
```
Can't locate File/Slurp.pm in @INC (you may need to install the File::Slurp module)
```
The fix is to install `File::Slurp` with `cpan`:
```bash
sudo cpan install File::Slurp
```
2024-01-29 03:29:45 +00:00
:::
2024-01-30 09:27:22 +00:00
[^1]: See [Foreign Function Interface](https://www.php.net/manual/en/book.ffi.php) in the PHP documentation.
[^2]: See [`ctypes`](https://docs.python.org/3/library/ctypes.html) in the Python documentation.