2017-08-19 23:06:34 +00:00
|
|
|
# Headless Browsers
|
|
|
|
|
2017-09-24 23:40:09 +00:00
|
|
|
The library, eschewing unstable and nascent ECMAScript features, plays nicely
|
2017-08-19 23:06:34 +00:00
|
|
|
with most headless browsers. This demo shows a few common headless scenarios.
|
|
|
|
|
2022-02-08 09:50:51 +00:00
|
|
|
NodeJS does not ship with its own layout engine. For advanced HTML exports, a
|
|
|
|
headless browser is generally indistinguishable from a browser process.
|
2017-08-19 23:06:34 +00:00
|
|
|
|
2022-02-08 09:50:51 +00:00
|
|
|
## Chromium Automation with Puppeteer
|
|
|
|
|
2022-05-17 21:48:05 +00:00
|
|
|
[Puppeteer](https://pptr.dev/) enables headless Chromium automation.
|
2022-02-08 09:50:51 +00:00
|
|
|
|
|
|
|
[`html.js`](./html.js) shows a dedicated script for converting an HTML file to
|
|
|
|
XLSB using puppeteer. The first argument is the path to the HTML file. The
|
|
|
|
script writes to `output.xlsb`:
|
2017-08-19 23:06:34 +00:00
|
|
|
|
|
|
|
```bash
|
2022-02-08 09:50:51 +00:00
|
|
|
# read from test.html and write to output.xlsb
|
|
|
|
$ node html.js test.html
|
2017-08-19 23:06:34 +00:00
|
|
|
```
|
|
|
|
|
2022-02-08 09:50:51 +00:00
|
|
|
The script pulls up the webpage using headless Chromium and adds a script tag
|
|
|
|
reference to the standalone browser build. That will make the `XLSX` variable
|
|
|
|
available to future scripts added in the page! The browser context is not able
|
|
|
|
to save the file using `writeFile`, so the demo generates the XLSB spreadsheet
|
|
|
|
bytes with the `base64` type, sends the string back to the main process, and
|
|
|
|
uses `fs.writeFileSync` to write the file.
|
2017-08-19 23:06:34 +00:00
|
|
|
|
2022-02-08 09:50:51 +00:00
|
|
|
## WebKit Automation with PhantomJS
|
2017-08-19 23:06:34 +00:00
|
|
|
|
2022-02-08 09:50:51 +00:00
|
|
|
This was tested using [PhantomJS 2.1.1](https://phantomjs.org/download.html)
|
2017-12-09 07:17:25 +00:00
|
|
|
|
2022-02-08 09:50:51 +00:00
|
|
|
```bash
|
|
|
|
$ phantomjs phantomjs.js
|
2017-08-19 23:06:34 +00:00
|
|
|
```
|
|
|
|
|
2022-02-08 09:50:51 +00:00
|
|
|
The flow is similar to the Puppeteer flow (scrape table and generate workbook in
|
|
|
|
website context, copy string back, write string to file from main process).
|
2017-08-19 23:06:34 +00:00
|
|
|
|
2022-02-08 09:50:51 +00:00
|
|
|
The `binary` type generates strings that can be written in PhantomJS using the
|
|
|
|
`fs.write` method with mode `"wb"`.
|
2017-12-09 07:17:25 +00:00
|
|
|
|
|
|
|
## wkhtmltopdf
|
|
|
|
|
|
|
|
This was tested in wkhtmltopdf 0.12.4, installed using the official binaries:
|
|
|
|
|
|
|
|
```bash
|
2021-09-09 06:01:53 +00:00
|
|
|
$ wkhtmltopdf --javascript-delay 20000 http://oss.sheetjs.com/sheetjs/tests/ test.pdf
|
2017-12-09 07:17:25 +00:00
|
|
|
```
|
|
|
|
|
2017-08-19 23:06:34 +00:00
|
|
|
|
2017-09-24 23:40:09 +00:00
|
|
|
[![Analytics](https://ga-beacon.appspot.com/UA-36810333-1/SheetJS/js-xlsx?pixel)](https://github.com/SheetJS/js-xlsx)
|