docs.sheetjs.com/docz/docs/03-demos/03-net/02-server.md

645 lines
19 KiB
Markdown
Raw Normal View History

2022-08-19 06:42:18 +00:00
---
title: HTTP Server Processing
---
2023-04-27 09:12:19 +00:00
import current from '/version.js';
2023-05-03 03:40:40 +00:00
import CodeBlock from '@theme/CodeBlock';
2023-04-27 09:12:19 +00:00
2022-08-19 06:42:18 +00:00
Server-Side JS platforms like NodeJS and Deno have built-in APIs for listening
on network interfaces. They provide wrappers for requests and responses.
## Overview
#### Reading Data
Typically servers receive form data with content type `multipart/form-data` or
`application/x-www-form-urlencoded`. The platforms themselves typically do not
provide "body parsing" functions, instead leaning on the community to supply
modules to take the encoded data and split into form fields and files.
NodeJS servers typically use a parser like `formidable`. In the example below,
`formidable` will write to file and `XLSX.readFile` will read the file:
```js
var XLSX = require("xlsx"); // This is using the CommonJS build
var formidable = require("formidable");
require("http").createServer(function(req, res) {
if(req.method !== "POST") return res.end("");
/* parse body and implement logic in callback */
// highlight-next-line
(new formidable.IncomingForm()).parse(req, function(err, fields, files) {
/* if successful, files is an object whose keys are param names */
// highlight-next-line
var file = files["upload"]; // <input type="file" id="upload" name="upload">
/* file.path is a location in the filesystem, usually in a temp folder */
// highlight-next-line
var wb = XLSX.readFile(file.filepath);
// print the first worksheet back as a CSV
res.end(XLSX.utils.sheet_to_csv(wb.Sheets[wb.SheetNames[0]]));
});
}).listen(process.env.PORT || 3000);
```
`XLSX.read` will accept NodeJS buffers as well as `Uint8Array`, Base64 strings,
binary strings, and plain Arrays of bytes. This covers the interface types of
a wide variety of frameworks.
#### Writing Data
Typically server libraries use a response API that accepts `Uint8Array` data.
`XLSX.write` with the option `type: "buffer"` will generate data. To force the
response to be treated as an attachment, set the `Content-Disposition` header:
```js
var XLSX = require("xlsx"); // This is using the CommonJS build
require("http").createServer(function(req, res) {
if(req.method !== "GET") return res.end("");
var wb = XLSX.read("S,h,e,e,t,J,S\n5,4,3,3,7,9,5", {type: "binary"});
// highlight-start
res.setHeader('Content-Disposition', 'attachment; filename="SheetJS.xlsx"');
res.end(XLSX.write(wb, {type:"buffer", bookType: "xlsx"}));
// highlight-end
}).listen(process.env.PORT || 3000);
```
2022-08-21 00:46:10 +00:00
## NodeJS
2022-12-14 11:46:23 +00:00
When processing small files, the work is best handled in the server response
handler function. This approach is used in the "Framework Demos" section.
When processing large files, the direct approach will freeze the server. NodeJS
2023-05-03 03:40:40 +00:00
provides ["Worker Threads"](#worker-threads) for this exact use case.
2022-12-14 11:46:23 +00:00
### Framework Demos
#### Express
2022-08-21 00:46:10 +00:00
The `express-formidable` middleware is powered by the `formidable` parser. It
adds a `files` property to the request.
When downloading binary data, Express handles `Buffer` data in `res.end`. The
convenience `attachment` method adds the required header:
```js
// Header 'Content-Disposition: attachment; filename="SheetJS.xlsx"'
res.attachment("SheetJS.xlsx");
```
The following demo Express server will respond to POST requests to `/upload`
with a CSV output of the first sheet. It will also respond to GET requests to
`/download`, responding with a fixed XLSX worksheet:
```js title="SheetJSExpressCSV.js"
var XLSX = require('xlsx'), express = require('express');
/* create app */
var app = express();
/* add express-formidable middleware */
// highlight-next-line
app.use(require('express-formidable')());
/* route for handling uploaded data */
app.post('/upload', function(req, res) {
// highlight-start
var f = req.files["upload"]; // <input type="file" id="upload" name="upload">
var wb = XLSX.readFile(f.path);
// highlight-end
/* respond with CSV data from the first sheet */
res.status(200).end(XLSX.utils.sheet_to_csv(wb.Sheets[wb.SheetNames[0]]));
});
app.get('/download', function(req, res) {
/* generate workbook object */
var ws = XLSX.utils.aoa_to_sheet(["SheetJS".split(""), [5,4,3,3,7,9,5]]);
var wb = XLSX.utils.book_new(); XLSX.utils.book_append_sheet(wb, ws, "Data");
// highlight-start
/* generate buffer */
var buf = XLSX.write(wb, {type: "buffer", bookType: "xlsx"});
/* set headers */
res.attachment("SheetJSExpress.xlsx");
/* respond with file data */
res.status(200).end(buf);
// highlight-end
});
app.listen(+process.env.PORT||3000);
```
<details><summary><b>Testing</b> (click to show)</summary>
0) Save the code sample to `SheetJSExpressCSV.js`
1) Install dependencies:
2023-05-03 03:40:40 +00:00
<CodeBlock language="bash">{`\
npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz express express-formidable`}
</CodeBlock>
2022-08-21 00:46:10 +00:00
2) Start server (note: it will not print anything to console when running)
```bash
node SheetJSExpressCSV.js
```
3) Test POST requests using <https://sheetjs.com/pres.numbers>:
```bash
curl -LO https://sheetjs.com/pres.numbers
curl -X POST -F upload=@pres.numbers http://localhost:3000/upload
```
The response should show the data in CSV rows.
2023-05-03 03:40:40 +00:00
4) Test GET requests by opening `http://localhost:3000/download` in your browser.
2022-08-21 00:46:10 +00:00
It should prompt to download `SheetJSExpress.xlsx`
</details>
2022-12-14 11:46:23 +00:00
#### NestJS
2022-08-21 00:46:10 +00:00
[The NestJS docs](https://docs.nestjs.com/techniques/file-upload) have detailed
instructions for file upload support. In the controller, the `path` property
works with `XLSX.readFile`.
When downloading binary data, NestJS expects `StreamableFile`-wrapped Buffers.
The following demo NestJS Controller will respond to POST requests to `/upload`
with a CSV output of the first sheet. It will also respond to GET requests to
`/download`, responding with a fixed export:
```ts title="src/sheetjs/sheetjs.controller.js"
import { Controller, Get, Header, Post, StreamableFile, UploadedFile, UseInterceptors } from '@nestjs/common';
import { FileInterceptor } from '@nestjs/platform-express';
import { readFile, utils } from 'xlsx';
@Controller('sheetjs')
export class SheetjsController {
@Post('upload') // <input type="file" id="upload" name="upload">
@UseInterceptors(FileInterceptor('upload'))
async uploadXlsxFile(@UploadedFile() file: Express.Multer.File) {
/* file.path is a path to the workbook */
// highlight-next-line
const wb = readFile(file.path);
/* generate CSV of first worksheet */
return utils.sheet_to_csv(wb.Sheets[wb.SheetNames[0]]);
}
@Get('download')
@Header('Content-Disposition', 'attachment; filename="SheetJSNest.xlsx"')
async downloadXlsxFile(): Promise<StreamableFile> {
var ws = utils.aoa_to_sheet(["SheetJS".split(""), [5,4,3,3,7,9,5]]);
var wb = utils.book_new(); utils.book_append_sheet(wb, ws, "Data");
// highlight-start
/* generate buffer */
var buf = write(wb, {type: "buffer", bookType: "xlsx"});
/* Return a streamable file */
return new StreamableFile(buf);
// highlight-end
}
}
```
<details><summary><b>Testing</b> (click to show)</summary>
1) Create a new project:
2023-05-03 03:40:40 +00:00
<CodeBlock language="bash">{`\
2022-08-21 00:46:10 +00:00
npx @nestjs/cli new -p npm sheetjs-nest
cd sheetjs-nest
2023-05-03 03:40:40 +00:00
npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz
2022-08-21 00:46:10 +00:00
npm i --save-dev @types/multer
2023-05-03 03:40:40 +00:00
mkdir -p upload`}
</CodeBlock>
2022-08-21 00:46:10 +00:00
2) Create a new controller and a new module:
```bash
npx @nestjs/cli generate module sheetjs
npx @nestjs/cli generate controller sheetjs
```
2022-08-24 00:51:18 +00:00
3) Add `multer` to the new module by editing `src/sheetjs/sheetjs.module.ts`.
2022-08-21 00:46:10 +00:00
Changes are highlighted below:
```ts title="src/sheetjs/sheetjs.module.ts"
import { Module } from '@nestjs/common';
import { SheetjsController } from './sheetjs.controller';
// highlight-next-line
import { MulterModule } from '@nestjs/platform-express';
@Module({
// highlight-start
imports: [
MulterModule.register({
dest: './upload',
}),
],
// highlight-end
controllers: [SheetjsController]
})
export class SheetjsModule {}
```
4) Copy the `src/sheetjs/sheetjs.controller.ts` example from earlier, replacing
the contents of the existing file.
5) Start the server with
```bash
npx @nestjs/cli start
```
3) Test POST requests using <https://sheetjs.com/pres.numbers>:
```bash
curl -LO https://sheetjs.com/pres.numbers
curl -X POST -F upload=@pres.numbers http://localhost:3000/sheetjs/upload
```
The response should show the data in CSV rows.
2023-05-03 03:40:40 +00:00
4) Test GET requests by opening `http://localhost:3000/sheetjs/download` in your browser.
2022-08-21 00:46:10 +00:00
It should prompt to download `SheetJSNest.xlsx`
2022-08-24 23:48:22 +00:00
</details>
2022-12-14 11:46:23 +00:00
#### Fastify
2022-08-24 23:48:22 +00:00
:::note
2023-04-07 08:30:20 +00:00
This demo was verified on 2023 April 06 using `fastify@4.15.0`
2022-08-24 23:48:22 +00:00
:::
_Reading Data_
`@fastify/multipart`, which uses `busbuy` under the hood, can be registered:
```js
/* load SheetJS Library */
const XLSX = require("xlsx");
/* load fastify and enable body parsing */
const fastify = require('fastify')({logger: true});
// highlight-next-line
fastify.register(require('@fastify/multipart'), { attachFieldsToBody: true });
```
Once registered with the option `attachFieldsToBody`, route handlers can use
`req.body` directly:
```js
/* POST / reads submitted file and exports to requested format */
fastify.post('/', async(req, reply) => {
/* "file" is the name of the field in the HTML form*/
const file = req.body.upload;
/* toBuffer returns a promise that resolves to a Buffer */
// highlight-next-line
const buf = await file.toBuffer();
/* `XLSX.read` can read the Buffer */
const wb = XLSX.read(buf);
/* reply with a CSV */
reply.send(XLSX.utils.sheet_to_csv(wb.Sheets[wb.SheetNames[0]]));
});
```
:::caution
Out of the box, Fastify will return an error `FST_ERR_CTP_BODY_TOO_LARGE` when
processing large spreadsheets (`statusCode 413`). This is a Fastify issue.
2022-11-08 23:16:40 +00:00
The default body size limit (including all uploaded files and fields) is 1 MB.
It can be increased by setting the `bodyLimit` option during server creation:
2022-08-24 23:48:22 +00:00
```js
/* increase request body size limit to 5MB = 5 * 1024 * 1024 bytes */
const fastify = require('fastify')({bodyLimit: 5 * 1024 * 1024});
```
:::
_Writing Data_
The `Content-Disposition` header must be set manually:
```js
/* GET / returns a workbook */
fastify.get('/', (req, reply) => {
/* make a workbook */
var wb = XLSX.read("S,h,e,e,t,J,S\n5,4,3,3,7,9,5", {type: "binary"});
/* write to Buffer */
const buf = XLSX.write(wb, {type:"buffer", bookType: "xlsx"});
/* set Content-Disposition header and send data */
// highlight-next-line
reply.header('Content-Disposition', 'attachment; filename="SheetJSFastify.xlsx"').send(buf);
});
```
<details><summary><b>Testing</b> (click to show)</summary>
0) Save the following snippet to `SheetJSFastify.js`:
2023-04-07 08:30:20 +00:00
```js title="SheetJSFastify.js"
2022-08-24 23:48:22 +00:00
/* load SheetJS Library */
const XLSX = require("xlsx");
/* load fastify and enable body parsing */
const fastify = require('fastify')({logger: true});
fastify.register(require('@fastify/multipart'), { attachFieldsToBody: true });
/* GET / returns a workbook */
fastify.get('/', (req, reply) => {
/* make a workbook */
var wb = XLSX.read("S,h,e,e,t,J,S\n5,4,3,3,7,9,5", {type: "binary"});
/* write to Buffer */
const buf = XLSX.write(wb, {type:"buffer", bookType: "xlsx"});
/* set Content-Disposition header and send data */
reply.header('Content-Disposition', 'attachment; filename="SheetJSFastify.xlsx"').send(buf);
});
/* POST / reads submitted file and exports to requested format */
fastify.post('/', async(req, reply) => {
/* "file" is the name of the field in the HTML form*/
const file = req.body.upload;
/* toBuffer returns a promise that resolves to a Buffer */
const wb = XLSX.read(await file.toBuffer());
/* send back a CSV */
reply.send(XLSX.utils.sheet_to_csv(wb.Sheets[wb.SheetNames[0]]));
});
/* start */
fastify.listen({port: process.env.PORT || 3000}, (err, addr) => { if(err) throw err; });
```
1) Install dependencies:
2023-05-03 03:40:40 +00:00
<CodeBlock language="bash">{`\
npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz fastify @fastify/multipart`}
</CodeBlock>
2022-08-24 23:48:22 +00:00
2) Start server
```bash
node SheetJSFastify.js
```
3) Test POST requests using <https://sheetjs.com/pres.numbers>:
```bash
curl -LO https://sheetjs.com/pres.numbers
curl -X POST -F upload=@pres.numbers http://localhost:3000/
```
The response should show the data in CSV rows.
2023-05-03 03:40:40 +00:00
4) Test GET requests by opening `http://localhost:3000/` in your browser.
2022-08-24 23:48:22 +00:00
It should prompt to download `SheetJSFastify.xlsx`
</details>
2022-12-14 11:46:23 +00:00
### Worker Threads
NodeJS "Worker Threads" were introduced in v14 and eventually marked as stable
in v16. Coupled with `AsyncResource`, a simple thread pool enables processing
without blocking the server! The official NodeJS docs include a sample worker
pool implementation.
This example uses ExpressJS to create a general XLSX conversion service, but
the same approach applies to any NodeJS server side framework.
When reading large files, it is strongly recommended to run the body parser in
the main server process. Body parsers like `formidable` will write uploaded
files to the filesystem, and the file path should be passed to the worker (and
the worker would be responsible for reading and cleaning up the files).
:::note
The `child_process` module can also spawn [command-line tools](/docs/demos/cli).
That approach is not explored in this demo.
:::
<details><summary><b>Complete Example</b> (click to show)</summary>
:::note
2023-03-14 08:38:47 +00:00
This demo was last tested on 2023 March 14.
Versions: NodeJS 18.15.0 + ExpressJS 4.18.2 + Formidable 2.1.1
2022-12-14 11:46:23 +00:00
:::
0) Create a simple ECMAScript-Module-enabled `package.json`:
```json title="package.json"
{ "type": "module" }
```
1) Install the dependencies:
2023-05-03 03:40:40 +00:00
<CodeBlock language="bash">{`\
npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz express@4.18.2 formidable@2.1.1`}
</CodeBlock>
2022-12-14 11:46:23 +00:00
2) Create a worker script `worker.js` that listens for messages. When a message
is received, it will read the file from the filesystem, generate and pass back a
new XLSX file, and delete the original file:
```js title="worker.js"
/* load the worker_threads module */
import { parentPort } from 'node:worker_threads';
/* load the SheetJS module and hook to FS */
import { set_fs, readFile, write } from 'xlsx';
import * as fs from 'fs';
set_fs(fs);
/* the server will send a message with the `path` field */
parentPort.on('message', (task) => {
/* highlight-start */
// read file
const wb = readFile(task.path, { dense: true });
// send back XLSX
parentPort.postMessage(write(wb, { type: "buffer", bookType: "xlsx" }));
/* highlight-end */
// remove file
fs.unlink(task.path, ()=>{});
});
```
3) Download [`worker_pool.js`](pathname:///server/worker_pool.js):
```bash
curl -LO https://docs.sheetjs.com/server/worker_pool.js
```
(this is a slightly modified version of the example in the NodeJS docs)
4) Save the following server code to `main.mjs`:
```js title="main.mjs"
/* load dependencies */
import os from 'node:os';
import process from 'node:process'
import express from 'express';
import formidable from 'formidable';
/* load worker pool */
import WorkerPool from './worker_pool.js';
const pool = new WorkerPool(os.cpus().length);
process.on("beforeExit", () => { pool.close(); })
/* create server */
const app = express();
app.post('/', (req, res, next) => {
// parse body
const form = formidable({});
form.parse(req, (err, fields, files) => {
// look for "upload" field
if(err) return next(err);
if(!files["upload"]) return next(new Error("missing `upload` file"));
// send a message to the worker with the path to the uploaded file
// highlight-next-line
pool.runTask({ path: files["upload"].filepath }, (err, result) => {
if(err) return next(err);
// send the file back as an attachment
res.attachment("SheetJSPool.xlsx");
res.status(200).end(result);
});
});
});
// start server
app.listen(7262, () => { console.log(`Example app listening on port 7262`); });
```
5) Run the server:
```bash
node main.mjs
```
Test with the [`pres.numbers` sample file](https://sheetjs.com/pres.numbers):
```bash
curl -LO https://sheetjs.com/pres.numbers
curl -X POST -F upload=@pres.numbers http://localhost:7262/ -J -O
```
This will generate `SheetJSPool.xlsx`.
</details>
2023-05-03 03:40:40 +00:00
## Deno
:::caution
Many hosted services like Deno Deploy do not offer filesystem access.
This breaks web frameworks that use the filesystem in body parsing.
:::
Deno provides the basic elements to implement a server. It does not provide a
body parser out of the box.
#### Drash
In testing, [Drash](https://drash.land/drash/) had an in-memory body parser
which could handle file uploads on hosted services like Deno Deploy.
_Reading Data_
`Request#bodyParam` reads body parameters. For uploaded files, the `content`
property is a `Uint8Array`:
<CodeBlock language="ts">{`\
// @deno-types="https://cdn.sheetjs.com/xlsx-${current}/package/types/index.d.ts"
import { read, utils } from 'https://cdn.sheetjs.com/xlsx-${current}/package/xlsx.mjs';
\n\
import * as Drash from "https://deno.land/x/drash@v2.5.4/mod.ts";
\n\
class ParseResource extends Drash.Resource {
public paths = ["/"];
\n\
public POST(request: Drash.Request, response: Drash.Response) {
// assume a form upload like <input type="file" id="upload" name="upload">
// highlight-next-line
const file = request.bodyParam<Drash.Types.BodyFile>("upload");
if (!file) throw new Error("File is required!");
// highlight-next-line
var wb = read(file.content);
return response.html( utils.sheet_to_html(wb.Sheets[wb.SheetNames[0]]));
}
}`}
</CodeBlock>
_Writing Data_
Headers are manually set with `Response#headers.set` while the raw body is set
with `Response#send`:
<CodeBlock language="ts">{`\
// @deno-types="https://cdn.sheetjs.com/xlsx-${current}/package/types/index.d.ts"
import { utils, write } from 'https://cdn.sheetjs.com/xlsx-${current}/package/xlsx.mjs';
\n\
import * as Drash from "https://deno.land/x/drash@v2.5.4/mod.ts";
\n\
class WriteResource extends Drash.Resource {
public paths = ["/export"];
\n\
public GET(request: Drash.Request, response: Drash.Response): void {
// create some fixed workbook
const data = ["SheetJS".split(""), [5,4,3,3,7,9,5]];
const ws = utils.aoa_to_sheet(data);
const wb = utils.book_new(); utils.book_append_sheet(wb, ws, "data");
// write the workbook to XLSX as a Uint8Array
// highlight-next-line
const file = write(wb, { bookType: "xlsx", type: "buffer"});
// set headers
response.headers.set("Content-Disposition", 'attachment; filename="SheetJSDrash.xlsx"');
// send data
// highlight-next-line
return response.send("application/vnd.ms-excel", file);
}
}`}
</CodeBlock>
<details><summary><b>Complete Example</b> (click to show)</summary>
1) Download [`SheetJSDrash.ts`](pathname:///server/SheetJSDrash.ts):
```bash
curl -LO https://docs.sheetjs.com/server/SheetJSDrash.ts
```
2) Run the server:
```bash
deno run --allow-net SheetJSDrash.ts
```
3) Download the test file <https://sheetjs.com/pres.numbers>
4) Open `http://localhost:7262/` in your browser.
Click "Choose File" and select `pres.numbers`. Then click "Submit"
The page should show the contents of the file as an HTML table.
5) Open `http://localhost:7262/export` in your browser.
The page should attempt to download `SheetJSDrash.xlsx` . Open the new file.
</details>