docs.sheetjs.com/docz/docs/03-demos/03-net/01-network/index.mdx

620 lines
16 KiB
Plaintext
Raw Normal View History

2022-08-05 05:10:11 +00:00
---
2023-11-20 02:51:39 +00:00
title: HTTP Downloads
pagination_next: demos/net/upload/index
2022-08-05 05:10:11 +00:00
---
<head>
2024-01-17 20:22:38 +00:00
<script src="https://unpkg.com/axios@1.6.5/dist/axios.min.js"></script>
<script src="https://unpkg.com/superagent@8.1.2/dist/superagent.min.js"></script>
2022-08-05 05:10:11 +00:00
</head>
2023-05-21 07:03:46 +00:00
import current from '/version.js';
import CodeBlock from '@theme/CodeBlock';
2022-08-13 22:01:26 +00:00
2024-01-03 06:47:00 +00:00
`XMLHttpRequest` and `fetch` browser APIs enable binary data transfer between
web browser clients and web servers. Since this library works in web browsers,
server conversion work can be offloaded to the client! This demo shows a few
common scenarios involving browser APIs and popular wrapper libraries.
2023-11-20 02:51:39 +00:00
:::info pass
This demo focuses on downloading files. Other demos cover other HTTP use cases:
- ["HTTP Uploads"](/docs/demos/net/upload) covers uploading files
- ["HTTP Server Processing"](/docs/demos/net/server) covers HTTP servers
:::
2022-08-05 05:10:11 +00:00
:::caution Third-Party Hosts and Binary Data
2023-11-20 02:51:39 +00:00
Third-party cloud platforms such as AWS may corrupt raw binary downloads by
encoding requests and responses in UTF-8 strings.
2022-08-05 05:10:11 +00:00
For AWS, in the "Binary Media Types" section of the API Gateway console, the
2023-11-20 02:51:39 +00:00
`"application/vnd.ms-excel"` type should be added to ensure that AWS Lambda
functions functions can send files to clients.
2022-08-05 05:10:11 +00:00
:::
## Downloading Binary Data
Most interesting spreadsheet files are binary data that contain byte sequences
that represent invalid UTF-8 characters.
2024-01-17 20:22:38 +00:00
APIs generally provide options to control how downloaded data is interpreted.
The `arraybuffer` response type usually forces the data to be presented as an
`ArrayBuffer` object which can be parsed with the SheetJS `read` method[^1].
2022-08-05 05:10:11 +00:00
2024-01-03 06:47:00 +00:00
The following example shows the data flow using `fetch` to download files:
```mermaid
flowchart LR
server[(Remote\nFile)]
response(Response\nobject)
subgraph SheetJS operations
ab(XLSX Data\nArrayBuffer)
wb(((SheetJS\nWorkbook)))
end
server --> |`fetch`\nGET request| response
response --> |`arrayBuffer`\n\n| ab
ab --> |`read`\n\n| wb
```
2022-08-05 05:10:11 +00:00
```js
2023-11-20 02:51:39 +00:00
/* download data into an ArrayBuffer object */
2022-08-05 05:10:11 +00:00
const res = await fetch("https://sheetjs.com/pres.numbers");
2022-08-14 02:10:41 +00:00
const ab = await res.arrayBuffer(); // recover data as ArrayBuffer
2022-08-05 05:10:11 +00:00
2023-11-20 02:51:39 +00:00
/* parse file */
2022-08-05 05:10:11 +00:00
const wb = XLSX.read(ab);
```
## Browser Demos
When the page is accessed, the browser will attempt to download https://sheetjs.com/pres.numbers
2023-11-20 02:51:39 +00:00
and read the workbook. The old table will be replaced with a table whose
contents match the first worksheet. The table is generated using the SheetJS
`sheet_to_html` method[^2]
2022-08-05 05:10:11 +00:00
2024-01-17 20:22:38 +00:00
:::note Tested Deployments
Each browser demo was tested in the following environments:
| Browser | Date |
|:------------|:-----------|
2024-01-31 08:14:07 +00:00
| Chrome 120 | 2024-01-30 |
2024-01-17 20:22:38 +00:00
| Safari 17.2 | 2024-01-15 |
:::
2022-08-05 05:10:11 +00:00
### XMLHttpRequest
For downloading data, the `arraybuffer` response type generates an `ArrayBuffer`
2024-01-03 06:47:00 +00:00
that can be viewed as an `Uint8Array` and fed to the SheetJS `read` method. For
legacy browsers, the option `type: "array"` should be specified:
2022-08-05 05:10:11 +00:00
```js
/* set up an async GET request */
var req = new XMLHttpRequest();
req.open("GET", url, true);
req.responseType = "arraybuffer";
req.onload = function(e) {
/* parse the data when it is received */
var data = new Uint8Array(req.response);
var workbook = XLSX.read(data, {type:"array"});
/* DO SOMETHING WITH workbook HERE */
};
req.send();
```
<details>
<summary><b>Live Download demo</b> (click to show)</summary>
2022-08-05 05:10:11 +00:00
This demo uses `XMLHttpRequest` to download https://sheetjs.com/pres.numbers
2022-08-05 05:10:11 +00:00
and show the data in an HTML table.
```jsx live
function SheetJSXHRDL() {
2022-10-21 00:10:10 +00:00
const [__html, setHTML] = React.useState("");
2022-08-05 05:10:11 +00:00
/* Fetch and update HTML */
2024-01-31 08:14:07 +00:00
React.useEffect(() => { (async() => {
2022-08-05 05:10:11 +00:00
/* Fetch file */
const req = new XMLHttpRequest();
req.open("GET", "https://sheetjs.com/pres.numbers", true);
req.responseType = "arraybuffer";
req.onload = e => {
/* Parse file */
const wb = XLSX.read(new Uint8Array(req.response));
const ws = wb.Sheets[wb.SheetNames[0]];
/* Generate HTML */
setHTML(XLSX.utils.sheet_to_html(ws));
};
req.send();
2024-01-31 08:14:07 +00:00
})(); }, []);
2022-08-05 05:10:11 +00:00
2023-02-28 11:40:44 +00:00
return ( <div dangerouslySetInnerHTML={{ __html }}/> );
2022-08-05 05:10:11 +00:00
}
```
</details>
### fetch
For downloading data, `Response#arrayBuffer` resolves to an `ArrayBuffer` that
2024-01-03 06:47:00 +00:00
can be converted to `Uint8Array` and passed to the SheetJS `read` method:
2022-08-05 05:10:11 +00:00
```js
fetch(url).then(function(res) {
/* get the data as a Blob */
if(!res.ok) throw new Error("fetch failed");
return res.arrayBuffer();
}).then(function(ab) {
/* parse the data when it is received */
var data = new Uint8Array(ab);
var workbook = XLSX.read(data, {type:"array"});
/* DO SOMETHING WITH workbook HERE */
});
```
<details>
<summary><b>Live Download demo</b> (click to show)</summary>
2022-08-05 05:10:11 +00:00
This demo uses `fetch` to download https://sheetjs.com/pres.numbers and show
2022-08-05 05:10:11 +00:00
the data in an HTML table.
```jsx live
function SheetJSFetchDL() {
2022-10-21 00:10:10 +00:00
const [__html, setHTML] = React.useState("");
2022-08-05 05:10:11 +00:00
/* Fetch and update HTML */
2024-01-31 08:14:07 +00:00
React.useEffect(() => { (async() => {
2022-08-05 05:10:11 +00:00
/* Fetch file */
const res = await fetch("https://sheetjs.com/pres.numbers");
const ab = await res.arrayBuffer();
/* Parse file */
const wb = XLSX.read(ab);
const ws = wb.Sheets[wb.SheetNames[0]];
/* Generate HTML */
setHTML(XLSX.utils.sheet_to_html(ws));
2024-01-31 08:14:07 +00:00
})(); }, []);
2022-08-05 05:10:11 +00:00
2023-02-28 11:40:44 +00:00
return ( <div dangerouslySetInnerHTML={{ __html }}/> );
2022-08-05 05:10:11 +00:00
}
```
</details>
2023-03-20 18:30:42 +00:00
### jQuery
2023-08-30 03:44:38 +00:00
[jQuery](https://jquery.com/) is a JavaScript library that includes helpers for
performing "Ajax" network requests. `jQuery.ajax` (`$.ajax`) does not support
2023-11-20 02:51:39 +00:00
binary data out of the box[^3]. A custom `ajaxTransport` can add support.
2023-08-30 03:44:38 +00:00
2023-11-20 02:51:39 +00:00
SheetJS users have reported success with `jquery.binarytransport.js`[^4] in IE10.
2023-03-20 18:30:42 +00:00
After including the main `jquery.js` and `jquery.binarytransport.js` scripts,
`$.ajax` will support `dataType: "binary"` and `processData: false`.
**[Live Download Demo](pathname:///jquery/index.html)**
In a GET request, the default behavior is to return a `Blob` object. Passing
`responseType: "arraybuffer"` returns a proper `ArrayBuffer` object in IE10:
```js
$.ajax({
type: "GET", url: "https://sheetjs.com/pres.numbers",
/* suppress jQuery post-processing */
// highlight-next-line
processData: false,
/* use the binary transport */
// highlight-next-line
dataType: "binary",
/* pass an ArrayBuffer in the callback */
// highlight-next-line
responseType: "arraybuffer",
success: function (ab) {
/* at this point, ab is an ArrayBuffer */
// highlight-next-line
var wb = XLSX.read(ab);
/* do something with workbook here */
var ws = wb.Sheets[wb.SheetNames[0]];
var html = XLSX.utils.sheet_to_html(ws);
$("#out").html(html);
}
});
```
2022-08-05 05:10:11 +00:00
### Wrapper Libraries
Before `fetch` shipped with browsers, there were various wrapper libraries to
2024-01-03 06:47:00 +00:00
simplify `XMLHttpRequest`. Due to limitations with `fetch`, these libraries are
still relevant.
2022-08-05 05:10:11 +00:00
#### axios
2023-08-30 03:44:38 +00:00
[`axios`](https://axios-http.com/) presents a Promise based interface. Setting
2024-01-03 06:47:00 +00:00
`responseType` to `arraybuffer` ensures the return type is an ArrayBuffer. The
`data` property of the result can be passed to the SheetJS `read` method:
2022-08-05 05:10:11 +00:00
```js
async function workbook_dl_axios(url) {
const res = await axios(url, {responseType:'arraybuffer'});
const workbook = XLSX.read(res.data);
return workbook;
}
```
<details>
<summary><b>Live Download demo</b> (click to show)</summary>
2022-08-05 05:10:11 +00:00
This demo uses `axios` to download https://sheetjs.com/pres.numbers and show
2022-08-05 05:10:11 +00:00
the data in an HTML table.
2023-09-24 03:59:48 +00:00
:::caution pass
2023-02-18 02:33:30 +00:00
If the live demo shows a message
```
ReferenceError: axios is not defined
```
please refresh the page. This is a known bug in the documentation generator.
:::
2022-08-05 05:10:11 +00:00
```jsx live
function SheetJSAxiosDL() {
2022-10-21 00:10:10 +00:00
const [__html, setHTML] = React.useState("");
2022-08-05 05:10:11 +00:00
/* Fetch and update HTML */
2024-01-31 08:14:07 +00:00
React.useEffect(() => { (async() => {
2023-05-21 07:03:46 +00:00
if(typeof axios != "function") return setHTML("ReferenceError: axios is not defined");
2022-08-05 05:10:11 +00:00
/* Fetch file */
const res = await axios("https://sheetjs.com/pres.numbers", {responseType: "arraybuffer"});
/* Parse file */
const wb = XLSX.read(res.data);
const ws = wb.Sheets[wb.SheetNames[0]];
/* Generate HTML */
setHTML(XLSX.utils.sheet_to_html(ws));
2024-01-31 08:14:07 +00:00
})(); }, []);
2022-08-05 05:10:11 +00:00
2023-02-28 11:40:44 +00:00
return ( <div dangerouslySetInnerHTML={{ __html }}/> );
2022-08-05 05:10:11 +00:00
}
```
</details>
2022-08-13 22:01:26 +00:00
#### superagent
2022-08-05 05:10:11 +00:00
2024-01-17 20:22:38 +00:00
[`superagent`](https://ladjs.github.io/superagent/) is a network request library
with a "Fluent Interface". Calling the `responseType` method with
2023-08-30 03:44:38 +00:00
`"arraybuffer"` will ensure the final response object is an `ArrayBuffer`:
2022-08-05 05:10:11 +00:00
```js
/* set up an async GET request with superagent */
2023-08-30 03:44:38 +00:00
superagent
.get(url)
.responseType('arraybuffer')
.end(function(err, res) {
/* parse the data when it is received */
var data = new Uint8Array(res.body);
var workbook = XLSX.read(data, {type:"array"});
/* DO SOMETHING WITH workbook HERE */
});
2022-08-05 05:10:11 +00:00
```
<details>
<summary><b>Live Download demo</b> (click to show)</summary>
2022-08-05 05:10:11 +00:00
This demo uses `superagent` to download https://sheetjs.com/pres.numbers and
2022-08-05 05:10:11 +00:00
show the data in an HTML table.
2023-09-24 03:59:48 +00:00
:::caution pass
2023-02-18 02:33:30 +00:00
If the live demo shows a message
```
ReferenceError: superagent is not defined
```
please refresh the page. This is a known bug in the documentation generator.
:::
2022-08-05 05:10:11 +00:00
```jsx live
function SheetJSSuperAgentDL() {
2022-10-21 00:10:10 +00:00
const [__html, setHTML] = React.useState("");
2022-08-05 05:10:11 +00:00
/* Fetch and update HTML */
2024-01-31 08:14:07 +00:00
React.useEffect(() => { (async() => {
2023-05-21 07:03:46 +00:00
if(typeof superagent == "undefined" || typeof superagent.get != "function")
return setHTML("ReferenceError: superagent is not defined");
2022-08-05 05:10:11 +00:00
/* Fetch file */
superagent
.get("https://sheetjs.com/pres.numbers")
.responseType("arraybuffer")
.end((err, res) => {
/* Parse file */
const wb = XLSX.read(res.body);
const ws = wb.Sheets[wb.SheetNames[0]];
/* Generate HTML */
setHTML(XLSX.utils.sheet_to_html(ws));
});
2024-01-31 08:14:07 +00:00
})(); }, []);
2022-08-05 05:10:11 +00:00
2023-02-28 11:40:44 +00:00
return ( <div dangerouslySetInnerHTML={{ __html }}/> );
2022-08-05 05:10:11 +00:00
}
```
</details>
2022-08-21 00:46:10 +00:00
## NodeJS Demos
2022-11-11 08:53:04 +00:00
These examples show how to download data in NodeJS.
2022-08-21 00:46:10 +00:00
2022-11-11 08:53:04 +00:00
### HTTPS GET
The `https` module provides a low-level `get` method for HTTPS GET requests:
2023-05-21 07:03:46 +00:00
```js title="SheetJSHTTPSGet.js"
2022-11-11 08:53:04 +00:00
var https = require("https"), XLSX = require("xlsx");
https.get('https://sheetjs.com/pres.numbers', function(res) {
var bufs = [];
res.on('data', function(chunk) { bufs.push(chunk); });
res.on('end', function() {
var buf = Buffer.concat(bufs);
var wb = XLSX.read(buf);
/* print the first worksheet to console */
var ws = wb.Sheets[wb.SheetNames[0]];
console.log(XLSX.utils.sheet_to_csv(ws));
});
});
```
<details>
<summary><b>Complete Example</b> (click to show)</summary>
2023-05-21 07:03:46 +00:00
2024-01-23 09:26:06 +00:00
:::note Tested Environments
2023-05-21 07:03:46 +00:00
2024-01-17 20:22:38 +00:00
This demo was last tested on 2024 January 15 against NodeJS `20.11.0`
2023-05-21 07:03:46 +00:00
:::
1) Install the [NodeJS module](/docs/getting-started/installation/nodejs)
<CodeBlock language="bash">{`\
npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz`}
</CodeBlock>
2) Copy the `SheetJSHTTPSGet.js` code snippet to a file `SheetJSHTTPSGet.js`
2024-01-17 20:22:38 +00:00
3) Run the script:
```bash
node SheetJSHTTPSGet.js
```
If successful, the script will print CSV contents of the test file.
2023-05-21 07:03:46 +00:00
</details>
2022-11-11 08:53:04 +00:00
### fetch
2024-01-17 20:22:38 +00:00
:::caution pass
Experimental support for `fetch` was introduced in NodeJS `16.15.0`. It will be
considered stable in NodeJS LTS version `22`.
:::
2022-11-11 08:53:04 +00:00
The `fetch` implementation has the same return types as the browser version:
```js
async function parse_from_url(url) {
const res = await fetch(url);
if(!res.ok) throw new Error("fetch failed");
const ab = await res.arrayBuffer();
const workbook = XLSX.read(ab);
return workbook;
}
```
<details>
<summary><b>Complete Example</b> (click to show)</summary>
2023-05-21 07:03:46 +00:00
2024-01-23 09:26:06 +00:00
:::note Tested Environments
2023-05-21 07:03:46 +00:00
2024-01-17 20:22:38 +00:00
This demo was last tested on 2024 January 15 against NodeJS `20.11.0`
2023-05-21 07:03:46 +00:00
:::
1) Install the [NodeJS module](/docs/getting-started/installation/nodejs)
<CodeBlock language="bash">{`\
npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz`}
</CodeBlock>
2) Save the following to `SheetJSFetch.js`:
```js title="SheetJSFetch.js"
var XLSX = require("xlsx");
async function parse_from_url(url) {
const res = await fetch(url);
if(!res.ok) throw new Error("fetch failed");
const ab = await res.arrayBuffer();
const workbook = XLSX.read(ab);
return workbook;
}
(async() => {
const wb = await parse_from_url('https://sheetjs.com/pres.numbers');
/* print the first worksheet to console */
var ws = wb.Sheets[wb.SheetNames[0]];
console.log(XLSX.utils.sheet_to_csv(ws));
})();
```
2024-01-17 20:22:38 +00:00
3) Run the script:
```bash
node SheetJSFetch.js
```
If successful, the script will print CSV contents of the test file.
2023-05-21 07:03:46 +00:00
</details>
2022-11-11 08:53:04 +00:00
### Wrapper Libraries
The latest releases of NodeJS support `fetch` natively. Before `fetch` support
was added to the platform, third party modules wrapped the native APIs.
#### request
2022-08-21 00:46:10 +00:00
2023-09-19 19:08:29 +00:00
:::warning pass
2022-08-21 00:46:10 +00:00
`request` has been deprecated and should only be used in legacy deployments.
:::
Setting the option `encoding: null` passes raw buffers:
2023-05-21 07:03:46 +00:00
```js title="SheetJSRequest.js"
2022-08-21 00:46:10 +00:00
var XLSX = require('xlsx'), request = require('request');
var url = 'https://sheetjs.com/pres.numbers';
/* call `request` with the option `encoding: null` */
// highlight-next-line
request(url, {encoding: null}, function(err, res, data) {
if(err || res.statusCode !== 200) return;
2023-08-30 03:44:38 +00:00
/* if the request was successful, parse the data */
2022-08-21 00:46:10 +00:00
// highlight-next-line
var wb = XLSX.read(data);
/* print the first worksheet to console */
var ws = wb.Sheets[wb.SheetNames[0]];
2022-11-11 08:53:04 +00:00
console.log(XLSX.utils.sheet_to_csv(ws));
2022-08-21 00:46:10 +00:00
});
2022-11-11 08:53:04 +00:00
```
<details>
<summary><b>Complete Example</b> (click to show)</summary>
2023-05-21 07:03:46 +00:00
2024-01-23 09:26:06 +00:00
:::note Tested Environments
2023-05-21 07:03:46 +00:00
2024-01-17 20:22:38 +00:00
This demo was last tested on 2024 January 15 against request `2.88.2`
2023-05-21 07:03:46 +00:00
:::
1) Install the [NodeJS module](/docs/getting-started/installation/nodejs)
<CodeBlock language="bash">{`\
npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz request@2.88.2`}
</CodeBlock>
2) Copy the `SheetJSRequest.js` code snippet to a file `SheetJSRequest.js`
2024-01-17 20:22:38 +00:00
3) Run the script:
```bash
node SheetJSRequest.js
```
If successful, the script will print CSV contents of the test file.
2023-05-21 07:03:46 +00:00
</details>
2022-11-11 08:53:04 +00:00
#### axios
When the `responseType` is `"arraybuffer"`, `axios` actually captures the data
2024-01-03 06:47:00 +00:00
in a NodeJS Buffer. The SheetJS `read` method handles NodeJS Buffer objects:
2022-11-11 08:53:04 +00:00
2023-05-21 07:03:46 +00:00
```js title="SheetJSAxios.js"
2022-11-11 08:53:04 +00:00
const XLSX = require("xlsx"), axios = require("axios");
async function workbook_dl_axios(url) {
const res = await axios(url, {responseType:'arraybuffer'});
/* at this point, res.data is a Buffer */
const workbook = XLSX.read(res.data);
return workbook;
}
```
2023-05-21 07:03:46 +00:00
<details>
<summary><b>Complete Example</b> (click to show)</summary>
2023-05-21 07:03:46 +00:00
2024-01-23 09:26:06 +00:00
:::note Tested Environments
2023-05-21 07:03:46 +00:00
2024-01-17 20:22:38 +00:00
This demo was last tested on 2024 January 15 against Axios `1.6.5`
2023-05-21 07:03:46 +00:00
:::
1) Install the [NodeJS module](/docs/getting-started/installation/nodejs)
<CodeBlock language="bash">{`\
2024-01-17 20:22:38 +00:00
npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz axios@1.6.5`}
2023-05-21 07:03:46 +00:00
</CodeBlock>
2) Save the following to `SheetJSAxios.js`:
```js title="SheetJSAxios.js"
const XLSX = require("xlsx"), axios = require("axios");
async function workbook_dl_axios(url) {
const res = await axios(url, {responseType:'arraybuffer'});
/* at this point, res.data is a Buffer */
const workbook = XLSX.read(res.data);
return workbook;
}
(async() => {
const wb = await workbook_dl_axios('https://sheetjs.com/pres.numbers');
/* print the first worksheet to console */
var ws = wb.Sheets[wb.SheetNames[0]];
console.log(XLSX.utils.sheet_to_csv(ws));
})();
```
2024-01-17 20:22:38 +00:00
3) Run the script:
```bash
node SheetJSAxios.js
```
If successful, the script will print CSV contents of the test file.
2023-05-21 07:03:46 +00:00
</details>
## Other Platforms
Other demos show network operations in special platforms:
- [React Native "Fetching Remote Data"](/docs/demos/mobile/reactnative#fetching-remote-data)
- [NativeScript "Fetching Remote Files"](/docs/demos/mobile/nativescript#fetching-remote-files)
2024-01-03 06:47:00 +00:00
- [AngularJS "Remote Files"](/docs/demos/frontend/angularjs#remote-files)
- [Dojo Toolkit "Parsing Remote Files"](/docs/demos/frontend/dojo#parsing-remote-files)
2023-08-30 03:44:38 +00:00
2023-11-20 02:51:39 +00:00
[^1]: See [`read` in "Reading Files"](/docs/api/parse-options)
[^2]: See [`sheet_to_html` in "Utilities"](/docs/api/utilities/html#html-table-output)
[^3]: See [`dataType` in `jQuery.ajax`](https://api.jquery.com/jQuery.ajax/) in the official jQuery documentation.
[^4]: See [the official `jquery.binarytransport.js` repo](https://github.com/henrya/js-jquery/tree/master/BinaryTransport) for more details.