docs.sheetjs.com/docz/docs/03-demos/08-local/01-file.md
2023-10-13 06:04:30 -04:00

23 KiB

title pagination_prev pagination_next sidebar_custom_props
Local File Access demos/data/index demos/cloud/index
summary
Reading and writing files using various platform APIs

import current from '/version.js'; import CodeBlock from '@theme/CodeBlock';

Reading from and writing to files requires native platform support.

SheetJS readFile and writeFile methods include support for some platforms. Due to sandboxing and security settings, readFile does not work in the web browser and writeFile is not guaranteed to work in all cases.

This demo looks at various web APIs for reading and writing files. We'll explore how to pass data between SheetJS functions and various APIs.

:::note pass

Some snippets are also available in the "Common Use Cases" section:

Other demos cover APIs for local file access on special platforms:

:::

Binary Data

JavaScript engines represent binary data in a number of structures.

The type option for SheetJS read function1 controls how the data should be interpreted. This parameter distinguishes binary strings from Base64 strings.

The type option for SheetJS write function2 controls the output storage.

Uint8Array and Buffer

A Uint8Array is a Typed Array where each value is a 8-bit unsigned integer. Server-side platforms including NodeJS typically use Uint8Array, or a subclass such as Buffer3, to represent data from files.

The SheetJS read method can read data from Uint8Array without any options:

const wb = XLSX.read(u8);

The SheetJS write method can generate workbooks stored in Uint8Array structures with the option type: "buffer":

const u8 = XLSX.write(wb, {bookType: "xlsx", type: "buffer"});

:::note pass

In NodeJS, the write method will generate a Buffer instance.

:::

ArrayBuffer

An ArrayBuffer represents an array of bytes. The Uint8Array constructor can synchronously create a view without copying the underlying data:

/* create a Uint8Array "view" */
const u8 = new Uint8Array(array_buffer);

The SheetJS read method can read data from ArrayBuffer without special options, as it performs the aforementioned conversion. The SheetJS write method can generate workbooks stored in ArrayBuffer structures with the option type: "array"

Blob and File

Blob is an opaque pointer to data. The data is not immediately accessible.

File extends Blob with support for storing file names and other metadata.

The SheetJS read method does not handle Blob or File. The underlying data must be pulled into an ArrayBuffer before parsing. There are two approaches:

A) Modern browsers support the arrayBuffer method. It returns a promise that resolves to ArrayBuffer:

// usage: const wb = await blob_to_wb(blob);
async function blob_to_wb(blob) {
  const ab = await blob.arrayBuffer(); // pull data from Blob
  return XLSX.read(ab);                // parse ArrayBuffer
}

B) For broader browser support, the FileReader API can pull ArrayBuffer data using the readAsArrayBuffer method:

// usage: file_to_wb(file, function(wb) { /* wb is a workbook object */ });
function file_to_wb(file, callback) {
  var reader = new FileReader();
  reader.onload = function(e) {
    /* e.target.result is an ArrayBuffer */
    callback(XLSX.read(e.target.result));
  };
  reader.readAsArrayBuffer(file);
}
FileReaderSync in Web Workers (click to show)

FileReaderSync is only available in Web Workers. It returns an ArrayBuffer:

// assuming main thread called worker.postMessage({ file: file_object })
self.addEventListener('message', (e) => {
  /* get file object from message */
  var file = e.data.file;
  /* Read file data */
  const ab = new FileReaderSync().readAsArrayBuffer(file);
  /* Parse file */
  const wb = XLSX.read(ab);
  /* DO SOMETHING WITH wb HERE */
});

"User-Submitted File" example includes a live demo.

The SheetJS write method can generate a Uint8Array which can be passed to the Blob constructor:

function wb_to_blob(wb, bookType) {
  /* write workbook to Uint8Array */
  const u8 = XLSX.write(wb, { bookType: bookType || "xlsx", type: "buffer" });
  /* create array of parts */
  const parts = [ u8 ]; // `Blob` constructor expects this
  /* create Blob */
  const blob = new Blob(parts, { type: "application/vnd.ms-excel" });
  return blob;
}

The File constructor accepts an additional name argument:

function wb_to_file(wb, filename) {
  /* impute bookType from file extension */
  const ext = filename.slice(filename.lastIndexOf(".") + 1);
  /* write workbook to Uint8Array */
  const u8 = XLSX.write(wb, { bookType: ext, type: "buffer" });
  /* create array of parts */
  const parts = [ u8 ]; // `File` constructor expects this
  /* create File */
  const file = new File(parts, filename, { type: "application/vnd.ms-excel" });
  return file;
}

Binary Strings

Binary strings are strings where each character code is between 0 and 255. This structure is generated from the FileReader#readAsBinaryString method.

The SheetJS read method supports binary strings with type: "binary". The following snippet shows how readAsBinaryString can be paired with SheetJS:

// usage: file_bs_to_wb(file, function(wb) { /* wb is a workbook object */ });
function file_bs_to_wb(file, callback) {
  var reader = new FileReader();
  reader.onload = function(e) {
    /* e.target.result is a binary string */
    callback(XLSX.read(e.target.result, { type: "binary" }));
  };
  reader.readAsBinaryString(file);
}

The SheetJS write method can generate binary strings using type: "binary":

/* write workbook to binary string */
const bstr = XLSX.write(wb, { bookType: "xlsx", type: "binary" });

Base64 Strings

Base64 strings are encoded using 64 display ASCII characters. This structure is generated from btoa, the FileReader#readAsDataURL method, and many platform APIs in mobile and desktop app frameworks.

The SheetJS read method supports Base64 strings with type: "base64". The following snippet shows how readAsDataURL can be paired with SheetJS:

// usage: file_b64_to_wb(file, function(wb) { /* wb is a workbook object */ });
function file_b64_to_wb(file, callback) {
  var reader = new FileReader();
  reader.onload = function(e) {
    /* e.target.result is a base64 string */
    callback(XLSX.read(e.target.result, { type: "base64" }));
  };
  reader.readAsDataURL(file);
}

The SheetJS write method can generate Base64 strings using type: "base64":

/* write workbook to Base64 string */
const b64 = XLSX.write(wb, { bookType: "xlsx", type: "base64" });

Arrays of Numbers

Some platforms represent binary data as arrays of numbers, where each number represents one byte in the file.

The SheetJS read method supports arrays of unsigned bytes (where each value is between 0 and 255) with type: "array".

:::caution Java and Signed Bytes

Google Sheets follows Java signed data type conventions. Byte arrays include values from -128 to 127.

How to Fix Signed Arrays (click to show)

The unsigned value for a negative byte can be calculated with a bitwise AND (&) operation against 0xFF:

const unsigned_byte = signed_byte & 0xFF;

For legacy platforms including NetSuite 2.0, the bitwise AND assignment operator (&=) can rectify an array in place:

/* convert a signed byte array to an unsigned byte array in place */
for(var i = 0; i < array.length; ++i) array[i] &= 0xFF;

For modern platforms, the Uint8Array constructor understands signed bytes:

/* copy data into a new Uint8Array */
const u8 = new Uint8Array(array);

:::

Web Browsers

:::warning pass

Not all web APIs are supported in all browsers. For example, Firefox does not support the "File System Access API".

Even when a browser technically supports a web API, it may be disabled in the client browser. Some APIs do not give any feedback.

:::

HTML5 Download Attribute

Writing Files

writeFile will attempt a download in the browser using the attribute.

XLSX.writeFile(wb, "SheetJS.xlsx");
Implementation Details (click to show)

Under the hood, it creates a special URL and clicks a link. The library method includes a few workarounds for legacy browsers

XLSX.writeFile(wb, "SheetJS.xlsx"); is roughly equivalent to:

/* write data -- `writeFile` infers bookType from filename but `write` cannot */
const u8 = XLSX.write(wb, { bookType: "xlsx", type: "buffer" });
/* create Blob */
const blob = new Blob([u8]);
/* create object URL */
const url = URL.createObjectURL(blob);

/* create `A` DOM element */
const a = document.createElement("a");
/* set export file name */
a.download = "SheetJS.xlsx";
/* wire up the object URL to the DOM element */
a.href = url;
/* add to the page */
document.body.appendChild(a);
/* click the link */
a.click();
/* remove the element from the page */
document.body.removeChild(a);

:::caution Web Workers

XLSX.writeFile requires DOM access and will not work in a Web Worker!

The workaround is to generate the file data from the Worker (using XLSX.write) and send the data back to the main context for the actual download action.

"Creating a Local File" includes a live demo.

:::

Google Tag Manager

:::caution pass

Google Tag Manager is known to intercept and corrupt links. This issue will manifest as UUID file names like 01234567-89ab-cdef-0123-456789abcdef .

:::

For sites using GTM, it is recommended to patch document.createElement and revert after performing the export.

GTM Workaround (click to show)

The workaround is to ensure new A elements created by document.createElement have the target attribute set to _blank.

After calling writeFile, the old version of the method should be restored.

/* preparation */
document.createElement2 = document.createElement;
document.createElement = function(...args) {
  if(args.length == 1 && args[0].toLowerCase() == "a") {
    const a = document.createElement2("a");
    a.target = "_blank";
    return a;
  }
  return document.createElement2.call(this, ...args);
};

/* export (XLSX.writeFile) */
XLSX.writeFile(wb, "SheetJS.xlsx");

/* cleanup */
setTimeout(() => {
  document.createElement = document.createElement2;
  delete document.createElement2;
}, 1000);

File API

Reading Files

In the change event of <input type="file">, the event object will have a target property. The files property of target is a list of File objects.

async function handleFileAsync(e) {
  /* get first file */
  const file = e.target.files[0];
  /* get raw data */
  const data = await file.arrayBuffer();
  /* data is an ArrayBuffer */
  const workbook = XLSX.read(data);
  /* do something with the workbook here */
  console.log(XLSX.utils.sheet_to_json(wb.Sheets[wb.SheetNames[0]]));
}
input_dom_element.addEventListener("change", handleFileAsync, false);

HTML Drag and Drop API

Reading Files

The dataTransfer property of the drop event holds a list of File objects:

/* suppress default behavior for drag and drop events */
function suppress(e) { e.stopPropagation(); e.preventDefault(); }

/* handle data from drop event */
async function handleDropAsync(e) {
  suppress(e);
  /* get first file */
  const f = e.dataTransfer.files[0];
  /* get raw data */
  const data = await f.arrayBuffer();
  /* data is an ArrayBuffer */
  const wb = XLSX.read(data);
  /* do something with the workbook here */
  console.log(XLSX.utils.sheet_to_json(wb.Sheets[wb.SheetNames[0]]));
}

drop_dom_element.addEventListener("drop", handleDropAsync, false);
drop_dom_element.addEventListener("dragover", suppress, false);
drop_dom_element.addEventListener("dragenter", suppress, false);

File System Access API

:::warning Limited Browser Support

At the time of writing, browser support was fairly limited. Chrome introduced the feature in version 86. Safari did not support File System Access API.

:::

:::caution pass

When this demo was last tested, Google Chrome did not add an entry to the "Downloads" list. Nevertheless the actual file was written correctly.

:::

:::note

This demo was last tested on 2023 August 30 in Google Chrome.

:::

Live Example (click to show)

This live example reads a file then tries to save as XLSX. If the File System Access API is not supported, the result will be a clear message.

function SheetJSRoundTripFileSystemAPI() { return window.showSaveFilePicker ? (
<button onClick={async () => {
  /* Show picker and get data */
  const [rFile] = await window.showOpenFilePicker({
    types: [{
      description: 'Spreadsheets',
      accept: { 'application/vnd.ms-excel': ['.xlsx', '.xls', '.xlsb', /*...*/] }
    }],
    excludeAcceptAllOption: true,
    multiple: false
  });
  const ab = await (await rFile.getFile()).arrayBuffer();

  /* parse */
  const wb = XLSX.read(ab);

  /* Show picker and get handle to file */
  const wFile = await window.showSaveFilePicker({
    suggestedName: "SheetJSRT.xlsx",
    types: [ { description: 'XLSX', accept: { 'application/vnd.ms-excel': ['.xlsx'] } } ]
  });
  const wstream = await wFile.createWritable();

  /* write */
  const buf = XLSX.write(wb, { bookType: "xlsx", type: "buffer" });
  wstream.write(buf);

  /* close stream to commit file */
  wstream.close();

}}>Click to read then save as XLSX</button>
  ) : ( <b>This browser does not support File System Access API</b> ); }

Reading Files

window.showOpenFilePicker shows a file picker and resolves to an array of file handles. When multiple: false is set, the array has one element.

The getFile method resolves to a File object whose data can be read with the arrayBuffer method:

/* Show picker and get data */
const [hFile] = await window.showOpenFilePicker({
  types: [{
    description: 'Spreadsheets',
    accept: { 'application/vnd.ms-excel': ['.xlsx', '.xls', '.xlsb', /*...*/] }
  }],
  excludeAcceptAllOption: true,
  multiple: false
});
const ab = await (await hFile.getFile()).arrayBuffer();

/* parse */
const wb = XLSX.read(ab);

/* do something with the workbook */
console.log(XLSX.utils.sheet_to_json(wb.Sheets[wb.SheetNames[0]]));

Writing Files

window.showSaveFilePicker shows a file picker and resolves to a file handle. The createWritable method resolves to a FileSystemWritableFileStream, which readily accepts Uint8Array data from XLSX.write:

/* Show picker and get handle to file */
const hFile = await window.showSaveFilePicker({
  suggestedName: "SheetJS.xlsx",
  types: [
    { description: 'Excel 2007+ (XLSX)', accept: { 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet': ['.xlsx'] } },
    { description: 'Excel 97-2004 (XLS)', accept: { 'application/vnd.ms-excel': ['.xls'] } },
    { description: 'Excel 2007+ Binary (XLSB)', accept: { 'application/vnd.ms-excel.sheet.binary.macroEnabled.12': ['.xlsb'] } },
    /* note that each MIME type must be unique! */
  ]
});
const wstream = await hFile.createWritable();

/* get extension */
const ext = hFile.name.slice(hFile.name.lastIndexOf(".")+1)
/* write */
wstream.write(XLSX.write(wb, { bookType: ext, type: "buffer" }))
/* close stream to commit file */
wstream.close();

File and Directory Entries API

:::caution Deprecated

In the web browser, the File and Directory Entries API has been deprecated and is not recommended for new applications.

cordova-plugin-file still uses the API patterns.

:::

Writing Files

The API is callback-based. At a high level:

  1. window.requestFileSystem requests access to the filesystem. The callback receives a FileSystem object.

  2. A file is created using the getFile method. The callback receives a FileSystemFileEntry object representing the file.

  3. A writer is created using the createWriter method of the file object. The callback receives a FileWriter object representing a file handle for writing.

  4. Data is written using the write method of the FileWriter object. Unlike the other methods, callbacks are attached to the FileWriter object directly.

// Request File System Access
window.requestFileSystem(window.PERSISTENT, 0, (fs) => {
  // Request a handle to "SheetJS.xlsx", making a new file if necessary
  fs.root.getFile("SheetJS.xlsx", {create: true}, entry => {
    // Request a FileWriter for writing data
    entry.createWriter(writer => {
      // The FileWriter API needs an actual Blob
      const u8 = XLSX.write(wb, { type: "buffer", bookType: "xlsx" });
      const data = new Blob([u8], { type: "application/vnd.ms-excel" });
      // `onwriteend` is called on success, `onerror` called on error
      writer.onwriteend = () => {}; writer.onerror = () => {};
      // write the data
      writer.write(data);
    });
  });
});

Internet Explorer

Internet Explorer offered proprietary APIs that were not adopted by Chromium.

Blob API

Writing Files

IE10 and IE11 support navigator.msSaveBlob. XLSX.writeFile will use this method if it is available.

Implementation Details (click to show)

XLSX.writeFile(wb, "SheetJS.xlsx"); is roughly equivalent to:

/* write data -- `writeFile` infers bookType from filename but `write` cannot */
const u8 = XLSX.write(wb, { bookType: "xlsx", type: "buffer" });
/* create Blob */
const blob = new Blob([u8]);
/* call msSaveBlob */
navigator.msSaveBlob(blob, "SheetJS.xlsx");

VBScript

Reading and Writing Files

Internet Explorer 6-9 with VBScript support Scripting.FileSystemObject. This is not supported in modern browsers.

This approach is implemented in the library readFile and writeFile methods. It requires the shim script to be loaded before the main library script:

<!-- load the shim script first -->
<script src="shim.min.js"></script>
<!-- then load the main script -->
<script src="xlsx.full.min.js"></script>

Other Platforms

NodeJS

fs.readFileSync and fs.writeFileSync allow for reading and writing files.

When using require, these are supported in readFile and writeFile:

var XLSX = require("xlsx");
var wb = XLSX.readFile("sheetjs.numbers");
XLSX.writeFile(wb, "sheetjs.xls");

Installation has a special note for use with NodeJS ECMAScript Modules:

import { readFile, writeFile, set_fs } from 'xlsx';
import * as fs from 'fs';
set_fs(fs);

var wb = readFile("sheetjs.numbers");
writeFile(wb, "sheetjs.xlsx");
Implementation Details (click to show)

XLSX.readFile(filepath) is equivalent to:

CommonJS

var fs = require("fs");
var buf = fs.readFileSync(filepath);
var wb = XLSX.read(buf);

ECMAScript Modules

import { read } from "xlsx";
import { readFileSync } from "fs";

var buf = readFileSync(filepath);
var wb = read(buf);

XLSX.writeFile(wb, filepath) is equivalent to:

CommonJS

var fs = require("fs"), path = require("path");
var buf = XLSX.write(wb, { bookType: path.extname(filepath).slice(1), type: "buffer" });
fs.writeFileSync(filepath, buf);

ECMAScript Modules

import { write } from "xlsx";
import { writeFileSync } from "fs";
import { extname } from "path";

var buf = write(wb, { bookType: extname(filepath).slice(1), type: "buffer" });
writeFileSync(filepath, buf);

ExtendScript

In Photoshop and other Adobe apps, readFile and writeFile use the File object under the hood:

#include "xlsx.extendscript.js"

var wb = XLSX.readFile("sheetjs.xlsx");
XLSX.writeFile(wb, "sheetjs.csv");

The ExtendScript demo also covers "Common Extensibility Platform" (CEP) and "Unified Extensibility Platform" (UXP) details.

Chrome Extensions

In Manifest v2 Chrome extensions, writeFile calls chrome.downloads.download.

This approach uses URL.createObjectURL, an API that is not supported in a Manifest v3 Background Service Worker. For small exports, raw Base64 URLs can be generated and downloaded.

The Chromium demo covers the details.

Deno

readFile uses Deno.readFileSync and writeFile uses Deno.writeFileSync:

{\ // @deno-types="https://cdn.sheetjs.com/xlsx-${current}/package/types/index.d.ts" import * as XLSX from 'https://cdn.sheetjs.com/xlsx-${current}/package/xlsx.mjs'; \n\ const wb: XLSX.WorkBook = XLSX.readFile("sheetjs.numbers"); XLSX.writeFile(wb, "sheetjs.xlsx");}

:::caution Deno entitlements

Any Deno script using XLSX.readFile requires the --allow-read entitlement.

Any Deno script using XLSX.writeFile requires the --allow-write entitlement.

:::

Implementation Details (click to show)

XLSX.readFile(filepath) is equivalent to:

ECMAScript Modules

{\ // @deno-types="https://cdn.sheetjs.com/xlsx-${current}/package/types/index.d.ts" import * as XLSX from 'https://cdn.sheetjs.com/xlsx-${current}/package/xlsx.mjs'; \n\ const u8: Uint8Array = Deno.readFileSync(filepath); const wb: XLSX.WorkBook = XLSX.read(u8);}

XLSX.writeFile(wb, filepath) is equivalent to:

ECMAScript Modules

{\ // @deno-types="https://cdn.sheetjs.com/xlsx-${current}/package/types/index.d.ts" import * as XLSX from 'https://cdn.sheetjs.com/xlsx-${current}/package/xlsx.mjs'; \n\ const u8 = XLSX.write(wb, { bookType: filepath.slice(filepath.lastIndexOf(".")+1), type: "buffer" }); Deno.writeFileSync(filepath, u8);}

Bun

Bun requires the fs module:

import { readFile, writeFile, set_fs } from 'xlsx';
import * as fs from 'fs';
set_fs(fs);

var wb = readFile("sheetjs.numbers");
writeFile(wb, "sheetjs.xlsx");

The implementation is identical to NodeJS ECMAScript Modules.

Apps

Desktop and mobile apps have their own specific APIs covered in separate demos: