docs.sheetjs.com/docz/docs/03-demos/30-cloud/21-gsheet.md
SheetJS 92e3c5aa72 mdx cleanup in preparation for v2
- use autolinks (e.g <https://sheetjs.com> -> https://sheetjs.com)
- move <summary> blocks to separate lines
2024-04-08 00:57:39 -04:00

22 KiB

title pagination_prev pagination_next
Google Sheets demos/local/index demos/extensions/index

import current from '/version.js'; import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; import CodeBlock from '@theme/CodeBlock';

:::note pass

This demo focuses on external data processing. For Google Apps Script custom functions, the "Google Sheets" extension demo covers Apps Script integration.

:::

Google Sheets is a collaborative spreadsheet service with powerful external APIs for automation.

SheetJS is a JavaScript library for reading and writing data from spreadsheets.

This demo uses SheetJS to properly exchange data with spreadsheet files. We'll explore how to use NodeJS integration libraries and SheetJS in three data flows:

  • "Importing data": Data in a NUMBERS spreadsheet will be parsed using SheetJS libraries and written to a Google Sheets Document

  • "Exporting data": Data in Google Sheets will be pulled into arrays of objects. A workbook will be assembled and exported to Excel Binary workbooks (XLSB).

  • "Exporting files": SheetJS libraries will read XLSX and ODS files exported by Google Sheets and generate CSV rows from every worksheet.

:::warning pass

It is strongly recommended to create a new Google account for testing.

One small mistake could result in a block or ban from Google services.

:::

:::caution pass

Google Sheets deprecates APIs quickly and there is no guarantee that the referenced APIs will be available in the future.

:::

Integration Details

This demo uses the following NodeJS modules:

  • google-auth-library1 simplifies authentication with Google APIs
  • node-google-spreadsheet2 interacts with Google Sheets v4 API

:::info Initial Setup

There are a number of steps to enable the Google Sheets API and Google Drive API for an account. The Complete Example covers the process.

:::

Authentication

It is strongly recommended to use a service account for Google API operations. The "Service Account Setup" section covers how to create a service account and generate a JSON key file.

The generated JSON key file includes client_email and private_key fields. These fields can be used in JWT authentication:

import { JWT } from 'google-auth-library'
import { GoogleSpreadsheet } from 'google-spreadsheet';

// adjust the path to the actual key file.
// highlight-next-line
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };

const jwt = new JWT({
  email: creds.client_email,
  key: creds.private_key,
  scopes: [
    'https://www.googleapis.com/auth/spreadsheets',
    'https://www.googleapis.com/auth/drive.file',
  ]
});

Connecting to Documents

To connect to existing documents, the document ID must be specified. This ID can be found from the edit URL.

The edit URL starts with https://docs.google.com/spreadsheets/d/ and includes /edit. The ID is the string of characters between the slashes. For example:

https://docs.google.com/spreadsheets/d/a_long_string_of_characters/edit#gid=0
---------------------------------------^^^^^^^^^^^^^^^^^^^^^^^^^^^--- ID

The GoogleSpreadsheet constructor accepts a document ID and auth object:

const doc = new GoogleSpreadsheet('DOCUMENT_ID', jwt);
await doc.loadInfo();

Creating New Documents

The createNewSpreadsheetDocument makes a request to create a new document. It is strongly recommended to create a blank sheet.

const doc = await GoogleSpreadsheet.createNewSpreadsheetDocument(jwt, { title: 'Document Title' });
const newSheet = await doc.addSheet({ title: 'Sheet Title' });

Array of Arrays

"Arrays of Arrays" are the main data format for interchange with Google Sheets. The outer array object includes row arrays, and each row array includes data.

SheetJS provides methods for working with Arrays of Arrays:

  • aoa_to_sheet3 creates SheetJS worksheet objects from arrays of arrays
  • sheet_to_json4 can generate arrays of arrays from SheetJS worksheets

Export Document Data

The goal is to create an XLSB export from a Google Sheet. Google Sheets does not natively support the XLSB format. SheetJS fills the gap.

Convert a Single Sheet

The idea is to extract the raw data from the Google Sheet headers and combine with the raw data rows to produce a large array of arrays.

async function wb_append_sheet(sheet, name, wb) {
  /* get the header and data rows */
  await sheet.loadHeaderRow();
  const header = sheet.headerValues;
  const rows = await sheet.getRows();

  /* construct the array of arrays */
  const aoa = [header].concat(rows.map(r => r._rawData));

  /* generate a SheetJS Worksheet */
  const ws = XLSX.utils.aoa_to_sheet(aoa);

  /* add to workbook */
  XLSX.utils.book_append_sheet(wb, ws, name);
}

Convert a Workbook

doc.sheetsByIndex is an array of worksheets in the Google Sheet Document. By looping across the sheets, the entire workbook can be written:

async function doc_to_wb(doc) {
  /* Create a new workbook object */
  const wb = XLSX.utils.book_new();

  /* Loop across the Document sheets */
  for(let i = 0; i < doc.sheetsByIndex.length; ++i) {
    const sheet = doc.sheetsByIndex[i];
    /* Get the worksheet name */
    const name = sheet.title;

    /* Add sheet to workbook */
    await add_sheet_to_wb(sheet, name, wb);
  }

  return wb;
}

Update Document Data

The goal is to import data from a NUMBERS file to Google Sheets. Google Sheets does not natively support the NUMBERS format. SheetJS fills the gap.

Clear the Document

Google Sheets does not allow users to delete every worksheet. This function deletes every worksheet after the first, then clears the first worksheet:

/* clear google sheets doc */
async function doc_clear(doc) {
  /* delete all sheets after the first sheet */
  const old_sheets = doc.sheetsByIndex;
  for(let i = 1; i < old_sheets.length; ++i) await old_sheets[i].delete();

  /* clear first worksheet */
  old_sheets[0].clear();
}

Update First Sheet

There are two steps: "update worksheet name" and "update worksheet data":

Update Sheet Name

The worksheet name is assigned by using the updateProperties method. The desired sheet name is the name of the first worksheet from the file.

async function doc_update_first_sheet_name(doc, wb) {
  /* get first worksheet name */
  const wsname = wb.SheetNames[0];

  /* get first gsheet */
  const sheet = doc.sheetsByIndex[0];

  /* update worksheet name */
  await sheet.updateProperties({title: wsname});
}

Update Sheet Data

sheet.addRows reads an Array of Arrays of values. XLSX.utils.sheet_to_json can generate this exact shape with the option header: 1. Unfortunately Google Sheets requires at least one "Header Row". This can be implemented by converting the entire worksheet to an Array of Arrays and setting the header row to the first row of the result:

async function doc_update_first_sheet_data(doc, wb) {
  /* get first worksheet */
  const ws = wb.Sheets[wb.SheetNames[0]];
  /* generate array of arrays from the first worksheet */
  const aoa = XLSX.utils.sheet_to_json(ws, {header: 1});

  /* get first gsheet */
  const sheet = doc.sheetsByIndex[0];
  /* set document header row to first row of the AOA */
  await sheet.setHeaderRow(aoa[0]);

  /* add the remaining rows */
  await sheet.addRows(aoa.slice(1));
}

Append Remaining Worksheets

Each name in the SheetJS Workbook SheetNames array maps to a worksheet. The list of names not including the first sheet is wb.SheetNames.slice(1).

There are two steps for each sheet: "create new sheet" and "load data".

Due to JavaScript async idiosyncrasies, a plain for loop must be used:

async function doc_append_remaining_sheets(doc, wb) {
  const names = wb.SheetNames.slice(1);

  /* loop across names */
  for(let i = 0; i < names.length; ++i) {
    /* wb.SheetNames[i] is the sheet name */
    const name = wb.SheetNames[i];
    /* wb.Sheets[name] is the worksheet object */
    const ws = wb.Sheets[name];

    /* create new google sheet */
    const sheet = await doc_add_new_sheet(doc, name);
    /* load sheet with data */
    await sheet_load_from_ws(sheet, ws);
  }
}

Add a New Worksheet

doc.addSheet accepts a properties object that includes the worksheet name:

async function doc_add_new_sheet(doc, name) {
  return await doc.addSheet({title: name});
}

This creates a new worksheet, sets the tab name, and returns a reference to the created worksheet.

Update Worksheet Data

async function sheet_load_from_ws(sheet, ws) {
  /* generate array of arrays from the first worksheet */
  const aoa = XLSX.utils.sheet_to_json(ws, {header: 1});

  /* set document header row to first row of the AOA */
  await sheet.setHeaderRow(aoa[0]);

  /* add the remaining rows */
  await sheet.addRows(aoa.slice(1));
}

Raw File Exports

In the web interface, Google Sheets can export documents to XLSX or ODS. The NodeJS library includes similar methods to perform the download5:

Format Google Sheets Description Method
XLSX Microsoft Excel (.xlsx) downloadAsXLSX
ODS OpenDocument (.ods) downloadAsODS

The functions resolve to Buffer data. The Buffer objects can be parsed using the SheetJS read6 method:

/* download XLSX */
const ab = await doc.downloadAsXLSX();

/* parse */
const wb = XLSX.read(buf);

At this point wb is a SheetJS workbook object7.

Complete Example

:::note Tested Deployments

This demo was last tested on 2023 September 17 using google-auth-library for authentication (v8.9.0) and google-spreadsheet for API access (v4.1.0).

:::

Account Setup

  1. Create a new Google account or log into an existing account.

:::caution pass

A valid phone number (for SMS verification) may be required.

:::

  1. Open https://console.cloud.google.com in a web browser. Review the Google Cloud Platform Terms of Service.

:::warning pass

You must agree to the Google Cloud Platform Terms of Service to use the APIs.

:::

Project Setup

  1. Create a new Project.

If the account does not have an existing project, click "CREATE PROJECT"

If the account has an existing project, click the project selector ( icon) and click "NEW PROJECT" in the modal.

  1. In the New Project screen, enter "SheetJS Test" in the Project name textbox and select "No organization" in the Location box. Click "CREATE"

API Setup

:::info pass

The goal of this section is to enable Google Sheets API and Google Drive API.

:::

  1. Click the Project Selector ( icon) and select "SheetJS Test"

  2. In the left sidebar, click "Enabled APIs and services".

  3. Near the top of the page, click "+ ENABLE APIS AND SERVICES".

  4. In the search bar near the middle of the page, type "Sheets" and look for "Google Sheets API". Click the card

  5. In the Product Details screen, click the blue "ENABLE" button.

  6. Click the left arrow (<-) next to API/Service details.

  7. In the search bar near the middle of the page, type "Drive" and look for "Google Drive API". Click the card.

  8. In the Product Details screen, click the blue "ENABLE" button.

Service Account Setup

:::info pass

The goal of this section is to create a service account and generate a JSON key.

:::

Create Service Account

  1. Go to https://console.cloud.google.com.

  2. Click the Project Selector ( icon) and select "SheetJS Test".

  3. Click "Dashboard".

  4. In the left sidebar, hover over "APIs and Services" and select "Credentials"

  5. Click "+ CREATE CREDENTIALS". In the dropdown, select "Service Account"

  6. Enter "SheetJService" for Service account name. Click "CREATE AND CONTINUE"

:::note pass

The Service account ID is generated automatically.

:::

  1. In Step 2 "Grant this service account access to project", click CONTINUE

  2. In Step 3 click "DONE". You will be taken back to the credentials screen

Create JSON Key

  1. Look for "SheetJService" in the "Service Accounts" table and click the email address in the row

  2. Click the email address of the account in the "Service Accounts" table.

  3. Click "KEYS" in the horizontal bar near the top of the page.

  4. Click "ADD KEY" and select "Create new key" in the dropdown.

  5. In the popup, select the "JSON" radio button and click "CREATE". The page will download a JSON file.

  6. Click "CLOSE"

Create Document

:::info pass

The goal of this section is to create a document from the service account and share with the main account.

:::

  1. Create a SheetJSGS folder and initialize:
mkdir SheetJSGS
cd SheetJSGS
npm init -y
  1. Copy the JSON file from step 24 into the project folder.

  2. Install dependencies:

{\ npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz google-spreadsheet google-auth-library}

  1. Save the following script to init.mjs:
import { JWT } from 'google-auth-library'
import { GoogleSpreadsheet } from 'google-spreadsheet';

// highlight-next-line
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };

const jwt = new JWT({
  email: creds.client_email,
  key: creds.private_key,
  scopes: [
    'https://www.googleapis.com/auth/spreadsheets',
    'https://www.googleapis.com/auth/drive.file',
  ]
});

const doc = await GoogleSpreadsheet.createNewSpreadsheetDocument(jwt, { title: 'test from NodeJS' });
const newSheet = await doc.addSheet({ title: 'SheetJSTest' });
// highlight-next-line
await doc.share('YOUR_ADDRESS@gmail.com');

Edit the highlighted lines as follows:

  • './sheetjs-test-726272627262.json' should be replaced with the name of the JSON file in step 27. The ./ prefix is required!

  • 'YOUR_ADDRESS@gmail.com' should be replaced with the Google Account email address from step 0.

  1. Run the script:
node init.mjs
  1. Sign into Google Sheets. A shared document "test from NodeJS" should be displayed in the table. It will be owned by the service account.

  2. Open the shared document from step 31.

  3. Copy the URL and extract the document ID.

The URL of the document will look like

https://docs.google.com/spreadsheets/d/a_long_string_of_characters/edit#gid=0
---------------------------------------^^^^^^^^^^^^^^^^^^^^^^^^^^^--- ID

The ID is a long string of letters and numbers and underscore characters (_) just before the /edit part of the URL.

Load Data from NUMBERS

:::info pass

The goal of this section is to update the new document with data from a sample NUMBERS file.

:::

  1. Download the test file pres.numbers:
curl -LO https://sheetjs.com/pres.numbers
  1. Save the following script to load.mjs:
import { JWT } from 'google-auth-library'
import { GoogleSpreadsheet } from 'google-spreadsheet';

import { set_fs, readFile, utils } from 'xlsx';
import * as fs from 'fs';
set_fs(fs);

// highlight-next-line
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };

const jwt = new JWT({
  email: creds.client_email,
  key: creds.private_key,
  scopes: [
    'https://www.googleapis.com/auth/spreadsheets',
    'https://www.googleapis.com/auth/drive.file',
  ]
});

// highlight-next-line
const doc = new GoogleSpreadsheet('DOCUMENT_ID', jwt);
await doc.loadInfo();

const wb = readFile("pres.numbers");

/* clear workbook */
{
  /* delete all sheets after the first sheet */
  const old_sheets = doc.sheetsByIndex;
  for(let i = 1; i < old_sheets.length; ++i) {
    await old_sheets[i].delete();
  }
  /* clear first worksheet */
  old_sheets[0].clear();
}

/* write worksheets */
{
  const name = wb.SheetNames[0];
  const ws = wb.Sheets[name];
  /* first worksheet already exists */
  const sheet = doc.sheetsByIndex[0];

  /* update worksheet name */
  await sheet.updateProperties({title: name});

  /* generate array of arrays from the first worksheet */
  const aoa = utils.sheet_to_json(ws, {header: 1});

  /* set document header row to first row of the AOA */
  await sheet.setHeaderRow(aoa[0])

  /* add the remaining rows */
  await sheet.addRows(aoa.slice(1));

  /* the other worksheets must be created manually */
  for(let i = 1; i < wb.SheetNames.length; ++i) {
    const name = wb.SheetNames[i];
    const ws = wb.Sheets[name];

    const sheet = await doc.addSheet({title: name});
    const aoa = utils.sheet_to_json(ws, {header: 1});
    await sheet.setHeaderRow(aoa[0])
    await sheet.addRows(aoa.slice(1));
  }
}

Edit the highlighted lines as follows:

  • './sheetjs-test-726272627262.json' should be replaced with the name of the JSON file in step 27. The ./ prefix is required!

  • 'DOCUMENT_ID' should be replaced with the Document ID from step 33.

  1. Run the script:
node load.mjs
  1. Sign into Google Sheets and open the "test from NodeJS" shared document. It should show a list of Presidents, matching the contents of the test file.

Export Data to XLSB

:::info pass

The goal of this section is to export the raw data from Google Sheets to XLSB.

:::

  1. Save the following script to dump.mjs:
import { JWT } from 'google-auth-library'
import { GoogleSpreadsheet } from 'google-spreadsheet';

import { set_fs, writeFile, utils } from 'xlsx';
import * as fs from 'fs';
set_fs(fs);

// highlight-next-line
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };

const jwt = new JWT({
  email: creds.client_email,
  key: creds.private_key,
  scopes: [
    'https://www.googleapis.com/auth/spreadsheets',
    'https://www.googleapis.com/auth/drive.file',
  ]
});

// highlight-next-line
const doc = new GoogleSpreadsheet('DOCUMENT_ID', jwt);
await doc.loadInfo();

const wb = utils.book_new();

for(let i = 0; i < doc.sheetsByIndex.length; ++i) {
  const sheet = doc.sheetsByIndex[i];
  const name = sheet.title;

  /* get the header and data rows */
  await sheet.loadHeaderRow();
  const header = sheet.headerValues;
  const rows = await sheet.getRows();
  const aoa = [header].concat(rows.map(r => r._rawData));

  /* generate a SheetJS Worksheet */
  const ws = utils.aoa_to_sheet(aoa);

  /* add to workbook */
  utils.book_append_sheet(wb, ws, name);
}

/* write to SheetJS.xlsb */
writeFile(wb, "SheetJS.xlsb");

Edit the highlighted lines as follows:

  • './sheetjs-test-726272627262.json' should be replaced with the name of the JSON file in step 27. The ./ prefix is required!

  • 'DOCUMENT_ID' should be replaced with the Document ID from step 33.

  1. Run the script:
node dump.mjs

The script should create a file SheetJS.xlsb in the project folder. This file can be opened in Excel

  1. Sign into Google Sheets and open the "test from NodeJS" shared document. It should show a list of Presidents, matching the contents of the test file.

Export Raw Files

:::info pass

The goal of this section is to parse the Google Sheets XLSX export and generate CSV files for each worksheet.

:::

  1. Sign into Google Sheets and open the "test from NodeJS" shared document.

  2. Click the Plus (+) icon in the lower left corner to create a new worksheet.

  3. In the new worksheet, set cell A1 to the formula =SEQUENCE(3,5). This will assign a grid of values

  4. Save the following script to raw.mjs:

import { JWT } from 'google-auth-library'
import { GoogleSpreadsheet } from 'google-spreadsheet';

import { read, utils } from 'xlsx';

// highlight-next-line
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };

const jwt = new JWT({
  email: creds.client_email,
  key: creds.private_key,
  scopes: [
    'https://www.googleapis.com/auth/spreadsheets',
    'https://www.googleapis.com/auth/drive.file',
  ]
});

// highlight-next-line
const doc = new GoogleSpreadsheet('DOCUMENT_ID', jwt);
await doc.loadInfo();

const buf = await doc.downloadAsXLSX();

/* Parse with SheetJS */
const wb = read(buf);

/* Loop over the worksheet names */
wb.SheetNames.forEach(name => {
  /* Print the name to the console */
  console.log(name);

  /* Get the corresponding worksheet object */
  const sheet = wb.Sheets[name];

  /* Print a CSV export of the worksheet */
  console.log(utils.sheet_to_csv(sheet));
});

Edit the highlighted lines as follows:

  • './sheetjs-test-726272627262.json' should be replaced with the name of the JSON file in step 27. The ./ prefix is required!

  • 'DOCUMENT_ID' should be replaced with the Document ID from step 33.

  1. Run the script:
node raw.mjs

The script will display the sheet names and CSV rows from both worksheets.


  1. The package name is google-auth-library ↩︎

  2. The project name is node-google-spreadsheet but the module name is google-spreadsheet. ↩︎

  3. See aoa_to_sheet in "Utilities" ↩︎

  4. See sheet_to_json in "Utilities" ↩︎

  5. See "Exporting Data" in the node-google-spreadsheet documentation ↩︎

  6. See read in "Reading Files" ↩︎

  7. See "Workbook Object" for a description of the workbook object or "API Reference" for various methods to work with workbook and sheet objects. ↩︎