22 KiB
title | pagination_prev | pagination_next |
---|---|---|
Google Sheets | demos/local/index | demos/extensions/index |
import current from '/version.js'; import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; import CodeBlock from '@theme/CodeBlock';
:::note pass
This demo focuses on external data processing. For Google Apps Script custom functions, the "Google Sheets" extension demo covers Apps Script integration.
:::
Google Sheets is a collaborative spreadsheet service with powerful external APIs for automation.
SheetJS is a JavaScript library for reading and writing data from spreadsheets.
This demo uses SheetJS to properly exchange data with spreadsheet files. We'll explore how to use NodeJS integration libraries and SheetJS in three data flows:
-
"Importing data": Data in a NUMBERS spreadsheet will be parsed using SheetJS libraries and written to a Google Sheets Document
-
"Exporting data": Data in Google Sheets will be pulled into arrays of objects. A workbook will be assembled and exported to Excel Binary workbooks (XLSB).
-
"Exporting files": SheetJS libraries will read XLSX and ODS files exported by Google Sheets and generate CSV rows from every worksheet.
:::warning pass
It is strongly recommended to create a new Google account for testing.
One small mistake could result in a block or ban from Google services.
:::
:::caution pass
Google Sheets deprecates APIs quickly and there is no guarantee that the referenced APIs will be available in the future.
:::
Integration Details
This demo uses the following NodeJS modules:
google-auth-library
1 to authenticate with Google APIsnode-google-spreadsheet
2 to interact with Google Sheets v4 API
:::info Initial Setup
There are a number of steps to enable the Google Sheets API and Google Drive API for an account. The Complete Example covers the process.
:::
Authentication
It is strongly recommended to use a service account for Google API operations. The "Service Account Setup" section covers how to create a service account and generate a JSON key file.
import { JWT } from 'google-auth-library'
import { GoogleSpreadsheet } from 'google-spreadsheet';
// adjust the path to the actual key file.
// highlight-next-line
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };
const jwt = new JWT({
email: creds.client_email,
key: creds.private_key,
scopes: [
'https://www.googleapis.com/auth/spreadsheets',
'https://www.googleapis.com/auth/drive.file',
]
});
Connecting to Documents
To connect to existing documents, the document ID must be specified. This ID can be found from the edit URL.
The edit URL starts with https://docs.google.com/spreadsheets/d/
and includes
/edit
. The ID is the string of characters between the slashes. For example:
https://docs.google.com/spreadsheets/d/a_long_string_of_characters/edit#gid=0
---------------------------------------^^^^^^^^^^^^^^^^^^^^^^^^^^^--- ID
The GoogleSpreadsheet
constructor accepts a document ID and auth object:
const doc = new GoogleSpreadsheet('DOCUMENT_ID', jwt);
await doc.loadInfo();
Creating New Documents
The createNewSpreadsheetDocument
makes a request to create a new document. It
is strongly recommended to create a blank sheet.
const doc = await GoogleSpreadsheet.createNewSpreadsheetDocument(jwt, { title: 'Document Title' });
const newSheet = await doc.addSheet({ title: 'Sheet Title' });
Array of Arrays
"Arrays of Arrays" are the main data format for interchange with Google Sheets. The outer array object includes row arrays, and each row array includes data.
SheetJS provides methods for working with Arrays of Arrays:
aoa_to_sheet
3 creates SheetJS worksheet objects from arrays of arrayssheet_to_json
4 can generate arrays of arrays from SheetJS worksheets
Export Document Data
The goal is to create an XLSB export from a Google Sheet. Google Sheets does not natively support the XLSB format. SheetJS fills the gap.
Convert a Single Sheet
The idea is to extract the raw data from the Google Sheet headers and combine with the raw data rows to produce a large array of arrays.
async function wb_append_sheet(sheet, name, wb) {
/* get the header and data rows */
await sheet.loadHeaderRow();
const header = sheet.headerValues;
const rows = await sheet.getRows();
/* construct the array of arrays */
const aoa = [header].concat(rows.map(r => r._rawData));
/* generate a SheetJS Worksheet */
const ws = XLSX.utils.aoa_to_sheet(aoa);
/* add to workbook */
XLSX.utils.book_append_sheet(wb, ws, name);
}
Convert a Workbook
doc.sheetsByIndex
is an array of worksheets in the Google Sheet Document. By
looping across the sheets, the entire workbook can be written:
async function doc_to_wb(doc) {
/* Create a new workbook object */
const wb = XLSX.utils.book_new();
/* Loop across the Document sheets */
for(let i = 0; i < doc.sheetsByIndex.length; ++i) {
const sheet = doc.sheetsByIndex[i];
/* Get the worksheet name */
const name = sheet.title;
/* Add sheet to workbook */
await add_sheet_to_wb(sheet, name, wb);
}
return wb;
}
Update Document Data
The goal is to import data from a NUMBERS file to Google Sheets. Google Sheets does not natively support the NUMBERS format. SheetJS fills the gap.
Clear the Document
Google Sheets does not allow users to delete every worksheet. This function deletes every worksheet after the first, then clears the first worksheet:
/* clear google sheets doc */
async function doc_clear(doc) {
/* delete all sheets after the first sheet */
const old_sheets = doc.sheetsByIndex;
for(let i = 1; i < old_sheets.length; ++i) await old_sheets[i].delete();
/* clear first worksheet */
old_sheets[0].clear();
}
Update First Sheet
There are two steps: "update worksheet name" and "update worksheet data":
Update Sheet Name
The worksheet name is assigned by using the updateProperties
method. The
desired sheet name is the name of the first worksheet from the file.
async function doc_update_first_sheet_name(doc, wb) {
/* get first worksheet name */
const wsname = wb.SheetNames[0];
/* get first gsheet */
const sheet = doc.sheetsByIndex[0];
/* update worksheet name */
await sheet.updateProperties({title: wsname});
}
Update Sheet Data
sheet.addRows
reads an Array of Arrays of values. XLSX.utils.sheet_to_json
can generate this exact shape with the option header: 1
. Unfortunately
Google Sheets requires at least one "Header Row". This can be implemented by
converting the entire worksheet to an Array of Arrays and setting the header
row to the first row of the result:
async function doc_update_first_sheet_data(doc, wb) {
/* get first worksheet */
const ws = wb.Sheets[wb.SheetNames[0]];
/* generate array of arrays from the first worksheet */
const aoa = XLSX.utils.sheet_to_json(ws, {header: 1});
/* get first gsheet */
const sheet = doc.sheetsByIndex[0];
/* set document header row to first row of the AOA */
await sheet.setHeaderRow(aoa[0]);
/* add the remaining rows */
await sheet.addRows(aoa.slice(1));
}
Append Remaining Worksheets
Each name in the SheetJS Workbook SheetNames
array maps to a worksheet. The
list of names not including the first sheet is wb.SheetNames.slice(1)
.
There are two steps for each sheet: "create new sheet" and "load data".
Due to JavaScript async
idiosyncrasies, a plain for
loop must be used:
async function doc_append_remaining_sheets(doc, wb) {
const names = wb.SheetNames.slice(1);
/* loop across names */
for(let i = 0; i < names.length; ++i) {
/* wb.SheetNames[i] is the sheet name */
const name = wb.SheetNames[i];
/* wb.Sheets[name] is the worksheet object */
const ws = wb.Sheets[name];
/* create new google sheet */
const sheet = await doc_add_new_sheet(doc, name);
/* load sheet with data */
await sheet_load_from_ws(sheet, ws);
}
}
Add a New Worksheet
doc.addSheet
accepts a properties object that includes the worksheet name:
async function doc_add_new_sheet(doc, name) {
return await doc.addSheet({title: name});
}
This creates a new worksheet, sets the tab name, and returns a reference to the created worksheet.
Update Worksheet Data
async function sheet_load_from_ws(sheet, ws) {
/* generate array of arrays from the first worksheet */
const aoa = XLSX.utils.sheet_to_json(ws, {header: 1});
/* set document header row to first row of the AOA */
await sheet.setHeaderRow(aoa[0]);
/* add the remaining rows */
await sheet.addRows(aoa.slice(1));
}
Raw File Exports
In the web interface, Google Sheets can export documents to XLSX
or ODS
. The
NodeJS library includes similar methods to perform the download5:
Format | Google Sheets Description | Method |
---|---|---|
XLSX | Microsoft Excel (.xlsx) | downloadAsXLSX |
ODS | OpenDocument (.ods) | downloadAsODS |
The functions resolve to Buffer
data. The Buffer
objects can be parsed using
the SheetJS read
6 method:
/* download XLSX */
const ab = await doc.downloadAsXLSX();
/* parse */
const wb = XLSX.read(buf);
At this point wb
is a SheetJS workbook object7.
Complete Example
:::note
This demo was last tested on 2023 September 17 using google-auth-library
for
authentication (v8.9.0
) and google-spreadsheet
for API access (v4.1.0
).
:::
Account Setup
- Create a new Google account or log into an existing account.
:::caution pass
A valid phone number (for SMS verification) may be required.
:::
- Open https://console.cloud.google.com in a web browser. Review the Google Cloud Platform Terms of Service.
:::warning pass
You must agree to the Google Cloud Platform Terms of Service to use the APIs.
:::
Project Setup
- Create a new Project.
If the account does not have an existing project, click "CREATE PROJECT"
If the account has an existing project, click the project selector (:·
icon)
and click "NEW PROJECT" in the modal.
- In the New Project screen, enter "SheetJS Test" in the Project name textbox and select "No organization" in the Location box. Click "CREATE"
API Setup
:::info pass
The goal of this section is to enable Google Sheets API and Google Drive API.
:::
-
Click the Project Selector (
:·
icon) and select "SheetJS Test" -
In the left sidebar, click "Enabled APIs and services".
-
Near the top of the page, click "+ ENABLE APIS AND SERVICES".
-
In the search bar near the middle of the page, type "Sheets" and look for "Google Sheets API". Click the card
-
In the Product Details screen, click the blue "ENABLE" button.
-
Click the left arrow (
<-
) next to API/Service details. -
In the search bar near the middle of the page, type "Drive" and look for "Google Drive API". Click the card.
-
In the Product Details screen, click the blue "ENABLE" button.
Service Account Setup
:::info pass
The goal of this section is to create a service account and generate a JSON key.
:::
Create Service Account
-
Click the Project Selector (
:·
icon) and select "SheetJS Test". -
Click "Dashboard".
-
In the left sidebar, hover over "APIs and Services" and select "Credentials"
-
Click "+ CREATE CREDENTIALS". In the dropdown, select "Service Account"
-
Enter "SheetJService" for Service account name. Click "CREATE AND CONTINUE"
:::note pass
The Service account ID is generated automatically.
:::
-
In Step 2 "Grant this service account access to project", click CONTINUE
-
In Step 3 click "DONE". You will be taken back to the credentials screen
Create JSON Key
-
Look for "SheetJService" in the "Service Accounts" table and click the email address in the row
-
Click the email address of the account in the "Service Accounts" table.
-
Click "KEYS" in the horizontal bar near the top of the page.
-
Click "ADD KEY" and select "Create new key" in the dropdown.
-
In the popup, select the "JSON" radio button and click "CREATE". The page will download a JSON file.
-
Click "CLOSE"
Create Document
:::info pass
The goal of this section is to create a document from the service account and share with the main account.
:::
- Create a
SheetJSGS
folder and initialize:
mkdir SheetJSGS
cd SheetJSGS
npm init -y
-
Copy the JSON file from step 24 into the project folder.
-
Install dependencies:
{\ npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz google-spreadsheet google-auth-library
}
- Save the following script to
init.mjs
:
import { JWT } from 'google-auth-library'
import { GoogleSpreadsheet } from 'google-spreadsheet';
// highlight-next-line
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };
const jwt = new JWT({
email: creds.client_email,
key: creds.private_key,
scopes: [
'https://www.googleapis.com/auth/spreadsheets',
'https://www.googleapis.com/auth/drive.file',
]
});
const doc = await GoogleSpreadsheet.createNewSpreadsheetDocument(jwt, { title: 'test from NodeJS' });
const newSheet = await doc.addSheet({ title: 'SheetJSTest' });
// highlight-next-line
await doc.share('YOUR_ADDRESS@gmail.com');
Edit the highlighted lines as follows:
-
'./sheetjs-test-726272627262.json'
should be replaced with the name of the JSON file in step 27. The./
prefix is required! -
'YOUR_ADDRESS@gmail.com'
should be replaced with the Google Account email address from step 0.
- Run the script:
node init.mjs
-
Sign into Google Sheets. A shared document "test from NodeJS" should be displayed in the table. It will be owned by the service account.
-
Open the shared document from step 31.
-
Copy the URL and extract the document ID.
The URL of the document will look like
https://docs.google.com/spreadsheets/d/a_long_string_of_characters/edit#gid=0
---------------------------------------^^^^^^^^^^^^^^^^^^^^^^^^^^^--- ID
The ID is a long string of letters and numbers and underscore characters (_
)
just before the /edit
part of the URL.
Load Data from NUMBERS
:::info pass
The goal of this section is to update the new document with data from a sample NUMBERS file.
:::
- Download the test file
pres.numbers
:
curl -LO https://sheetjs.com/pres.numbers
- Save the following script to
load.mjs
:
import { JWT } from 'google-auth-library'
import { GoogleSpreadsheet } from 'google-spreadsheet';
import { set_fs, readFile, utils } from 'xlsx';
import * as fs from 'fs';
set_fs(fs);
// highlight-next-line
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };
const jwt = new JWT({
email: creds.client_email,
key: creds.private_key,
scopes: [
'https://www.googleapis.com/auth/spreadsheets',
'https://www.googleapis.com/auth/drive.file',
]
});
// highlight-next-line
const doc = new GoogleSpreadsheet('DOCUMENT_ID', jwt);
await doc.loadInfo();
const wb = readFile("pres.numbers");
/* clear workbook */
{
/* delete all sheets after the first sheet */
const old_sheets = doc.sheetsByIndex;
for(let i = 1; i < old_sheets.length; ++i) {
await old_sheets[i].delete();
}
/* clear first worksheet */
old_sheets[0].clear();
}
/* write worksheets */
{
const name = wb.SheetNames[0];
const ws = wb.Sheets[name];
/* first worksheet already exists */
const sheet = doc.sheetsByIndex[0];
/* update worksheet name */
await sheet.updateProperties({title: name});
/* generate array of arrays from the first worksheet */
const aoa = utils.sheet_to_json(ws, {header: 1});
/* set document header row to first row of the AOA */
await sheet.setHeaderRow(aoa[0])
/* add the remaining rows */
await sheet.addRows(aoa.slice(1));
/* the other worksheets must be created manually */
for(let i = 1; i < wb.SheetNames.length; ++i) {
const name = wb.SheetNames[i];
const ws = wb.Sheets[name];
const sheet = await doc.addSheet({title: name});
const aoa = utils.sheet_to_json(ws, {header: 1});
await sheet.setHeaderRow(aoa[0])
await sheet.addRows(aoa.slice(1));
}
}
Edit the highlighted lines as follows:
-
'./sheetjs-test-726272627262.json'
should be replaced with the name of the JSON file in step 27. The./
prefix is required! -
'DOCUMENT_ID'
should be replaced with the Document ID from step 33.
- Run the script:
node load.mjs
- Sign into Google Sheets and open the "test from NodeJS" shared document. It should show a list of Presidents, matching the contents of the test file.
Export Data to XLSB
:::info pass
The goal of this section is to export the raw data from Google Sheets to XLSB.
:::
- Save the following script to
dump.mjs
:
import { JWT } from 'google-auth-library'
import { GoogleSpreadsheet } from 'google-spreadsheet';
import { set_fs, writeFile, utils } from 'xlsx';
import * as fs from 'fs';
set_fs(fs);
// highlight-next-line
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };
const jwt = new JWT({
email: creds.client_email,
key: creds.private_key,
scopes: [
'https://www.googleapis.com/auth/spreadsheets',
'https://www.googleapis.com/auth/drive.file',
]
});
// highlight-next-line
const doc = new GoogleSpreadsheet('DOCUMENT_ID', jwt);
await doc.loadInfo();
const wb = utils.book_new();
for(let i = 0; i < doc.sheetsByIndex.length; ++i) {
const sheet = doc.sheetsByIndex[i];
const name = sheet.title;
/* get the header and data rows */
await sheet.loadHeaderRow();
const header = sheet.headerValues;
const rows = await sheet.getRows();
const aoa = [header].concat(rows.map(r => r._rawData));
/* generate a SheetJS Worksheet */
const ws = utils.aoa_to_sheet(aoa);
/* add to workbook */
utils.book_append_sheet(wb, ws, name);
}
/* write to SheetJS.xlsb */
writeFile(wb, "SheetJS.xlsb");
Edit the highlighted lines as follows:
-
'./sheetjs-test-726272627262.json'
should be replaced with the name of the JSON file in step 27. The./
prefix is required! -
'DOCUMENT_ID'
should be replaced with the Document ID from step 33.
- Run the script:
node load.mjs
The script should create a file SheetJS.xlsb
in the project folder. This file
can be opened in Excel
- Sign into Google Sheets and open the "test from NodeJS" shared document. It should show a list of Presidents, matching the contents of the test file.
Export Raw Files
:::info pass
The goal of this section is to parse the Google Sheets XLSX export and generate CSV files for each worksheet.
:::
-
Sign into Google Sheets and open the "test from NodeJS" shared document.
-
Click the Plus (
+
) icon in the lower left corner to create a new worksheet. -
In the new worksheet, set cell A1 to the formula
=SEQUENCE(3,5)
. This will assign a grid of values -
Save the following script to
raw.mjs
:
import { JWT } from 'google-auth-library'
import { GoogleSpreadsheet } from 'google-spreadsheet';
import { read, utils } from 'xlsx';
// highlight-next-line
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };
const jwt = new JWT({
email: creds.client_email,
key: creds.private_key,
scopes: [
'https://www.googleapis.com/auth/spreadsheets',
'https://www.googleapis.com/auth/drive.file',
]
});
// highlight-next-line
const doc = new GoogleSpreadsheet('DOCUMENT_ID', jwt);
await doc.loadInfo();
const buf = await doc.downloadAsXLSX();
/* Parse with SheetJS */
const wb = read(buf);
/* Loop over the worksheet names */
wb.SheetNames.forEach(name => {
/* Print the name to the console */
console.log(name);
/* Get the corresponding worksheet object */
const sheet = wb.Sheets[name];
/* Print a CSV export of the worksheet */
console.log(utils.sheet_to_csv(sheet));
});
Edit the highlighted lines as follows:
-
'./sheetjs-test-726272627262.json'
should be replaced with the name of the JSON file in step 27. The./
prefix is required! -
'DOCUMENT_ID'
should be replaced with the Document ID from step 33.
- Run the script:
node raw.mjs
The script will display the sheet names and CSV rows from both worksheets.
-
The package name is
google-auth-library
↩︎ -
The project name is
node-google-spreadsheet
but the module name isgoogle-spreadsheet
. ↩︎ -
See "Exporting Data" in the
node-google-spreadsheet
documentation ↩︎ -
See "Workbook Object" for a description of the workbook object or "API Reference" for various methods to work with workbook and sheet objects. ↩︎