forked from sheetjs/docs.sheetjs.com
547 lines
16 KiB
Markdown
547 lines
16 KiB
Markdown
---
|
|
title: Sheets with PostgreSQL
|
|
sidebar_label: PostgreSQL
|
|
pagination_prev: demos/cli/index
|
|
pagination_next: demos/local/index
|
|
sidebar_custom_props:
|
|
sql: true
|
|
---
|
|
|
|
import current from '/version.js';
|
|
import CodeBlock from '@theme/CodeBlock';
|
|
|
|
[PostgreSQL](https://postgresql.org/) (colloquially referenced as "Postgres") is
|
|
an open source object-relational database system.
|
|
|
|
[SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing
|
|
data from spreadsheets.
|
|
|
|
This demo uses SheetJS to exchange data between spreadsheets and PostgreSQL
|
|
databases. We'll explore how to save tables from a database to spreadsheets and
|
|
how to add data from spreadsheets into a database.
|
|
|
|
:::caution pass
|
|
|
|
**It is strongly recommended to use PostgreSQL with a query builder or ORM.**
|
|
|
|
While it is possible to generate SQL statements directly, there are many subtle
|
|
details and pitfalls. Battle-tested solutions generally provide mitigations
|
|
against SQL injection and other vulnerabilities.
|
|
|
|
:::
|
|
|
|
:::note Tested Deployments
|
|
|
|
This demo was tested in the following environments:
|
|
|
|
| Postgres | Connector Library | Date |
|
|
|:---------|:------------------|:-----------|
|
|
| `17.2` | `pg` (`8.13.1`) | 2025-01-03 |
|
|
| `16.6` | `pg` (`8.13.1`) | 2025-01-03 |
|
|
| `15.10` | `pg` (`8.13.1`) | 2025-01-03 |
|
|
| `14.15` | `pg` (`8.13.1`) | 2025-01-03 |
|
|
| `13.18` | `pg` (`8.13.1`) | 2025-01-03 |
|
|
| `12.22` | `pg` (`8.13.1`) | 2025-01-03 |
|
|
|
|
:::
|
|
|
|
## Integration Details
|
|
|
|
The [SheetJS NodeJS module](/docs/getting-started/installation/nodejs) can be
|
|
loaded in NodeJS scripts that connect to PostgreSQL databases.
|
|
|
|
This demo uses the `pg` connector module[^1], but the same mechanics apply to
|
|
other PostgreSQL libraries.
|
|
|
|
### Exporting Data
|
|
|
|
`Client#query` returns a Promise that resolves to a result set. The `rows`
|
|
property of the result is an array of objects.
|
|
|
|
The SheetJS `json_to_sheet` method[^2] can generate a worksheet object[^3] from
|
|
the array of objects:
|
|
|
|
```js
|
|
const table_name = "Tabeller1"; // name of table
|
|
|
|
/* fetch all data from specified table */
|
|
const res = await client.query(`SELECT * FROM ${table_name}`);
|
|
|
|
/* generate a SheetJS worksheet object from the data */
|
|
const worksheet = XLSX.utils.json_to_sheet(res.rows);
|
|
```
|
|
|
|
A workbook object can be built from the worksheet using utility functions[^4].
|
|
The workbook can be exported using the SheetJS `writeFile` method[^5]:
|
|
|
|
```js
|
|
/* create a new workbook and add the worksheet */
|
|
const wb = XLSX.utils.book_new();
|
|
XLSX.utils.book_append_sheet(wb, worksheet, "Sheet1");
|
|
|
|
/* export workbook to XLSX */
|
|
XLSX.writeFile(wb, "SheetJSPGExport.xlsx");
|
|
```
|
|
|
|
### Importing Data
|
|
|
|
The SheetJS `sheet_to_json` function[^6] takes a worksheet object and generates
|
|
an array of objects.
|
|
|
|
Queries must be manually generated from the objects. Assuming the field names
|
|
in the object match the column headers, a loop can generate `INSERT` queries.
|
|
|
|
:::danger pass
|
|
|
|
**PostgreSQL does not allow parameterized queries with variable column names**
|
|
|
|
```sql
|
|
INSERT INTO table_name (?) VALUES (?);
|
|
-- ---------------------^ variable column names are not valid
|
|
```
|
|
|
|
Queries are generated manually. To help prevent SQL injection vulnerabilities,
|
|
the `pg-format`[^7] module escapes identifiers and fields.
|
|
|
|
:::
|
|
|
|
```js
|
|
/* generate an array of arrays from the worksheet */
|
|
const aoo = XLSX.utils.sheet_to_json(ws);
|
|
|
|
const table_name = "Blatte1"; // name of table
|
|
|
|
/* loop through the data rows */
|
|
for(let row of aoo) {
|
|
|
|
/* generate format helper strings */
|
|
const ent = Object.entries(row);
|
|
const Istr = Array.from({length: entries.length}, ()=>"%I").join(", ");
|
|
const Lstr = Array.from({length: entries.length}, ()=>"%L").join(", ");
|
|
|
|
/* generate INSERT statement */
|
|
let query = format.withArray(
|
|
`INSERT INTO %I (${Istr}) VALUES(${Lstr})`,
|
|
[ table_name, ...ent.map(x => x[0]), ...ent.map(x => x[1]) ]
|
|
);
|
|
|
|
/* execute INSERT statement */
|
|
await client.query(query);
|
|
}
|
|
```
|
|
|
|
### Creating a Table
|
|
|
|
The worksheet can be scanned to determine column names and types. With the names
|
|
and types, a `CREATE TABLE` query can be written.
|
|
|
|
<details>
|
|
<summary><b>Implementation Details</b> (click to show)</summary>
|
|
|
|
The `sheet_to_pg_table` function:
|
|
|
|
- scans worksheet cells to determine column names and types
|
|
- drops and creates a new table with the determined column names and types
|
|
- loads the entire dataset into the new table
|
|
|
|
```js
|
|
/* create table and load data given a worksheet and a PostgreSQL client */
|
|
async function sheet_to_pg_table(client, worksheet, tableName) {
|
|
if (!worksheet['!ref']) return;
|
|
|
|
const range = XLSX.utils.decode_range(worksheet['!ref']);
|
|
|
|
/* Extract headers from first row, clean names for PostgreSQL */
|
|
const headers = [];
|
|
for (let col = range.s.c; col <= range.e.c; col++) {
|
|
const cellAddress = XLSX.utils.encode_cell({ r: range.s.r, c: col });
|
|
const cell = worksheet[cellAddress];
|
|
const headerValue = cell ? String(cell.v).replace(/[^a-zA-Z0-9_]/g, '_') : `column_${col + 1}`;
|
|
headers.push(headerValue.toLowerCase());
|
|
}
|
|
|
|
/* Group cell values by column for type deduction */
|
|
const columnValues = headers.map(() => []);
|
|
for (let row = range.s.r + 1; row <= range.e.r; row++) {
|
|
for (let col = range.s.c; col <= range.e.c; col++) {
|
|
const cellAddress = XLSX.utils.encode_cell({ r: row, c: col });
|
|
const cell = worksheet[cellAddress];
|
|
columnValues[col].push(cell);
|
|
}
|
|
}
|
|
|
|
/* Deduce PostgreSQL type for each column */
|
|
const types = {};
|
|
headers.forEach((header, idx) => {
|
|
types[header] = deduceType(columnValues[idx]);
|
|
});
|
|
|
|
/* Delete table if it exists in the DB */
|
|
await client.query(format('DROP TABLE IF EXISTS %I', tableName));
|
|
|
|
/* Create table */
|
|
const createTableSQL = format(
|
|
'CREATE TABLE %I (%s)',
|
|
tableName,
|
|
headers.map(header => format('%I %s', header, types[header])).join(', ')
|
|
);
|
|
await client.query(createTableSQL);
|
|
|
|
/* Insert data row by row */
|
|
for (let row = range.s.r + 1; row <= range.e.r; row++) {
|
|
const values = headers.map((header, col) => {
|
|
const cellAddress = XLSX.utils.encode_cell({ r: row, c: col });
|
|
const cell = worksheet[cellAddress];
|
|
return parseValue(cell, types[header]);
|
|
});
|
|
|
|
const insertSQL = format(
|
|
'INSERT INTO %I (%s) VALUES (%s)',
|
|
tableName,
|
|
headers.map(h => format('%I', h)).join(', '),
|
|
values.map(() => '%L').join(', ')
|
|
);
|
|
await client.query(format(insertSQL, ...values));
|
|
}
|
|
}
|
|
|
|
function deduceType(cells) {
|
|
if (!cells || cells.length === 0) return 'text';
|
|
|
|
const nonEmptyCells = cells.filter(cell => cell && cell.v != null);
|
|
if (nonEmptyCells.length === 0) return 'text';
|
|
|
|
// Check for dates by looking at both cell type and formatted value
|
|
const isDateCell = cell => cell?.t === 'd' || (cell?.t === 'n' && cell.w && /\d{4}-\d{2}-\d{2}|\d{1,2}\/\d{1,2}\/\d{4}|\d{2}-[A-Za-z]{3}-\d{4}|[A-Za-z]{3}-\d{2}|\d{1,2}-[A-Za-z]{3}/.test(cell.w));
|
|
|
|
if (nonEmptyCells.some(isDateCell)) { return 'date'; }
|
|
|
|
const allBooleans = nonEmptyCells.every(cell => cell.t === 'b');
|
|
if (allBooleans) { return 'boolean'; }
|
|
|
|
const allNumbers = nonEmptyCells.every(cell => cell.t === 'n' || (cell.t === 's' && !isNaN(cell.v.replace(/[,$\s%()]/g, ''))));
|
|
|
|
if (allNumbers) {
|
|
const numbers = nonEmptyCells.map(cell => {
|
|
if (cell.t === 'n') return cell.v;
|
|
return parseFloat(cell.v.replace(/[,$\s%()]/g, ''));
|
|
});
|
|
|
|
const needsPrecision = numbers.some(num => {
|
|
const str = num.toString();
|
|
return str.includes('e')
|
|
|| (str.includes('.') && str.split('.')[1].length > 6)
|
|
|| Math.abs(num) > 1e15;
|
|
});
|
|
|
|
return needsPrecision ? 'numeric' : 'double precision';
|
|
}
|
|
return 'text';
|
|
}
|
|
|
|
function parseValue(cell, type) {
|
|
if (!cell || cell.v == null) return null;
|
|
|
|
switch (type) {
|
|
case 'date':
|
|
if (cell.t === 'd') { return cell.v.toISOString().split('T')[0]; }
|
|
if (cell.t === 'n') {
|
|
const date = new Date((cell.v - 25569) * 86400 * 1000);
|
|
return date.toISOString().split('T')[0];
|
|
}
|
|
return null;
|
|
|
|
case 'numeric':
|
|
case 'double precision':
|
|
if (cell.t === 'n') return cell.v;
|
|
if (cell.t === 's') {
|
|
const cleaned = cell.v.replace(/[,$\s%()]/g, '');
|
|
if (!isNaN(cleaned)) return parseFloat(cleaned);
|
|
}
|
|
return null;
|
|
|
|
case 'boolean':
|
|
return cell.t === 'b' ? cell.v : null;
|
|
|
|
default:
|
|
return String(cell.v);
|
|
}
|
|
}
|
|
```
|
|
|
|
</details>
|
|
|
|
|
|
## Complete Example
|
|
|
|
0) Install and start the PostgreSQL serve r.
|
|
|
|
<details>
|
|
<summary><b>Installation Notes</b> (click to show)</summary>
|
|
|
|
On macOS, install the `postgresql` formula with Homebrew:
|
|
|
|
```bash
|
|
brew install postgresql@16
|
|
```
|
|
|
|
The last few lines of the installer explain how to start the database:
|
|
|
|
```text
|
|
Or, if you don't want/need a background service you can just run:
|
|
// highlight-next-line
|
|
LC_ALL="C" /usr/local/opt/postgresql@16/bin/postgres -D /usr/local/var/postgresql@16
|
|
```
|
|
|
|
On Linux, install `postgresql` by running the following script:
|
|
|
|
```bash
|
|
echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" | sudo tee /etc/apt/sources.list.d/pgdg.list
|
|
wget -qO - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
|
|
|
|
sudo apt update
|
|
sudo apt install postgresql-16
|
|
sudo systemctl start postgresql
|
|
|
|
# Optional: Create user with password
|
|
sudo -u postgres createuser -P $USER
|
|
sudo -u postgres psql -c "ALTER USER $USER WITH SUPERUSER;"
|
|
```
|
|
|
|
If running the optional user creation steps above, a PostgreSQL password will be required. [^8]
|
|
|
|
Run the command to start a local database instance.
|
|
|
|
</details>
|
|
|
|
1) Drop any existing database with the name `SheetJSPG`:
|
|
|
|
```bash
|
|
dropdb SheetJSPG
|
|
```
|
|
|
|
:::caution pass
|
|
|
|
Some Linux installations do not create the `dropdb` command. The command can be
|
|
run through the `postgres` command-line tool:
|
|
|
|
```bash title="dropdb in Ubuntu/Debian"
|
|
sudo -i -u postgres dropdb SheetJSPG
|
|
```
|
|
|
|
:::
|
|
|
|
:::info pass
|
|
|
|
If the server is running elsewhere, or if the username is different from the
|
|
current user, command-line flags can override the defaults.
|
|
|
|
| Option | Explanation
|
|
|:--------------|:--------------------------|
|
|
| `-h HOSTNAME` | Name of the server |
|
|
| `-p PORT` | specifies the port number |
|
|
| `-U USERNAME` | specifies the username |
|
|
|
|
:::
|
|
|
|
:::note pass
|
|
|
|
If the database does not exist, the command will fail with the following error:
|
|
|
|
```
|
|
dropdb: error: database removal failed: ERROR: database "SheetJSPG" does not exist
|
|
```
|
|
|
|
This error can be safely ignored.
|
|
|
|
:::
|
|
|
|
2) Create an empty `SheetJSPG` database using the `createdb` command:
|
|
|
|
```bash
|
|
createdb SheetJSPG
|
|
```
|
|
|
|
:::caution pass
|
|
|
|
Some Linux installations do not create the `createdb` command. The command can be
|
|
run through the `postgres` command-line tool:
|
|
|
|
```bash title="createdb in Ubuntu/Debian"
|
|
sudo -i -u postgres createdb SheetJSPG
|
|
```
|
|
|
|
:::
|
|
|
|
:::note pass
|
|
|
|
`createdb` supports the same `-h`, `-p`, and `-U` flags as `dropdb`.
|
|
|
|
:::
|
|
|
|
### Connector Test
|
|
|
|
3) Create a project folder:
|
|
|
|
```bash
|
|
mkdir sheetjs-pg
|
|
cd sheetjs-pg
|
|
npm init -y
|
|
```
|
|
|
|
4) Install the `pg` connector module:
|
|
|
|
```bash
|
|
npm i --save pg@8.13.1
|
|
```
|
|
|
|
5) Save the following example codeblock to `PGTest.js`:
|
|
|
|
```js title="PGTest.js"
|
|
const pg = require("pg");
|
|
const client = new pg.Client({
|
|
database:"SheetJSPG",
|
|
// highlight-start
|
|
host: "127.0.0.1", // localhost
|
|
port: 5432,
|
|
//user: "",
|
|
//password: ""
|
|
// highlight-end
|
|
});
|
|
(async() => {
|
|
|
|
await client.connect();
|
|
const res = await client.query('SELECT $1::text as message', ['Hello world!']);
|
|
console.log(res.rows[0].message); // Hello world!
|
|
await client.end();
|
|
|
|
})();
|
|
```
|
|
|
|
6) Edit the new `PGTest.js` script and modify the highlighted lines from the
|
|
codeblock to reflect the database deployment settings.
|
|
|
|
The settings in the codeblock match the default configuration for macOS Homebrew
|
|
PostgreSQL server. For other deployments:
|
|
|
|
- If the server is not running on your computer, set `host` and `port` to the
|
|
correct host name and port number.
|
|
|
|
- If the server expects a different username and password, uncomment the `user`
|
|
and `password` lines and replace the values with the username and password.
|
|
|
|
- For Ubuntu/Debian PostgreSQL installations, the default user is `postgres`.
|
|
The password must be set during installation or using `sudo -u postgres psql`
|
|
followed by `\password postgres` in the `psql` prompt.
|
|
|
|
7) Run the script:
|
|
|
|
```bash
|
|
node PGTest.js
|
|
```
|
|
|
|
It should print `Hello world!`
|
|
|
|
:::caution pass
|
|
|
|
If the output is not `Hello world!` or if there is an error, please report the
|
|
issue to the `pg` connector project for further diagnosis.
|
|
|
|
:::
|
|
|
|
### Add SheetJS
|
|
|
|
8) Install dependencies:
|
|
|
|
<CodeBlock language="bash">{`\
|
|
npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz pg-format@1.0.4`}
|
|
</CodeBlock>
|
|
|
|
9) Download [`SheetJSPG.js`](pathname:///postgresql/SheetJSPG.js):
|
|
|
|
```bash
|
|
curl -LO https://docs.sheetjs.com/postgresql/SheetJSPG.js
|
|
```
|
|
|
|
This script will:
|
|
- read and parse the test file `pres.numbers`
|
|
- create a connection to the `SheetJSPG` database on a local PostgreSQL server
|
|
- load data from the first worksheet into a table with name `Presidents`
|
|
- disconnect and reconnect to the database
|
|
- dump data from the table `Presidents`
|
|
- export the dataset to `SheetJSPG.xlsx`
|
|
|
|
10) Edit the `SheetJSPG.js` script.
|
|
|
|
The script defines an `opts` object:
|
|
|
|
```js title="SheetJSPG.js (configuration lines)"
|
|
const XLSX = require("xlsx");
|
|
const opts = {
|
|
database:"SheetJSPG",
|
|
// highlight-start
|
|
host: "127.0.0.1", // localhost
|
|
port: 5432,
|
|
//user: "",
|
|
//password: ""
|
|
// highlight-end
|
|
};
|
|
```
|
|
|
|
The settings in the codeblock match the default configuration for macOS Homebrew
|
|
PostgreSQL server. For other deployments:
|
|
|
|
- If the server is not running on your computer, set `host` and `port` to the
|
|
correct host name and port number.
|
|
|
|
- If the server expects a different username and password, uncomment the `user`
|
|
and `password` lines and replace the values with the username and password.
|
|
|
|
- For Ubuntu/Debian PostgreSQL installations, the default user is postgres.
|
|
The password must be set during installation or using `sudo -u postgres psql`
|
|
followed by `\password postgres` in the `psql` prompt.
|
|
|
|
|
|
11) Fetch the example file [`pres.numbers`](https://docs.sheetjs.com/pres.numbers):
|
|
|
|
```bash
|
|
curl -L -O https://docs.sheetjs.com/pres.numbers
|
|
```
|
|
|
|
12) Run the script:
|
|
|
|
```bash
|
|
node SheetJSPG.js
|
|
```
|
|
|
|
13) Verify the result:
|
|
|
|
- `SheetJSPGExport.xlsx` can be opened in a spreadsheet app or tested in the terminal
|
|
|
|
```bash
|
|
npx xlsx-cli SheetJSPGExport.xlsx
|
|
```
|
|
|
|
- The database server can be queried using the `psql` command line tool.
|
|
|
|
If the server is running locally, the command will be:
|
|
|
|
```bash
|
|
psql SheetJSPG -c 'SELECT * FROM "Presidents";'
|
|
```
|
|
|
|
:::note pass
|
|
|
|
`psql` supports the same `-h`, `-p`, and `-U` flags as `dropdb` and `createdb`.
|
|
|
|
:::
|
|
|
|
|
|
[^1]: See [the official `pg` website](https://node-postgres.com/) for more info.
|
|
[^2]: See [`json_to_sheet` in "Utilities"](/docs/api/utilities/array#array-of-objects-input)
|
|
[^3]: See ["Sheet Objects"](/docs/csf/sheet) in "SheetJS Data Model" for more details.
|
|
[^4]: See ["Workbook Helpers" in "Utilities"](/docs/api/utilities/wb) for details on `book_new` and `book_append_sheet`.
|
|
[^5]: See [`writeFile` in "Writing Files"](/docs/api/write-options)
|
|
[^6]: See [`sheet_to_json` in "Utilities"](/docs/api/utilities/array#array-output)
|
|
[^7]: The [`pg-format`](https://npm.im/pg-format) package is available on the public NPM registry. Even though the project is marked as deprecated, the official [`pg` website still recommends `pg-format`](https://node-postgres.com/features/queries#parameterized-query:~:text=use%20pg%2Dformat%20package%20for%20handling%20escaping)
|
|
[^8]: PostgreSQL on Linux uses [SCRAM authentication by default, which requires a password](https://www.postgresql.org/docs/current/auth-password.html) |