27 KiB
title |
---|
Content and Site Generation |
import current from '/version.js';
With the advent of server-side frameworks and content management systems, it is possible to build sites whose source of truth is a spreadsheet! This demo explores a number of approaches.
Lume
The official Sheets plugin uses SheetJS to load data from spreadsheets.
Lume Demo
:::note
This was tested against lume v1.14.2
on 2022 December 27.
:::
Complete Example (click to show)
- Create a stock site:
mkdir -p sheetjs-lume
cd sheetjs-lume
deno run -Ar https://deno.land/x/lume/init.ts
When prompted, enter the following options:
Use TypeScript for the configuration file
: press Enter (use defaultN
)Do you want to use plugins
: typesheets
and press Enter
The project will be configured and modules will be installed.
- Download https://sheetjs.com/pres.numbers and place in a
_data
folder:
mkdir -p _data
curl -L -o _data/pres.numbers https://sheetjs.com/pres.numbers
- Create a
index.njk
file that references the file. Since the file ispres.numbers
, the parameter name ispres
:
<h2>Presidents</h2>
<table><thead><th>Name</th><th>Index</th></thead>
<tbody>
{% for row in pres %}{% if (loop.index >= 1) %}
<tr>
<td>{{ row.Name }}</td>
<td>{{ row.Index }}</td>
</tr>
{% endif %}{% endfor %}
</tbody>
</table>
- Run the development server:
deno task lume --serve
To verify it works, access http://localhost:3000 from your web browser.
Adding a new row and saving pres.numbers
should refresh the data
- Stop the server (press
CTRL+C
in the terminal window) and run
deno task lume
This will create a static site in the _site
folder, which can be served with:
npx http-server _site
Accessing the page http://localhost:8080 will show the page contents.
GatsbyJS
gatsby-transformer-excel
generates nodes for each data row of each worksheet. The official documentation
includes examples and more detailed usage instructions.
:::note
gatsby-transformer-excel
is maintained by the Gatsby core team and all bugs
should be directed to the main Gatsby project. If it is determined to be a bug
in the parsing logic, issues should then be raised with the SheetJS project.
:::
GraphQL details (click to show)
gatsby-transformer-excel
generates nodes for each data row of each worksheet.
Under the hood, it uses sheet_to_json
to generate row objects using the headers in the first row as keys.
Assuming the file name is pres.xlsx
and the data is stored in "Sheet1", the
following nodes will be created:
[
{ Name: "Bill Clinton", Index: 42, type: "PresXlsxSheet1" },
{ Name: "GeorgeW Bush", Index: 43, type: "PresXlsxSheet1" },
{ Name: "Barack Obama", Index: 44, type: "PresXlsxSheet1" },
{ Name: "Donald Trump", Index: 45, type: "PresXlsxSheet1" },
{ Name: "Joseph Biden", Index: 46, type: "PresXlsxSheet1" },
]
The type is a proper casing of the file name concatenated with the sheet name.
The following query pulls the Name
and Index
fields from each row:
{
allPresXlsxSheet1 { # "all" followed by type
edges {
node { # each line in this block should be a field in the data
Name
Index
}
}
}
}
:::caution
gatsby-transformer-excel
uses an older version of the library. It can be
overridden through a package.json
override in the latest versions of NodeJS:
{`\
{
"overrides": {
"xlsx": "https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz"
}
}`}
:::
GatsbyJS Demo
Complete Example (click to show)
:::note
This demo was tested on 2022 November 11 against create-gatsby@3.0.0
. The
generated project used gatsby@5.0.0
and react@18.2.0
.
:::
-
Run
npm init gatsby -- -y sheetjs-gatsby
to create the template site. -
Follow the on-screen instructions for starting the local development server:
cd sheetjs-gatsby
npm run develop
Open a web browser to the displayed URL (typically http://localhost:8000/
)
- Edit
package.json
and add the highlighted lines in the JSON object:
{`\
{
// highlight-start
"overrides": {
"xlsx": "https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz"
},
// highlight-end
"name": "sheetjs-gatsby",
"version": "1.0.0",
`}
- Install the library and plugins:
{`\
npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz
npm i --save gatsby-transformer-excel gatsby-source-filesystem
`}
- Edit
gatsby-config.js
and add the following lines to theplugins
array:
plugins: [
{
resolve: `gatsby-source-filesystem`,
options: {
name: `data`,
path: `${__dirname}/src/data/`,
},
},
`gatsby-transformer-excel`,
],
Stop and restart the development server process (npm run develop
).
- Make a
src/data
directory, download https://sheetjs.com/pres.xlsx, and move the downloaded file into the new folder:
mkdir -p src/data
curl -L -o src/data/pres.xlsx https://sheetjs.com/pres.xlsx
- To verify, open the GraphiQL editor at
http://localhost:8000/___graphql
.
There is an editor in the left pane. Paste the following query into the editor:
{
allPresXlsxSheet1 {
edges {
node {
Name
Index
}
}
}
}
Press the Execute Query button and data should show up in the right pane:
- Create a new file
src/pages/pres.js
that uses the query and displays the result:
import { graphql } from "gatsby"
import * as React from "react"
export const query = graphql`query {
allPresXlsxSheet1 {
edges {
node {
Name
Index
}
}
}
}`;
const PageComponent = ({data}) => {
return ( <pre>{JSON.stringify(data, 2, 2)}</pre>);
};
export default PageComponent;
After saving the file, access http://localhost:8000/pres
. The displayed JSON
is the data that the component receives:
{
"allPresXlsxSheet1": {
"edges": [
{
"node": {
"Name": "Bill Clinton",
"Index": 42
}
},
// ....
- Change
PageComponent
to display a table based on the data:
import { graphql } from "gatsby"
import * as React from "react"
export const query = graphql`query {
allPresXlsxSheet1 {
edges {
node {
Name
Index
}
}
}
}`;
// highlight-start
const PageComponent = ({data}) => {
const rows = data.allPresXlsxSheet1.edges.map(r => r.node);
return ( <table>
<thead><tr><th>Name</th><th>Index</th></tr></thead>
<tbody>{rows.map(row => ( <tr>
<td>{row.Name}</td>
<td>{row.Index}</td>
</tr>))}</tbody>
</table> );
};
// highlight-end
export default PageComponent;
Going back to the browser, http://localhost:8000/pres
will show a table:
- Open the file
src/data/pres.xlsx
in Excel or LibreOffice or Numbers. Add a new row at the end of the file:
Save the file and notice that the table has refreshed with the new data:
- Stop the development server and run
npm run build
. Once the build is finished, the display will confirm that the/pres
route is static:
Pages
┌ src/pages/404.js
│ ├ /404/
│ └ /404.html
├ src/pages/index.js
│ └ /
└ src/pages/pres.js
└ /pres/
╭────────────────────────────────────────────────────────────────╮
│ │
│ (SSG) Generated at build time │
│ D (DSG) Deferred static generation - page generated at runtime │
│ ∞ (SSR) Server-side renders at runtime (uses getServerData) │
│ λ (Function) Gatsby function │
│ │
╰────────────────────────────────────────────────────────────────╯
The built page will be placed in public/pres/index.html
. Open the page with a
text editor and search for "SheetJS" to verify raw HTML was generated:
<tr><td>SheetJS Dev</td><td>47</td></tr>
Bundling Data
Bundlers can run JS code and process assets during development and during site builds. Custom plugins can extract data from spreadsheets.
ViteJS
:::note
This demo covers static asset imports. For processing files in the browser, the "Bundlers" demo includes an example.
:::
ViteJS supports static asset imports, but the default raw loader interprets data as UTF-8 strings. This corrupts binary formats like XLSX and XLS, but a custom loader can override the default behavior.
For a pure static site, a plugin can load data into an array of row objects. The SheetJS work is performed in the plugin. The library is not loaded in the page!
import { readFileSync } from 'fs';
import { read, utils } from 'xlsx';
import { defineConfig } from 'vite';
export default defineConfig({
assetsInclude: ['**/*.xlsx'], // xlsx file should be treated as assets
plugins: [
{ // this plugin handles ?sheetjs tags
name: "vite-sheet",
transform(code, id) {
if(!id.match(/\?sheetjs$/)) return;
var wb = read(readFileSync(id.replace(/\?sheetjs$/, "")));
var data = utils.sheet_to_json(wb.Sheets[wb.SheetNames[0]]);
return `export default JSON.parse('${JSON.stringify(data)}')`;
}
}
]
});
This loader uses the query sheetjs
:
import data from './data.xlsx?sheetjs';
document.querySelector('#app').innerHTML = `<div><pre>
${data.map(row => JSON.stringify(row)).join("\n")}
</pre></div>`;
Base64 plugin (click to show)
This loader pulls in data as a Base64 string that can be read with XLSX.read
.
While this approach works, it is not recommended since it loads the library in
the front-end site.
import { readFileSync } from 'fs';
import { defineConfig } from 'vite';
export default defineConfig({
assetsInclude: ['**/*.xlsx'], // mark that xlsx file should be treated as assets
plugins: [
{ // this plugin handles ?b64 tags
name: "vite-b64-plugin",
transform(code, id) {
if(!id.match(/\?b64$/)) return;
var path = id.replace(/\?b64/, "");
var data = readFileSync(path, "base64");
return `export default '${data}'`;
}
}
]
});
When importing using the b64
query, the raw Base64 string will be exposed.
This can be read directly with XLSX.read
in JS code:
import { read, utils } from "xlsx";
/* reference workbook */
import b64 from './data.xlsx?b64';
/* parse workbook and export first sheet to CSV */
const wb = await read(b64);
const wsname = wb.SheetNames[0];
const csv = utils.sheet_to_csv(wb.Sheets[wsname]);
document.querySelector('#app').innerHTML = `<div><pre>
<b>${wsname}</b>
${csv}
</pre></div>`;
Site Generators
NextJS
:::note
This was tested against next v13.1.1
on 2022 December 28.
:::
:::info
At a high level, there are two ways to pull spreadsheet data into NextJS apps:
loading an asset module or performing the file read operations from the NextJS
lifecycle. At the time of writing, NextJS does not offer an out-of-the-box
asset module solution, so this demo focuses on raw operations. NextJS does not
watch the spreadsheets, so next dev
hot reloading will not work!
:::
The general strategy with NextJS apps is to generate HTML snippets or data from the lifecycle functions and reference them in the template.
HTML output can be generated using XLSX.utils.sheet_to_html
and inserted into
the document using the dangerouslySetInnerHTML
attribute:
export default function Index({html, type}) { return (
// ...
// highlight-next-line
<div dangerouslySetInnerHTML={{ __html: html }} />
// ...
); }
:::warning Reading and writing files during the build process
fs
cannot be statically imported from the top level in NextJS pages. The
dynamic import must happen within a lifecycle function. For example:
/* it is safe to import the library from the top level */
import { readFile, utils, set_fs } from 'xlsx';
/* it is not safe to import 'fs' from the top level ! */
// import * as fs from 'fs'; // this will fail
import { join } from 'path';
import { cwd } from 'process';
export async function getServerSideProps() {
// highlight-next-line
set_fs(await import("fs")); // dynamically import 'fs' when needed
const wb = readFile(join(cwd(), "public", "sheetjs.xlsx")); // works
// ...
}
:::
:::caution Next 13+ and SWC
Next 13 switched to the SWC minifier. There are known issues with the minifier.
Until those issues are resolved, SWC should be disabled in next.config.js
:
module.exports = {
// highlight-next-line
swcMinify: false
};
:::
Demo
Complete Example (click to show)
- Disable NextJS telemetry:
npx next@13.1.1 telemetry disable
Confirm it is disabled by running
npx next@13.1.1 telemetry status
- Set up folder structure. At the end, a
pages
folder with asheets
subfolder must be created. On Linux or MacOS or WSL:
mkdir -p pages/sheets/
- Download the test file and place in the project root. On Linux or MacOS or WSL:
curl -LO https://docs.sheetjs.com/next/sheetjs.xlsx
- Install dependencies:
npm i --save https://cdn.sheetjs.com/xlsx-latest/xlsx-latest.tgz next@13.1.1
- Download test scripts:
Download and place the following scripts in the pages
subfolder:
Download [id].js
and place in the
pages/sheets
subfolder.
:::caution Percent-Encoding in the script name
The [id].js
script must have the literal square brackets in the name. If your
browser saved the file to %5Bid%5D.js
. rename the file.
:::
On Linux or MacOS or WSL:
cd pages
curl -LO https://docs.sheetjs.com/next/index.js
curl -LO https://docs.sheetjs.com/next/getServerSideProps.js
curl -LO https://docs.sheetjs.com/next/getStaticPaths.js
curl -LO https://docs.sheetjs.com/next/getStaticProps.js
cd sheets
curl -LOg 'https://docs.sheetjs.com/next/[id].js'
cd ../..
- Test the deployment:
npx next@13.1.1
Open a web browser and access:
- http://localhost:3000 landing page
- http://localhost:3000/getStaticProps shows data from the first sheet
- http://localhost:3000/getServerSideProps shows data from the first sheet
- http://localhost:3000/getStaticPaths shows a list (3 sheets)
The individual worksheets are available at
- Stop the server and run a production build:
npx next@13.1.1 build
The final output will show a list of the routes and types:
Route (pages) Size First Load JS
┌ ○ / 541 B 77.4 kB
├ ○ /404 181 B 73.7 kB
├ λ /getServerSideProps 594 B 77.4 kB
├ ● /getStaticPaths 2.56 kB 79.4 kB
├ ● /getStaticProps 591 B 77.4 kB
└ ● /sheets/[id] (447 ms) 569 B 77.4 kB
├ /sheets/0
├ /sheets/1
└ /sheets/2
As explained in the summary, the /getStaticPaths
and /getStaticProps
routes
are completely static. 3 /sheets/#
pages were generated, corresponding to 3
worksheets in the file. /getServerSideProps
is server-rendered.
- Try to build a static site:
npx next@13.1.1 export
:::note The static export will fail!
A static page cannot be generated at this point because /getServerSideProps
is still server-rendered.
:::
- Remove
pages/getServerSideProps.js
and rebuild withnpx next@13.1.1 build
Inspecting the output, there should be no lines with the λ
symbol:
Route (pages) Size First Load JS
┌ ○ / 541 B 77.4 kB
├ ○ /404 181 B 73.7 kB
├ ● /getStaticPaths 2.56 kB 79.4 kB
├ ● /getStaticProps 591 B 77.4 kB
└ ● /sheets/[id] (459 ms) 569 B 77.4 kB
├ /sheets/0
├ /sheets/1
└ /sheets/2
- Generate the static site:
npx next@13.1.1 export
The static site will be written to the out
subfolder, which can be hosted with
npx http-server out
The command will start a local HTTP server on port 8080.
"Static Site Generation" using getStaticProps
When using getStaticProps
, the file will be read once during build time.
import { readFile, set_fs, utils } from 'xlsx';
export async function getStaticProps() {
/* read file */
set_fs(await import("fs"));
const wb = readFile(path_to_file)
/* generate and return the html from the first worksheet */
const html = utils.sheet_to_html(wb.Sheets[wb.SheetNames[0]]);
return { props: { html } };
};
"Static Site Generation with Dynamic Routes" using getStaticPaths
Typically a static site with dynamic routes has an endpoint /sheets/[id]
that
implements both getStaticPaths
and getStaticProps
.
getStaticPaths
should return an array of worksheet indices:
export async function getStaticPaths() {
/* read file */
set_fs(await import("fs"));
const wb = readFile(path);
/* generate an array of objects that will be used for generating pages */
const paths = wb.SheetNames.map((name, idx) => ({ params: { id: idx.toString() } }));
return { paths, fallback: false };
};
:::note
For a pure static site, fallback
must be set to false
!
:::
getStaticProps
will generate the actual HTML for each page:
export async function getStaticProps(ctx) {
/* read file */
set_fs(await import("fs"));
const wb = readFile(path);
/* get the corresponding worksheet and generate HTML */
const ws = wb.Sheets[wb.SheetNames[ctx.params.id]]; // id from getStaticPaths
const html = utils.sheet_to_html(ws);
return { props: { html } };
};
"Server-Side Rendering" using getServerSideProps
:::caution Do not use on a static site
These routes require a NodeJS dynamic server. Static page generation will fail!
getStaticProps
and getStaticPaths
support static site generation (SSG).
getServerSideProps
is suited for NodeJS hosted deployments where the workbook
changes frequently and a static site is undesirable.
:::
When using getServerSideProps
, the file will be read on each request.
import { readFile, set_fs, utils } from 'xlsx';
export async function getServerSideProps() {
/* read file */
set_fs(await import("fs"));
const wb = readFile(path_to_file);
/* generate and return the html from the first worksheet */
const html = utils.sheet_to_html(wb.Sheets[wb.SheetNames[0]]);
return { props: { html } };
};
NuxtJS
@nuxt/content
is a file-based CMS for Nuxt, enabling static-site generation
and on-demand server rendering powered by spreadsheets.
:::note
This demo was tested on 2022 November 18 against Nuxt Content v1.15.1
.
:::
Nuxt Content Demo
Complete Example (click to show)
:::note
The project was generated using create-nuxt-app v4.0.0
. The generated project
used Nuxt v2.15.8
and Nuxt Content v1.15.1
.
:::
- Create a stock app:
npx create-nuxt-app@4.0.0 SheetJSNuxt
When prompted, enter the following options:
Project name
: press Enter (use defaultSheetJSNuxt
)Programming language
: press Down Arrow (TypeScript
selected) then EnterPackage manager
: selectNpm
and press EnterUI framework
: selectNone
and press EnterNuxt.js modules
: scroll toContent
, select with Space, then press EnterLinting tools
: press Enter (do not select any Linting tools)Testing framework
: selectNone
and press EnterRendering mode
: selectUniversal (SSR / SSG)
and press EnterDeployment target
: selectStatic (Static/Jamstack hosting)
and press EnterDevelopment tools
: press Enter (do not select any Development tools)What is your GitHub username?
: press EnterVersion control system
: selectNone
The project will be configured and modules will be installed.
- Install the SheetJS library and start the server:
cd SheetJSNuxt
npm i --save https://cdn.sheetjs.com/xlsx-latest/xlsx-latest.tgz
npm run dev
When the build finishes, the terminal will display a URL like:
ℹ Listening on: http://localhost:64688/
The server is listening on that URL. Open the link in a web browser.
- Download https://sheetjs.com/pres.xlsx and move to the
content
folder.
curl -L -o content/pres.xlsx https://sheetjs.com/pres.xlsx
- Modify
nuxt.config.js
as follows:
- Add the following to the top of the script:
import { readFile, utils } from 'xlsx';
// This will be called when the files change
const parseSheet = (file, { path }) => {
// `path` is a path that can be read with `XLSX.readFile`
const wb = readFile(path);
const o = wb.SheetNames.map(name => ({ name, data: utils.sheet_to_json(wb.Sheets[name])}));
return { data: o };
}
- Look for the exported object. There should be a
content
property:
// Content module configuration: https://go.nuxtjs.dev/config-content
content: {},
Replace the property with the following definition:
// content.extendParser allows us to hook into the parsing step
content: {
extendParser: {
// the keys are the extensions that will be matched. The "." is required
".numbers": parseSheet,
".xlsx": parseSheet,
".xls": parseSheet,
// can add other extensions like ".fods" as desired
}
},
(If the property is missing, add it to the end of the exported object)
- Replace
pages/index.vue
with the following:
<!-- sheetjs (C) 2013-present SheetJS -- https://sheetjs.com -->
<template><div>
<div v-for="item in data.data" v-bind:key="item.name">
<h2>{{ item.name }}</h2>
<table><thead><tr><th>Name</th><th>Index</th></tr></thead><tbody>
<tr v-for="row in item.data" v-bind:key="row.Index">
<td>{{ row.Name }}</td>
<td>{{ row.Index }}</td>
</tr>
</tbody></table>
</div>
</div></template>
<script>
export default {
async asyncData ({$content}) {
return {
data: await $content('pres').fetch()
};
}
};
</script>
The browser should refresh to show the contents of the spreadsheet. If it does not, click Refresh manually or open a new browser window.
- To verify that hot loading works, open
pres.xlsx
from thecontent
folder in Excel. Add a new row to the bottom and save the file:
The server terminal window should show a line like:
ℹ Updated ./content/pres.xlsx @nuxt/content 05:43:37
The page should automatically refresh with the new content:
- Stop the server (press
CTRL+C
in the terminal window) and run
npm run generate
This will create a static site in the dist
folder, which can be served with:
npx http-server dist
Accessing the page http://localhost:8080 will show the page contents. Verifying the static nature is trivial: make another change in Excel and save. The page will not change.
nuxt.config.js configuration
Through an override in nuxt.config.js
, Nuxt Content will use custom parsers.
Differences from a stock create-nuxt-app
config are shown below:
import { readFile, utils } from 'xlsx';
// This will be called when the files change
const parseSheet = (file, { path }) => {
// `path` is a path that can be read with `XLSX.readFile`
const wb = readFile(path);
const o = wb.SheetNames.map(name => ({ name, data: utils.sheet_to_json(wb.Sheets[name])}));
return { data: o };
}
export default {
// ...
// content.extendParser allows us to hook into the parsing step
content: {
extendParser: {
// the keys are the extensions that will be matched. The "." is required
".numbers": parseSheet,
".xlsx": parseSheet,
".xls": parseSheet,
// can add other extensions like ".fods" as desired
}
},
// ...
}
Template Use
When a spreadsheet is placed in the content
folder, Nuxt will find it. The
data can be referenced in a view with asyncData
. The name should not include
the extension, so "sheetjs.numbers"
would be referenced as "sheetjs"
:
async asyncData ({$content}) {
return {
// $content('sheetjs') will match files with extensions in nuxt.config.js
data: await $content('sheetjs').fetch()
};
}
In the template, data.data
is an array of objects. Each object has a name
property for the worksheet name and a data
array of row objects. This maps
neatly with nested v-for
:
<!-- loop over the worksheets -->
<div v-for="item in data.data" v-bind:key="item.name">
<table>
<!-- loop over the rows of each worksheet -->
<tr v-for="row in item.data" v-bind:key="row.Index">
<!-- here `row` is a row object generated from sheet_to_json -->
<td>{{ row.Name }}</td>
<td>{{ row.Index }}</td>
</tr>
</table>
</div>