# Databases "Database" is a catch-all term referring to traditional RDBMS as well as K/V stores, document databases, and other "NoSQL" storages. There are many external database systems as well as browser APIs like WebSQL and `localStorage` This demo discusses general strategies and provides examples for a variety of database systems. The examples are merely intended to demonstrate very basic functionality. ## Structured Tables Database tables are a common import and export target for spreadsheets. One common representation of a database table is an array of JS objects whose keys are column headers and whose values are the underlying data values. For example, | Name | Index | | :----------- | ----: | | Barack Obama | 44 | | Donald Trump | 45 | is naturally represented as an array of objects ```js [ { Name: "Barack Obama", Index: 44 }, { Name: "Donald Trump", Index: 45 } ] ``` The `sheet_to_json` and `json_to_sheet` helper functions work with objects of similar shape, converting to and from worksheet objects. The corresponding worksheet would include a header row for the labels: ``` XXX| A | B | ---+--------------+-------+ 1 | Name | Index | 2 | Barack Obama | 44 | 3 | Donald Trump | 45 | ``` ## Building Schemas from Worksheets The `sheet_to_json` helper function generates arrays of JS objects that can be scanned to determine the column "types", and there are third-party connectors that can push arrays of JS objects to database tables. The [`sexql`](http://sheetjs.com/sexql) browser demo uses WebSQL, which is limited to the SQLite fundamental types.
Implementation details (click to show) The `sexql` schema builder scans the first row to find headers: ```js if(!ws || !ws['!ref']) return; var range = XLSX.utils.decode_range(ws['!ref']); if(!range || !range.s || !range.e || range.s > range.e) return; var R = range.s.r, C = range.s.c; var names = new Array(range.e.c-range.s.c+1); for(C = range.s.c; C<= range.e.c; ++C){ var addr = XLSX.utils.encode_cell({c:C,r:R}); names[C-range.s.c] = ws[addr] ? ws[addr].v : XLSX.utils.encode_col(C); } ``` After finding the headers, a deduplication step ensures that data is not lost. Duplicate headers will be suffixed with `_1`, `_2`, etc. ```js for(var i = 0; i < names.length; ++i) if(names.indexOf(names[i]) < i) for(var j = 0; j < names.length; ++j) { var _name = names[i] + "_" + (j+1); if(names.indexOf(_name) > -1) continue; names[i] = _name; } ``` A column-major walk helps determine the data type. For SQLite the only relevant data types are `REAL` and `TEXT`. If a string or date or error is seen in any value of a column, the column is marked as `TEXT`: ```js var types = new Array(range.e.c-range.s.c+1); for(C = range.s.c; C<= range.e.c; ++C) { var seen = {}, _type = ""; for(R = range.s.r+1; R<= range.e.r; ++R) seen[(ws[XLSX.utils.encode_cell({c:C,r:R})]||{t:"z"}).t] = true; if(seen.s || seen.str) _type = "TEXT"; else if(seen.n + seen.b + seen.d + seen.e > 1) _type = "TEXT"; else switch(true) { case seen.b: case seen.n: _type = "REAL"; break; case seen.e: _type = "TEXT"; break; case seen.d: _type = "TEXT"; break; } types[C-range.s.c] = _type || "TEXT"; } ```
The included `SheetJSSQL.js` script demonstrates SQL statement generation. ## Objects, K/V and "Schema-less" Databases So-called "Schema-less" databases allow for arbitrary keys and values within the entries in the database. K/V stores and Objects add additional restrictions. There is no natural way to translate arbitrarily shaped schemas to worksheets in a workbook. One common trick is to dedicate one worksheet to holding named keys. For example, considering the JS object: ```json { "title": "SheetDB", "metadata": { "author": "SheetJS", "code": 7262 }, "data": [ { "Name": "Barack Obama", "Index": 44 }, { "Name": "Donald Trump", "Index": 45 }, ] } ``` A dedicated worksheet should store the one-off named values: ``` XXX| A | B | ---+-----------------+---------+ 1 | Path | Value | 2 | title | SheetDB | 3 | metadata.author | SheetJS | 4 | metadata.code | 7262 | ``` The included `ObjUtils.js` script demonstrates object-workbook conversion:
Implementation details (click to show) ```js function deepset(obj, path, value) { if(path.indexOf(".") == -1) return obj[path] = value; var parts = path.split("."); if(!obj[parts[0]]) obj[parts[0]] = {}; return deepset(obj[parts[0]], parts.slice(1).join("."), value); } function workbook_to_object(wb) { var out = {}; /* assign one-off keys */ var ws = wb.Sheets["_keys"]; if(ws) { var data = XLSX.utils.sheet_to_json(ws, {raw:true}); data.forEach(function(r) { deepset(out, r.path, r.value); }); } /* assign arrays from worksheet tables */ wb.SheetNames.forEach(function(n) { if(n == "_keys") return; out[n] = XLSX.utils.sheet_to_json(wb.Sheets[n], {raw:true}); }); return out; } function walk(obj, key, arr) { if(Array.isArray(obj)) return; if(typeof obj != "object") { arr.push({path:key, value:obj}); return; } Object.keys(obj).forEach(function(k) { walk(obj[k], key?key+"."+k:k, arr); }); } function object_to_workbook(obj) { var wb = XLSX.utils.book_new(); /* keyed entries */ var base = []; walk(obj, "", base); var ws = XLSX.utils.json_to_sheet(base, {header:["path", "value"]}); XLSX.utils.book_append_sheet(wb, ws, "_keys"); /* arrays */ Object.keys(obj).forEach(function(k) { if(!Array.isArray(obj[k])) return; XLSX.utils.book_append_sheet(wb, XLSX.utils.json_to_sheet(obj[k]), k); }); return wb; } ```
## Browser APIs #### WebSQL WebSQL is a popular SQL-based in-browser database available on Chrome / Safari. In practice, it is powered by SQLite, and most simple SQLite-compatible queries work as-is in WebSQL. The public demo generates a database from workbook. #### LocalStorage and SessionStorage The Storage API, encompassing `localStorage` and `sessionStorage`, describes simple key-value stores that only support string values and keys. Objects can be stored as JSON using `JSON.stringify` and `JSON.parse` to set and get keys. `SheetJSStorage.js` extends the `Storage` prototype with a `load` function to populate the db based on an object and a `dump` function to generate a workbook from the data in the storage. `LocalStorage.html` tests `localStorage`. #### IndexedDB IndexedDB is a more complex storage solution, but the `localForage` wrapper supplies a Promise-based interface mimicking the `Storage` API. `SheetJSForage.js` extends the `localforage` object with a `load` function to populate the db based on an object and a `dump` function to generate a workbook from the data in the storage. `LocalForage.html` forces IndexedDB mode. ## External Database Demos ### SQL Databases There are nodejs connector libraries for all of the popular RDBMS systems. They have facilities for connecting to a database, executing queries, and obtaining results as arrays of JS objects that can be passed to `json_to_sheet`. The main differences surround API shape and supported data types. #### SQLite [The `better-sqlite3` module](https://www.npmjs.com/package/better-sqlite3) provides a very simple API for working with SQLite databases. `Statement#all` runs a prepared statement and returns an array of JS objects. `SQLiteTest.js` generates a simple two-table SQLite database (`SheetJS1.db`), exports to XLSX (`sqlite.xlsx`), imports the new XLSX file to a new database (`SheetJS2.db`) and verifies the tables are preserved. #### MySQL / MariaDB [The `mysql2` module](https://www.npmjs.com/package/mysql2) supplies a callback API as well as a Promise wrapper. `Connection#query` runs a statement and returns an array whose first element is an array of JS objects. `MySQLTest.js` connects to the MySQL instance running on `localhost`, builds two tables in the `sheetjs` database, exports to XLSX, imports the new XLSX file to the `sheetj5` database and verifies the tables are preserved. #### PostgreSQL [The `pg` module](https://www.npmjs.com/package/pg) supplies a Promise wrapper. Like with `mysql2`, `Client#query` runs a statement and returns a result object. The `rows` key of the object is an array of JS objects. `PgSQLTest.js` connects to the PostgreSQL server on `localhost`, builds two tables in the `sheetjs` database, exports to XLSX, imports the new XLSX file to the `sheetj5` database and verifies the tables are preserved. #### Knex Query Builder [The `knex` module](https://www.npmjs.com/package/knex) builds SQL queries. The same exact code can be used against Oracle Database, MSSQL, and other engines. `KnexTest.js` uses the `sqlite3` connector and follows the same procedure as the SQLite test. The included `SheetJSKnex.js` script converts between the query builder and the common spreadsheet format. ### Key/Value Stores #### Redis Redis is a powerful data structure server that can store simple strings, sets, sorted sets, hashes and lists. One simple database representation stores the strings in a special worksheet (`_strs`), the manifest in another worksheet (`_manifest`), and each object in its own worksheet (`obj##`). `RedisTest.js` connects to a local Redis server, populates data based on the official Redis tutorial, exports to XLSX, flushes the server, imports the new XLSX file and verifies the data round-tripped correctly. `SheetJSRedis.js` includes the implementation details. #### LowDB LowDB is a small schemaless database powered by `lodash`. `_.get` and `_.set` helper functions make storing metadata a breeze. The included `SheetJSLowDB.js` script demonstrates a simple adapter that can load and dump data. [![Analytics](https://ga-beacon.appspot.com/UA-36810333-1/SheetJS/js-xlsx?pixel)](https://github.com/SheetJS/js-xlsx)