# Databases

"Database" is a catch-all term referring to traditional RDBMS as well as K/V
stores, document databases, and other "NoSQL" storages. There are many external
database systems as well as browser APIs like WebSQL and `localStorage`

This demo discusses general strategies and provides examples for a variety of
database systems.  The examples are merely intended to demonstrate very basic
functionality.


## Structured Tables

Database tables are a common import and export target for spreadsheets.  One
common representation of a database table is an array of JS objects whose keys
are column headers and whose values are the underlying data values. For example,

| Name         | Index |
| :----------- | ----: |
| Barack Obama |    44 |
| Donald Trump |    45 |

is naturally represented as an array of objects

```js
[
  { Name: "Barack Obama", Index: 44 },
  { Name: "Donald Trump", Index: 45 }
]
```

The `sheet_to_json` and `json_to_sheet` helper functions work with objects of
similar shape, converting to and from worksheet objects.  The corresponding
worksheet would include a header row for the labels:

```
XXX|      A       |   B   |
---+--------------+-------+
 1 | Name         | Index |
 2 | Barack Obama |    44 |
 3 | Donald Trump |    45 |
```


## Building Schemas from Worksheets

The `sheet_to_json` helper function generates arrays of JS objects that can be
scanned to determine the column "types", and there are third-party connectors
that can push arrays of JS objects to database tables.

The [`sexql`](http://sheetjs.com/sexql) browser demo uses WebSQL, which is
limited to the SQLite fundamental types.  Its schema builder scans the first row
to find headers:

```js
  if(!ws || !ws['!ref']) return;
  var range = XLSX.utils.decode_range(ws['!ref']);
  if(!range || !range.s || !range.e || range.s > range.e) return;
  var R = range.s.r, C = range.s.c;

  var names = new Array(range.e.c-range.s.c+1);
  for(C = range.s.c; C<= range.e.c; ++C){
    var addr = XLSX.utils.encode_cell({c:C,r:R});
    names[C-range.s.c] = ws[addr] ? ws[addr].v : XLSX.utils.encode_col(C);
  }
```

After finding the headers, a deduplication step ensures that data is not lost.
Duplicate headers will be suffixed with `_1`, `_2`, etc.

```js
  for(var i = 0; i < names.length; ++i) if(names.indexOf(names[i]) < i)
    for(var j = 0; j < names.length; ++j) {
      var _name = names[i] + "_" + (j+1);
      if(names.indexOf(_name) > -1) continue;
      names[i] = _name;
    }
```

A column-major walk helps determine the data type.  For SQLite the only relevant
data types are `REAL` and `TEXT`.  If a string or date or error is seen in any
value of a column, the column is marked as `TEXT`:

```js
  var types = new Array(range.e.c-range.s.c+1);
  for(C = range.s.c; C<= range.e.c; ++C) {
    var seen = {}, _type = "";
    for(R = range.s.r+1; R<= range.e.r; ++R)
      seen[(ws[XLSX.utils.encode_cell({c:C,r:R})]||{t:"z"}).t] = true;
    if(seen.s || seen.str) _type = "TEXT";
    else if(seen.n + seen.b + seen.d + seen.e > 1) _type = "TEXT";
    else switch(true) {
      case seen.b:
      case seen.n: _type = "REAL"; break;
      case seen.e: _type = "TEXT"; break;
      case seen.d: _type = "TEXT"; break;
    }
    types[C-range.s.c] = _type || "TEXT";
  }
```

The included `SheetJSSQL.js` script demonstrates SQL statement generation.

## Objects, K/V and "Schema-less" Databases

So-called "Schema-less" databases allow for arbitrary keys and values within the
entries in the database.  K/V stores and Objects add additional restrictions.

There is no natural way to translate arbitrarily shaped schemas to worksheets
in a workbook.  One common trick is to dedicate one worksheet to holding named
keys.  For example, considering the JS object:

```json
{
  "title": "SheetDB",
  "metadata": {
    "author": "SheetJS",
    "code": 7262
  },
  "data": [
    { "Name": "Barack Obama", "Index": 44 },
    { "Name": "Donald Trump", "Index": 45 },
  ]
}
```

A dedicated worksheet should store the one-off named values:

```
XXX|        A        |    B    |
---+-----------------+---------+
 1 | Path            | Value   |
 2 | title           | SheetDB |
 3 | metadata.author | SheetJS |
 4 | metadata.code   |    7262 |
```

The included `ObjUtils.js` script demonstrates object-workbook conversion:

```js
function deepset(obj, path, value) {
  if(path.indexOf(".") == -1) return obj[path] = value;
  var parts = path.split(".");
  if(!obj[parts[0]]) obj[parts[0]] = {};
  return deepset(obj[parts[0]], parts.slice(1).join("."), value);
}
function workbook_to_object(wb) {
  var out = {};

  /* assign one-off keys */
  var ws = wb.Sheets["_keys"]; if(ws) {
    var data = XLSX.utils.sheet_to_json(ws, {raw:true});
    data.forEach(function(r) { deepset(out, r.path, r.value); });
  }

  /* assign arrays from worksheet tables */
  wb.SheetNames.forEach(function(n) {
    if(n == "_keys") return;
    out[n] = XLSX.utils.sheet_to_json(wb.Sheets[n], {raw:true});
  });

  return out;
}

function walk(obj, key, arr) {
  if(Array.isArray(obj)) return;
  if(typeof obj != "object") { arr.push({path:key, value:obj}); return; }
  Object.keys(obj).forEach(function(k) { walk(obj[k], key?key+"."+k:k, arr); });
}
function object_to_workbook(obj) {
  var wb = XLSX.utils.book_new();

  /* keyed entries */
  var base = []; walk(obj, "", base);
  var ws = XLSX.utils.json_to_sheet(base, {header:["path", "value"]});
  XLSX.utils.book_append_sheet(wb, ws, "_keys");

  /* arrays */
  Object.keys(obj).forEach(function(k) {
    if(!Array.isArray(obj[k])) return;
    XLSX.utils.book_append_sheet(wb, XLSX.utils.json_to_sheet(obj[k]), k);
  });

  return wb;
}
```


## Browser APIs

#### WebSQL

WebSQL is a popular SQL-based in-browser database available on Chrome / Safari.
In practice, it is powered by SQLite, and most simple SQLite-compatible queries
work as-is in WebSQL.

The public demo <http://sheetjs.com/sexql> generates a database from workbook.

#### LocalStorage and SessionStorage

The Storage API, encompassing `localStorage` and `sessionStorage`, describes
simple key-value stores that only support string values and keys. Objects can be
stored as JSON using `JSON.stringify` and `JSON.parse` to set and get keys.

`SheetJSStorage.js` extends the `Storage` prototype with a `load` function to
populate the db based on an object and a `dump` function to generate a workbook
from the data in the storage.  `LocalStorage.html` tests `localStorage`.

#### IndexedDB

IndexedDB is a more complex storage solution, but the `localForage` wrapper
supplies a Promise-based interface mimicking the `Storage` API.

`SheetJSForage.js` extends the `localforage` object with a `load` function to
populate the db based on an object and a `dump` function to generate a workbook
from the data in the storage.  `LocalForage.html` forces IndexedDB mode.


## External Database Demos

### SQL Databases

There are nodejs connector libraries for all of the popular RDBMS systems.  They
have facilities for connecting to a database, executing queries, and obtaining
results as arrays of JS objects that can be passed to `json_to_sheet`.  The main
differences surround API shape and supported data types.

#### SQLite

[The `better-sqlite3` module](https://www.npmjs.com/package/better-sqlite3)
provides a very simple API for working with SQLite databases.  `Statement#all`
runs a prepared statement and returns an array of JS objects

`SQLiteTest.js` generates a simple two-table SQLite database (`SheetJS1.db`),
exports to XLSX (`sqlite.xlsx`), imports the new XLSX file to a new database
(`SheetJS2.db`) and verifies the tables are preserved.

#### MySQL / MariaDB

[The `mysql2` module](https://www.npmjs.com/package/mysql2) supplies a callback
API as well as a Promise wrapper.  `Connection#query` runs a statement and
returns an array whose first element is an array of JS objects.

`MySQLTest.js` connects to the MySQL instance running on `localhost`, builds two
tables in the `sheetjs` database, exports to XLSX, imports the new XLSX file to
the `sheetj5` database and verifies the tables are preserved.

#### PostgreSQL

[The `pg` module](https://www.npmjs.com/package/pg) supplies a Promise wrapper.
Like with `mysql2`, `Client#query` runs a statement and returns a result object.
The `rows` key of the object is an array of JS objects.

`PgSQLTest.js` connects to the PostgreSQL server on `localhost`, builds two
tables in the `sheetjs` database, exports to XLSX, imports the new XLSX file to
the `sheetj5` database and verifies the tables are preserved.

### Key/Value Stores

#### Redis

Redis is a powerful data structure server that can store simple strings, sets,
sorted sets, hashes and lists.  One simple database representation stores the
strings in a special worksheet (`_strs`), the manifest in another worksheet
(`_manifest`), and each object in its own worksheet (`obj##`).

`RedisTest.js` connects to a local Redis server, populates data based on the
official Redis tutorial, exports to XLSX, flushes the server, imports the new
XLSX file and verifies the data round-tripped correctly.  `SheetJSRedis.js`
includes the implementation details

#### LowDB

LowDB is a small schemaless database powered by `lodash`.  `_.get` and `_.set`
helper functions make storing metadata a breeze.  The included `SheetJSLowDB.js`
script demonstrates a simple adapter that can load and dump data.


[![Analytics](https://ga-beacon.appspot.com/UA-36810333-1/SheetJS/js-xlsx?pixel)](https://github.com/SheetJS/js-xlsx)