sheet_to_html

- added to TS definition and tests
- clarified behavior of plaintext files (fixes #641 h/t @dskrvk)
- removed old test files
This commit is contained in:
SheetJS 2017-05-16 13:45:35 -04:00
parent 409581b317
commit 3fde651a8c
35 changed files with 856 additions and 323 deletions

View File

@ -1,5 +1,6 @@
test_files/
tests/files/
types/
demos/
index.html
misc/

173
README.md
View File

@ -103,6 +103,7 @@ enhancements, additional features by request, and dedicated support.
* [Formulae Output](#formulae-output)
* [Delimiter-Separated Output](#delimiter-separated-output)
+ [UTF-16 Unicode Text](#utf-16-unicode-text)
* [HTML Output](#html-output)
* [JSON](#json)
- [File Formats](#file-formats)
* [Excel 2007+ XML (XLSX/XLSM)](#excel-2007-xml-xlsxxlsm)
@ -128,9 +129,9 @@ enhancements, additional features by request, and dedicated support.
* [Tested Environments](#tested-environments)
* [Test Files](#test-files)
- [Contributing](#contributing)
* [Tests](#tests)
* [OSX/Linux](#osxlinux)
* [Windows](#windows)
* [Tests](#tests)
- [License](#license)
- [References](#references)
@ -457,6 +458,7 @@ files and output the contents in various formats. The source is available at
Some helper functions in `XLSX.utils` generate different views of the sheets:
- `XLSX.utils.sheet_to_csv` generates CSV
- `XLSX.utils.sheet_to_html` generates HTML
- `XLSX.utils.sheet_to_json` generates an array of objects
- `XLSX.utils.sheet_to_formulae` generates a list of formulae
@ -485,7 +487,7 @@ Note: browser generates binary blob and forces a "download" to client. This
example uses [FileSaver.js](https://github.com/eligrey/FileSaver.js/):
```js
/* bookType can be any supported output type */
/* bookType can be any supported output type */
var wopts = { bookType:'xlsx', bookSST:false, type:'binary' };
var wbout = XLSX.write(workbook,wopts);
@ -514,7 +516,18 @@ take the same arguments as the normal write functions but return a readable
stream. They are only exposed in node.
- `XLSX.stream.to_csv` is the streaming version of `XLSX.utils.sheet_to_csv`.
- `XLSX.stream.to_html` is the streaming version of the HTML output type.
- `XLSX.stream.to_html` is the streaming version of `XLSX.utils.sheet_to_html`.
<details>
<summary><b>nodejs convert to CSV and write file</b> (click to show)</summary>
```js
var output_file_name = "out.csv";
var stream = XLSX.stream.to_csv(worksheet);
stream.pipe(fs.createWriteStream(output_file_name));
```
</details>
<https://github.com/sheetjs/sheetaki> pipes write streams to nodejs response.
@ -555,11 +568,13 @@ Utilities are available in the `XLSX.utils` object:
- `aoa_to_sheet` converts an array of arrays of JS data to a worksheet.
- `json_to_sheet` converts an array of JS objects to a worksheet.
- `table_to_sheet` converts a DOM TABLE element to a worksheet.
**Exporting:**
- `sheet_to_json` converts a worksheet object to an array of JSON objects.
- `sheet_to_csv` generates delimiter-separated-values output.
- `sheet_to_html` generates HTML output.
- `sheet_to_formulae` generates a list of the formulae (with value fallbacks).
These utilities are described in [Utility Functions](#utility-functions) below.
@ -572,6 +587,8 @@ These utilities are described in [Utility Functions](#utility-functions) below.
- `{en,de}code_cell` converts cell addresses
- `{en,de}code_range` converts cell ranges
Utilities are described in the [Utility Functions](#utility-functions) section.
## Common Spreadsheet Format
js-xlsx conforms to the Common Spreadsheet Format (CSF):
@ -979,23 +996,39 @@ objects which have the following properties:
```typescript
type ColInfo = {
/* visibility */
hidden:?boolean; // if true, the column is hidden
hidden?: boolean; // if true, the column is hidden
/* column width is specified in one of the following ways: */
wpx?:number; // width in screen pixels
width:number; // width in Excel's "Max Digit Width", width*256 is integral
wch?:number; // width in characters
wpx?: number; // width in screen pixels
width?: number; // width in Excel's "Max Digit Width", width*256 is integral
wch?: number; // width in characters
/* other fields for preserving features from files */
MDW?:number; // Excel's "Max Digit Width" unit, always integral
MDW?: number; // Excel's "Max Digit Width" unit, always integral
};
```
Excel internally stores column widths in a nebulous "Max Digit Width" form. The
<details>
<summary><b>Why are there three width types?</b> (click to show)</summary>
There are three different width types corresponding to the three different ways
spreadsheets store column widths:
SYLK and other plaintext formats use raw character count. Contemporaneous tools
like Visicalc and Multiplan were character based. Since the characters had the
same width, it sufficed to store a count. This tradition was continued into the
BIFF formats.
SpreadsheetML (2003) tried to align with HTML by standardizing on screen pixel
count throughout the file. Column widths, row heights, and other measures use
pixels. When the pixel and character counts do not align, Excel rounds values.
XLSX internally stores column widths in a nebulous "Max Digit Width" form. The
Max Digit Width is the width of the largest digit when rendered (generally the
"0" character is the widest). The internal width must be an integer multiple of
the the width divided by 256. ECMA-376 describes a formula for converting
between pixels and the internal width.
between pixels and the internal width. This represents a hybrid approach.
</details>
<details>
<summary><b>Implementation details</b> (click to show)</summary>
@ -1022,11 +1055,11 @@ objects which have the following properties:
```typescript
type RowInfo = {
/* visibility */
hidden:?boolean; // if true, the row is hidden
hidden?: boolean; // if true, the row is hidden
/* row height is specified in one of the following ways: */
hpx?:number; // height in screen pixels
hpt?:number; // height in points
hpx?: number; // height in screen pixels
hpt?: number; // height in points
};
```
@ -1060,7 +1093,6 @@ at index 164. The following example creates a custom format from scratch:
<summary><b>New worksheet with custom format</b> (click to show)</summary>
```js
var tbl = {};
var wb = {
SheetNames: ["Sheet1"],
Sheets: {
@ -1285,6 +1317,24 @@ Plaintext format guessing follows the priority order:
| PRN | (default) |
</details>
<details>
<summary><b>Why are random text files valid?</b> (click to show)</summary>
Excel is extremely aggressive in reading files. Adding an XLS extension to any
display text file (where the only characters are ANSI display chars) tricks
Excel into thinking that the file is potentially a CSV or TSV file, even if it
is only one column! This library attempts to replicate that behavior.
The best approach is to validate the desired worksheet and ensure it has the
expected number of rows or columns. Extracting the range is extremely simple:
```js
var range = XLSX.utils.decode_range(worksheet['!ref']);
var ncols = range.e.c - range.r.c + 1, nrows = range.e.r - range.s.r + 1;
```
</details>
## Writing Options
The exported `write` and `writeFile` functions accept an options argument:
@ -1506,6 +1556,29 @@ The `txt` output type uses the tab character as the field separator. If the
codepage library is available (included in the full distribution but not core),
the output will be encoded in codepage `1200` and the BOM will be prepended.
### HTML Output
As an alternative to the `writeFile` HTML type, `XLSX.utils.sheet_to_html` also
produces HTML output. The function takes an options argument:
| Option Name | Default | Description |
| :---------- | :------: | :-------------------------------------------------- |
| editable | false | If true, set `contenteditable="true"` for every TD |
| header | | Override header (default `html body table`) |
| footer | | Override footer (default `/table /body /html`) |
<details>
<summary><b>Examples</b> (click to show)</summary>
For the example sheet:
```js
> console.log(XLSX.utils.sheet_to_html(ws));
// ...
```
</details>
### JSON
`XLSX.utils.sheet_to_json` generates different types of JS objects. The function
@ -1926,37 +1999,50 @@ Tests utilize the mocha testing framework. Travis-CI and Sauce Labs links:
Test files are housed in [another repo](https://github.com/SheetJS/test_files).
Running `make init` will refresh the `test_files` submodule and get the files.
Note that this requires `svn`, `git`, `hg` and other commands that may not be
available. If `make init` fails, please download the latest version of the test
files snapshot from [the repo](https://github.com/SheetJS/test_files/releases)
<details>
<summary><b>Latest Snapshot</b> (click to show)</summary>
Latest test files snapshot:
<http://github.com/SheetJS/test_files/releases/download/20170409/test_files.zip>
(download and unzip to the `test_files` subdirectory)
</details>
## Contributing
Due to the precarious nature of the Open Specifications Promise, it is very
important to ensure code is cleanroom. Consult CONTRIBUTING.md
### Tests
<details>
<summary>(click to show)</summary>
<summary><b>File organization</b> (click to show)</summary>
The `test_misc` target (`make test_misc` on Linux/OSX / `make misc` on Windows)
runs the targeted feature tests. It should take 5-10 seconds to perform feature
tests without testing against the entire test battery. New features should be
accompanied with tests for the relevant file formats and features.
At a high level, the final script is a concatenation of the individual files in
the `bits` folder. Running `make` should reproduce the final output on all
platforms. The README is similarly split into bits in the `docbits` folder.
For tests involving the read side, an appropriate feature test would involve
reading an existing file and checking the resulting workbook object. If a
parameter is involved, files should be read with different values for the param
to verify that the feature is working as expected.
Folders:
For tests involving a new write feature which can already be parsed, appropriate
feature tests would involve writing a workbook with the feature and then opening
and verifying that the feature is preserved.
| folder | contents |
|:-------------|:--------------------------------------------------------------|
| `bits` | raw source files that make up the final script |
| `docbits` | raw markdown files that make up README.md |
| `bin` | server-side bin scripts (`xlsx.njs`) |
| `dist` | dist files for web browsers and nonstandard JS environments |
| `demos` | demo projects for platforms like ExtendScript and Webpack |
| `tests` | browser tests (run `make ctest` to rebuild) |
| `types` | typescript definitions and tests |
| `misc` | miscellaneous supporting scripts |
| `test_files` | test files (pulled from the test files repository) |
For tests involving a new write feature without an existing read ability, please
add a feature test to the kitchen sink `tests/write.js`.
</details>
After cloning the repo, running `make help` will display a list of commands.
### OSX/Linux
<details>
@ -2007,14 +2093,29 @@ make book -- rebuild README and summary
make help -- display this message
```
The normal approach uses a variety of command line tools to grab the test files.
For windows users, please download the latest version of the test files snapshot
from [github](https://github.com/SheetJS/test_files/releases)
</details>
Latest test files snapshot:
<https://github.com/SheetJS/test_files/releases/download/20170409/test_files.zip>
### Tests
Download and unzip to the `test_files` subdirectory.
<details>
<summary>(click to show)</summary>
The `test_misc` target (`make test_misc` on Linux/OSX / `make misc` on Windows)
runs the targeted feature tests. It should take 5-10 seconds to perform feature
tests without testing against the entire test battery. New features should be
accompanied with tests for the relevant file formats and features.
For tests involving the read side, an appropriate feature test would involve
reading an existing file and checking the resulting workbook object. If a
parameter is involved, files should be read with different values for the param
to verify that the feature is working as expected.
For tests involving a new write feature which can already be parsed, appropriate
feature tests would involve writing a workbook with the feature and then opening
and verifying that the feature is preserved.
For tests involving a new write feature without an existing read ability, please
add a feature test to the kitchen sink `tests/write.js`.
</details>
## License

View File

@ -91,8 +91,8 @@ function wb_fmt() {
}
workbook_formats.forEach(function(m) { if(program[m]) { wb_fmt(); } });
wb_formats_2.forEach(function(m) { if(program[m[0]]) { wb_fmt(); } });
if(seen);
else if(program.formulae) opts.cellFormula = true;
if(seen) {
} else if(program.formulae) opts.cellFormula = true;
else opts.cellFormula = false;
if(program.all) {
@ -107,11 +107,9 @@ if(program.all) {
if(program.sparse) opts.dense = false; else opts.dense = true;
if(program.dev) {
X.verbose = 2;
opts.WTF = true;
wb = X.readFile(filename, opts);
}
else try {
} else try {
wb = X.readFile(filename, opts);
} catch(e) {
var msg = (program.quiet) ? "" : n + ": error parsing ";
@ -151,7 +149,10 @@ if(target_sheet === '') {
var ws;
try {
ws = wb.Sheets[target_sheet];
if(!ws) throw "Sheet " + target_sheet + " cannot be found";
if(!ws) {
console.error("Sheet " + target_sheet + " cannot be found");
process.exit(3);
}
} catch(e) {
console.error(n + ": error parsing "+filename+" "+target_sheet+": " + e);
process.exit(4);
@ -176,7 +177,7 @@ if(program.readOnly) process.exit(0);
var oo = "";
var strm = false;
if(!program.quiet) console.error(target_sheet);
if(program.formulae) oo = X.utils.get_formulae(ws).join("\n");
if(program.formulae) oo = X.utils.sheet_to_formulae(ws).join("\n");
else if(program.json) oo = JSON.stringify(X.utils.sheet_to_json(ws));
else if(program.rawJs) oo = JSON.stringify(X.utils.sheet_to_json(ws,{raw:true}));
else if(program.arrays) oo = JSON.stringify(X.utils.sheet_to_json(ws,{raw:true, header:1}));

View File

@ -51,9 +51,10 @@ var HTML_ = (function() {
function html_to_book(str/*:string*/, opts)/*:Workbook*/ {
return sheet_to_workbook(html_to_sheet(str, opts), opts);
}
function make_html_row(ws/*:Worksheet*/, r/*:Range*/, R/*:number*/, o)/*:string*/ {
function make_html_row(ws/*:Worksheet*/, r/*:Range*/, R/*:number*/, o/*:Sheet2HTMLOpts*/)/*:string*/ {
var M = (ws['!merges'] ||[]);
var oo = [];
var nullcell = "<td" + (o.editable ? ' contenteditable="true"' : "" ) + "></td>";
for(var C = r.s.c; C <= r.e.c; ++C) {
var RS = 0, CS = 0;
for(var j = 0; j < M.length; ++j) {
@ -65,29 +66,36 @@ var HTML_ = (function() {
if(RS < 0) continue;
var coord = encode_cell({r:R,c:C});
var cell = o.dense ? (ws[R]||[])[C] : ws[coord];
if(!cell || cell.v == null) { oo.push("<td></td>"); continue; }
if(!cell || cell.v == null) { oo.push(nullcell); continue; }
/* TODO: html entities */
var w = cell.h || escapexml(cell.w || (format_cell(cell), cell.w) || "");
var sp = {};
if(RS > 1) sp.rowspan = RS;
if(CS > 1) sp.colspan = CS;
if(o.editable) sp.contenteditable = "true";
oo.push(writextag('td', w, sp));
}
return "<tr>" + oo.join("") + "</tr>";
}
function sheet_to_html(ws/*:Worksheet*/, opts/*:Sheet2HTMLOpts*/)/*:string*/ {
var _BEGIN = "<html><head><title>SheetJS Table Export</title></head><body><table>";
var _END = "</table></body></html>";
function sheet_to_html(ws/*:Worksheet*/, opts/*:?Sheet2HTMLOpts*/)/*:string*/ {
var o = opts || {};
var out/*:Array<string>*/ = [];
var r = decode_range(ws['!ref']);
o.dense = Array.isArray(ws);
for(var R = r.s.r; R <= r.e.r; ++R) out.push(make_html_row(ws, r, R, o));
return "<html><body><table>" + out.join("") + "</table></body></html>";
var header = o.header != null ? o.header : _BEGIN;
var footer = o.footer != null ? o.footer : _END;
return header + out.join("") + footer ;
}
return {
to_workbook: html_to_book,
to_sheet: html_to_sheet,
_row: make_html_row,
BEGIN: _BEGIN,
END: _END,
from_sheet: sheet_to_html
};
})();

View File

@ -211,6 +211,7 @@ var utils/*:any*/ = {
table_to_book: table_to_book,
sheet_to_csv: sheet_to_csv,
sheet_to_json: sheet_to_json,
sheet_to_html: HTML_.from_sheet,
sheet_to_formulae: sheet_to_formulae,
sheet_to_row_object_array: sheet_to_json
};

View File

@ -29,22 +29,19 @@ if(has_buf && typeof require != 'undefined') (function() {
return stream;
};
var HTML_BEGIN = "<html><body><table>";
var HTML_END = "</table></body></html>";
var write_html_stream = function(sheet/*:Worksheet*/, opts/*:?Sheet2HTMLOpts*/) {
var stream = Readable();
var o = opts == null ? {} : opts;
var r = decode_range(sheet['!ref']), cell/*:Cell*/;
o.dense = Array.isArray(sheet);
stream.push(HTML_BEGIN);
stream.push(HTML_.BEGIN);
var R = r.s.r;
var end = false;
stream._read = function() {
if(R > r.e.r) {
if(!end) { end = true; stream.push(HTML_END); }
if(!end) { end = true; stream.push(HTML_.END); }
return stream.push(null);
}
while(R <= r.e.r) {

View File

@ -29,6 +29,7 @@ files and output the contents in various formats. The source is available at
Some helper functions in `XLSX.utils` generate different views of the sheets:
- `XLSX.utils.sheet_to_csv` generates CSV
- `XLSX.utils.sheet_to_html` generates HTML
- `XLSX.utils.sheet_to_json` generates an array of objects
- `XLSX.utils.sheet_to_formulae` generates a list of formulae

View File

@ -23,7 +23,7 @@ Note: browser generates binary blob and forces a "download" to client. This
example uses [FileSaver.js](https://github.com/eligrey/FileSaver.js/):
```js
/* bookType can be any supported output type */
/* bookType can be any supported output type */
var wopts = { bookType:'xlsx', bookSST:false, type:'binary' };
var wbout = XLSX.write(workbook,wopts);

View File

@ -5,7 +5,18 @@ take the same arguments as the normal write functions but return a readable
stream. They are only exposed in node.
- `XLSX.stream.to_csv` is the streaming version of `XLSX.utils.sheet_to_csv`.
- `XLSX.stream.to_html` is the streaming version of the HTML output type.
- `XLSX.stream.to_html` is the streaming version of `XLSX.utils.sheet_to_html`.
<details>
<summary><b>nodejs convert to CSV and write file</b> (click to show)</summary>
```js
var output_file_name = "out.csv";
var stream = XLSX.stream.to_csv(worksheet);
stream.pipe(fs.createWriteStream(output_file_name));
```
</details>
<https://github.com/sheetjs/sheetaki> pipes write streams to nodejs response.

View File

@ -35,11 +35,13 @@ Utilities are available in the `XLSX.utils` object:
- `aoa_to_sheet` converts an array of arrays of JS data to a worksheet.
- `json_to_sheet` converts an array of JS objects to a worksheet.
- `table_to_sheet` converts a DOM TABLE element to a worksheet.
**Exporting:**
- `sheet_to_json` converts a worksheet object to an array of JSON objects.
- `sheet_to_csv` generates delimiter-separated-values output.
- `sheet_to_html` generates HTML output.
- `sheet_to_formulae` generates a list of the formulae (with value fallbacks).
These utilities are described in [Utility Functions](#utility-functions) below.
@ -52,3 +54,5 @@ These utilities are described in [Utility Functions](#utility-functions) below.
- `{en,de}code_cell` converts cell addresses
- `{en,de}code_range` converts cell ranges
Utilities are described in the [Utility Functions](#utility-functions) section.

View File

@ -6,23 +6,39 @@ objects which have the following properties:
```typescript
type ColInfo = {
/* visibility */
hidden:?boolean; // if true, the column is hidden
hidden?: boolean; // if true, the column is hidden
/* column width is specified in one of the following ways: */
wpx?:number; // width in screen pixels
width?:number; // width in Excel's "Max Digit Width", width*256 is integral
wch?:number; // width in characters
wpx?: number; // width in screen pixels
width?: number; // width in Excel's "Max Digit Width", width*256 is integral
wch?: number; // width in characters
/* other fields for preserving features from files */
MDW?:number; // Excel's "Max Digit Width" unit, always integral
MDW?: number; // Excel's "Max Digit Width" unit, always integral
};
```
Excel internally stores column widths in a nebulous "Max Digit Width" form. The
<details>
<summary><b>Why are there three width types?</b> (click to show)</summary>
There are three different width types corresponding to the three different ways
spreadsheets store column widths:
SYLK and other plaintext formats use raw character count. Contemporaneous tools
like Visicalc and Multiplan were character based. Since the characters had the
same width, it sufficed to store a count. This tradition was continued into the
BIFF formats.
SpreadsheetML (2003) tried to align with HTML by standardizing on screen pixel
count throughout the file. Column widths, row heights, and other measures use
pixels. When the pixel and character counts do not align, Excel rounds values.
XLSX internally stores column widths in a nebulous "Max Digit Width" form. The
Max Digit Width is the width of the largest digit when rendered (generally the
"0" character is the widest). The internal width must be an integer multiple of
the the width divided by 256. ECMA-376 describes a formula for converting
between pixels and the internal width.
between pixels and the internal width. This represents a hybrid approach.
</details>
<details>
<summary><b>Implementation details</b> (click to show)</summary>
@ -49,11 +65,11 @@ objects which have the following properties:
```typescript
type RowInfo = {
/* visibility */
hidden?:boolean; // if true, the row is hidden
hidden?: boolean; // if true, the row is hidden
/* row height is specified in one of the following ways: */
hpx?:number; // height in screen pixels
hpt?:number; // height in points
hpx?: number; // height in screen pixels
hpt?: number; // height in points
};
```

View File

@ -14,7 +14,6 @@ at index 164. The following example creates a custom format from scratch:
<summary><b>New worksheet with custom format</b> (click to show)</summary>
```js
var tbl = {};
var wb = {
SheetNames: ["Sheet1"],
Sheets: {

View File

@ -90,3 +90,21 @@ Plaintext format guessing follows the priority order:
| PRN | (default) |
</details>
<details>
<summary><b>Why are random text files valid?</b> (click to show)</summary>
Excel is extremely aggressive in reading files. Adding an XLS extension to any
display text file (where the only characters are ANSI display chars) tricks
Excel into thinking that the file is potentially a CSV or TSV file, even if it
is only one column! This library attempts to replicate that behavior.
The best approach is to validate the desired worksheet and ensure it has the
expected number of rows or columns. Extracting the range is extremely simple:
```js
var range = XLSX.utils.decode_range(worksheet['!ref']);
var ncols = range.e.c - range.r.c + 1, nrows = range.e.r - range.s.r + 1;
```
</details>

View File

@ -152,6 +152,29 @@ The `txt` output type uses the tab character as the field separator. If the
codepage library is available (included in the full distribution but not core),
the output will be encoded in codepage `1200` and the BOM will be prepended.
### HTML Output
As an alternative to the `writeFile` HTML type, `XLSX.utils.sheet_to_html` also
produces HTML output. The function takes an options argument:
| Option Name | Default | Description |
| :---------- | :------: | :-------------------------------------------------- |
| editable | false | If true, set `contenteditable="true"` for every TD |
| header | | Override header (default `html body table`) |
| footer | | Override footer (default `/table /body /html`) |
<details>
<summary><b>Examples</b> (click to show)</summary>
For the example sheet:
```js
> console.log(XLSX.utils.sheet_to_html(ws));
// ...
```
</details>
### JSON
`XLSX.utils.sheet_to_json` generates different types of JS objects. The function

View File

@ -84,6 +84,17 @@ Tests utilize the mocha testing framework. Travis-CI and Sauce Labs links:
Test files are housed in [another repo](https://github.com/SheetJS/test_files).
Running `make init` will refresh the `test_files` submodule and get the files.
Note that this requires `svn`, `git`, `hg` and other commands that may not be
available. If `make init` fails, please download the latest version of the test
files snapshot from [the repo](https://github.com/SheetJS/test_files/releases)
<details>
<summary><b>Latest Snapshot</b> (click to show)</summary>
Latest test files snapshot:
<http://github.com/SheetJS/test_files/releases/download/20170409/test_files.zip>
(download and unzip to the `test_files` subdirectory)
</details>

View File

@ -3,29 +3,31 @@
Due to the precarious nature of the Open Specifications Promise, it is very
important to ensure code is cleanroom. Consult CONTRIBUTING.md
### Tests
<details>
<summary>(click to show)</summary>
<summary><b>File organization</b> (click to show)</summary>
The `test_misc` target (`make test_misc` on Linux/OSX / `make misc` on Windows)
runs the targeted feature tests. It should take 5-10 seconds to perform feature
tests without testing against the entire test battery. New features should be
accompanied with tests for the relevant file formats and features.
At a high level, the final script is a concatenation of the individual files in
the `bits` folder. Running `make` should reproduce the final output on all
platforms. The README is similarly split into bits in the `docbits` folder.
For tests involving the read side, an appropriate feature test would involve
reading an existing file and checking the resulting workbook object. If a
parameter is involved, files should be read with different values for the param
to verify that the feature is working as expected.
Folders:
For tests involving a new write feature which can already be parsed, appropriate
feature tests would involve writing a workbook with the feature and then opening
and verifying that the feature is preserved.
| folder | contents |
|:-------------|:--------------------------------------------------------------|
| `bits` | raw source files that make up the final script |
| `docbits` | raw markdown files that make up README.md |
| `bin` | server-side bin scripts (`xlsx.njs`) |
| `dist` | dist files for web browsers and nonstandard JS environments |
| `demos` | demo projects for platforms like ExtendScript and Webpack |
| `tests` | browser tests (run `make ctest` to rebuild) |
| `types` | typescript definitions and tests |
| `misc` | miscellaneous supporting scripts |
| `test_files` | test files (pulled from the test files repository) |
For tests involving a new write feature without an existing read ability, please
add a feature test to the kitchen sink `tests/write.js`.
</details>
After cloning the repo, running `make help` will display a list of commands.
### OSX/Linux
<details>
@ -76,13 +78,28 @@ make book -- rebuild README and summary
make help -- display this message
```
The normal approach uses a variety of command line tools to grab the test files.
For windows users, please download the latest version of the test files snapshot
from [github](https://github.com/SheetJS/test_files/releases)
Latest test files snapshot:
<https://github.com/SheetJS/test_files/releases/download/20170409/test_files.zip>
Download and unzip to the `test_files` subdirectory.
</details>
### Tests
<details>
<summary>(click to show)</summary>
The `test_misc` target (`make test_misc` on Linux/OSX / `make misc` on Windows)
runs the targeted feature tests. It should take 5-10 seconds to perform feature
tests without testing against the entire test battery. New features should be
accompanied with tests for the relevant file formats and features.
For tests involving the read side, an appropriate feature test would involve
reading an existing file and checking the resulting workbook object. If a
parameter is involved, files should be read with different values for the param
to verify that the feature is working as expected.
For tests involving a new write feature which can already be parsed, appropriate
feature tests would involve writing a workbook with the feature and then opening
and verifying that the feature is preserved.
For tests involving a new write feature without an existing read ability, please
add a feature test to the kitchen sink `tests/write.js`.
</details>

View File

@ -9,7 +9,7 @@ Dual licenced under the MIT license or GPLv3. See https://raw.github.com/Stuk/js
JSZip uses the library pako released under the MIT license :
https://github.com/nodeca/pako/blob/master/LICENSE
*/
!function(e){
(function(e){
if("object"==typeof exports&&"undefined"!=typeof module)module.exports=e();
else if("function"==typeof define&&define.amd){JSZip=e();define([],e);}
else{
@ -8985,4 +8985,4 @@ function ZStream() {
module.exports = ZStream;
},{}]},{},[9])
(9)
});
}));

View File

@ -51,6 +51,7 @@
* [Formulae Output](README.md#formulae-output)
* [Delimiter-Separated Output](README.md#delimiter-separated-output)
+ [UTF-16 Unicode Text](README.md#utf-16-unicode-text)
* [HTML Output](README.md#html-output)
* [JSON](README.md#json)
- [File Formats](README.md#file-formats)
* [Excel 2007+ XML (XLSX/XLSM)](README.md#excel-2007-xml-xlsxxlsm)
@ -76,8 +77,8 @@
* [Tested Environments](README.md#tested-environments)
* [Test Files](README.md#test-files)
- [Contributing](README.md#contributing)
* [Tests](README.md#tests)
* [OSX/Linux](README.md#osxlinux)
* [Windows](README.md#windows)
* [Tests](README.md#tests)
- [License](README.md#license)
- [References](README.md#references)

View File

@ -29,8 +29,10 @@
"mocha":"",
"xlsjs":"",
"@sheetjs/uglify-js":"",
"@types/node":"",
"@types/commander":"",
"dtslint": "^0.1.2",
"typescript": "^2.2.0"
"typescript": "2.2.0"
},
"repository": { "type":"git", "url":"git://github.com/SheetJS/js-xlsx.git" },
"scripts": {

View File

@ -1,25 +0,0 @@
var XLSX = require('../');
var tests = {
'should be able to open workbook': function (file) {
var xlsx = XLSX.readFile('tests/files/' + file);
expect(xlsx).toBeTruthy();
expect(xlsx).toEqual(jasmine.any(Object));
},
'should define all api properties correctly': function (file) {
var xlsx = XLSX.readFile('tests/files/' + file);
expect(xlsx.Workbook).toEqual(jasmine.any(Object));
expect(xlsx.Props).toBeDefined();
expect(xlsx.Deps).toBeDefined();
expect(xlsx.Sheets).toEqual(jasmine.any(Object));
expect(xlsx.SheetNames).toEqual(jasmine.any(Array));
expect(xlsx.Strings).toBeDefined();
expect(xlsx.Styles).toBeDefined();
}
};
module.exports = function (file) {
for (var key in tests) {
it(key, tests[key].bind(undefined, file));
}
};

View File

@ -1,8 +0,0 @@
var XLSX = require('../');
var testCommon = require('./Common.js');
var file = ישוב_נקודות_זיכוי.xlsx';
describe(file, function () {
testCommon(file);
});

View File

@ -1,9 +0,0 @@
var XLSX = require('../');
var testCommon = require('./Common.js');
var file = 'formula_stress_test.xlsx';
describe(file, function () {
// Opening the file currently crashes node
//testCommon(file);
});

View File

@ -1,8 +0,0 @@
var XLSX = require('../');
var testCommon = require('./Common.js');
var file = 'interview.xlsx';
describe(file, function () {
testCommon(file);
});

View File

@ -1,8 +0,0 @@
var XLSX = require('../');
var testCommon = require('./Common.js');
var file = 'issue.xlsx';
describe(file, function () {
testCommon(file);
});

View File

@ -1,8 +0,0 @@
var XLSX = require('../');
var testCommon = require('./Common.js');
var file = 'mixed_sheets.xlsx';
describe(file, function () {
testCommon(file);
});

View File

@ -1,8 +0,0 @@
var XLSX = require('../');
var testCommon = require('./Common.js');
var file = 'named_ranges_2011.xlsx';
describe(file, function () {
testCommon(file);
});

3
types/Makefile Normal file
View File

@ -0,0 +1,3 @@
.PHONY: tslint
tslint:
@make -C.. tslint

195
types/bin_xlsx.ts Executable file
View File

@ -0,0 +1,195 @@
/* xlsx.js (C) 2013-present SheetJS -- http://sheetjs.com */
/* eslint-env node */
const n = "xlsx";
/* vim: set ts=2 ft=javascript: */
import XLSX = require("xlsx");
import 'exit-on-epipe';
import * as fs from 'fs';
import program = require('commander');
program
.version(XLSX.version)
.usage('[options] <file> [sheetname]')
.option('-f, --file <file>', 'use specified workbook')
.option('-s, --sheet <sheet>', 'print specified sheet (default first sheet)')
.option('-N, --sheet-index <idx>', 'use specified sheet index (0-based)')
.option('-p, --password <pw>', 'if file is encrypted, try with specified pw')
.option('-l, --list-sheets', 'list sheet names and exit')
.option('-o, --output <file>', 'output to specified file')
.option('-B, --xlsb', 'emit XLSB to <sheetname> or <file>.xlsb')
.option('-M, --xlsm', 'emit XLSM to <sheetname> or <file>.xlsm')
.option('-X, --xlsx', 'emit XLSX to <sheetname> or <file>.xlsx')
.option('-Y, --ods', 'emit ODS to <sheetname> or <file>.ods')
.option('-2, --biff2','emit XLS to <sheetname> or <file>.xls (BIFF2)')
.option('-6, --xlml', 'emit SSML to <sheetname> or <file>.xls (2003 XML)')
.option('-T, --fods', 'emit FODS to <sheetname> or <file>.fods (Flat ODS)')
.option('-S, --formulae', 'print formulae')
.option('-j, --json', 'emit formatted JSON (all fields text)')
.option('-J, --raw-js', 'emit raw JS object (raw numbers)')
.option('-A, --arrays', 'emit rows as JS objects (raw numbers)')
.option('-H, --html', 'emit HTML')
.option('-D, --dif', 'emit data interchange format (dif)')
.option('-K, --sylk', 'emit symbolic link (sylk)')
.option('-P, --prn', 'emit formatted text (prn)')
.option('-t, --txt', 'emit delimited text (txt)')
.option('-F, --field-sep <sep>', 'CSV field separator', ",")
.option('-R, --row-sep <sep>', 'CSV row separator', "\n")
.option('-n, --sheet-rows <num>', 'Number of rows to process (0=all rows)')
.option('--sst', 'generate shared string table for XLS* formats')
.option('--compress', 'use compression when writing XLSX/M/B and ODS')
.option('--read-only', 'do not generate output')
.option('--all', 'parse everything; write as much as possible')
.option('--dev', 'development mode')
.option('--read', 'read but do not print out contents')
.option('-q, --quiet', 'quiet mode');
program.on('--help', function() {
console.log(' Default output format is CSV');
console.log(' Support email: dev@sheetjs.com');
console.log(' Web Demo: http://oss.sheetjs.com/js-'+n+'/');
});
/* output formats, update list with full option name */
const workbook_formats = ['xlsx', 'xlsm', 'xlsb', 'ods', 'fods'];
/* flag, bookType, default ext */
const wb_formats_2 = [
['xlml', 'xlml', 'xls']
];
program.parse(process.argv);
let filename = '', sheetname = '';
if(program.args[0]) {
filename = program.args[0];
if(program.args[1]) sheetname = program.args[1];
}
if(program.sheet) sheetname = program.sheet;
if(program.file) filename = program.file;
if(!filename) {
console.error(n + ": must specify a filename");
process.exit(1);
}
/*:: if(filename) { */
if(!fs.existsSync(filename)) {
console.error(n + ": " + filename + ": No such file or directory");
process.exit(2);
}
let opts: XLSX.ParsingOptions = {};
let wb: XLSX.WorkBook;
if(program.listSheets) opts.bookSheets = true;
if(program.sheetRows) opts.sheetRows = program.sheetRows;
if(program.password) opts.password = program.password;
let seen = false;
function wb_fmt() {
seen = true;
opts.cellFormula = true;
opts.cellNF = true;
if(program.output) sheetname = program.output;
}
workbook_formats.forEach(function(m) { if(program[m]) { wb_fmt(); } });
wb_formats_2.forEach(function(m) { if(program[m[0]]) { wb_fmt(); } });
if(seen) {
} else if(program.formulae) opts.cellFormula = true;
else opts.cellFormula = false;
if(program.all) {
opts.cellFormula = true;
opts.bookVBA = true;
opts.cellNF = true;
opts.cellHTML = true;
opts.cellStyles = true;
opts.sheetStubs = true;
opts.cellDates = true;
}
if(program.dev) {
opts.WTF = true;
wb = XLSX.readFile(filename, opts);
} else try {
wb = XLSX.readFile(filename, opts);
} catch(e) {
let msg = (program.quiet) ? "" : n + ": error parsing ";
msg += filename + ": " + e;
console.error(msg);
process.exit(3);
}
if(program.read) process.exit(0);
/*:: if(wb) { */
if(program.listSheets) {
console.log((wb.SheetNames||[]).join("\n"));
process.exit(0);
}
let wopts: XLSX.WritingOptions = ({WTF:opts.WTF, bookSST:program.sst}/*:any*/);
if(program.compress) wopts.compression = true;
/* full workbook formats */
workbook_formats.forEach(function(m) { if(program[m]) {
XLSX.writeFile(wb, sheetname || ((filename || "") + "." + m), wopts);
process.exit(0);
} });
wb_formats_2.forEach(function(m) { if(program[m[0]]) {
wopts.bookType = <XLSX.BookType>(m[1]);
XLSX.writeFile(wb, sheetname || ((filename || "") + "." + m[2]), wopts);
process.exit(0);
} });
let target_sheet = sheetname || '';
if(target_sheet === '') {
if(program.sheetIndex < (wb.SheetNames||[]).length) target_sheet = wb.SheetNames[program.sheetIndex];
else target_sheet = (wb.SheetNames||[""])[0];
}
let ws: XLSX.WorkSheet;
try {
ws = wb.Sheets[target_sheet];
if(!ws) {
console.error("Sheet " + target_sheet + " cannot be found");
process.exit(3);
}
} catch(e) {
console.error(n + ": error parsing "+filename+" "+target_sheet+": " + e);
process.exit(4);
}
if(program.readOnly) process.exit(0);
/* single worksheet formats */
[
['biff2', '.xls'],
['sylk', '.slk'],
['html', '.html'],
['prn', '.prn'],
['txt', '.txt'],
['dif', '.dif']
].forEach(function(m) { if(program[m[0]]) {
wopts.bookType = <XLSX.BookType>(m[1]);
XLSX.writeFile(wb, sheetname || ((filename || "") + m[1]), wopts);
process.exit(0);
} });
let oo = "";
let strm = false;
if(!program.quiet) console.error(target_sheet);
if(program.formulae) oo = XLSX.utils.sheet_to_formulae(ws).join("\n");
else if(program.json) oo = JSON.stringify(XLSX.utils.sheet_to_json(ws));
else if(program.rawJs) oo = JSON.stringify(XLSX.utils.sheet_to_json(ws,{raw:true}));
else if(program.arrays) oo = JSON.stringify(XLSX.utils.sheet_to_json(ws,{raw:true, header:1}));
else {
strm = true;
let stream: NodeJS.ReadableStream = XLSX.stream.to_csv(ws, {FS:program.fieldSep, RS:program.rowSep});
if(program.output) stream.pipe(fs.createWriteStream(program.output));
else stream.pipe(process.stdout);
}
if(!strm) {
if(program.output) fs.writeFileSync(program.output, oo);
else console.log(oo);
}
/*:: } */
/*:: } */

406
types/index.d.ts vendored
View File

@ -1,28 +1,50 @@
/* index.d.ts (C) 2015-present SheetJS and contributors */
// TypeScript Version: 2.2
/** Version string */
export const version: string;
/** Attempts to read filename and parse */
export function readFile(filename: string, opts?: ParsingOptions): WorkBook;
/** Attempts to parse data */
export function read(data: any, opts?: ParsingOptions): WorkBook;
/** Attempts to write workbook data to filename */
/** NODE ONLY! Attempts to write workbook data to filename */
export function writeFile(data: WorkBook, filename: string, opts?: WritingOptions): any;
/** Attempts to write the workbook data */
export function write(data: WorkBook, opts?: WritingOptions): any;
export const utils: Utils;
export const stream: StreamUtils;
/** Number Format (either a string or an index to the format table) */
export type NumberFormat = string | number;
/** Basic File Properties */
export interface Properties {
/** Summary tab "Title" */
Title?: string;
/** Summary tab "Subject" */
Subject?: string;
/** Summary tab "Author" */
Author?: string;
/** Summary tab "Manager" */
Manager?: string;
/** Summary tab "Company" */
Company?: string;
/** Summary tab "Category" */
Category?: string;
/** Summary tab "Keywords" */
Keywords?: string;
/** Summary tab "Comments" */
Comments?: string;
/** Statistics tab "Last saved by" */
LastAuthor?: string;
/** Statistics tab "Created" */
CreatedDate?: Date;
}
/** Other supported properties */
export interface FullProperties extends Properties {
ModifiedDate?: Date;
Application?: string;
AppVersion?: string;
@ -33,13 +55,33 @@ export interface Properties {
ScaleCrop?: boolean;
Worksheets?: number;
SheetNames?: string[];
ContentStatus?: string;
LastPrinted?: string;
Revision?: string | number;
Version?: string;
Identifier?: string;
Language?: string;
}
export interface ParsingOptions {
export interface CommonOptions {
/**
* Input data encoding
* If true, throw errors when features are not understood
* @default false
*/
type?: 'base64' | 'binary' | 'buffer' | 'array' | 'file';
WTF?: boolean;
/**
* When reading a file, store dates as type d (default is n)
* When writing XLSX/XLSM file, use native date (default uses date codes)
* @default false
*/
cellDates?: boolean;
}
/** Options for read and readFile */
export interface ParsingOptions extends CommonOptions {
/** Input data encoding */
type?: 'base64' | 'binary' | 'buffer' | 'file' | 'array';
/**
* Save formulae to the .f field
@ -66,10 +108,13 @@ export interface ParsingOptions {
cellStyles?: boolean;
/**
* Store dates as type d (default is n)
* @default false
* Generate formatted text to the .w field
* @default true
*/
cellDates?: boolean;
cellText?: boolean;
/** Override default date format (code 14) */
dateNF?: string;
/**
* Create cell objects for stub cells
@ -120,18 +165,11 @@ export interface ParsingOptions {
password?: string;
}
export interface WritingOptions {
/**
* Output data encoding
*/
/** Options for write and writeFile */
export interface WritingOptions extends CommonOptions {
/** Output data encoding */
type?: 'base64' | 'binary' | 'buffer' | 'file';
/**
* Store dates as type d (default is n)
* @default false
*/
cellDates?: boolean;
/**
* Generate Shared String Table
* @default false
@ -139,13 +177,13 @@ export interface WritingOptions {
bookSST?: boolean;
/**
* Type of Workbook
* File format of generated workbook
* @default 'xlsx'
*/
bookType?: 'xlsx' | 'xlsm' | 'xlsb' | 'biff2' | 'xlml' | 'ods' | 'fods' | 'csv' | 'txt' | 'sylk' | 'html' | 'dif' | 'prn';
bookType?: BookType;
/**
* Name of Worksheet for single-sheet formats
* Name of Worksheet (for single-sheet formats)
* @default ''
*/
sheet?: string;
@ -155,8 +193,12 @@ export interface WritingOptions {
* @default false
*/
compression?: boolean;
/** Override workbook properties on save */
Props?: Properties;
}
/** Workbook Object */
export interface WorkBook {
/**
* A dictionary of the worksheets in the workbook.
@ -164,59 +206,72 @@ export interface WorkBook {
*/
Sheets: { [sheet: string]: WorkSheet };
/**
* ordered list of the sheet names in the workbook
*/
/** Ordered list of the sheet names in the workbook */
SheetNames: string[];
/**
* an object storing the standard properties. wb.Custprops stores custom properties.
* Since the XLS standard properties deviate from the XLSX standard, XLS parsing stores core properties in both places.
*/
Props?: Properties;
Props?: FullProperties;
Workbook?: WBProps;
}
export interface SheetProps {
/** Sheet Visibility (0=Visible 1=Hidden 2=VeryHidden) */
Hidden?: 0 | 1 | 2;
}
export interface DefinedName {
Name: string;
Ref: string;
Sheet?: number;
Comment?: string;
}
/** Workbook-Level Attributes */
export interface WBProps {
Sheets?: any[];
/** Sheet Properties */
Sheets?: SheetProps[];
/** Defined Names */
Names?: DefinedName[];
}
export interface ColInfo {
/**
* Excel's "Max Digit Width" unit, always integral
*/
MDW?: number;
/**
* width in Excel's "Max Digit Width", width*256 is integral
*/
width?: number;
/**
* width in screen pixels
*/
wpx?: number;
/**
* intermediate character calculation
*/
wch?: number;
/**
* if true, the column is hidden
*/
/* --- visibility --- */
/** if true, the column is hidden */
hidden?: boolean;
/* --- column width --- */
/** width in Excel's "Max Digit Width", width*256 is integral */
width?: number;
/** width in screen pixels */
wpx?: number;
/** width in "characters" */
wch?: number;
/** Excel's "Max Digit Width" unit, always integral */
MDW?: number;
}
export interface RowInfo {
/**
* height in screen pixels
*/
hpx?: number;
/**
* height in points
*/
hpt?: number;
/**
* if true, the column is hidden
*/
/* --- visibility --- */
/** if true, the column is hidden */
hidden?: boolean;
/* --- row height --- */
/** height in screen pixels */
hpx?: number;
/** height in points */
hpt?: number;
}
/**
@ -305,31 +360,62 @@ export interface ProtectInfo {
scenarios?: boolean;
}
/**
* object representing any sheet (worksheet or chartsheet)
*/
export interface Sheet {
'!ref'?: string;
'!margins'?: {
left: number,
right: number,
top: number,
bottom: number,
header: number,
footer: number,
};
/** Page Margins -- see Excel Page Setup .. Margins diagram for explanation */
export interface MarginInfo {
/** Left side margin (inches) */