Improve documentation regarding !ref #82

Closed
opened 2014-07-09 02:11:22 +00:00 by lszhu · 12 comments
lszhu commented 2014-07-09 02:11:22 +00:00 (Migrated from github.com)

when I create a sheet with 49 columns, it can only fill the first 20000 rows.
this is the code:

var xlsx = require('xlsx');
var workbook = xlsx.readFile('template.xlsx');

var alpha = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
var sheet = workbook.Sheets[workbook.SheetNames[0]];
//console.log(sheet.A1.v);
console.log(new Date());
for (var i = 2; i < 100000; i++) {
    for (var j = 0; j < 26; j++) {
        var str = Math.floor(Math.random() * 1E+10).toString();
        str += Math.floor(Math.random() * 1E+10).toString();
        sheet[alpha[j] + i] = {v: str, t: 's'};
    }
    for (; j < 49; j++) {
        var str = Math.floor(Math.random() * 1E+10).toString();
        str += Math.floor(Math.random() * 1E+10).toString();
        sheet['A' + alpha[j - 26] + i] = {v: str, t: 's'};
    }
}
xlsx.writeFile(workbook, 'rand.xlsx');
console.log(new Date());
when I create a sheet with 49 columns, it can only fill the first 20000 rows. this is the code: ``` var xlsx = require('xlsx'); var workbook = xlsx.readFile('template.xlsx'); var alpha = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'; var sheet = workbook.Sheets[workbook.SheetNames[0]]; //console.log(sheet.A1.v); console.log(new Date()); for (var i = 2; i < 100000; i++) { for (var j = 0; j < 26; j++) { var str = Math.floor(Math.random() * 1E+10).toString(); str += Math.floor(Math.random() * 1E+10).toString(); sheet[alpha[j] + i] = {v: str, t: 's'}; } for (; j < 49; j++) { var str = Math.floor(Math.random() * 1E+10).toString(); str += Math.floor(Math.random() * 1E+10).toString(); sheet['A' + alpha[j - 26] + i] = {v: str, t: 's'}; } } xlsx.writeFile(workbook, 'rand.xlsx'); console.log(new Date()); ```
SheetJSDev commented 2014-07-09 02:33:50 +00:00 (Migrated from github.com)

I ran your script against a really dumb template and it took roughly 2 minutes:

$ node --version
v0.10.29
$ xlsx --version
0.7.7
$ cat t.js
var xlsx = require('xlsx');
var workbook = xlsx.readFile('template.xlsx');

var alpha = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
var sheet = workbook.Sheets[workbook.SheetNames[0]];
//console.log(sheet.A1.v);
console.log(new Date());
for (var i = 2; i < 100000; i++) {
    for (var j = 0; j < 26; j++) {
        var str = Math.floor(Math.random() * 1E+10).toString();
        str += Math.floor(Math.random() * 1E+10).toString();
        sheet[alpha[j] + i] = {v: str, t: 's'};
    }
    for (; j < 49; j++) {
        var str = Math.floor(Math.random() * 1E+10).toString();
        str += Math.floor(Math.random() * 1E+10).toString();
        sheet['A' + alpha[j - 26] + i] = {v: str, t: 's'};
    }
}
console.log(new Date());
sheet['!ref'] = "A1:AW100000"; // <-- i added this line to force it to write all of the cells
xlsx.writeFile(workbook, 'rand.xlsx');
console.log(new Date());
$ node t.js
Tue Jul 08 2014 22:24:55 GMT-0400 (EDT)
Tue Jul 08 2014 22:25:10 GMT-0400 (EDT)
Tue Jul 08 2014 22:26:47 GMT-0400 (EDT)

This is the generated file -- I warn you, it is big! The file is 262.5MB, which may explain why you are seeing issues? On a side note, it took more time to upload to dropbox than to generate the file.

Can you test the following:

  1. Add the line with the comment to your script and run

  2. Change the template so that it only has the header row, then run the script from above

  3. Generate CSV instead of XLSX

Also, can you share the template?

I ran your script against a really dumb template and it took roughly 2 minutes: ``` $ node --version v0.10.29 $ xlsx --version 0.7.7 $ cat t.js var xlsx = require('xlsx'); var workbook = xlsx.readFile('template.xlsx'); var alpha = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'; var sheet = workbook.Sheets[workbook.SheetNames[0]]; //console.log(sheet.A1.v); console.log(new Date()); for (var i = 2; i < 100000; i++) { for (var j = 0; j < 26; j++) { var str = Math.floor(Math.random() * 1E+10).toString(); str += Math.floor(Math.random() * 1E+10).toString(); sheet[alpha[j] + i] = {v: str, t: 's'}; } for (; j < 49; j++) { var str = Math.floor(Math.random() * 1E+10).toString(); str += Math.floor(Math.random() * 1E+10).toString(); sheet['A' + alpha[j - 26] + i] = {v: str, t: 's'}; } } console.log(new Date()); sheet['!ref'] = "A1:AW100000"; // <-- i added this line to force it to write all of the cells xlsx.writeFile(workbook, 'rand.xlsx'); console.log(new Date()); $ node t.js Tue Jul 08 2014 22:24:55 GMT-0400 (EDT) Tue Jul 08 2014 22:25:10 GMT-0400 (EDT) Tue Jul 08 2014 22:26:47 GMT-0400 (EDT) ``` [This is the generated file](https://www.dropbox.com/s/jogizngjalucvl5/rand.xlsx) -- I warn you, it is big! The file is 262.5MB, which may explain why you are seeing issues? On a side note, it took more time to upload to dropbox than to generate the file. Can you test the following: 1) Add the line with the comment to your script and run 2) Change the template so that it only has the header row, then run the script from above 3) Generate CSV instead of XLSX Also, can you share the template?
lszhu commented 2014-07-09 03:01:45 +00:00 (Migrated from github.com)

Thank you , it's ok now with the line you added.
but can you explain the line you add in detail? I can't find any documents yet.

Thank you , it's ok now with the line you added. but can you explain the line you add in detail? I can't find any documents yet.
SheetJSDev commented 2014-07-09 03:11:59 +00:00 (Migrated from github.com)

If this explanation makes sense, I'll add it to the readme:

The keys of a worksheet object are A-1 style references to the cell. For example, to get the cell object for B2, you would just access worksheet.B2 or worksheet['B2'].

There are special keys (all starting with !) that correspond to sheet metadata. In particular, !ref is the A-1 style cell range for the worksheet. For example, if the worksheet's range is A1:B2, there are 4 cells (A1, B1, A2, B2). The output functions iterate through the rows and columns as specified in the range. The relevant code segment is in the write_ws_xml_data function: https://github.com/SheetJS/js-xlsx/blob/master/bits/67_wsxml.js#L233

    ... range = safe_decode_range(ws['!ref']) ... // <-- this decodes the range into a range object
    for(var R = range.s.r; R <= range.e.r; ++R) { // <-- walk through each row.  range.s is the start cell (upper-left corner) and range.e is the end cell (lower-right corner)
...
        for(var C = range.s.c; C <= range.e.c; ++C) { // <-- walk through each column

I suspect your template had 20000 rows (or at least, that was the internal range storage). Your original script added cells but didn't update the range, which is why the output functions ignored them. If you want to see the range from your template, print it in the script:

var xlsx = require('xlsx');
var workbook = xlsx.readFile('template.xlsx');
var sheet = workbook.Sheets[workbook.SheetNames[0]];
console.log(sheet["!ref"]);
If this explanation makes sense, I'll add it to the readme: The keys of a worksheet object are A-1 style references to the cell. For example, to get the cell object for B2, you would just access `worksheet.B2` or `worksheet['B2']`. There are special keys (all starting with `!`) that correspond to sheet metadata. In particular, `!ref` is the A-1 style cell range for the worksheet. For example, if the worksheet's range is `A1:B2`, there are 4 cells (A1, B1, A2, B2). The output functions iterate through the rows and columns as specified in the range. The relevant code segment is in the `write_ws_xml_data` function: https://github.com/SheetJS/js-xlsx/blob/master/bits/67_wsxml.js#L233 ``` ... range = safe_decode_range(ws['!ref']) ... // <-- this decodes the range into a range object for(var R = range.s.r; R <= range.e.r; ++R) { // <-- walk through each row. range.s is the start cell (upper-left corner) and range.e is the end cell (lower-right corner) ... for(var C = range.s.c; C <= range.e.c; ++C) { // <-- walk through each column ``` I suspect your template had 20000 rows (or at least, that was the internal range storage). Your original script added cells but didn't update the range, which is why the output functions ignored them. If you want to see the range from your template, print it in the script: ``` var xlsx = require('xlsx'); var workbook = xlsx.readFile('template.xlsx'); var sheet = workbook.Sheets[workbook.SheetNames[0]]; console.log(sheet["!ref"]); ```
lszhu commented 2014-07-09 04:03:17 +00:00 (Migrated from github.com)

awesome!
my template was saved from an xls file which had 20000 rows.
the codes are really long and complex, a more detailed guide is preferred, thanks again.

awesome! my template was saved from an xls file which had 20000 rows. the codes are really long and complex, a more detailed guide is preferred, thanks again.
SheetJSDev commented 2014-07-09 04:08:05 +00:00 (Migrated from github.com)

@lszhu heh if you think this is complex, the XLS parsing is much more intense

@lszhu heh if you think this is complex, the [XLS parsing](https://github.com/SheetJS/js-xls) is much more intense
SheetJSDev commented 2014-08-26 18:00:00 +00:00 (Migrated from github.com)

@lszhu added a short explanation of !ref in the README: https://github.com/SheetJS/js-xlsx#worksheet-object

@lszhu added a short explanation of `!ref` in the README: https://github.com/SheetJS/js-xlsx#worksheet-object
vikash52 commented 2015-04-08 16:59:06 +00:00 (Migrated from github.com)

I am trying to convert html table to xls and its getting converted to xsl successfully. However, I am facing issues with formula. Formula in the html table is not visible in xsl file, I just see a value at the place of formula.

Refering this example https://github.com/SheetJS/js-xlsx/blob/master/tests/write.js , it doesn't have any implementation for cell formula in xlsx spreadsheet I tried using cell.f = "=SUM(A1+B1)" for the cell C1 and cell.v as the summation value which was 3. But i didn't succeed. With the exported file, when opened in MS excel, the cell contained just the data and when selected, didn't show any formula which i assigned in f(x) field.

Can someone post me a example which actually uses the functions/property '.f' and 'cellFormula'
Will be very helpful. I just need a working example with static values.

I am trying to convert html table to xls and its getting converted to xsl successfully. However, I am facing issues with formula. Formula in the html table is not visible in xsl file, I just see a value at the place of formula. Refering this example https://github.com/SheetJS/js-xlsx/blob/master/tests/write.js , it doesn't have any implementation for cell formula in xlsx spreadsheet I tried using cell.f = "=SUM(A1+B1)" for the cell C1 and cell.v as the summation value which was 3. But i didn't succeed. With the exported file, when opened in MS excel, the cell contained just the data and when selected, didn't show any formula which i assigned in f(x) field. Can someone post me a example which actually uses the functions/property '.f' and 'cellFormula' Will be very helpful. I just need a working example with static values.
aboodh95 commented 2017-03-19 08:29:10 +00:00 (Migrated from github.com)

@vikash52 did you solve the issue or find a way to solve it

@vikash52 did you solve the issue or find a way to solve it
Crusader4Christ commented 2017-06-28 07:52:06 +00:00 (Migrated from github.com)

@SheetJSDev
Sorry, but I can't understand this line:
sheet['!ref'] = "A1:AW100000"; // <-- i added this line to force it to write all of the cells
How did you get last range "AW100000"?

@SheetJSDev Sorry, but I can't understand this line: `sheet['!ref'] = "A1:AW100000"; // <-- i added this line to force it to write all of the cells ` How did you get last range "AW100000"?
reviewher commented 2017-06-28 08:05:41 +00:00 (Migrated from github.com)

@Crusader4Christ AW100000 is the cell in the 49th column and 100000th row:

> XLSX.utils.decode_range("A1:AW100000")
{ s: { c: 0, r: 0 }, e: { c: 48, r: 99999 } }

The original case from @lszhu used 100k rows and 49 columns, but since the original file specified a range of 20000 cells the rest were omitted.

The range needs to be changed since the writer won't include cells outside of the worksheet's range. The wiki includes some examples of how to update the range: https://github.com/SheetJS/js-xlsx/wiki/General-Utility-Functions

@Crusader4Christ AW100000 is the cell in the 49th column and 100000th row: ```js > XLSX.utils.decode_range("A1:AW100000") { s: { c: 0, r: 0 }, e: { c: 48, r: 99999 } } ``` The original case from @lszhu used 100k rows and 49 columns, but since the original file specified a range of 20000 cells the rest were omitted. The range needs to be changed since the writer won't include cells outside of the worksheet's range. The wiki includes some examples of how to update the range: https://github.com/SheetJS/js-xlsx/wiki/General-Utility-Functions
Crusader4Christ commented 2017-06-28 08:34:47 +00:00 (Migrated from github.com)

@reviewher So, if I need to work with OO Calc xlsx bigger than 65536 rows I should use update_sheet_range(ws) from https://github.com/SheetJS/js-xlsx/wiki/General-Utility-Functions?

@reviewher So, if I need to work with OO Calc xlsx bigger than 65536 rows I should use `update_sheet_range(ws)` from https://github.com/SheetJS/js-xlsx/wiki/General-Utility-Functions?
reviewher commented 2017-06-28 15:33:51 +00:00 (Migrated from github.com)

@Crusader4Christ if you don't know the bottom-right address, update_sheet_range(ws) scans all of the addresses and fixes the worksheet. That is not the most efficient approach but it works in general

If you do know that address, add_to_sheet(ws, bottom_right_cell_address) is more efficient:

var nrows = 100000;
var ncols = 49;
var bottom_right_cell_address = XLSX.utils.encode_cell({r:nrows-1, c:ncols-1});
add_to_sheet(ws, bottom_right_cell_address);
@Crusader4Christ if you don't know the bottom-right address, `update_sheet_range(ws)` scans all of the addresses and fixes the worksheet. That is not the most efficient approach but it works in general If you do know that address, `add_to_sheet(ws, bottom_right_cell_address)` is more efficient: ```js var nrows = 100000; var ncols = 49; var bottom_right_cell_address = XLSX.utils.encode_cell({r:nrows-1, c:ncols-1}); add_to_sheet(ws, bottom_right_cell_address); ```
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#82
No description provided.