Only first sheet is returning when working with Huge data sets #563
Labels
No Label
DBF
Dates
Defined Names
Features
Formula
HTML
Images
Infrastructure
Integration
International
ODS
Operations
Performance
PivotTables
Pro
Protection
Read Bug
SSF
SYLK
Style
Write Bug
good first issue
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: sheetjs/sheetjs#563
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
While working or bigger data sets( tested with 2000 cols X 1000 rows) with multiple sheets. In that case, after certain MB, (in my case 40MB),
sheet_to_row_object_array
is only returning first sheet.Is it a limitation? Are there any alternatives to get all the sheets regardless of files size?
sheet_to_row_object_array
, as the name suggests, takes a single worksheet as input. If you want to convert the second or third worksheet in a workbook you need to read the workbook first, then get the sheet name you want from the SheetNames array, then pull from the Sheets object. For example in node, if you want to convert the third worksheet:The browser demo iterates across the sheet names in a similar manner:
If you have an example file where you can read one worksheet but not another, then there's probably a parsing issue with that specific sheet. If you could send us a sample, we'd be glad to take a look. Alternatively, if you can enable dev mode (set the option
WTF:1
in the read or readFile options object) you may see a concrete error message.@reviewher Thanks for the quick reply.
continue with my above reply... Actually, the problem is with.
let excelData = XLSX.read("<some file>", {type: 'binary'});
, then callingexcelData.Sheets
not returning the object of all sheets, its only returning object with one key which is first sheet name and its data.Just to mention there are no errors in the sheet. And there is no special data to share with. Just dummy set of columns and rows are fine to test.
One last thing. While doing the testing
js-xlsx
is not returning any errors.(I will try with WTF:1 option set, and share the details)Hi @reviewher, any alternative for this? any help is appreciated.
@gtd57 what does
excelData.SheetNames
show?If you can share a file then we can take a look. If you don't want to upload to github you can email a file to dev at sheetjs dot com and someone can take a look
As per SheetJS/js-xlsx API,
excelData.SheetNames
must contain sheet name as key for each sheet and it data as value, However in my case it is only showing one key and value of first sheet data only.I can send you the data, but its is huge file above 50MB. However, that sheet does not contain any special data, You can create data on your side, I just popped some, one word strings in the first column and first row and rest are just numbers and in total I have used 2000 cols in 1200 rows and total of 3 sheets. My requirement is upto 8 sheets
Hello @gtd57 ! Just for you I ran a test: https://gist.github.com/SheetJSDev/205d9d9c14c5203e602da70ed57e0a43
It generates an XLSX workbook with 8 sheets. Each sheet happens to be the same but it doesn't affect the nature of the issue. The worksheet is 2000 columns and 1000 rows. You can run it to generate and to verify.
On a side note: the files are massive! without compression the generated file was 475MB. With compression it was a more manageable 98M, both well beyond the 50M limit.
I tried using the node command line tool to read the data. Turns out you hit a memory snag with the default settings, so you need to override:
After waiting a minute and 15 seconds, finally saw the CSV output!
So if it's only showing you one sheet, I'm guessing there's an issue with the other sheets. I'd like to take a look. If you can post the file to a hosting service like Dropbox, you can email me a link and I can download and take a look.
I'm getting the same or similar issue. I have a workbook with 7 sheets.
SheetNames
contains all 7 sheetnames but one of the sheets is missing altogether fromSheets
. The missing sheet has 382,000 rows. Do you think I'm hitting some kind of limit in xlsx? Do you have any suggested workaround or if it sounds like I bug can I help to debug?@mtharrison The limit is dictated by memory, and honestly you would be able to tell in either case when you hit the memory limit (chrome shows "oh snap" and node gives you a long message about GC). Assuming you don't see that, there's something in the file causing trouble.
The simplest thing you could do is pass the option
WTF:1
to the reader:If you are using node, and if you installed xlsx globally, you can just run
xlsx --dev --perf file_name
the --dev enables all errors and the --perf tells the xlsx command not to output anythingIf you are using XLSX.read, the second argument is the options argument, so just add
WTF:1
there. For example, in the browser using theindex.html
demo in the repo, add that key to the object{type: 'binary'}
and{type: 'base64'}
If you are using XLSX.readFile, just pass a second argument with
{WTF:1}
or add that key to the file. For example, the example in the README https://github.com/SheetJS/js-xlsx/#parsing-workbooks would bevar workbook = XLSX.readFile('test.xlsx', {WTF:1});
That option will expose any sheet-specific errors. You should get an error message that you can share.
I'd like to see a file if you can share it. Since it probably would exceed the github or email limit, please post somewhere (like dropbox) and email or share a link.
Thanks @SheetJSDev for the super-quick response. I tried with the debug option and the error I get is:
Looks like I'm probably hitting V8's max string size. I logged the buffer size out before
toString()
is called on it and it's 356mb.@mtharrison you caught me in a late night session!
If you are running in a recent version of node, can you try calling node with
--max-old-space-size=4096
? To do this against the global install, runIf you are running something like
node main.js
normally, just add the flag:@mtharrison scratch that.
The nodejs string limit is 16 bytes shy of 256MB: https://github.com/nodejs/node/issues/3175 . I thought the previous write test case would cover it, but the underlying xml files are 60M.
I'm going to resolve this in the next version by trying to catch that particular error and give more informative output.
Can you do one last thing: try saving that file as XLSB in Excel and seeing if you hit the same problem? The XLSB parser keeps data as buffers so it theoretically should not trigger a 256MB string conversion
@SheetJSDev Using an XLSB does indeed sidestep that issue! Thanks a lot for the help.
ERROR in src/app/app.component.ts(50,20): error TS2339: Property 'rows' does not exist on type 'HTMLElement'.
src/app/app.component.ts(59,27): error TS2339: Property 'rows' does not exist on type 'HTMLElement'.
src/app/app.component.ts(60,15): error TS2339: Property 'rows' does not exist on type 'HTMLElement'.
src/app/app.component.ts(63,25): error TS2339: Property 'rows' does not exist on type 'HTMLElement'.
src/app/app.component.ts(98,20): error TS2339: Property 'rows' does not exist on type 'HTMLElement'.
src/app/app.component.ts(109,27): error TS2339: Property 'rows' does not exist on type 'HTMLElement'.
src/app/app.component.ts(113,26): error TS2339: Property 'rows' does not exist on type 'HTMLElement'.
src/app/app.component.ts(172,31): error TS2339: Property 'sheet_to_row_object_array' does not exist on type 'XLSX$Utils'.
src/app/app.component.ts(186,36): error TS2339: Property 'files' does not exist on type 'HTMLElement'.
please help...
CODE:
submit()
{
var myTad = document.getElementById('rr');
var len = myTad.rows.length ;
var arr = [];
var values =[];
var key1 ="Task_NO";var key2 ="EFFORT_MONTH";var key3 ="ROLE";var key4 ="LOC"; var key5 ="ESTIMATED_HOURS"; var key6 ="CNTY";
var key7 ="EMPLOYEE_LEVEL";var key8 ="AUTHENTED_HOURS";var key9 ="ESTIMATED_COST"; var key10 ="ACTUAL_COST";
console.log("Row_length:" +len);
}
Grouping all related reading issues to #61 please follow up there