Unable to load large XLS document, "Sheet1 undefined" #2999
Labels
No Label
DBF
Dates
Defined Names
Features
Formula
HTML
Images
Infrastructure
Integration
International
ODS
Operations
Performance
PivotTables
Pro
Protection
Read Bug
SSF
SYLK
Style
Write Bug
good first issue
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: sheetjs/sheetjs#2999
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I have a single sheet xlsx document with a side of 79,815,440 bytes and 446,790 rows and 44 columns.
I have failed to load this document using my own app with this XLSX library. But it also fails using the xlsx-cli tool and the WebSQL demo site found in the sheetjs documentation.
It seems like it is failing silently. I do not get any error in my browser console.
After an
XLSX.read
the wb.SheetNames[] array correctly has "Sheet1" in it. However the wb.Sheets[] array is empty.My Code Example:
WebSQL Demo Site Comparison
https://sheetjs.com/sql/
This page seems to run into the same issue. Doesn't crash or throw error, just no data in the Sheets array.
Attempt using XLSX-CLI
Expected Results / Comment
Is it possible to load a xlsx document of this size with this library? Given that it fail with two different browser examples, and a command line example on a modern, powerful computer I think the limitation is something other than compute resources.
Is there a way to only load a subset of the data? I don't need to parse or render 400,000+ rows on Chrome. I really just need to load the document to show a preview of the file. The first 10 lines even.
Thankjs!
It is noted in the documentation:
https://docs.sheetjs.com/docs/miscellany/errors#aw-snap-or-oops-an-error-has-occurred
Expand "Technical Limitations" for details:
We'll add a comment in the docs mentioning
ERR_STRING_TOO_LONG
and other error messages.Current SheetJS Process
Currently the whole XML is decoded before processing.
This approach does not lean on advanced features and still works in IE6 and legacy JavaScript engines (including engines used in Photoshop and InDesign, certain NetSuite SuiteScripts deployments, and other extension platforms).
This also means the library is subject to the string length limit.
V8 Limit
The V8 string length limit has changed over the years. https://bugs.chromium.org/p/v8/issues/detail?id=6148 was raised in the V8 bug tracker in 2017.
Note that IE11 did not have the same hardcoded limit. It was able to support longer strings in 2017 than V8 can support in 2023
Future
With modern JS and web APIs there are new ways to work around the limitations, but it would drop compatibility with most browsers and environments (possibly Chromium / NodeJS only).
If NodeJS is the only target, it is possible to unzip the entire XLSX file into a temporary folder and seek/stream individual entries from the filesystem in the correct order. It's possible to do something similar in the browser using deferred
Blob
operations.If this is of interest, we can take a closer look again.
A new section has been added to the Troubleshooting page: https://docs.sheetjs.com/docs/miscellany/errors#invalid-string-length-or-err_string_too_long
PS:
dense
mode will help for worksheets of ~10M cells (that do not trip the string length limit):OK thanks for the quick response and the details.
My sheet is well over that 10M cell limit at 19M cells. So it looks like I will have to come up with another solution which is OK because it was never my intent to render the full sheet in chrome anyway. I only wish for a preview so I will have to do some pre-processing on the file.
Regarding the string length error, that only happens with the CLI, not chrome. In Chrome it fails silently and the sheets array is simply empty. So that made the initial investigation confusing. Making more sense now.