json_to_sheet hanging when using large JSON file #2750
Labels
No Label
DBF
Dates
Defined Names
Features
Formula
HTML
Images
Infrastructure
Integration
International
ODS
Operations
Performance
PivotTables
Pro
Protection
Read Bug
SSF
SYLK
Style
Write Bug
good first issue
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: sheetjs/sheetjs#2750
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I've got a fairly large JSON file I'm trying to write to a sheet. The file is ~353MB, with a length (when stringified) of 361,514,182.
It just hangs. I can get it working by chunking the Object and writing it to different sheets, but I can't write it all to a single sheet. I've also tried appending the chunks to the bottom of the sheet, but that also fails once it get's to the "latter" chunks.
I've just tried moving from v0.17 to 0.18 to see if that helps, but also no luck.
Running Node 16.15.1, Windows 11. Any ideas or workarounds? It's annoying having to manually open the files and paste the sheets back together.
Perhaps also worth noting that since the move to 0.18, the filesize has gone from 37MB to 98MB.
You might be running into https://bugs.chromium.org/p/v8/issues/detail?id=6696
Try passing the option
dense: true
tojson_to_sheet
oraoa_to_sheet
.Thanks for the suggestion! Tried this
let ws = XLSX.utils.json_to_sheet(fullResult, { dense: true })
but it still hangs.I don't even see a reference to dense in the docs! I moved back to v0.17.5
on account of the file size issue, is that an option for 0.18 only?
On Sat, Jul 30, 2022 at 2:12 PM SheetJSDev @.***> wrote:
The worksheet representation has a long history. We compared the hash vs array of arrays back in 2012 and found that IE6 performed much better with the Object. V8 also performed better with the object approach until some major changes under the hood heavily penalized large objects.
After V8 regressed, with no sign of improvement from Google's side, we revisited the array of arrays approach. That's what
dense
enables. Changing the default representation will require a major version bump.If you are using
compression: true
and seeing a jump in file size, it's because we fully switched the ZIP engine and the sliding window hasn't been optimizedCan you give a rough estimate for the number of rows in the object? If you could find a way to share the data, that would also be helpful
Ahh IE6, to blame for so many things! ;)
In this particular case I really can't share the data because it's confidential, but I'll try and provide as much detail as possible outside the actual data.
It's a data pull from a number of data sources that are processed and put together as an array of objects. The most recent one, run just this afternoon is 269496 rows total - and array of that many objects.
While not every object contains every one of these keys, most do. I've changed the actual name in most cases, but this is a list of name and type:
To be clear, I saw the jump in size (roughly x3) with
compression: true
in v0.18. I dropped back to .17 and it went back to normal size - approx 36MB.Got the same error on json_to_sheet function with 421060 rows and 20 columns. it stuck almost an hour with no output
Any help?
Is there any debug mode so that I can able to see any progress and where it is stuck?
json_to_sheet( data, { dense:true } ) hangs with 766734 rows(11 columns). on 0.19.1
Unless there's a workaround or a fix for this issue, are the limitations mentioned somewhere?
https://docs.sheetjs.com/docs/miscellany/errors#aw-snap-or-oops-an-error-has-occurred
That said, the current limits apply on a per-worksheet level. The limit is driven by how the worksheet is written (a large string is constructed).
For the maximum worksheet rows (1048576 rows), you can write 10 columns (https://jsfiddle.net/L462gdjy/ demo)
For sheets with a small number of strings repeated many times,
bookSST
reduces size by de-duplicating text using the shared string table.can confirm this is on serverside(nestjs) and simply hanging at worksheet creation (not workbook) with 766,734 rows and 11 columns. can't find anything related in the link provided
Let's try the equivalent of the demo in NodeJS:
Locally. the file is created and the process prints:
During the processing, it will seem to hang the server.
Since it takes a while to write, if you are doing this in the server, it is recommended to use a worker thread or subprocess. An example of worker threads is now in the docs: https://docs.sheetjs.com/docs/demos/server#worker-threads
#2834 for more context on the example.