Streaming read-support for large files #3179

Open
opened 2024-08-14 18:22:48 +00:00 by mmarkell · 1 comment

We commonly receive the error: Cannot create a string longer than 0x1fffffe8 characters

It would be immensely helpful if there was a way of streaming the file rather than loading the XML into a very large string and crashing NodeJS. I briefly looked into using Bun (with a larger string limit) but due to some other compatibility issues that's not an option for us.

Do you have any suggestions or guidance on how we can use sheetjs as a streaming reader for files that does not load the entire xml into memory at once? I'd be happy to contribute back any patches we make to the project, as large file support is our biggest priority right now.

We commonly receive the error: `Cannot create a string longer than 0x1fffffe8 characters` It would be immensely helpful if there was a way of streaming the file rather than loading the XML into a very large string and crashing NodeJS. I briefly looked into using Bun (with a larger string limit) but due to some other compatibility issues that's not an option for us. Do you have any suggestions or guidance on how we can use sheetjs as a *streaming* reader for files that does not load the entire xml into memory at once? I'd be happy to contribute back any patches we make to the project, as large file support is our biggest priority right now.
Owner

Feel free to ask for clarifying details in the chat.

The SheetJS libraries date back to 2012. Many of the JavaScript features and NodeJS and browser APIs we take for granted today did not exist back then.

There are a number of different approaches, but many have other limitations / are incompatible with the current API. For example, it is possible to jump around and parse parts of a Blob object without pulling the full file into the engine, but that would force a Promise-based interface (browser vendors decided that synchronous reading would be permitted in Web Workers but not in the renderer thread).

Feel free to ask for clarifying details in [the chat](https://sheetjs.com/chat). The SheetJS libraries date back to 2012. Many of the JavaScript features and NodeJS and browser APIs we take for granted today did not exist back then. There are a number of different approaches, but many have other limitations / are incompatible with the current API. For example, it is possible to jump around and parse parts of a Blob object without pulling the full file into the engine, but that would force a Promise-based interface (browser vendors decided that synchronous reading would be permitted in Web Workers but not in the renderer thread).
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#3179
No description provided.