Help me, how to read worksheets rows OnDemand, using NodeJs with stream? #2918

New Issue

src-rodrigues · 2023-04-13T18:42:45Z

src-rodrigues commented

2023-04-13 18:42:45 +00:00

Following this doc. I would like to read the file pausedly, that is, I would like to read only the first records, not the whole file. How do I do this with Node.js stream?

Following this [doc](https://docs.sheetjs.com/docs/demos/bigdata/stream). I would like to read the file pausedly, that is, I would like to read only the first records, not the whole file. How do I do this with Node.js stream?

sheetjs commented

2023-04-13 19:49:27 +00:00

There are multiple related questions:

"How do I read the first N records?". The sheetRows option limits the number of rows processed.

"How do I read from a NodeJS stream?": This is not currently supported for technical reasons. The docs have an explanation:

XLSX, XLSB, NUMBERS, and ODS files are ultimately ZIP files that contain binary and XML entries. The ZIP file format stores the table of contents ("end of central directory" record) at the end of the file, so a proper parse of a ZIP file requires scanning from the end. Streams do not provide random access into the data, so the only correct approach involves buffering the entire stream.

You can buffer the stream and call XLSX.read at the end.

"How do I incrementally parse a file?": This flow is not currently supported in the open source software but there is a plan to address this flow.

There are multiple related questions: "How do I read the first N records?". The [`sheetRows` option](https://docs.sheetjs.com/docs/api/parse-options) limits the number of rows processed. "How do I read from a NodeJS stream?": This is not currently supported for technical reasons. [The docs](https://docs.sheetjs.com/docs/solutions/input#example-readable-streams) have an explanation: > XLSX, XLSB, NUMBERS, and ODS files are ultimately ZIP files that contain binary and XML entries. The ZIP file format stores the table of contents ("end of central directory" record) at the end of the file, so **a proper parse of a ZIP file requires scanning from the end**. Streams do not provide random access into the data, so the only correct approach involves buffering the entire stream. You can buffer the stream and call `XLSX.read` at the end. "How do I incrementally parse a file?": This flow is not currently supported in the open source software but there is a plan to address this flow.

Sign in to join this conversation.