Partial xlsx file causes indefinite loop #3215
Labels
No Label
DBF
Dates
Defined Names
Features
Formula
HTML
Images
Infrastructure
Integration
International
ODS
Operations
Performance
PivotTables
Pro
Protection
Read Bug
SSF
SYLK
Style
Write Bug
good first issue
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: sheetjs/sheetjs#3215
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
A partial xlsx file smaller than 43 bytes will cause an (almost) indefinite loop.
SYSTEM SETUP:
Last known xlsx version to work correctly and not have this problem is: 0.17.5
STEPS TO REPRODUCE:
A hexdump of the first 20 bytes might look like:
EXPECTED RESULT:
A partial file should trigger an error such as 'End of zip file not found'
ACTUAL RESULT:
We see that the xlsx.parse_zip function gets called. Variable fcnt gets set to a value of in this case 2056. The for loop will iterate 2056 times. Each loop iteration taking 1 second. The whole loop lasting preventing any other node actions getting processing time, so hanging the system.
To be sure, let's start from a sample file https://docs.sheetjs.com/pres.xlsx and start a new project:
We can inspect the contents using
xxd
:The output appears byte-swapped compared to your output, likely because your tool was interpreting values as little-endian 16-bit integers and
xxd
prints the bytes as they appear in the file:Following your procedure, the following
node
invocation should hang:This fails fairly quickly in local testing with the error
Bad compressed size: 0 != 350
.To test against your output, we need to recover the original
Uint8Array
. The following should be run in the NodeJS REPL:This command immediately fails with
Error: Unsupported ZIP file
Can you test the aforementioned steps locally and compare the performance? If you agree with the analysis, can you reinstall the library using the "NodeJS" installation guide
Thank you for your response. I have gone thru the steps you posted above.
The result is that with pres.xlsx the expected exception is thrown:
Error: Bad compressed size: 0 != 350
With the file that I have the method call hangs as reported in this ticket.
I will paste the outcome of the pres.xlsx file and my test file below to easily compare the two:
It prints the below bytes similar to what you expect.
Now I use my tester file which is a file created by LibreOffice Calc. It's an empty sheet. File on disk is 5094 bytes. But we are interested in the first 20 bytes. We print them below:
Running the above partial file (first 20 bytes) thru the read method is causing the function to seemingly hang. (a very long loop)
Thanks for following up! Can confirm the slowdown with the following script based on your file data:
That I can confirm on my side as well.