SheetJS generated excel workbook size is too large #2859
Labels
No Label
DBF
Dates
Defined Names
Features
Formula
HTML
Images
Infrastructure
Integration
International
ODS
Operations
Performance
PivotTables
Pro
Protection
Read Bug
SSF
SYLK
Style
Write Bug
good first issue
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: sheetjs/sheetjs#2859
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I am using xlsx to prepare the excel workbook through AWS lambda. I need to read the workbook from S3 and add the multiple sheets. I am using below code to read the excel workbook from S3 but final workbook size is too large.
xlsx.read(buffer, { cellStyles: true })
If I remove
cellStyles: true
then it is generating workbook with correct size but it removes the formatting (date, number, cell width etc)I am writing workbook to S3 with ContentType
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
If you are writing files, you can pass the option
compression: true
to force ZIP DEFLATE compression (https://docs.sheetjs.com/docs/api/write-options more details). By default it is disabled.I have tried passing
compression: true
and I was able to reduce the file size to 50% but still it was too large.My file size was 52 MB without compression and 23 MB with compression. But if I remove
cellStyles: true
then it gets reduced to less than 2 MBThat sounds unexpected, can you share the file or data? If it can't be shared publicly, email hello@sheetjs.com
I won't be able to share the actual files. I have generated dummy files using dummy script and shared the files at hello@sheetjs.com. Please note that I have generated the files without compression.
Size with cellStyles: 2.54
Size without cellStyles: 2.28
Thanks for following up! It looks like an issue with AWS S3/Lambda. The demos https://docs.sheetjs.com/docs/demos/cloud/aws were last tested in August 2022 but should still work. Please follow them exactly.
I have followed the demo exactly but the file size is still the same. After analyzing the theme files, I have found that below four font scripts are creating issue. Also, attached screenshot of font scripts from theme files.
Seems non-English typeface are creating the issue. Is there a way to remove/disable the above four font scripts?
Can you log the buffer sizes when you write them to S3 and log the reconstructed buffer size when you receive the data? If the sizes are the same, can you compute a checksum (e.g. CRC32 using https://github.com/sheetjs/js-crc32) to compare the data as it flows through S3?
Basically, this is either a bug in the library or an issue with the S3 integration code or a corruption issue within S3 itself.
Is there a way to remove/disable the font scripts(Jpan, Hang, Hans, Hant) which are creating the issue? I have found a file where font scripts are available but my node_modules folder is generated automatically during deployment.
themeXLSX
option when writing https://docs.sheetjs.com/docs/api/write-options -- you can specify your own theme data.I am using 0.18.5 version of the library. I don't have
themeXLSX
option available forXLSX.write
. Below are the options available to me.WritingOptions
CommonOptions
SheetOption
Also, I have gone through the repo https://github.com/SheetJS/sheetjs but unable to find the
themeXLSX
option.Is there any update on how to do the
themeXLSX
changes? I am unable to find thethemeXLSX
option in current version (0.18.5) of the library.