SheetJS generated excel workbook size is too large #2859

Open
opened 2023-01-19 18:15:32 +00:00 by anupganatra · 11 comments

I am using xlsx to prepare the excel workbook through AWS lambda. I need to read the workbook from S3 and add the multiple sheets. I am using below code to read the excel workbook from S3 but final workbook size is too large.

xlsx.read(buffer, { cellStyles: true })

If I remove cellStyles: true then it is generating workbook with correct size but it removes the formatting (date, number, cell width etc)

I am writing workbook to S3 with ContentType application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

I am using [xlsx](https://www.npmjs.com/package/xlsx) to prepare the excel workbook through AWS lambda. I need to read the workbook from S3 and add the multiple sheets. I am using below code to read the excel workbook from S3 but final workbook size is too large. `xlsx.read(buffer, { cellStyles: true })` If I remove `cellStyles: true` then it is generating workbook with correct size but it removes the formatting (date, number, cell width etc) I am writing workbook to S3 with ContentType `application/vnd.openxmlformats-officedocument.spreadsheetml.sheet`
Owner

If you are writing files, you can pass the option compression: true to force ZIP DEFLATE compression (https://docs.sheetjs.com/docs/api/write-options more details). By default it is disabled.

If you are writing files, you can pass the option `compression: true` to force ZIP DEFLATE compression (https://docs.sheetjs.com/docs/api/write-options more details). By default it is disabled.
Author

I have tried passing compression: true and I was able to reduce the file size to 50% but still it was too large.

My file size was 52 MB without compression and 23 MB with compression. But if I remove cellStyles: true then it gets reduced to less than 2 MB

I have tried passing `compression: true` and I was able to reduce the file size to 50% but still it was too large. My file size was 52 MB without compression and 23 MB with compression. But if I remove `cellStyles: true` then it gets reduced to less than 2 MB
Owner

That sounds unexpected, can you share the file or data? If it can't be shared publicly, email hello@sheetjs.com

That sounds unexpected, can you share the file or data? If it can't be shared publicly, email hello@sheetjs.com
Author

I won't be able to share the actual files. I have generated dummy files using dummy script and shared the files at hello@sheetjs.com. Please note that I have generated the files without compression.

Size with cellStyles: 2.54
Size without cellStyles: 2.28

I won't be able to share the actual files. I have generated dummy files using dummy script and shared the files at hello@sheetjs.com. Please note that I have generated the files without compression. Size with cellStyles: 2.54 Size without cellStyles: 2.28
Owner

Thanks for following up! It looks like an issue with AWS S3/Lambda. The demos https://docs.sheetjs.com/docs/demos/cloud/aws were last tested in August 2022 but should still work. Please follow them exactly.

Thanks for following up! It looks like an issue with AWS S3/Lambda. The demos https://docs.sheetjs.com/docs/demos/cloud/aws were last tested in August 2022 but should still work. Please follow them exactly.
Author

I have followed the demo exactly but the file size is still the same. After analyzing the theme files, I have found that below four font scripts are creating issue. Also, attached screenshot of font scripts from theme files.

  1. Jpan
  2. Hang
  3. Hans
  4. Hant

Seems non-English typeface are creating the issue. Is there a way to remove/disable the above four font scripts?

I have followed the demo exactly but the file size is still the same. After analyzing the theme files, I have found that below four font scripts are creating issue. Also, attached screenshot of font scripts from theme files. 1. Jpan 2. Hang 3. Hans 4. Hant Seems non-English typeface are creating the issue. Is there a way to remove/disable the above four font scripts?
Owner

Can you log the buffer sizes when you write them to S3 and log the reconstructed buffer size when you receive the data? If the sizes are the same, can you compute a checksum (e.g. CRC32 using https://github.com/sheetjs/js-crc32) to compare the data as it flows through S3?

Basically, this is either a bug in the library or an issue with the S3 integration code or a corruption issue within S3 itself.

Can you log the buffer sizes when you write them to S3 and log the reconstructed buffer size when you receive the data? If the sizes are the same, can you compute a checksum (e.g. CRC32 using https://github.com/sheetjs/js-crc32) to compare the data as it flows through S3? Basically, this is either a bug in the library or an issue with the S3 integration code or a corruption issue within S3 itself.
Author

Is there a way to remove/disable the font scripts(Jpan, Hang, Hans, Hant) which are creating the issue? I have found a file where font scripts are available but my node_modules folder is generated automatically during deployment.

Is there a way to remove/disable the font scripts(Jpan, Hang, Hans, Hant) which are creating the issue? I have found a file where font scripts are available but my node_modules folder is generated automatically during deployment.
Owner

themeXLSX option when writing https://docs.sheetjs.com/docs/api/write-options -- you can specify your own theme data.

`themeXLSX` option when writing https://docs.sheetjs.com/docs/api/write-options -- you can specify your own theme data.
Author

themeXLSX option when writing https://docs.sheetjs.com/docs/api/write-options -- you can specify your own theme data.

I am using 0.18.5 version of the library. I don't have themeXLSX option available for XLSX.write. Below are the options available to me.

WritingOptions

  • type
  • bookSST
  • bookType
  • compression
  • ignoreEC
  • Props
  • numbers

CommonOptions

  • WTF
  • bookVBA
  • cellDates
  • sheetStubs
  • cellStyles
  • password

SheetOption

  • sheet

Also, I have gone through the repo https://github.com/SheetJS/sheetjs but unable to find the themeXLSX option.

> `themeXLSX` option when writing https://docs.sheetjs.com/docs/api/write-options -- you can specify your own theme data. I am using 0.18.5 version of the library. I don't have `themeXLSX` option available for `XLSX.write`. Below are the options available to me. **WritingOptions** * type * bookSST * bookType * compression * ignoreEC * Props * numbers **CommonOptions** * WTF * bookVBA * cellDates * sheetStubs * cellStyles * password **SheetOption** * sheet Also, I have gone through the repo https://github.com/SheetJS/sheetjs but unable to find the `themeXLSX` option.
Author

Is there any update on how to do the themeXLSX changes? I am unable to find the themeXLSX option in current version (0.18.5) of the library.

Is there any update on how to do the `themeXLSX` changes? I am unable to find the `themeXLSX` option in current version (0.18.5) of the library.
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#2859
No description provided.