Error when writing a .xlsb if cell value is a number #2795

Closed
opened 2022-09-21 07:35:54 +00:00 by laurelgr · 3 comments
laurelgr commented 2022-09-21 07:35:54 +00:00 (Migrated from github.com)

Currently using community version xlsx-0.18.11, I can generate my data as .xlsx, but when I tried to change to .xlsb I got the following error:

    RangeError [ERR_INVALID_OPT_VALUE]: The value "NaN" is invalid for option "size"
        at Function.alloc (buffer.js:370:3)
        at new_raw_buf (/var/www/my-project/node_modules/xlsx/xlsx.js:175:43)
        at new_buf (/var/www/my-project/node_modules/xlsx/xlsx.js:4067:10)
        at write_BrtShortSt (/var/www/my-project/node_modules/xlsx/xlsx.js:16071:20)
        at write_ws_bin_cell (/var/www/my-project/node_modules/xlsx/xlsx.js:16631:61)
        at write_CELLTABLE (/var/www/my-project/node_modules/xlsx/xlsx.js:16679:16)
        at write_ws_bin (/var/www/my-project/node_modules/xlsx/xlsx.js:16795:2)
        at write_zip_xlsb (/var/www/my-projecty/node_modules/xlsx/xlsx.js:24825:25)
        at write_zip (/var/www/my-project/node_modules/xlsx/xlsx.js:25190:23)
        at write_zip_type (/var/www/my-project/node_modules/xlsx/xlsx.js:25197:10)
        at writeSync (/var/www/my-project/node_modules/xlsx/xlsx.js:25335:22)
        at Object.writeFile [as writeFileSync] (/var/www/my-project/node_modules/xlsx/xlsx.js:25358:9)

I had to dig a bit, but here is what I noticed:

  1. When using XLSX.utils.sheet_add_aoa, the content of the aoa cells are directly put in the sheet cells' v value property (unless object, which is merged)
  2. As a consequence, some cells had pure integer or boolean values instead of string.
  3. Boolean value is ok (when opening the excel it shows up as 'true' but I'm not sure if it was converted to string or just Excel UI displaying this). My test cell: { "v": true, "t": "s", "z": "@" }
  4. Number value causes the error: { "v": 123456, "t": "s", "z": "@" }

(I kept my custom formatting in the examples in case this is relevant, but I don't believe it should be)

No problem when generating a regular .xlsx though. I also did not test on other formats than .xlsx and .xlsb. Or with non-integer numbers.

Currently using community version `xlsx-0.18.11`, I can generate my data as .xlsx, but when I tried to change to .xlsb I got the following error: ``` RangeError [ERR_INVALID_OPT_VALUE]: The value "NaN" is invalid for option "size" at Function.alloc (buffer.js:370:3) at new_raw_buf (/var/www/my-project/node_modules/xlsx/xlsx.js:175:43) at new_buf (/var/www/my-project/node_modules/xlsx/xlsx.js:4067:10) at write_BrtShortSt (/var/www/my-project/node_modules/xlsx/xlsx.js:16071:20) at write_ws_bin_cell (/var/www/my-project/node_modules/xlsx/xlsx.js:16631:61) at write_CELLTABLE (/var/www/my-project/node_modules/xlsx/xlsx.js:16679:16) at write_ws_bin (/var/www/my-project/node_modules/xlsx/xlsx.js:16795:2) at write_zip_xlsb (/var/www/my-projecty/node_modules/xlsx/xlsx.js:24825:25) at write_zip (/var/www/my-project/node_modules/xlsx/xlsx.js:25190:23) at write_zip_type (/var/www/my-project/node_modules/xlsx/xlsx.js:25197:10) at writeSync (/var/www/my-project/node_modules/xlsx/xlsx.js:25335:22) at Object.writeFile [as writeFileSync] (/var/www/my-project/node_modules/xlsx/xlsx.js:25358:9) ``` I had to dig a bit, but here is what I noticed: 1. When using `XLSX.utils.sheet_add_aoa`, the content of the aoa cells are directly put in the sheet cells' `v` value property (unless object, which is merged) 2. As a consequence, some cells had pure integer or boolean values instead of string. 4. Boolean value is ok (when opening the excel it shows up as 'true' but I'm not sure if it was converted to string or just Excel UI displaying this). My test cell: `{ "v": true, "t": "s", "z": "@" }` 5. Number value causes the error: `{ "v": 123456, "t": "s", "z": "@" }` (I kept my custom formatting in the examples in case this is relevant, but I don't believe it should be) No problem when generating a regular .xlsx though. I also did not test on other formats than .xlsx and .xlsb. Or with non-integer numbers.
SheetJSDev commented 2022-09-21 08:01:35 +00:00 (Migrated from github.com)

https://docs.sheetjs.com/docs/api/utilities#array-of-arrays-input needs to clarify the logic:

  • Strings are mapped to string cells ("abc" -> {v: "abc", t: "s"})
  • Numbers are mapped to numeric cells (123 -> {v: 123, t: "n"})
  • Raw cell objects are preserved ({ "v": 123456, "t": "s", "z": "@" } -> { "v": 123456, "t": "s", "z": "@" }).

{ "v": 123456, "t": "s", "z": "@" } does not make sense. Either
A) the cell should be numeric { "v": 123456, "t": "n", "z": "@" }
B) the value should be a string { "v": "123456", "t": "s", "z": "@" }.

The XLSX export "works" thanks to JS string conversions.

It is still a bug in the XLSB writer. There are a few parts that need to be fixed in bits/68_wsbin.js.

.

We'd need to decide how to stringify values. Feedback appreciated :)

For numbers, there are a few approaches:

  1. Write the raw value String(cell.v)
  2. Format the number using the number format and store the formatted string (cell.w)
  3. Force the writer to generate a Number cell (e.g. BrtShortRk or BrtCellRk or BrtShortReal or BrtCellReal)

For booleans:

  1. Write the JS true/false strings
  2. Write the Excel TRUE / FALSE values.
  3. Write actual boolean cells (BrtCellBool / BrtShortBool)

XLSB does not have a native date cell type, so the options are

  1. Write the stringified date
  2. Write the formatted value
  3. Generate a numeric cell using the date code
https://docs.sheetjs.com/docs/api/utilities#array-of-arrays-input needs to clarify the logic: - Strings are mapped to string cells ("abc" -> `{v: "abc", t: "s"}`) - Numbers are mapped to numeric cells (123 -> `{v: 123, t: "n"}`) - Raw cell objects are preserved (`{ "v": 123456, "t": "s", "z": "@" }` -> `{ "v": 123456, "t": "s", "z": "@" }`). `{ "v": 123456, "t": "s", "z": "@" }` does not make sense. Either A) the cell should be numeric `{ "v": 123456, "t": "n", "z": "@" }` B) the value should be a string `{ "v": "123456", "t": "s", "z": "@" }`. The XLSX export "works" thanks to JS string conversions. It is still a bug in the XLSB writer. There are a few parts that need to be fixed in [`bits/68_wsbin.js`](https://github.com/SheetJS/sheetjs/blob/master/bits/68_wsbin.js). . We'd need to decide how to stringify values. Feedback appreciated :) For numbers, there are a few approaches: 1) Write the raw value `String(cell.v)` 2) Format the number using the number format and store the formatted string (`cell.w`) 3) Force the writer to generate a Number cell (e.g. `BrtShortRk` or `BrtCellRk` or `BrtShortReal` or `BrtCellReal`) For booleans: 1) Write the JS `true`/`false` strings 2) Write the Excel `TRUE` / `FALSE` values. 3) Write actual boolean cells (BrtCellBool / BrtShortBool) XLSB does not have a native date cell type, so the options are 1) Write the stringified date 2) Write the formatted value 3) Generate a numeric cell using the date code
laurelgr commented 2022-09-21 08:58:28 +00:00 (Migrated from github.com)

Oh, good to know!

I had overwritten the t and z parameter for custom formatting after the aoa input of the number value, with "t": "s" because this value was actually supposed to be a string and not a number (bad leftover in database), that's why my error case is unusual!

But yes, it would be better to at least generate a clear error message on the cause if stringifying is not yet implemented.


I'm not that familiar with Excel/ExcelB formats, and I don't have requirements impacted by these, so please understand this feedback is not expert level. :)

For numbers: I would go with either 1 (simple) or 3 (exact)
For booleans: 2 (Excel value over JS value, in the case of xlsb) or 3
For date: 1 or 3

I'm not sure how these choices would impact the user experience, so instead I would say to find which remains closest to the original when generating the file, then reading the generated file with the XLSX package. And maybe check how Excel chooses to do these types too.

You could also add writing options to choose behaviour other than default, but this is already getting complex. :)

Oh, good to know! I had overwritten the `t` and `z` parameter for custom formatting **after** the aoa input of the number value, with `"t": "s"` because this value was actually supposed to be a string and not a number (bad leftover in database), that's why my error case is unusual! But yes, it would be better to at least generate a clear error message on the cause if stringifying is not yet implemented. ---- I'm not *that* familiar with Excel/ExcelB formats, and I don't have requirements impacted by these, so please understand this feedback is not expert level. :) For numbers: I would go with either 1 (simple) or 3 (exact) For booleans: 2 (Excel value over JS value, in the case of xlsb) or 3 For date: 1 or 3 I'm not sure how these choices would impact the user experience, so instead I would say to find which remains closest to the original when generating the file, then reading the generated file with the XLSX package. And maybe check how Excel chooses to do these types too. You could also add writing options to choose behaviour other than default, but this is already getting complex. :)
SheetJSDev commented 2022-09-21 18:20:41 +00:00 (Migrated from github.com)

Probably should have explained the choices with an example. Suppose we were talking about a cell like:

{ t: "s", v: 123456, z: "#,##0" }

If this were marked as a numeric cell (t: "n") there would be no ambiguity -- it's a number. If the value were a string (v: "123456") there would be no ambiguity -- it's a string. But when you mark a cell as a string and do not pass a string, the library has to decide how to handle that value. The choices are equivalent to:

  1. Entering the formula '123456 in the formula bar
  2. Entering the formula '123,456 in the formula bar
  3. Entering 123456 and assigning the format #,##0

They map to cells B2/C2/D2 in the sample: issue2795.xlsx -- which cell best reflects the intention? Same question applies to boolean (row 3) and date (row 4)

Probably should have explained the choices with an example. Suppose we were talking about a cell like: ```js { t: "s", v: 123456, z: "#,##0" } ``` If this were marked as a numeric cell (`t: "n"`) there would be no ambiguity -- it's a number. If the value were a string (`v: "123456"`) there would be no ambiguity -- it's a string. But when you mark a cell as a string and do not pass a string, the library has to decide how to handle that value. The choices are equivalent to: 1. Entering the formula `'123456` in the formula bar 2. Entering the formula `'123,456` in the formula bar 3. Entering `123456` and assigning the format `#,##0` They map to cells B2/C2/D2 in the sample: [issue2795.xlsx](https://github.com/SheetJS/sheetjs/files/9619305/issue2795.xlsx) -- which cell best reflects the intention? Same question applies to boolean (row 3) and date (row 4)
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#2795
No description provided.