Custom Number Format with "cellText: false, cellDates: true" broken #2349

Open
opened 2021-08-11 03:32:18 +00:00 by woshiguabi · 10 comments
woshiguabi commented 2021-08-11 03:32:18 +00:00 (Migrated from github.com)

File

sheetjs-bug.xlsx

Problem

Date column is standard date format, Custom Date column is custom format
image

image

when XLSX.read set cellDates: true
3542d62fff/bits/67_wsxml.js (L453)
fmtid is 30, and SSF.is_date got undefined so broken
image

Discussion

  1. SSF.is_date validate input and return false for unexpected input.
  2. use safe_is_date like safe_format with SSFImplicit table, and add more implicit format.
  3. not broken whole worksheet, because workbook.Sheets still includes the broken worksheet, just remove the broken cell.
## File [sheetjs-bug.xlsx](https://github.com/SheetJS/sheetjs/files/6960203/sheetjs-bug.xlsx) ## Problem `Date` column is standard date format, `Custom Date` column is custom format ![image](https://user-images.githubusercontent.com/17465858/128835785-beee6bf9-4a28-4a95-828c-0bcfb86628b9.png) ![image](https://user-images.githubusercontent.com/17465858/128959068-3da7c98e-c962-4f5c-b2ad-3529317133d6.png) when `XLSX.read` set `cellDates: true` https://github.com/SheetJS/sheetjs/blob/3542d62fffc155dd505a23230ba182c4402a0e2c/bits/67_wsxml.js#L453 `fmtid` is `30`, and `SSF.is_date` got `undefined` so broken ![image](https://user-images.githubusercontent.com/17465858/128963938-3e1cf1eb-639f-4fbf-b938-5815b47cbb0a.png) ## Discussion 1. `SSF.is_date` validate input and return `false` for unexpected input. 2. use `safe_is_date` like `safe_format` with `SSFImplicit` table, and add more implicit format. 3. not broken whole worksheet, because `workbook.Sheets` still includes the broken worksheet, just remove the broken cell.
SheetJSDev commented 2021-08-11 04:08:40 +00:00 (Migrated from github.com)

ECMA-376 (XLSX) does not specify a default for code 30, hence nothing is in the SSFImplicit table. It further states:

When values not present in the lists below are used, the behavior is implementation-defined.

There are some locale-specific overrides for that particular code:

locale fmt 30
zh_TW m/d/yy
zh_CN m"-"d"-"yy
ja_JP m/d/yy
ko_KR mm"-"dd"-"yy

XLSB (2.5.76 Ifmt) refers to XLSX

XLS (2.4.126 Format) is even stranger, not prescribing a default for 30 and stating

The value of ifmt.ifmt MUST be a value within one of the following ranges or within 383 to 392.

  • 5 to 8
  • 23 to 26
  • 41 to 44
  • 63 to 66
  • 164 to 382

Out of curiosity, can you repeat that process and save the file in a few different formats, namely "Excel 97-2003 Workbook (.xls)" and "SYLK (Symbolic Link) (.slk)"?

ECMA-376 (XLSX) does not specify a default for code 30, hence nothing is in the SSFImplicit table. It further states: > When values not present in the lists below are used, the behavior is implementation-defined. There are some locale-specific overrides for that particular code: | locale | fmt 30 | |:-------|:-------------| | zh_TW | m/d/yy | | zh_CN | m"-"d"-"yy | | ja_JP | m/d/yy | | ko_KR | mm"-"dd"-"yy | [XLSB (2.5.76 Ifmt)](https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-xlsb/1259bdd0-9443-44d2-af67-42ee0ec082ff) refers to XLSX [XLS (2.4.126 Format)](https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-xls/300280fd-e4fe-4675-a924-4d383af48d3b) is even stranger, not prescribing a default for 30 and stating > The value of ifmt.ifmt MUST be a value within one of the following ranges or within 383 to 392. > - 5 to 8 > - 23 to 26 > - 41 to 44 > - 63 to 66 > - 164 to 382 Out of curiosity, can you repeat that process and save the file in a few different formats, namely "Excel 97-2003 Workbook (*.xls)" and "SYLK (Symbolic Link) (*.slk)"?
woshiguabi commented 2021-08-11 04:32:47 +00:00 (Migrated from github.com)

Out of curiosity, can you repeat that process and save the file in a few different formats, namely "Excel 97-2003 Workbook (.xls)" and "SYLK (Symbolic Link) (.slk)"?

For *.xls, SSF.is_date(SSF._table[fmtid] || String(fmtid)) will skip fmt 30 and read as normal number
3542d62fff/bits/76_xls.js (L92)
image


For *.slk, the behavior is opposite to *.xls.
Custom date format can be recognized but normal date format cant.
image

> Out of curiosity, can you repeat that process and save the file in a few different formats, namely "Excel 97-2003 Workbook (_.xls)" and "SYLK (Symbolic Link) (_.slk)"? For *.xls, `SSF.is_date(SSF._table[fmtid] || String(fmtid))` will skip fmt 30 and read as normal number https://github.com/SheetJS/sheetjs/blob/3542d62fffc155dd505a23230ba182c4402a0e2c/bits/76_xls.js#L92 ![image](https://user-images.githubusercontent.com/17465858/128969851-a1650ce5-89fb-4c45-a203-11c608d91588.png) --- For *.slk, the behavior is opposite to *.xls. Custom date format can be recognized but normal date format cant. ![image](https://user-images.githubusercontent.com/17465858/128969820-6e24fa7a-c109-49bd-b709-f6b09ac1e6be.png)
woshiguabi commented 2021-08-11 04:44:33 +00:00 (Migrated from github.com)

ECMA-376 (XLSX) does not specify a default for code 30, hence nothing is in the SSFImplicit table. It further states:

When values not present in the lists below are used, the behavior is implementation-defined.

Certainly, ECMA-376 not includes code 30, number format to string should be implementation-defined.
I just considering whether to add these special cases when cellDates: true, these values should be converted to Date .

> ECMA-376 (XLSX) does not specify a default for code 30, hence nothing is in the SSFImplicit table. It further states: > > > When values not present in the lists below are used, the behavior is implementation-defined. > Certainly, ECMA-376 not includes code 30, number format to string should be implementation-defined. I just considering whether to add these special cases when `cellDates: true`, these values should be converted to `Date` .
SheetJSDev commented 2021-08-11 04:52:54 +00:00 (Migrated from github.com)

We'd have to test across a few more locales. If Excel is always treating format code 30 as the default date format, we can just add it to SSFImplicit.

Can you share the SLK file? You may need to compress as ZIP and attach.

We'd have to test across a few more locales. If Excel is always treating format code 30 as the default date format, we can just add it to SSFImplicit. Can you share the SLK file? You may need to compress as ZIP and attach.
woshiguabi commented 2021-08-11 04:53:34 +00:00 (Migrated from github.com)

😉Request for an extra help, I want to know which custom format code should be recognized as Date, is there any specs to read ?
Many thanks!

😉Request for an extra help, I want to know which custom format code should be recognized as `Date`, is there any specs to read ? Many thanks!
woshiguabi commented 2021-08-11 04:55:01 +00:00 (Migrated from github.com)

We'd have to test across a few more locales. If Excel is always treating format code 30 as the default date format, we can just add it to SSFImplicit.

Can you share the SLK file? You may need to compress as ZIP and attach.

sheetjs-bug.slk.zip

> We'd have to test across a few more locales. If Excel is always treating format code 30 as the default date format, we can just add it to SSFImplicit. > > Can you share the SLK file? You may need to compress as ZIP and attach. [sheetjs-bug.slk.zip](https://github.com/SheetJS/sheetjs/files/6965955/sheetjs-bug.slk.zip)
SheetJSDev commented 2021-08-11 05:06:12 +00:00 (Migrated from github.com)

Thanks!

The function is a stripped down version of the tokenizer https://github.com/SheetJS/ssf/blob/master/bits/81_fmttype.js#L3

It looks for the following date tokens:

  • B1 and B2 (calendar system)
  • Y and E (year and era)
  • M, D, H, S (month/minute, day, hour, second)
  • A/P, AM/PM, 上午/下午 (12-hour indicator)
Thanks! The function is a stripped down version of the tokenizer https://github.com/SheetJS/ssf/blob/master/bits/81_fmttype.js#L3 It looks for the following date tokens: - `B1` and `B2` (calendar system) - `Y` and `E` (year and era) - `M`, `D`, `H`, `S` (month/minute, day, hour, second) - `A/P`, `AM/PM`, `上午/下午` (12-hour indicator)
SheetJSDev commented 2021-08-11 05:14:49 +00:00 (Migrated from github.com)

Can you check what version of the library you are using (console.log(XLSX.version)) ?
With 0.17.0 setting cellDates: true generates dates for every cell:

> require("xlsx").readFile("sheetjs-bug.xlsx", {cellNF: true, cellDates: true}).Sheets.Sheet1
...
  A2: { t: 'd', v: 2021-08-10T04:00:00.000Z, w: '8/10/21' },
  B2: { t: 'd', v: 2021-08-10T04:00:00.000Z, w: '8/10/21' },
  A3: { t: 'd', v: 2021-08-11T04:00:00.000Z, w: '8/11/21' },
  B3: { t: 'd', v: 2021-08-11T04:00:00.000Z, w: '8/11/21' },
  A4: { t: 'd', v: 2021-08-12T04:00:00.000Z, w: '8/12/21' },
  B4: { t: 'd', v: 2021-08-12T04:00:00.000Z, w: '8/12/21' },
  A5: { t: 'd', v: 2021-08-13T04:00:00.000Z, w: '8/13/21' },
  B5: { t: 'd', v: 2021-08-13T04:00:00.000Z, w: '8/13/21' },
...

With cellNF: true it looks like the format for cell B2 is missed (so something is still incorrect):

  A2: { t: 'n', v: 44418, z: 'm/d/yy', w: '8/10/21' },
  B2: { t: 'n', v: 44418, z: undefined, w: '8/10/21' },
  A3: { t: 'n', v: 44419, z: 'm/d/yy', w: '8/11/21' },
Can you check what version of the library you are using (`console.log(XLSX.version)`) ? With 0.17.0 setting cellDates: true generates dates for every cell: ``` > require("xlsx").readFile("sheetjs-bug.xlsx", {cellNF: true, cellDates: true}).Sheets.Sheet1 ... A2: { t: 'd', v: 2021-08-10T04:00:00.000Z, w: '8/10/21' }, B2: { t: 'd', v: 2021-08-10T04:00:00.000Z, w: '8/10/21' }, A3: { t: 'd', v: 2021-08-11T04:00:00.000Z, w: '8/11/21' }, B3: { t: 'd', v: 2021-08-11T04:00:00.000Z, w: '8/11/21' }, A4: { t: 'd', v: 2021-08-12T04:00:00.000Z, w: '8/12/21' }, B4: { t: 'd', v: 2021-08-12T04:00:00.000Z, w: '8/12/21' }, A5: { t: 'd', v: 2021-08-13T04:00:00.000Z, w: '8/13/21' }, B5: { t: 'd', v: 2021-08-13T04:00:00.000Z, w: '8/13/21' }, ... ``` With `cellNF: true` it looks like the format for cell B2 is missed (so something is still incorrect): ``` A2: { t: 'n', v: 44418, z: 'm/d/yy', w: '8/10/21' }, B2: { t: 'n', v: 44418, z: undefined, w: '8/10/21' }, A3: { t: 'n', v: 44419, z: 'm/d/yy', w: '8/11/21' }, ```
woshiguabi commented 2021-08-11 06:17:26 +00:00 (Migrated from github.com)

Can you check what version of the library you are using (console.log(XLSX.version)) ?

image

My options:

XLSX.read(ev.data.arrayBuffer, {
    type: 'array',
    cellFormula: false,
    cellHTML: false,
    cellText: false,
    cellStyles: true,
    cellDates: true,
    sheetRows: 100000,
    WTF: true,
}),

Addtional: I'm using xlsx in Worker.

> Can you check what version of the library you are using (`console.log(XLSX.version)`) ? ![image](https://user-images.githubusercontent.com/17465858/128978862-1ba5cfc6-57c3-4cb0-acde-b79039e49554.png) My options: ``` XLSX.read(ev.data.arrayBuffer, { type: 'array', cellFormula: false, cellHTML: false, cellText: false, cellStyles: true, cellDates: true, sheetRows: 100000, WTF: true, }), ``` Addtional: I'm using `xlsx` in `Worker`.
woshiguabi commented 2021-08-11 06:31:35 +00:00 (Migrated from github.com)

Can you check what version of the library you are using (console.log(XLSX.version)) ?

image

My options:

XLSX.read(ev.data.arrayBuffer, {
    type: 'array',
    cellFormula: false,
    cellHTML: false,
    cellText: false,
    cellStyles: true,
    cellDates: true,
    sheetRows: 100000,
    WTF: true,
}),

Addtional: I'm using xlsx in Worker.

Ok finally i find cellText: false will cause this error.

cellText: true

// SSF._table
{
    "0": "General",
    "1": "0",
    "2": "0.00",
    "3": "#,##0",
    "4": "#,##0.00",
    "9": "0%",
    "10": "0.00%",
    "11": "0.00E+00",
    "12": "# ?/?",
    "13": "# ??/??",
    "14": "m/d/yy",
    "15": "d-mmm-yy",
    "16": "d-mmm",
    "17": "mmm-yy",
    "18": "h:mm AM/PM",
    "19": "h:mm:ss AM/PM",
    "20": "h:mm",
    "21": "h:mm:ss",
    "22": "m/d/yy h:mm",
    "30": "m/d/yy", // added fmtid 30 in _table
    "37": "#,##0 ;(#,##0)",
    "38": "#,##0 ;[Red](#,##0)",
    "39": "#,##0.00;(#,##0.00)",
    "40": "#,##0.00;[Red](#,##0.00)",
    "45": "mm:ss",
    "46": "[h]:mm:ss",
    "47": "mmss.0",
    "48": "##0.0E+0",
    "49": "@",
    "56": "\"上午/下午 \"hh\"時\"mm\"分\"ss\"秒 \"",
    "176": "[$-F800]dddd\\,\\ mmmm\\ dd\\,\\ yyyy"
}

cellText: false

// SSF._table
{
    "0": "General",
    "1": "0",
    "2": "0.00",
    "3": "#,##0",
    "4": "#,##0.00",
    "9": "0%",
    "10": "0.00%",
    "11": "0.00E+00",
    "12": "# ?/?",
    "13": "# ??/??",
    "14": "m/d/yy",
    "15": "d-mmm-yy",
    "16": "d-mmm",
    "17": "mmm-yy",
    "18": "h:mm AM/PM",
    "19": "h:mm:ss AM/PM",
    "20": "h:mm",
    "21": "h:mm:ss",
    "22": "m/d/yy h:mm",
    "37": "#,##0 ;(#,##0)",
    "38": "#,##0 ;[Red](#,##0)",
    "39": "#,##0.00;(#,##0.00)",
    "40": "#,##0.00;[Red](#,##0.00)",
    "45": "mm:ss",
    "46": "[h]:mm:ss",
    "47": "mmss.0",
    "48": "##0.0E+0",
    "49": "@",
    "56": "\"上午/下午 \"hh\"時\"mm\"分\"ss\"秒 \"",
    "176": "[$-F800]dddd\\,\\ mmmm\\ dd\\,\\ yyyy"
}

333deae63f/bits/11_ssfutils.js (L27)

333deae63f/bits/66_wscommon.js (L93-L94)

SSFImplicit already includes fmt30, but when cellText: false, it won't be used.

> > Can you check what version of the library you are using (`console.log(XLSX.version)`) ? > > ![image](https://user-images.githubusercontent.com/17465858/128978862-1ba5cfc6-57c3-4cb0-acde-b79039e49554.png) > > My options: > > ``` > XLSX.read(ev.data.arrayBuffer, { > type: 'array', > cellFormula: false, > cellHTML: false, > cellText: false, > cellStyles: true, > cellDates: true, > sheetRows: 100000, > WTF: true, > }), > ``` > > Addtional: I'm using `xlsx` in `Worker`. Ok finally i find `cellText: false` will cause this error. ### cellText: true ``` // SSF._table { "0": "General", "1": "0", "2": "0.00", "3": "#,##0", "4": "#,##0.00", "9": "0%", "10": "0.00%", "11": "0.00E+00", "12": "# ?/?", "13": "# ??/??", "14": "m/d/yy", "15": "d-mmm-yy", "16": "d-mmm", "17": "mmm-yy", "18": "h:mm AM/PM", "19": "h:mm:ss AM/PM", "20": "h:mm", "21": "h:mm:ss", "22": "m/d/yy h:mm", "30": "m/d/yy", // added fmtid 30 in _table "37": "#,##0 ;(#,##0)", "38": "#,##0 ;[Red](#,##0)", "39": "#,##0.00;(#,##0.00)", "40": "#,##0.00;[Red](#,##0.00)", "45": "mm:ss", "46": "[h]:mm:ss", "47": "mmss.0", "48": "##0.0E+0", "49": "@", "56": "\"上午/下午 \"hh\"時\"mm\"分\"ss\"秒 \"", "176": "[$-F800]dddd\\,\\ mmmm\\ dd\\,\\ yyyy" } ``` ### cellText: false ``` // SSF._table { "0": "General", "1": "0", "2": "0.00", "3": "#,##0", "4": "#,##0.00", "9": "0%", "10": "0.00%", "11": "0.00E+00", "12": "# ?/?", "13": "# ??/??", "14": "m/d/yy", "15": "d-mmm-yy", "16": "d-mmm", "17": "mmm-yy", "18": "h:mm AM/PM", "19": "h:mm:ss AM/PM", "20": "h:mm", "21": "h:mm:ss", "22": "m/d/yy h:mm", "37": "#,##0 ;(#,##0)", "38": "#,##0 ;[Red](#,##0)", "39": "#,##0.00;(#,##0.00)", "40": "#,##0.00;[Red](#,##0.00)", "45": "mm:ss", "46": "[h]:mm:ss", "47": "mmss.0", "48": "##0.0E+0", "49": "@", "56": "\"上午/下午 \"hh\"時\"mm\"分\"ss\"秒 \"", "176": "[$-F800]dddd\\,\\ mmmm\\ dd\\,\\ yyyy" } ``` https://github.com/SheetJS/sheetjs/blob/333deae63fbe13d1ff2db42e599211a062001c32/bits/11_ssfutils.js#L27 https://github.com/SheetJS/sheetjs/blob/333deae63fbe13d1ff2db42e599211a062001c32/bits/66_wscommon.js#L93-L94 **`SSFImplicit` already includes `fmt30`, but when `cellText: false`, it won't be used.**
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#2349
No description provided.