Support of cellDates #581

Closed
Aymkdn wants to merge 1 commits from patch-1 into master
Aymkdn commented 2017-03-03 16:23:15 +00:00 (Migrated from github.com)

Currently the cellDates option doesn't work and the Date cells are not correctly recognized.

I propose a fix that returns .d that contains the JavaScript Date.

I also propose to provide fmt and fmtid to the object so the user will be able to deal with any customized formats. It will give more flexibility.

Currently the `cellDates` option doesn't work and the Date cells are not correctly recognized. I propose a fix that returns `.d` that contains the JavaScript Date. I also propose to provide `fmt` and `fmtid` to the object so the user will be able to deal with any customized formats. It will give more flexibility.
SheetJSDev commented 2017-03-03 17:30:13 +00:00 (Migrated from github.com)

What should cellDates do in general?

The original intention of the cellDates option was to normalize files for some ECMA-376 logic. Excel generally writes files with dates stored as numbers (type n) with the date code, but the spec technically allows for storing ISO date strings (type d). The default behavior is to convert the date cells to numeric, but setting cellDates:true preserves the native format. On the write side, the default behavior is to convert cell objects with t:"d" to numeric cells. Setting cellDates:true will save as XLSX date cells.

Note to self: SpreadsheetML and ODS should also follow the option

What should `cellDates` do in general? The original intention of the `cellDates` option was to normalize files for some ECMA-376 logic. Excel generally writes files with dates stored as numbers (type `n`) with the date code, but the spec technically allows for storing ISO date strings (type `d`). The default behavior is to convert the date cells to numeric, but setting `cellDates:true` preserves the native format. On the write side, the default behavior is to convert cell objects with `t:"d"` to numeric cells. Setting `cellDates:true` will save as XLSX date cells. Note to self: SpreadsheetML and ODS should also follow the option
Aymkdn commented 2017-03-03 18:49:00 +00:00 (Migrated from github.com)

The user should be able to know when the cell contains a date. Right now nothing permits to know that, or I didn't understand how to proceed?!

With the changes I did the intent is to indicate to the user that the cell contains a date and to parse it to the .d. So I'm opened to suggestions if you have another way to find this information.

Also it looks important for the user to get the fmt in order to correctly treat the format of the content on his/her hand, don't you think?

Thanks!

The user should be able to know when the cell contains a date. Right now nothing permits to know that, or I didn't understand how to proceed?! With the changes I did the intent is to indicate to the user that the cell contains a date and to parse it to the `.d`. So I'm opened to suggestions if you have another way to find this information. Also it looks important for the user to get the `fmt` in order to correctly treat the format of the content on his/her hand, don't you think? Thanks!
SheetJSDev commented 2017-03-03 19:27:26 +00:00 (Migrated from github.com)

Also it looks important for the user to get the fmt in order to correctly treat the format of the content on his/her hand, don't you think?

The cellNF option should expose the actual number format string to the z field of the cell objects. For example with the test file https://github.com/sheetjs/test_files/blob/master/number_format.xlsm?raw=true cell B24 of sheet "Implied" is set to the format #,##0.00;[Red](#,##0.00):

> require('xlsx').readFile('number_format.xlsm').Sheets.Implied.B24
{ t: 'n', v: 12345.6789, w: '12,345.68' }
> require('xlsx').readFile('number_format.xlsm', {cellNF:true}).Sheets.Implied.B24
{ t: 'n',
  v: 12345.6789,
  w: '12,345.68',
  z: '#,##0.00;[Red]\\(#,##0.00\\)' }

A quick comparison shows the same behavior for XLS and XLSB.

It does the same thing for the internal number formats, so you don't have to do a big switch on the format id -- you can just scan the string.


The user should be able to know when the cell contains a date. Right now nothing permits to know that, or I didn't understand how to proceed?!

This should be something exposed by the SSF formatter, but you are right that the information is not available at the moment.


With the changes I did the intent is to indicate to the user that the cell contains a date and to parse it to the .d. So I'm opened to suggestions if you have another way to find this information.

Disregard my last comment. You want to proactively change number cells with a date format, and that's fine :)


Conclusion:

  1. try using the cellNF:true option and reading the format from the .z field of the cell. If that works, we should not try to expose the internal number format ID.

  2. I will change the format parser so that it tells you if it thinks the format is a date. Basically it would be a simplified form of the tokenizer

Your commit is good, but the simple regex doesn't handle cases like the custom format 0 "months" (since the "m" is in quotation marks it's a literal character).

Also, can you shoot an email to sheetjs@gmail.com -- the git log shows an anonymized address :/

> Also it looks important for the user to get the fmt in order to correctly treat the format of the content on his/her hand, don't you think? The `cellNF` [option](https://github.com/SheetJS/js-xlsx/#parsing-options) should expose the actual number format string to the `z` field of the cell objects. For example with the test file https://github.com/sheetjs/test_files/blob/master/number_format.xlsm?raw=true cell B24 of sheet "Implied" is set to the format `#,##0.00;[Red](#,##0.00)`: ``` > require('xlsx').readFile('number_format.xlsm').Sheets.Implied.B24 { t: 'n', v: 12345.6789, w: '12,345.68' } > require('xlsx').readFile('number_format.xlsm', {cellNF:true}).Sheets.Implied.B24 { t: 'n', v: 12345.6789, w: '12,345.68', z: '#,##0.00;[Red]\\(#,##0.00\\)' } ``` A quick comparison shows the same behavior for [XLS](https://github.com/sheetjs/test_files/blob/master/number_format.xls?raw=true) and [XLSB](https://github.com/sheetjs/test_files/blob/master/number_format.xlsb?raw=true). It does the same thing for the internal number formats, so you don't have to do a big switch on the format id -- you can just scan the string. ---- > The user should be able to know when the cell contains a date. Right now nothing permits to know that, or I didn't understand how to proceed?! This should be something exposed by the `SSF` formatter, but you are right that the information is not available at the moment. ---- > With the changes I did the intent is to indicate to the user that the cell contains a date and to parse it to the .d. So I'm opened to suggestions if you have another way to find this information. Disregard my last comment. You want to proactively change number cells with a date format, and that's fine :) --- Conclusion: 1) try using the `cellNF:true` option and reading the format from the `.z` field of the cell. If that works, we should not try to expose the internal number format ID. 2) I will change the format parser so that it tells you if it thinks the format is a date. Basically it would be a simplified form of the [tokenizer](https://github.com/SheetJS/js-xlsx/blob/master/bits/10_ssf.js#L527-L596) Your commit is good, but the simple regex doesn't handle cases like the custom format `0 "months"` (since the "m" is in quotation marks it's a literal character). Also, can you shoot an email to sheetjs@gmail.com -- the git log shows an anonymized address :/
Aymkdn commented 2017-03-04 10:15:01 +00:00 (Migrated from github.com)

Thanks for all these details. It helped me, and I'm now able to parse a date cell without modifying the original js-xlsx code by using .z with cellNF:true ! :-)

I know the way to parse the format is not perfect, but it should work in most cases for now.

This is for my ExcelPlus library that I use as an interface of js-xslx to easily manipulate an Excel file.

Thanks for all these details. It helped me, and I'm now able to parse a date cell without modifying the original js-xlsx code by using `.z` with `cellNF:true` ! :-) I know the way to parse the format is not perfect, but it should work in most cases for now. This is for my [ExcelPlus library](http://aymkdn.github.io/ExcelPlus/) that I use as an interface of js-xslx to easily manipulate an Excel file.
reidblomquist commented 2017-03-13 16:54:30 +00:00 (Migrated from github.com)

@SheetJSDev I'm having issues getting a value in z on date objects coming in from an xls formatted file. They seem to just not be there... ropts= { type: 'binary', cellNF: true } is what I'm using (doing this all client side/in browser) - wondering if the file type I'm using or the binary flag is interfering with z value parsing or if you have any advice. Alternatively two columns will always have this formatting - and I think I should be able to manually transform the epoch into a normal date format and manually set the format as I walk it into xlsx, but hoping to avoid that.

@SheetJSDev I'm having issues getting a value in z on date objects coming in from an `xls` formatted file. They seem to just not be there... `ropts= { type: 'binary', cellNF: true }` is what I'm using (doing this all client side/in browser) - wondering if the file type I'm using or the binary flag is interfering with z value parsing or if you have any advice. Alternatively two columns will always have this formatting - and I think I should be able to manually transform the epoch into a normal date format and manually set the format as I walk it into xlsx, but hoping to avoid that.
SheetJSDev commented 2017-03-13 17:03:34 +00:00 (Migrated from github.com)

@reidblomquist if you aren't seeing cell formats, that's a bug in the library.

I just checked against an XLS file with a cell encoding a date and it appears to populate the data for dates:

> require('xlsx').readFile('xlsx-stream-d-date-cell.xls', {cellNF:true}).Sheets.Sheet1.B5
{ ixfe: 62,
  XF:
   { ifnt: 0,
     ifmt: 14,
     flags: 1,
     fStyle: 0,
     data: { patternType: null, icvFore: 64, icvBack: 65 } },
  v: 41684.35264774306,
  t: 'n',
  w: '2/14/14',
  z: 'm/d/yy' }

(note: the whole concept of the "date cell" type is specific to XLSX -- XLSB and XLS do not have a date cell type and always represent dates as numeric cells)

Can you share a file where cellNF is not adding formats to a cell? If you don't want to make it public, you can email us at dev@sheetjs.com

@reidblomquist if you aren't seeing cell formats, that's a bug in the library. I just checked against an [XLS file with a cell encoding a date](https://github.com/SheetJS/test_files/blob/master/xlsx-stream-d-date-cell.xls?raw=true) and it appears to populate the data for dates: ```js > require('xlsx').readFile('xlsx-stream-d-date-cell.xls', {cellNF:true}).Sheets.Sheet1.B5 { ixfe: 62, XF: { ifnt: 0, ifmt: 14, flags: 1, fStyle: 0, data: { patternType: null, icvFore: 64, icvBack: 65 } }, v: 41684.35264774306, t: 'n', w: '2/14/14', z: 'm/d/yy' } ``` (note: the whole concept of the "date cell" type is specific to XLSX -- XLSB and XLS do not have a date cell type and always represent dates as numeric cells) Can you share a file where `cellNF` is not adding formats to a cell? If you don't want to make it public, you can email us at dev@sheetjs.com
reidblomquist commented 2017-03-13 17:10:05 +00:00 (Migrated from github.com)

@SheetJSDev before I do that, I'm going to pull down the sheet that test uses and run it through my code. Would you also mind running a check using { type: 'binary' } in addition to cellNF true - wondering if that combo is what's causing. Thanks! Also, you could very well be right that my source sheet has something going on - since it's being generated by some ruby code (hahahaha)

@SheetJSDev before I do that, I'm going to pull down the sheet that test uses and run it through my code. Would you also mind running a check using { type: 'binary' } in addition to cellNF true - wondering if that combo is what's causing. Thanks! Also, you could very well be right that my source sheet has something going on - since it's being generated by some ruby code (hahahaha)
reidblomquist commented 2017-03-13 17:18:53 +00:00 (Migrated from github.com)

@SheetJSDev disregard! it is indeed an issue with formatting in the sheet my Ruby code is generating haha. Man sheet inception. Thanks again for being so prompt! Lmk if you want me to delete some of my comments on here to clean it up - and thanks for your hard work/awesome free software!

@SheetJSDev disregard! it is indeed an issue with formatting in the sheet my Ruby code is generating haha. Man sheet inception. Thanks again for being so prompt! Lmk if you want me to delete some of my comments on here to clean it up - and thanks for your hard work/awesome free software!
SheetJSDev commented 2017-03-13 17:25:47 +00:00 (Migrated from github.com)

@reidblomquist It's probably best to move this discussion to a new issue, as it is somewhat unrelated to the core issue of this PR and @Aymkdn probably doesn't want to see all of these unrelated notifications!

If you are generating a file that does the right thing in Excel (formats a cell properly) but not with js-xlsx, it's a bug in js-xlsx and I recommend raising a new issue with that file.

If your generated file isn't doing the right thing in Excel, then clearly your file generation process has an issue.

@reidblomquist It's probably best to move this discussion to a new issue, as it is somewhat unrelated to the core issue of this PR and @Aymkdn probably doesn't want to see all of these unrelated notifications! If you are generating a file that does the right thing in Excel (formats a cell properly) but not with js-xlsx, it's a bug in js-xlsx and I recommend raising a new issue with that file. If your generated file isn't doing the right thing in Excel, then clearly your file generation process has an issue.
reidblomquist commented 2017-03-13 17:31:21 +00:00 (Migrated from github.com)

@SheetJSDev last comment here - but yes it does seem to be something with the file generation for the xls I'm reading from - I'll do some testing to confirm my findings but so far I'm thinking it has to do with the date not being formatted with traditional / delineators but with . notation. When I open my source xls file they're shown as Date types in excel, but if I re-select that format it reformats the cells to use / delineators and if I then run that "fixed" xls file through js-xlsx everything works as expected. If you'd like me to pull together some sample code and files and open up a new issue I'd be more than happy to!

@SheetJSDev last comment here - but yes it does seem to be something with the file generation for the xls I'm reading from - I'll do some testing to confirm my findings but so far I'm thinking it has to do with the date not being formatted with traditional / delineators but with . notation. When I open my source xls file they're shown as Date types in excel, but if I re-select that format it reformats the cells to use / delineators and if I then run that "fixed" xls file through js-xlsx everything works as expected. If you'd like me to pull together some sample code and files and open up a new issue I'd be more than happy to!
SheetJSDev commented 2017-03-13 17:33:51 +00:00 (Migrated from github.com)

@reidblomquist yes that sounds like a different sort of bug in js-xlsx, so a new issue with a test file or some sample code would be appreciated :) Once you raise that issue, I will go back and clean up these comments

@reidblomquist yes that sounds like a different sort of bug in js-xlsx, so a new issue with a test file or some sample code would be appreciated :) Once you raise that issue, I will go back and clean up these comments
SheetJSDev commented 2017-03-21 08:04:15 +00:00 (Migrated from github.com)

@Aymkdn we added an is_date helper to SSF, factored out of the eval function.

Next version will have this fix and will ensure consistent cellDates handling

@Aymkdn we added an [`is_date` helper](https://github.com/SheetJS/ssf/blob/master/ssf.flow.js#L544) to SSF, factored out of the eval function. Next version will have this fix and will ensure consistent `cellDates` handling
Aymkdn commented 2017-03-21 08:13:23 +00:00 (Migrated from github.com)

Awesome !!! Thank you 👍

Awesome !!! Thank you 👍
SheetJSDev commented 2017-03-21 20:45:57 +00:00 (Migrated from github.com)

We expect to push the next release sometime this week

We expect to push the next release sometime this week

Pull request closed

Sign in to join this conversation.
No description provided.