A way to specify date formating when using XLSX.utils.sheet_to_row_object_array() #134

Closed
opened 2014-10-10 07:26:20 +00:00 by rotemtam · 9 comments
rotemtam commented 2014-10-10 07:26:20 +00:00 (Migrated from github.com)

Hi

Is there a way to specify date formatting to "dd/mm/yyyy" when converting a sheet to object array?

Currently function returns "mm/dd/yyyy" seems like this is hardcoded into SSF somewhere as a default.

Hi Is there a way to specify date formatting to "dd/mm/yyyy" when converting a sheet to object array? Currently function returns "mm/dd/yyyy" seems like this is hardcoded into SSF somewhere as a default.
SheetJSDev commented 2014-10-10 14:41:38 +00:00 (Migrated from github.com)

@rotemtam Let's take the write example https://github.com/SheetJS/test_files/blob/master/write.xlsx in node:

var XLSX = require('xlsx');
var wb = XLSX.readFile('write.xlsx');

Cell C3 is a date:

> wb.Sheets.SheetJS.C3
{ t: 'n',
  v: 41689.604166666664,
  w: '2/19/14' } 

To reformat this, first vacate the formatted text and add the desired number format:

delete wb.Sheets.SheetJS.C3.w
wb.Sheets.SheetJS.C3.z = 'dd/mm/yyyy'

Now apply the number format:

> XLSX.utils.format_cell(wb.Sheets.SheetJS.C3);
> wb.Sheets.SheetJS.C3
{ t: 'n',
  v: 41689.604166666664,
  z: 'dd/mm/yyyy',
  w: '19/02/2014' } // <-- note that the formatted text is now correct

Now all of the output functions will use the new format.

@rotemtam Let's take the write example https://github.com/SheetJS/test_files/blob/master/write.xlsx in node: ``` var XLSX = require('xlsx'); var wb = XLSX.readFile('write.xlsx'); ``` Cell C3 is a date: ``` > wb.Sheets.SheetJS.C3 { t: 'n', v: 41689.604166666664, w: '2/19/14' } ``` To reformat this, first vacate the formatted text and add the desired number format: ``` delete wb.Sheets.SheetJS.C3.w wb.Sheets.SheetJS.C3.z = 'dd/mm/yyyy' ``` Now apply the number format: ``` > XLSX.utils.format_cell(wb.Sheets.SheetJS.C3); > wb.Sheets.SheetJS.C3 { t: 'n', v: 41689.604166666664, z: 'dd/mm/yyyy', w: '19/02/2014' } // <-- note that the formatted text is now correct ``` Now all of the output functions will use the new format.
rotemtam commented 2014-10-11 19:57:06 +00:00 (Migrated from github.com)

@SheetJSDev thanks for the reply! I've managed to figure that much out.

I wanted to check (and perhaps suggest ) that we should be able to specify a default date format to use for cells containing date data.

@SheetJSDev thanks for the reply! I've managed to figure that much out. I wanted to check (and perhaps suggest ) that we should be able to specify a default date format to use for cells containing date data.
SheetJSDev commented 2014-10-11 20:39:35 +00:00 (Migrated from github.com)

@rotemtam you probably know this, but for posterity's sake let's review dates in Excel:

Generally Excel stores dates as numbers with a special formatting. For example, today is October 11 2014. If you use the formula =TODAY() you see a date rendering, but that comes from the number format. If you change the format to General you will see the date code 41923.

Right now only the formatting library is aware that the number is a date. To make the change you propose, we would need to make the core library more aware of dates. The big stumbling block, which is why we store the date code rather than storing the raw date object, is a bug that was replicated from Lotus 1-2-3: 1900 was not a leap year but Lotus treated it as one. Who would have thought a 30+ year old bug would haunt us today?

One solution, as discussed in https://github.com/SheetJS/js-xlsx/issues/126 ( h/t @elad ), is to just store the raw dates internally. I've been playing with it, and quirks aside it works fairly well (you see the problems when trying to perform date arithmetic -- some versions of excel store the difference as a date, and the dreaded 1900 bug shows up at 60 days)

Another solution (which fakes a "default date") is to add an override to the output functions which would effectively apply the approach I suggested earlier (reformat cells that are assumed to be dates).

A third solution involves some not-yet-released logic: as part of https://github.com/SheetJS/js-xlsx/issues/78 ( @sysarchitect the issue was not forgotten :) the whole localization logic is changing. It is possible to create a fake locale with the default date.

The third option definitely will be implemented. The second option can be added really quickly. The first may happen, although I'm not sure when. Which sounds like the most appealing option?

@rotemtam you probably know this, but for posterity's sake let's review dates in Excel: Generally Excel stores dates as numbers with a special formatting. For example, today is October 11 2014. If you use the formula `=TODAY()` you see a date rendering, but that comes from the number format. If you change the format to `General` you will see the date code `41923`. Right now only the formatting library is aware that the number is a date. To make the change you propose, we would need to make the core library more aware of dates. The big stumbling block, which is why we store the date code rather than storing the raw date object, is a bug that was replicated from Lotus 1-2-3: 1900 was not a leap year but Lotus treated it as one. Who would have thought a 30+ year old bug would haunt us today? One solution, as discussed in https://github.com/SheetJS/js-xlsx/issues/126 ( h/t @elad ), is to just store the raw dates internally. I've been playing with it, and quirks aside it works fairly well (you see the problems when trying to perform date arithmetic -- some versions of excel store the difference as a date, and the dreaded 1900 bug shows up at 60 days) Another solution (which fakes a "default date") is to add an override to the output functions which would effectively apply the approach I suggested earlier (reformat cells that are assumed to be dates). A third solution involves some not-yet-released logic: as part of https://github.com/SheetJS/js-xlsx/issues/78 ( @sysarchitect the issue was not forgotten :) the whole localization logic is changing. It is possible to create a fake locale with the default date. The third option definitely will be implemented. The second option can be added really quickly. The first may happen, although I'm not sure when. Which sounds like the most appealing option?
rotemtam commented 2014-10-12 14:00:07 +00:00 (Migrated from github.com)

@SheetJSDev not sure I understand all the options, but thanks for the thorough replies and explanations.

Not sure what I'm suggesting fits in one of these three. What I was thinking is that I as a user of the library should be able to do is something like:

wb = XLSX.read(data, {'type': 'binary'});
ws = wb.Sheets[wb.SheetNames[0]];
data = XLSX.utils.sheet_to_row_object_array(ws, {'date_format':'dd/mm/yyyy'});

The reason i think we should be able to override whatever the user did in the original excel file, is that we get all sorts of funky data from all sorts of users and we need to be able to standardize it and process it.

Also what @elad suggests sounds useful to me, if i could tell by looking at a cell object whether it contains date data i could scan the results and deal with them myself, currently i could not find an easy way to tell that a cell contains date data.

@SheetJSDev not sure I understand all the options, but thanks for the thorough replies and explanations. Not sure what I'm suggesting fits in one of these three. What I was thinking is that I as a user of the library should be able to do is something like: ``` js wb = XLSX.read(data, {'type': 'binary'}); ws = wb.Sheets[wb.SheetNames[0]]; data = XLSX.utils.sheet_to_row_object_array(ws, {'date_format':'dd/mm/yyyy'}); ``` The reason i think we should be able to override whatever the user did in the original excel file, is that we get all sorts of funky data from all sorts of users and we need to be able to standardize it and process it. Also what @elad suggests sounds useful to me, if i could tell by looking at a cell object whether it contains date data i could scan the results and deal with them myself, currently i could not find an easy way to tell that a cell contains date data.
elad commented 2014-10-13 16:21:23 +00:00 (Migrated from github.com)

@rotemtam do you need the date in a consistent format so that you're able to parse it into a javascript date object (or a moment.js object or whatever)?

@rotemtam do you need the date in a consistent format so that you're able to parse it into a javascript date object (or a moment.js object or whatever)?
rotemtam commented 2014-10-13 20:56:03 +00:00 (Migrated from github.com)

@elad exactly!

@elad exactly!
elad commented 2014-10-14 11:48:10 +00:00 (Migrated from github.com)

@SheetJSDev I'd say this is another instance of #126 then...

@SheetJSDev I'd say this is another instance of #126 then...
SheetJSDev commented 2017-03-22 07:52:42 +00:00 (Migrated from github.com)

Note: these changes are in the repo and will appear in version 0.9.5.

If you read in with cellDates:true, the date-stored-as-number cells will be proper dates.

If you set the dateNF key of the options object to a number format string, the formatter will use it. Following the example from my previous comment:

> var wb = XLSX.readFile('write.xlsx', {cellDates:true});
> wb.Sheets.SheetJS.C3
{ t: 'd', v: 2014-02-19T14:30:00.000Z, w: '2/19/14' } // <-- note: v is a date object
> delete wb.Sheets.SheetJS.C3.w
true
> XLSX.utils.format_cell(wb.Sheets.SheetJS.C3);
'2/19/14' <-- defaults to the number format code 14
> delete wb.Sheets.SheetJS.C3.w
true
> XLSX.utils.format_cell(wb.Sheets.SheetJS.C3, null, {dateNF:"YYYY-MM-DD"});
'2014-02-19' <-- uses the override
Note: these changes are in the repo and will appear in version 0.9.5. If you read in with `cellDates:true`, the date-stored-as-number cells will be proper dates. If you set the `dateNF` key of the options object to a number format string, the formatter will use it. Following the example from [my previous comment](https://github.com/SheetJS/js-xlsx/issues/134#issuecomment-58664998): ```js > var wb = XLSX.readFile('write.xlsx', {cellDates:true}); > wb.Sheets.SheetJS.C3 { t: 'd', v: 2014-02-19T14:30:00.000Z, w: '2/19/14' } // <-- note: v is a date object > delete wb.Sheets.SheetJS.C3.w true > XLSX.utils.format_cell(wb.Sheets.SheetJS.C3); '2/19/14' <-- defaults to the number format code 14 > delete wb.Sheets.SheetJS.C3.w true > XLSX.utils.format_cell(wb.Sheets.SheetJS.C3, null, {dateNF:"YYYY-MM-DD"}); '2014-02-19' <-- uses the override ```
EvanCarroll commented 2019-02-27 19:30:45 +00:00 (Migrated from github.com)

Where is the arguments to format_cell even documented? I just see it mentioned in the docs.

Where is the arguments to `format_cell` even documented? I just see it mentioned in the docs.
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#134
No description provided.