dateNF seems to be ignored #718

Closed
opened 2017-07-06 15:25:11 +00:00 by trading-peter · 14 comments
trading-peter commented 2017-07-06 15:25:11 +00:00 (Migrated from github.com)

Hey there,

First of all, I've prepared a minimal demo of the issue I'm running into. You can find it here: https://github.com/pkaske/xlsx-date-issue

You should just need to clone, npm install and node ./index.js it to reproduce.

Input data
input.xlsx contains the following data:
auswahl_077

The date format you see here is German (DD.MM.YYYY).
The cells are formated as dates in excel (2010).

What I want to accomplish
I want to convert that table to CSV and keep the date format as it is displayed in the screenshot (DD.MM.YYY).

This is what I tried

const stream = XLSX.stream.to_csv(worksheet, {
  FS: ';',
  RS: '\n',
  dateNF: 'dd.mm.yyyy',   // <--- Seems to be ignored.
  strip: false,
  blankrows: true
});

stream.pipe(Fs.createWriteStream(output))
  .on('finish', () => {
    console.log(output, 'written');
  });

The actual output is

Name;Date
Person 1;12/31/18
Person 2;2/13/18
Person 3;4/5/18

Expected output
The dates should read 31.12.2018, 13.02.2018 and 05.04.2018.

Hopefully I just did something silly and that's all :)

Hey there, First of all, I've prepared a minimal demo of the issue I'm running into. You can find it here: https://github.com/pkaske/xlsx-date-issue You should just need to `clone`, `npm install` and `node ./index.js` it to reproduce. **Input data** `input.xlsx` contains the following data: ![auswahl_077](https://user-images.githubusercontent.com/11567985/27918247-a054d4c2-626e-11e7-807d-3c0cc43fde25.jpg) The date format you see here is German (DD.MM.YYYY). The cells are formated as dates in excel (2010). **What I want to accomplish** I want to convert that table to CSV and keep the date format as it is displayed in the screenshot (DD.MM.YYY). **This is what I tried** ```js const stream = XLSX.stream.to_csv(worksheet, { FS: ';', RS: '\n', dateNF: 'dd.mm.yyyy', // <--- Seems to be ignored. strip: false, blankrows: true }); stream.pipe(Fs.createWriteStream(output)) .on('finish', () => { console.log(output, 'written'); }); ``` **The actual output is** ``` Name;Date Person 1;12/31/18 Person 2;2/13/18 Person 3;4/5/18 ``` **Expected output** The dates should read `31.12.2018`, `13.02.2018` and `05.04.2018`. Hopefully I just did something silly and that's all :)
SheetJSDev commented 2017-07-06 17:32:20 +00:00 (Migrated from github.com)

Both of us did something silly :)

tl;dr:

const workbook = XLSX.readFile(input, {cellText:false, cellDates:true});
// ...
const stream = XLSX.stream.to_csv(worksheet, {
  FS: ';',
  RS: '\n',
  dateNF: 'dd"."mm"."yyyy', // <-- notice the double quotes
  strip: false,
  blankrows: true
});

How does Excel handle international dates?

Excel dates are somewhat hokey. In the file, if you unzip and take a peek at xl/worksheets/sheet1.xml you'll see entries like

      <c r="B2" s="1">
        <v>43465</v>
      </c>

The reader has to figure out that the number corresponds to a date and convert it back. To do that it looks at the cell format from xl/styles.xml:

    <xf numFmtId="14" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1" applyFill="1"/>

All it stores is the number 14. Based in new york, your file looks like:

screen shot 2017-07-06 at 13 18 27

By fiddling with the regional settings, the same exact file will be displayed differently. https://github.com/SheetJS/js-xlsx/issues/326#issuecomment-286014758 is one comment showing a few different regional setting screenshots against Excel 95

What you should do

To override the date format, you must force the reader to ignore the original number format and force the writer to use a valid cell format.

On the read side, the line should be:

const workbook = XLSX.readFile(input, {cellText:false, cellDates:true});

the cellText:false option skips generation of the original strings (the w field is omitted) and cellDates:true forces generation of date objects for cells that are possibly dates (instead of the file's number values). After doing this, the worksheet date cells look like:

{
// ...
  B2: { t: 'd', v: 2018-12-31T00:00:00.000Z },
//...
}

On the write side, you need to quote the periods:

const stream = XLSX.stream.to_csv(worksheet, {
  FS: ';',
  RS: '\n',
  dateNF: 'dd"."mm"."yyyy',
  strip: false,
  blankrows: true
});

Why are quotes required in the date format?

The format displayed in the UI is not necessarily the same as the format displayed in the file. I took your file and changed one of the cells to follow the dd.mm.yyyy format in the UI. When saving the file, the actual format entry looks like:

    <numFmt numFmtId="165" formatCode="dd\.mm\.yyyy"/>

Excel is automagically fixing the decimal points as literal. The double quotes achieve the same effect.

What we should be doing

  1. The library should really take a locale option that sets everything up automatically. The ideal resolution is:
const workbook = XLSX.readFile(input, {locale:"en-US"});
  1. Common format errors should be autocorrected in the same way that Excel corrects them.
Both of us did something silly :) tl;dr: ```js const workbook = XLSX.readFile(input, {cellText:false, cellDates:true}); // ... const stream = XLSX.stream.to_csv(worksheet, { FS: ';', RS: '\n', dateNF: 'dd"."mm"."yyyy', // <-- notice the double quotes strip: false, blankrows: true }); ``` **How does Excel handle international dates?** Excel dates are somewhat hokey. In the file, if you unzip and take a peek at `xl/worksheets/sheet1.xml` you'll see entries like ```xml <c r="B2" s="1"> <v>43465</v> </c> ``` The reader has to figure out that the number corresponds to a date and convert it back. To do that it looks at the cell format from `xl/styles.xml`: ```xml <xf numFmtId="14" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1" applyFill="1"/> ``` All it stores is the number 14. Based in new york, your file looks like: <img width="256" alt="screen shot 2017-07-06 at 13 18 27" src="https://user-images.githubusercontent.com/6070939/27923804-aa3e5bdc-624d-11e7-94fe-341ce6a7b0bd.png"> By fiddling with the regional settings, the same exact file will be displayed differently. https://github.com/SheetJS/js-xlsx/issues/326#issuecomment-286014758 is one comment showing a few different regional setting screenshots against Excel 95 **What you should do** To override the date format, you must force the reader to ignore the original number format and force the writer to use a valid cell format. On the read side, the line should be: ```js const workbook = XLSX.readFile(input, {cellText:false, cellDates:true}); ``` the `cellText:false` option skips generation of the original strings (the `w` field is omitted) and `cellDates:true` forces generation of date objects for cells that are possibly dates (instead of the file's number values). After doing this, the worksheet date cells look like: ```js { // ... B2: { t: 'd', v: 2018-12-31T00:00:00.000Z }, //... } ``` On the write side, you need to quote the periods: ```js const stream = XLSX.stream.to_csv(worksheet, { FS: ';', RS: '\n', dateNF: 'dd"."mm"."yyyy', strip: false, blankrows: true }); ``` **Why are quotes required in the date format?** The format displayed in the UI is not necessarily the same as the format displayed in the file. I took your file and changed one of the cells to follow the `dd.mm.yyyy` format in the UI. When saving the file, the actual format entry looks like: ```xml <numFmt numFmtId="165" formatCode="dd\.mm\.yyyy"/> ``` Excel is automagically fixing the decimal points as literal. The double quotes achieve the same effect. **What we should be doing** 1) The library should really take a locale option that sets everything up automatically. The ideal resolution is: ```js const workbook = XLSX.readFile(input, {locale:"en-US"}); ``` 2) Common format errors should be autocorrected in the same way that Excel corrects them.
trading-peter commented 2017-07-07 06:34:07 +00:00 (Migrated from github.com)

Works like a charm 🎉🎉🎉
Thank you!

Works like a charm 🎉🎉🎉 Thank you!
SheetJSDev commented 2017-07-07 06:48:20 +00:00 (Migrated from github.com)

@pkaske good to hear that worked! Let's keep this issue open until the locale stuff is resolved

@pkaske good to hear that worked! Let's keep this issue open until the locale stuff is resolved
haio commented 2018-01-25 07:49:17 +00:00 (Migrated from github.com)

@SheetJSDev cellText:false seems not work, I set it to false but still got w in cell object.

@SheetJSDev `cellText:false` seems not work, I set it to false but still got `w` in cell object.
sonjadeissenboeck commented 2018-10-17 13:28:51 +00:00 (Migrated from github.com)

@SheetJSDev @pkaske I'm using the exact same code, however, it's still the default format that gets applied..
var data = XLSX.utils.sheet_to_json(worksheet, {dateNF: 'dd"."mm"."yy', raw: false});
what's wrong with my code?

@SheetJSDev @pkaske I'm using the exact same code, however, it's still the default format that gets applied.. var data = XLSX.utils.sheet_to_json(worksheet, {dateNF: 'dd"."mm"."yy', raw: false}); what's wrong with my code?
Fabio-Pires commented 2018-10-30 10:34:49 +00:00 (Migrated from github.com)

Thank you, did work for me!

const workbook = XLSX.readFile(input, {cellText:false, cellDates:true});
var excelCSV = XLSX.utils.sheet_to_csv(worksheet, { FS: ";", RS: "\n" , strip: true, blankrows: false, skipHidden: true, dateNF: 'DD"/"MM"/"YYYY HH":"mm":"ss' });

Thank you, did work for me! const workbook = XLSX.readFile(input, {cellText:false, cellDates:true}); var excelCSV = XLSX.utils.sheet_to_csv(worksheet, { FS: ";", RS: "\n" , strip: true, blankrows: false, skipHidden: true, dateNF: 'DD"/"MM"/"YYYY HH":"mm":"ss' });
kalinchernev commented 2019-04-22 11:36:08 +00:00 (Migrated from github.com)

@sonjadeissenboeck regarding https://github.com/SheetJS/js-xlsx/issues/718#issuecomment-430628199, having same situation.

For me, it helped to work with the formatting in the data reading:

const workbook = XLSX.read(data, {
  cellDates: true,
  dateNF: 'dd/mm/yyyy',
});

And you leave the XLSX.utils.sheet_to_json({ raw: false })

@sonjadeissenboeck regarding https://github.com/SheetJS/js-xlsx/issues/718#issuecomment-430628199, having same situation. For me, it helped to work with the formatting in the data reading: ```javascript const workbook = XLSX.read(data, { cellDates: true, dateNF: 'dd/mm/yyyy', }); ``` And you leave the `XLSX.utils.sheet_to_json({ raw: false })`
TerryZ commented 2019-07-15 03:45:09 +00:00 (Migrated from github.com)

@sonjadeissenboeck regarding #718 (comment), having same situation.

For me, it helped to work with the formatting in the data reading:

const workbook = XLSX.read(data, {
  cellDates: true,
  dateNF: 'dd/mm/yyyy',
});

And you leave the XLSX.utils.sheet_to_json({ raw: false })

Thanks, XLSX.utils.sheet_to_json({ raw: false }) worked for me.

> @sonjadeissenboeck regarding [#718 (comment)](https://github.com/SheetJS/js-xlsx/issues/718#issuecomment-430628199), having same situation. > > For me, it helped to work with the formatting in the data reading: > > ```js > const workbook = XLSX.read(data, { > cellDates: true, > dateNF: 'dd/mm/yyyy', > }); > ``` > > And you leave the `XLSX.utils.sheet_to_json({ raw: false })` Thanks, `XLSX.utils.sheet_to_json({ raw: false })` worked for me.
elaliwat commented 2020-06-19 13:15:52 +00:00 (Migrated from github.com)

In my case, I solved it by your all help as below;

workbook.SheetNames.forEach(function (sheetName, index) {
        var roa = XLSX.utils.sheet_to_json(workbook.Sheets[sheetName], {
            raw: false,
            dateNF: 'dd"."mm"."yyyy',
        });
var sheets = ExcelUtils.readSheets(
            decodeURIComponent(this.state.tempFile.file),
            {   type: 'base64', 
                cellText:false, 
                cellDates:true
            }
        );
In my case, I solved it by your all help as below; ```js workbook.SheetNames.forEach(function (sheetName, index) { var roa = XLSX.utils.sheet_to_json(workbook.Sheets[sheetName], { raw: false, dateNF: 'dd"."mm"."yyyy', }); ``` ```js var sheets = ExcelUtils.readSheets( decodeURIComponent(this.state.tempFile.file), { type: 'base64', cellText:false, cellDates:true } ); ```
serena97 commented 2020-12-18 18:39:46 +00:00 (Migrated from github.com)

@SheetJSDev what format should I pass in for dates with deliberate whitespaces, eg. Czech dates 5. 1. 2021 (d. m. yyyy)?

I tried

  • raw: false

    • dateNF: 'd"." m"." yyyy', but it returned "30. 4. 2021".

    • dateNF: 'd". "m". "yyyy', but it returned "30. 4. 2021".

    • dateNF: 'd. m. yyyy', but it threw an error: bad second format

  • raw: true

    • dateNF: 'd"." m"." yyyy', but it returned date object May 01 2021 instead of Jan 05 2021

    • dateNF: 'd". "m". "yyyy', but it returned date object May 01 2021 instead of Jan 05 2021

    • dateNF: 'd. m. yyyy', but it threw an error: bad second format

Also, I'd like to ask, is there a option to completely disable date localisation and just get the date string as it is?

@SheetJSDev what format should I pass in for dates with deliberate whitespaces, eg. Czech dates 5. 1. 2021 (d. m. yyyy)? I tried - raw: false - dateNF: 'd"." m"." yyyy', but it returned "30. 4. 2021". - dateNF: 'd". "m". "yyyy', but it returned "30. 4. 2021". - dateNF: 'd. m. yyyy', but it threw an error: bad second format - raw: true - dateNF: 'd"." m"." yyyy', but it returned date object May 01 2021 instead of Jan 05 2021 - dateNF: 'd". "m". "yyyy', but it returned date object May 01 2021 instead of Jan 05 2021 - dateNF: 'd. m. yyyy', but it threw an error: bad second format Also, I'd like to ask, is there a option to completely disable date localisation and just get the date string as it is?
SheetJSDev commented 2020-12-18 19:14:33 +00:00 (Migrated from github.com)

@serena97 just to be clear, the "date string as it is" does not exist in XLSX or XLS or XLSB formats. They just store a number and a number format. That's why we need the formatting library (https://github.com/sheetjs/ssf).

As for delimiters, Excel generally stores the generic "/", which in certain contexts is to be interpreted as the date delimiter character in your current locale.

As for the actual date values (raw: false) what is your current locale? (Intl.DateTimeFormat().resolvedOptions().locale)

@serena97 just to be clear, the "date string as it is" does not exist in XLSX or XLS or XLSB formats. They just store a number and a number format. That's why we need the formatting library (https://github.com/sheetjs/ssf). As for delimiters, Excel generally stores the generic "/", which in certain contexts is to be interpreted as the date delimiter character in your current locale. As for the actual date values (raw: false) what is your current locale? (`Intl.DateTimeFormat().resolvedOptions().locale`)
serena97 commented 2020-12-18 19:41:31 +00:00 (Migrated from github.com)

@SheetJSDev Thanks for your prompt response! My current locale is 'en-US'.

Also, thanks for letting me know that "date string as it is" doesn't exist in XLSX/XLS/XLSB. I understand now that the dates must be interpreted based on the date formats of the system settings for these formats, but is there a option to disable date localisation for other formats such as csv at least?

@SheetJSDev Thanks for your prompt response! My current locale is 'en-US'. Also, thanks for letting me know that "date string as it is" doesn't exist in XLSX/XLS/XLSB. I understand now that the dates must be interpreted based on the date formats of the system settings for these formats, but is there a option to disable date localisation for other formats such as csv at least?
0416vidya commented 2021-06-01 10:14:12 +00:00 (Migrated from github.com)

@SheetJSDev I am trying for an XLS file and 'dateNF' in readFile didn't work for me. For the XLSX file, the dateNF worked fine.
For XLS file:
image

For XLSX file:
image

Could anyone help me with this.

@SheetJSDev I am trying for an XLS file and 'dateNF' in readFile didn't work for me. For the XLSX file, the dateNF worked fine. For XLS file: ![image](https://user-images.githubusercontent.com/42372699/120306986-19757600-c2f0-11eb-8f33-bdf8b3348ba7.png) For XLSX file: ![image](https://user-images.githubusercontent.com/42372699/120306867-f5b23000-c2ef-11eb-8b7d-de1e5abd7e77.png) Could anyone help me with this.
SheetJSDev commented 2021-09-17 07:42:35 +00:00 (Migrated from github.com)

@serena97 to disable date parsing for CSV, pass the option raw: true to XLSX.read or XLSX.readFile

@0416vidya #2305

@serena97 to disable date parsing for CSV, pass the option `raw: true` to `XLSX.read` or `XLSX.readFile` @0416vidya #2305
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#718
No description provided.