Support x: tags in styles.xml #1114

Closed
tombousso wants to merge 0 commits from master into master
tombousso commented 2018-05-22 17:53:35 +00:00 (Migrated from github.com)

The XML tags in styles.xml sometimes have an extra x: which messes up parse_sty_xml

<x:cellXfs count="2">
    <x:xf numFmtId="0" fontId="0" fillId="0" borderId="0" xfId="0" />
    <x:xf numFmtId="14" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1" />
</x:cellXfs>

(see https://social.msdn.microsoft.com/Forums/vstudio/en-US/a72442fd-a1a6-446e-9416-876f7669d8e2/openxml-excel-date-formatting for more examples)

This is a simple fix for the problem

The XML tags in `styles.xml` sometimes have an extra `x:` which messes up `parse_sty_xml` <x:cellXfs count="2"> <x:xf numFmtId="0" fontId="0" fillId="0" borderId="0" xfId="0" /> <x:xf numFmtId="14" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1" /> </x:cellXfs> (see https://social.msdn.microsoft.com/Forums/vstudio/en-US/a72442fd-a1a6-446e-9416-876f7669d8e2/openxml-excel-date-formatting for more examples) This is a simple fix for the problem
SheetJSDev commented 2018-05-22 17:58:12 +00:00 (Migrated from github.com)

@tombousso Do you have, or can you generate, a sample file?

@tombousso Do you have, or can you generate, a sample file?
tombousso commented 2018-05-22 18:17:23 +00:00 (Migrated from github.com)

problem.xlsx

Here's a sample file. In Excel and Libreoffice it will be recognized as a date, but js-xlsx does not currently recognize it as a date

[problem.xlsx](https://github.com/SheetJS/js-xlsx/files/2028051/problem.xlsx) Here's a sample file. In Excel and Libreoffice it will be recognized as a date, but `js-xlsx` does not currently recognize it as a date
SheetJSDev commented 2018-05-22 18:35:49 +00:00 (Migrated from github.com)

Thanks for sharing! Oddly that's the only part that uses namespaced xml. Some strict files namespaced xml in the worksheet and workbook xml but not the styles, go figure.

FYI the new expressions change the capture group positions. The fix should make those non-capture by putting ?: just after the open parenthesis -- see Note 1 from ECMA-262 v5.1 section 15.10.2.8:

var numFmtRegex = /<(?:\w+:)?numFmts([^>]*)>[\S\s]*?<\/(?:\w+:)?numFmts>/;
Thanks for sharing! Oddly that's the only part that uses namespaced xml. Some strict files namespaced xml in the worksheet and workbook xml but not the styles, go figure. FYI the new expressions change the capture group positions. The fix should make those non-capture by [putting `?:` just after the open parenthesis -- see Note 1 from ECMA-262 v5.1 section 15.10.2.8](http://ecma-international.org/ecma-262/5.1/#sec-15.10.2.8): ```js var numFmtRegex = /<(?:\w+:)?numFmts([^>]*)>[\S\s]*?<\/(?:\w+:)?numFmts>/; ```
tombousso commented 2018-05-22 22:41:08 +00:00 (Migrated from github.com)

Thanks for the fix!

Thanks for the fix!
SheetJSDev commented 2018-05-22 22:42:33 +00:00 (Migrated from github.com)

We amended the commit, so unfortunately github will show this PR as closed. If you link your email address to your account, it will show you as a contributor to the project.

We amended the commit, so unfortunately github will show this PR as closed. If you link your email address to your account, it will show you as a contributor to the project.

Pull request closed

Sign in to join this conversation.
No description provided.