Invalid date recognition #1646
Labels
No Label
DBF
Dates
Defined Names
Features
Formula
HTML
Images
Infrastructure
Integration
International
ODS
Operations
Performance
PivotTables
Pro
Protection
Read Bug
SSF
SYLK
Style
Write Bug
good first issue
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: sheetjs/sheetjs#1646
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
fuzzydate() function incorrectly guesses date due to invalid regex pattern. For example, let's look at "Trollahaugen 64" text. First, it tries to convert it to date and it ends up being 01.01.1964 due to 64 at the end. Then it proceeds with looking for a month using /jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec/ regex. This is the source of error, since the text contains "aug" inside but it's not a separate word, hence should not be considered a month. Yet, the function assumes it is and returns 01.01.1964.
Just so we're on the same page: the root problem is the incredible flexibility of the Date constructor in V8 and other JS engines. In chrome,
new Date("Trollahaugen 64")
is interpreted as the beginning of the year 1964 (Jan 1 0:00 local time).There are a handful of workarounds for common cases we've seen but it's not pretty. The "right" answer is to skip the date constructor entirely and just regex against a series of culture-specific formats
I understand the inherent problem with solving such issue - there always will be some edge cases. Working around this particular one seems to be trivial - simply expect the month name to be a whole word, surrounded with non-alphabetic characters, if we assume (which might not be true) that the string is a ECMA-262 Date.toString() representation.
Hi, I had a similar issue with "SEPEM GRENOBLE 2018" text which is transformed in 9/1/18 date.
Moving the discussion to https://github.com/SheetJS/sheetjs/issues/1300
@krojew The spec uses the phrase "implementation-dependent":
That Date Time String Format is related to ISO 8601 and does not cover the set of date expressions understood by Excel