8.3 KiB
sidebar_position |
---|
3 |
Dates and Times
Lotus 1-2-3, Excel, and other spreadsheet software do not have a true concept of date or time. Instead, dates and times are stored as offsets from an epoch. The magic behind date interpretations is hidden in functions or number formats.
SheetJS attempts to create a friendly JS date experience while also exposing options to use the traditional date codes
How Spreadsheets Understand Time
Excel stores dates as numbers. When displaying dates, the format code should
include special date and time tokens like yyyyy
for long year. EDATE
and
other date functions operate on and return date numbers.
For date formats like yyyy-mm-dd
, the integer part represents the number of
days from a starting epoch. For example, the date 19-Feb-17
is stored as the
number 42785
with a number format of d-mmm-yy
.
The fractional part of the date code serves as the time marker. Excel assumes
each day has exactly 86400 seconds. For example, the date code 0.25
has a
time component corresponding to 6:00 AM.
For absolute time formats like [hh]:mm
, the integer part represents a whole
number of 24-hour (or 1440 minute) intervals. The value 1.5
in the format
[hh]:mm
is interpreted as 36 hours 0 minutes.
Date and Time Number Formats
Assuming a cell has a formatted date, re-formatting as "General" will reveal
the underlying value. Alternatively, the TEXT
function can be used to return
the date code.
The following table covers some common formats:
Common Date-Time Formats (click to hide)
Fragment | Interpretation |
---|---|
yy |
Short (2-digit) year |
yyyy |
Long (4-digit) year |
m |
Short (1-digit) month |
mm |
Long (2-digit) month |
mmm |
Short (3-letter) month name |
mmmm |
Full month name |
mmmmm |
First letter of month name |
d |
Short (1-digit) day of month |
dd |
Long (2-digit) day of month |
ddd |
Short (3-letter) day of week |
dddd |
Full day of week |
h |
Short (1-digit) hours |
hh |
Long (2-digit) hours |
m |
Short (1-digit) minutes |
mm |
Long (2-digit) minutes |
s |
Short (1-digit) seconds |
ss |
Long (2-digit) seconds |
A/P |
Meridiem ("A" or "P") |
AM/PM |
Meridiem ("AM" or "PM") |
:::note
m
and mm
are context-dependent. It is interpreted as "minutes" when the
previous or next date token represents a time (hours or seconds):
yyyy-mm-dd hh:mm:ss
^^ ^^
month minutes
:::
1904 and 1900 Date Systems
The interpretation of date codes requires a shared understanding of date code
0
, otherwise known as the "epoch". Excel supports two epochs:
-
The default epoch is "January 0 1900". The
0
value is 00:00 on December 31 of the year 1899, but it is formatted as January 0 1900. -
Enabling "1904 Date System" sets the default epoch to "January 1 1904". The
0
value is 00:00 on January 1 of the year 1904.
The workbook's epoch can be determined by examining the workbook's wb.Workbook.WBProps.date1904
property:
if(!(wb?.Workbook?.WBProps?.date1904)) {
/* uses 1904 date system */
} else {
/* uses 1900 date system */
}
:::note Why does the 1904 date system exist?
1900 was not a leap year. For the Gregorian calendar, the general rules are:
- every multiple of 400 is a leap year
- every multiple of 100 that is not a multiple of 400 is not a leap year
- every multiple of 4 that is not a multiple of 100 is a leap year
- all other years are not leap years.
Lotus 1-2-3 erroneously treated 1900 as a leap year. This can be verified with
the @date
function:
@date(0,2,28) -> 59 // Lotus accepts 2/28/1900
@date(0,2,29) -> 60 // <--2/29/1900 was not a real date
@date(0.2,30) -> ERR // Lotus rejects 2/30/1900
Excel extends the tradition in the default date system. The 1904 date system starts the count in 1904, skipping the bad date.
:::
Relative Epochs
The epoch is based on the system timezone. The epoch in New York is midnight in Eastern time, while the epoch in Seattle is midnight in Pacific time.
This design has the advantage of uniform time display: "12 PM" is 12 PM irrespective of the timezone of the viewer. However, this design precludes any international coordination (there is no way to create a value that represents an absolute time) and makes JavaScript processing somewhat ambiguous (since JavaScript Date objects are timezone-aware)
This is a deficiency of the spreadsheet software. Excel has no native concept of universal time.
The library attempts to normalize the dates. All times are specified in the local time zone. SheetJS cannot magically fix the technical problems with Excel and other spreadsheet software, but this represents .
How Files Store Dates and Times
XLS, XLSB, and most binary formats store the raw date codes. Special number formats are used to indicate that the values are intended to be dates/times.
CSV and other text formats typically store actual formatted date values. They are interpreted as dates and times in the user timezone.
XLSX actually supports both! Typically dates are stored as n
numeric cells,
but the format supports a special type d
where the data is an ISO 8601 date
string. This is not used in the default Excel XLSX export and third-party
support is poor.
ODS does support absolute time values but drops the actual timezone indicator when parsing. In that sense, LibreOffice follows the same behavior as Excel.
How SheetJS handles Dates and Times
The default behavior for all parsers is to generate number cells. Passing the
cellDates
to true will force the parsers to store dates:
// cell A1 will be { t: 'n', v: 54337 }
var wb_sans_date = XLSX.read("10/6/2048", {type:"binary"});
// cell A1 will be { t: 'd', v: <Date object representing 2048 October 06> }
var wb_with_date = XLSX.read("10/6/2048", {type:"binary", cellDates: true});
When writing, date cells are automatically translated back to numeric cells with an appropriate number format.
The actual values stored in cells are intended to be correct from the perspective of an Excel user in the current timezone.
The value formatting logic understands date formats and converts when relevant.
Utility Functions
Utility functions that deal with JS data accept a cellDates
argument which
dictates how dates should be handled.
Functions that create a worksheet will adjust date cells and use a number
format like m/d/yy
to mark dates:
// Cell A1 will be a numeric cell whose value is the date code
var ws = XLSX.utils.aoa_to_sheet([[new Date()]]);
// Cell A1 will be a date cell
var ws = XLSX.utils.aoa_to_sheet([[new Date()]], { cellDates: true });
Functions that create an array of JS objects with raw values will keep the native representation:
// Cell A1 is numeric -> output is a number
var ws = XLSX.utils.aoa_to_sheet([[new Date()]]);
var A1 = XLSX.utils.sheet_to_json(ws, { header: 1 })[0][0];
// Cell A1 is a date -> output is a date
var ws = XLSX.utils.aoa_to_sheet([[new Date()]], { cellDates: true });
var A1 = XLSX.utils.sheet_to_json(ws, { header: 1 })[0][0];
Number Formats
By default, the number formats are not emitted. For Excel-based file formats,
passing the option cellNF: true
adds the z
field.
The helper function XLSX.SSF.is_date
parses formats and returns true
if the
format represents a date or time:
XLSX.SSF.is_date("yyyy-mm-dd"); // true
XLSX.SSF.is_date("0.00"); // false
Live Demo (click to show)
function SSFIsDate() {
const [format, setFormat] = React.useState("yyyy-mm-dd");
const cb = React.useCallback((evt) => {
setFormat(evt.target.value);
});
const is_date = XLSX.SSF.is_date(format);
return ( <>
<div>Format <b>|{format}|</b> is {is_date ? "" : "not"} a date/time</div>
<input type="text" onChange={cb}/>
</> );
}