5.1 KiB
Spreadsheet Features
import DocCardList from '@theme/DocCardList'; import {useCurrentSidebarCategory} from '@docusaurus/theme-common';
Even for basic features like date storage, the official Excel formats store the same content in different ways. The parsers are expected to convert from the underlying file format representation to the Common Spreadsheet Format. Writers are expected to serialize SheetJS workbooks in the underlying file format.
The following topics are covered in sub-pages:
- {useCurrentSidebarCategory().items.map((item, index) => {
const cP = item.customProps;
const listyle = (cP?.icon) ? { listStyleImage: `url("${cP.icon}")` } : {};
return (
- {item.label}{cP?.summary && (" - " + cP.summary)} ); })}
Row and Column Properties
Format Support (click to show)
Row Properties: XLSX/M, XLSB, BIFF8 XLS, XLML, SYLK, DOM, ODS
Column Properties: XLSX/M, XLSB, BIFF8 XLS, XLML, SYLK, DOM
Row and Column properties are not extracted by default when reading from a file
and are not persisted by default when writing to a file. The option
cellStyles: true
must be passed to the relevant read or write function.
Column Properties
The !cols
array in each worksheet, if present, is a collection of ColInfo
objects which have the following properties:
type ColInfo = {
/* visibility */
hidden?: boolean; // if true, the column is hidden
/* column width is specified in one of the following ways: */
wpx?: number; // width in screen pixels
width?: number; // width in Excel "Max Digit Width", width*256 is integral
wch?: number; // width in characters
/* other fields for preserving features from files */
level?: number; // 0-indexed outline / group level
MDW?: number; // Excel "Max Digit Width" unit, always integral
};
Row Properties
The !rows
array in each worksheet, if present, is a collection of RowInfo
objects which have the following properties:
type RowInfo = {
/* visibility */
hidden?: boolean; // if true, the row is hidden
/* row height is specified in one of the following ways: */
hpx?: number; // height in screen pixels
hpt?: number; // height in points
level?: number; // 0-indexed outline / group level
};
Outline / Group Levels Convention
The Excel UI displays the base outline level as 1
and the max level as 8
.
Following JS conventions, SheetJS uses 0-indexed outline levels wherein the base
outline level is 0
and the max level is 7
.
Why are there three width types? (click to show)
There are three different width types corresponding to the three different ways spreadsheets store column widths:
SYLK and other plain text formats use raw character count. Contemporaneous tools like Visicalc and Multiplan were character based. Since the characters had the same width, it sufficed to store a count. This tradition was continued into the BIFF formats.
SpreadsheetML (2003) tried to align with HTML by standardizing on screen pixel count throughout the file. Column widths, row heights, and other measures use pixels. When the pixel and character counts do not align, Excel rounds values.
XLSX internally stores column widths in a nebulous "Max Digit Width" form. The Max Digit Width is the width of the largest digit when rendered (generally the "0" character is the widest). The internal width must be an integer multiple of the width divided by 256. ECMA-376 describes a formula for converting between pixels and the internal width. This represents a hybrid approach.
Read functions attempt to populate all three properties. Write functions will
try to cycle specified values to the desired type. In order to avoid potential
conflicts, manipulation should delete the other properties first. For example,
when changing the pixel width, delete the wch
and width
properties.
Implementation details (click to show)
Row Heights
Excel internally stores row heights in points. The default resolution is 72 DPI or 96 PPI, so the pixel and point size should agree. For different resolutions they may not agree, so the library separates the concepts.
Even though all of the information is made available, writers are expected to follow the priority order:
- use
hpx
pixel height if available - use
hpt
point height if available
Column Widths
Given the constraints, it is possible to determine the MDW
without actually
inspecting the font! The parsers guess the pixel width by converting from width
to pixels and back, repeating for all possible MDW
and selecting the value
that minimizes the error. XLML actually stores the pixel width, so the guess
works in the opposite direction.
Even though all of the information is made available, writers are expected to follow the priority order:
- use
width
field if available - use
wpx
pixel width if available - use
wch
character count if available