--- title: File Formats sidebar_position: 1 pagination_prev: api/utilities/index --- SheetJS supports reading and writing a number of spreadsheet file formats. | Format | Read | Write | |:-------------------------------------------------------------|:-----:|:-----:| | **Excel Worksheet/Workbook Formats** |:-----:|:-----:| | Excel 2007+ XML Formats (XLSX/XLSM) | ✔ | ✔ | | Excel 2007+ Binary Format (XLSB BIFF12) | ✔ | ✔ | | Excel 2003-2004 XML Format (XML "SpreadsheetML") | ✔ | ✔ | | Excel 97-2004 (XLS BIFF8) | ✔ | ✔ | | Excel 5.0/95 (XLS BIFF5) | ✔ | ✔ | | Excel 4.0 (XLS/XLW BIFF4) | ✔ | ✔ | | Excel 3.0 (XLS BIFF3) | ✔ | ✔ | | Excel 2.0/2.1 / Multiplan 4.x DOS (XLS BIFF2) | ✔ | ✔ | | **Excel Supported Text Formats** |:-----:|:-----:| | Delimiter-Separated Values (CSV/TXT) | ✔ | ✔ | | Data Interchange Format (DIF) | ✔ | ✔ | | Symbolic Link (SYLK/SLK) | ✔ | ✔ | | Lotus Formatted Text (PRN) | ✔ | ✔ | | UTF-16 Unicode Text (TXT) | ✔ | ✔ | | **Other Workbook/Worksheet Formats** |:-----:|:-----:| | Numbers 3.0+ / iWork 2013+ Spreadsheet (NUMBERS) | ✔ | ✔ | | WPS 电子表格 (ET) | ✔ | | | OpenDocument Spreadsheet (ODS) | ✔ | ✔ | | Flat XML ODF Spreadsheet (FODS) | ✔ | ✔ | | Uniform Office Format Spreadsheet (标文通 UOS1/UOS2) | ✔ | | | dBASE II/III/IV / Visual FoxPro (DBF) | ✔ | ✔ | | Lotus 1-2-3 (WK1/WK3) | ✔ | ✔ | | Lotus 1-2-3 (WKS/WK2/WK4/123) | ✔ | | | Quattro Pro Spreadsheet (WQ1/WQ2/WB1/WB2/WB3/QPW) | ✔ | | | Works 1.x-3.x DOS / 2.x-5.x Windows Spreadsheet (WKS) | ✔ | | | Works 6.x-9.x Spreadsheet (XLR) | ✔ | | | **Other Common Spreadsheet Output Formats** |:-----:|:-----:| | HTML Tables | ✔ | ✔ | | Rich Text Format tables (RTF) | ✔ | ✔ | | Ethercalc Record Format (ETH) | ✔ | ✔ | ![graph of format support](pathname:///formats.png) ![graph legend](pathname:///legend.png) Features not supported by a given file format will not be written. ## Worksheet Range Limits Formats with range limits will be silently truncated. For example, the Lotus WKS format has a limit of 2048 rows, so data after the 2048th row will not be saved. | Format | Last Cell | Max Cols | Max Rows | |:------------------------------------------|:-----------|---------:|---------:| | Excel 2007+ XML Formats (XLSX/XLSM) |`XFD1048576`| 16384 | 1048576 | | Excel 2007+ Binary Format (XLSB BIFF12) |`XFD1048576`| 16384 | 1048576 | | Numbers 13.1 (NUMBERS) |`ALL1000000`| 1000 | 1000000 | | Quattro Pro 9+ (QPW) |`IV1000000 `| 256 | 1000000 | | Excel 97-2004 (XLS BIFF8) |`IV65536 `| 256 | 65536 | | Excel 5.0/95 (XLS BIFF5) |`IV16384 `| 256 | 16384 | | Excel 4.0 (XLS BIFF4) |`IV16384 `| 256 | 16384 | | Excel 3.0 (XLS BIFF3) |`IV16384 `| 256 | 16384 | | Excel 2.0/2.1 (XLS BIFF2) |`IV16384 `| 256 | 16384 | | Lotus 1-2-3 R2 - R5 (WK1/WK3/WK4) |`IV8192 `| 256 | 8192 | | Lotus 1-2-3 R1 (WKS) |`IV2048 `| 256 | 2048 | Excel 2003 SpreadsheetML range limits are governed by the version of Excel and are not enforced by the writer. ## Common File Formats #### Excel 2007+ XML (XLSX/XLSM) XLSX and XLSM files are ZIP containers containing a series of XML files in accordance with the Open Packaging Conventions (OPC). The XLSM format, almost identical to XLSX, is used for files containing macros. The format is standardized in `ECMA-376` and `ISO/IEC 29500`. Excel does not follow the specification, and there are additional documents discussing how Excel deviates from the specification. #### Excel 2.0-95 (BIFF2/BIFF3/BIFF4/BIFF5) BIFF 2/3 XLS are single-sheet streams of binary records. Excel 4 introduced the concept of a workbook (`XLW` files) but also had single-sheet `XLS` format. The structure is largely similar to the Lotus 1-2-3 file formats. BIFF5/8/12 extended the format in various ways but largely stuck to the same record format. Multiplan 4 "Normal" files are identical in structure to BIFF2 and use the same cell value records. There are some different record types for more advanced features like Print Settings. The BIFF2 writer generates files that can be read in Multiplan 4 and the parser can extract values from "Normal" files. There is no official specification for any of these formats. Excel 95 can write files in these formats, so record lengths and fields were determined by writing in all of the supported formats and comparing files. Excel 2016 can generate BIFF5 files, enabling a full suite of file tests starting from XLSX or BIFF2. #### Excel 97-2004 Binary (BIFF8) BIFF8 exclusively uses the Compound File Binary container format, splitting some content into streams within the file. At its core, it still uses an extended version of the binary record format from older versions of BIFF. The `MS-XLS` specification covers the basics of the file format, and other specifications expand on serialization of features like properties. #### Excel 2003-2004 (SpreadsheetML) Predating XLSX, SpreadsheetML files are simple XML files. There is no official and comprehensive specification, although MS has released documentation on the format. Since Excel 2016 can generate SpreadsheetML files, mapping features is pretty straightforward. #### Excel 2007+ Binary (XLSB, BIFF12) Introduced in parallel with XLSX, the XLSB format combines the BIFF architecture with the content separation and ZIP container of XLSX. For the most part nodes in an XLSX sub-file can be mapped to XLSB records in a corresponding sub-file. The `MS-XLSB` specification covers the basics of the file format, and other specifications expand on serialization of features like properties. #### Delimiter-Separated Values (CSV/TXT) Excel CSV deviates from RFC4180 in a number of important ways. The generated CSV files should generally work in Excel although they may not work in RFC4180 compatible readers. The parser should generally understand Excel CSV. The writer proactively generates cells for formulae if values are unavailable. Excel TXT uses tab as the delimiter and code page 1200. Like in Excel, files starting with `0x49 0x44 ("ID")` are treated as Symbolic Link files. Unlike Excel, if the file does not have a valid SYLK header, it will be proactively reinterpreted as CSV. There are some files with semicolon delimiter that align with a valid SYLK file. For the broadest compatibility, all cells with the value of `ID` are automatically wrapped in double-quotes. #### HTML Excel HTML worksheets include special metadata encoded in styles. For example, `mso-number-format` is a localized string containing the number format. Despite the metadata the output is valid HTML, although it does accept bare `&` symbols. The writer adds type metadata to the TD elements via the `t` tag. The parser looks for those tags and overrides the default interpretation. For example, text like `