4.7 KiB
Compound File Binary Format
Pure-JS implementation of MS-CFB: Compound File Binary File Format, a container format used in many Microsoft file types (XLS, DOC, VBA blobs in XLSX and XLSB)
Installation
In the browser:
<script src="dist/cfb.min.js" type="text/javascript"></script>
With npm:
$ npm install cfb
The xlscfb.js
file is designed to be embedded in js-xlsx
Library Usage
In node:
var CFB = require('cfb');
For example, to get the Workbook content from an XLS file:
var cfb = CFB.read(filename, {type: 'file'});
var workbook = CFB.find(cfb, 'Workbook');
var data = workbook.content;
Command-Line Utility Usage
It is preferable to install the library globally with npm:
$ npm install -g cfb
The global installation adds a command cfb
which can work with existing files:
cfb file
will extract the contents of the file to the current directory. It will make the corresponding subdirectories.cfb --list-files file
will show a listing of the contained files. The format follows theunzip -l
"short format".cfb --repair file
will attempt to repair by reading and re-writing the file. This fixes some issues with files generated by non-standard tools.
JS API
TypeScript definitions are maintained in types/index.d.ts
.
The CFB object exposes the following methods and properties:
CFB.parse(blob)
takes a nodejs Buffer or an array of bytes and returns an
parsed representation of the data.
CFB.read(blob, opts)
wraps parse
. opts.type
controls the behavior:
file
:blob
is interpreted as a file name that will be readbase64
:blob
is interpreted as base64 stringbinary
:blob
is interpreted as binary string- default:
blob
is interpreted as nodejs buffer or array of bytes
CFB.find(cfb, path)
performs a case-insensitive match for the path (or file
name, if there are no slashes) and returns an entry object or null if not found.
CFB.write(cfb, opts)
generates a file based on the container. opts.type
controls the behavior:
base64
: returns a base64 stringbinary
: returns a binary string- default: returns a nodejs buffer or array of bytes
CFB.writeFile(cfb, filename, opts)
creates a file with the specified name.
Utility Functions
The utility functions are available in the CFB.utils
object. Functions that
accept a name
argument strictly deal with absolute file names:
.cfb_new(?opts)
creates a new container object..cfb_add(cfb, name, ?content, ?opts)
adds a new file to thecfb
..cfb_del(cfb, name)
deletes the specified file.cfb_mov(cfb, old_name, new_name)
moves the old file to new path and name
Container Object Description
The objects returned by parse
and read
have the following properties:
-
.FullPaths
is an array of the names of all of the streams (files) and storages (directories) in the container. The paths are properly prefixed from the root entry (so the entries are unique) -
.FullPathDir
is an object whose keys are entries in.FullPaths
and whose values are objects with metadata and content (described below) -
.FileIndex
is an array of the objects from.FullPathDir
, in the same order as.FullPaths
. -
.raw
contains the raw header and sectors
Entry Object Description
The entry objects are available from FullPathDir
and FileIndex
elements of
the container object:
interface CFBEntry {
name: string; /** Case-sensitive internal name */
type: number; /** 1 = dir, 2 = file, 5 = root ; see [MS-CFB] 2.6.1 */
content: Buffer | number[] | Uint8Array; /** Raw Content */
ct?: Date; /** Creation Time */
mt?: Date; /** Modification Time */
}
License
Please consult the attached LICENSE file for details. All rights not explicitly granted by the Apache 2.0 License are reserved by the Original Author.
References
OSP-covered Specifications (click to show)
- [MS-CFB]: Compound File Binary File Format