2013-09-05 18:55:36 +00:00
|
|
|
# Compound File Binary Format
|
|
|
|
|
|
|
|
This is a Pure-JS implementation of MS-CFB: Compound File Binary File Format, a
|
|
|
|
format used in many Microsoft file types (such as XLS, DOC, and other Microsoft
|
|
|
|
Office file types).
|
|
|
|
|
2013-10-29 18:50:54 +00:00
|
|
|
# Utility Installation and Usage
|
2013-09-05 18:55:36 +00:00
|
|
|
|
|
|
|
The package is available on NPM:
|
|
|
|
|
|
|
|
```
|
|
|
|
$ npm install -g cfb
|
|
|
|
$ cfb path/to/CFB/file
|
|
|
|
```
|
|
|
|
|
|
|
|
The command will extract the storages and streams in the container, generating
|
|
|
|
files that line up with the tree-based structure of the storage. Metadata
|
|
|
|
such as the red-black tree are discarded (and in the future, new CFB containers
|
|
|
|
will exclusively use black nodes)
|
|
|
|
|
2013-10-29 18:50:54 +00:00
|
|
|
# Library Installation and Usage
|
|
|
|
|
|
|
|
In the browser:
|
|
|
|
|
|
|
|
<script src="cfb.js" type="text/javascript"></script>
|
|
|
|
|
|
|
|
In node:
|
|
|
|
|
|
|
|
var CFB = require('cfb');
|
|
|
|
|
|
|
|
For example, to get the Workbook content from an XLS file:
|
|
|
|
|
|
|
|
var cfb = CFB.read(filename, {type: 'file'});
|
2013-11-26 15:56:58 +00:00
|
|
|
var workbook = cfb.find('Workbook')
|
2013-10-29 18:50:54 +00:00
|
|
|
|
2013-11-26 15:56:58 +00:00
|
|
|
# API
|
|
|
|
|
|
|
|
Typescript definitions are maintained in `misc/cfb.d.ts`.
|
2013-10-29 18:50:54 +00:00
|
|
|
|
|
|
|
The CFB object exposes the following methods and properties:
|
|
|
|
|
|
|
|
`CFB.parse(blob)` takes a nodejs Buffer or an array of bytes and returns an
|
|
|
|
parsed representation of the data.
|
|
|
|
|
|
|
|
`CFB.read(blob, options)` wraps `parse`. `options.type` controls the behavior:
|
|
|
|
|
|
|
|
- `file`: `blob` should be a file name
|
|
|
|
- `base64`: `blob` should be a base64 string
|
|
|
|
- `binary`: `blob` should be a binary string
|
|
|
|
|
|
|
|
## Container Object Description
|
|
|
|
|
|
|
|
The object returned by `parse` and `read` can be found in the source (`rval`).
|
|
|
|
It has the following properties and methods:
|
|
|
|
|
|
|
|
- `.find(path)` performs a case-insensitive match for the path (or file name, if
|
|
|
|
there are no slashes) and returns an entry object (described later) or null if
|
|
|
|
not found
|
|
|
|
|
|
|
|
- `.FullPaths` is an array of the names of all of the streams (files) and
|
|
|
|
storages (directories) in the container. The paths are properly prefixed from
|
|
|
|
the root entry (so the entries are unique)
|
|
|
|
|
|
|
|
- `.FullPathDir` is an object whose keys are entries in `.FullPaths` and whose
|
|
|
|
values are objects with metadata and content (described below)
|
|
|
|
|
|
|
|
- `.FileIndex` is an array of the objects from `.FullPathDir`, in the same order
|
|
|
|
as `.FullPaths`.
|
|
|
|
|
|
|
|
- `.raw` contains the raw header and sectors
|
|
|
|
|
|
|
|
## Entry Object Description
|
|
|
|
|
2013-11-26 15:56:58 +00:00
|
|
|
The entry objects are available from `FullPathDir` and `FileIndex` elements of the
|
|
|
|
container object.
|
2013-10-29 18:50:54 +00:00
|
|
|
|
|
|
|
- `.name` is the (case sensitive) internal name
|
|
|
|
- `.type` is the type (`stream` for files, `storage` for dirs, `root` for root)
|
|
|
|
- `.content` is a Buffer/Array with the raw content
|
|
|
|
- `.ct`/`.mt` are the creation and modification time (if provided in file)
|
|
|
|
|
|
|
|
# Notes
|
|
|
|
|
|
|
|
Case comparison has not been verified for non-ASCII character
|
|
|
|
|
|
|
|
Writing is not supported. It is in the works, but it has not yet been released.
|
|
|
|
|
2013-09-05 18:55:36 +00:00
|
|
|
# License
|
|
|
|
|
|
|
|
This implementation is covered under Apache 2.0 license. It complies with the
|
|
|
|
[Open Specifications Promise](http://www.microsoft.com/openspecifications/)
|
|
|
|
|