js-cfb/README.md

4.7 KiB

Compound File Binary Format

Pure-JS implementation of MS-CFB: Compound File Binary File Format, a container format used in many Microsoft file types (XLS, DOC, VBA blobs in XLSX and XLSB)

Build Status Coverage Status Dependencies Status NPM Downloads ghit.me Analytics

Installation

In the browser:

<script src="dist/cfb.min.js" type="text/javascript"></script>

With npm:

$ npm install cfb

The xlscfb.js file is designed to be embedded in js-xlsx

Library Usage

In node:

var CFB = require('cfb');

For example, to get the Workbook content from an XLS file:

var cfb = CFB.read(filename, {type: 'file'});
var workbook = CFB.find(cfb, 'Workbook');
var data = workbook.content;

Command-Line Utility Usage

It is preferable to install the library globally with npm:

$ npm install -g cfb

The global installation adds a command cfb which can work with existing files:

  • cfb file will extract the contents of the file to the current directory. It will make the corresponding subdirectories.
  • cfb --list-files file will show a listing of the contained files. The format follows the unzip -l "short format".
  • cfb --repair file will attempt to repair by reading and re-writing the file. This fixes some issues with files generated by non-standard tools.

JS API

TypeScript definitions are maintained in types/index.d.ts.

The CFB object exposes the following methods and properties:

CFB.parse(blob) takes a nodejs Buffer or an array of bytes and returns an parsed representation of the data.

CFB.read(blob, opts) wraps parse. opts.type controls the behavior:

  • file: blob is interpreted as a file name that will be read
  • base64: blob is interpreted as base64 string
  • binary: blob is interpreted as binary string
  • default: blob is interpreted as nodejs buffer or array of bytes

CFB.find(cfb, path) performs a case-insensitive match for the path (or file name, if there are no slashes) and returns an entry object or null if not found.

CFB.write(cfb, opts) generates a file based on the container. opts.type controls the behavior:

  • base64: returns a base64 string
  • binary: returns a binary string
  • default: returns a nodejs buffer or array of bytes

CFB.writeFile(cfb, filename, opts) creates a file with the specified name.

Utility Functions

The utility functions are available in the CFB.utils object. Functions that accept a name argument strictly deal with absolute file names:

  • .cfb_new(?opts) creates a new container object.
  • .cfb_add(cfb, name, ?content, ?opts) adds a new file to the cfb.
  • .cfb_del(cfb, name) deletes the specified file
  • .cfb_mov(cfb, old_name, new_name) moves the old file to new path and name

Container Object Description

The objects returned by parse and read have the following properties:

  • .FullPaths is an array of the names of all of the streams (files) and storages (directories) in the container. The paths are properly prefixed from the root entry (so the entries are unique)

  • .FullPathDir is an object whose keys are entries in .FullPaths and whose values are objects with metadata and content (described below)

  • .FileIndex is an array of the objects from .FullPathDir, in the same order as .FullPaths.

  • .raw contains the raw header and sectors

Entry Object Description

The entry objects are available from FullPathDir and FileIndex elements of the container object:

interface CFBEntry {
  name: string; /** Case-sensitive internal name */
  type: number; /** 1 = dir, 2 = file, 5 = root ; see [MS-CFB] 2.6.1 */
  content: Buffer | number[] | Uint8Array; /** Raw Content */
  ct?: Date; /** Creation Time */
  mt?: Date; /** Modification Time */
}

License

Please consult the attached LICENSE file for details. All rights not explicitly granted by the Apache 2.0 License are reserved by the Original Author.

References

OSP-covered Specifications (click to show)
  • [MS-CFB]: Compound File Binary File Format