# Codepages for JS
[Codepages](https://en.wikipedia.org/wiki/Codepage) are character encodings. In
many contexts, single- or double-byte character sets are used in lieu of Unicode
encodings. The codepages map between characters and numbers.
[unicode.org](http://www.unicode.org/Public/MAPPINGS/) hosts lists of mappings.
The build script automatically downloads and parses the mappings in order to
generate the full script. The `pages.csv` description in `codepage.md` controls
which codepages are used.
## Setup
In node:
var cptable = require('codepage');
In the browser:
Alternatively, use the full version in the dist folder:
The complete set of codepages is large due to some Double Byte Character Set
encodings. A much smaller file that just includes SBCS codepages is provided in
this repo (`sbcs.js`), as well as a file for other projects (`cpexcel.js`)
If you know which codepages you need, you can include individual scripts for
each codepage. The individual files are provided in the `bits/` directory.
For example, to include only the Mac codepages:
All of the browser scripts define and append to the `cptable` object. To rename
the object, edit the `JSVAR` shell variable in `make.sh` and run the script.
The utilities functions are contained in `cputils.js`, which assumes that the
appropriate codepage scripts were loaded.
## Usage
The codepages are indexed by number. To get the unicode character for a given
codepoint, use the `dec` property:
var unicode_cp10000_255 = cptable[10000].dec[255]; // ˇ
To get the codepoint for a given character, use the `enc` property:
var cp10000_711 = cptable[10000].enc[String.fromCharCode(711)]; // 255
There are a few utilities that deal with strings and buffers:
var 汇总 = cptable.utils.decode(936, [0xbb,0xe3,0xd7,0xdc]);
var buf = cptable.utils.encode(936, 汇总);
var sushi= cptable.utils.decode(65001, [0xf0,0x9f,0x8d,0xa3]); // 🍣
var sbuf = cptable.utils.encode(65001, sushi);
`cptable.utils.encode(CP, data, ofmt)` accepts a String or Array of characters
and returns a representation controlled by `ofmt`:
- Default output is a Buffer (or Array) of bytes (integers between 0 and 255).
- If `ofmt == 'str'`, return a String where `o.charCodeAt(i)` is the ith byte
- If `ofmt == 'arr'`, return an Array of bytes
## Known Excel Codepages
A much smaller script, including only the codepages known to be used in Excel,
is available under the name `cpexcel`. It exposes the same variable `cptable`
and is suitable as a drop-in replacement when the full codepage tables are not
needed.
In node:
var cptable = require('codepage/dist/cpexcel.full');
## Rolling your own script
The `make.sh` script in the repo can take a manifest and generate JS source.
Usage:
bash make.sh path_to_manifest output_file_name JSVAR
where
- `JSVAR` is the name of the exported variable (generally `cptable`)
- `output_file_name` is the output file (e.g. `cpexcel.js`, `cptable.js`)
- `path_to_manifest` is the path to the manifest file.
The manifest file is expected to be a CSV with 3 columns:
,