--- title: Sheets in Ghidra sidebar_label: Ghidra pagination_prev: demos/cloud/index pagination_next: demos/bigdata/index sidebar_custom_props: summary: Generate spreadsheets from Ghidra-generated bitfield tables --- import current from '/version.js'; import CodeBlock from '@theme/CodeBlock'; [Ghidra](https://ghidra-sre.org/) is a software reverse engineering platform with a robust Java-based extension system. [SheetJS](https://sheetjs.com) is a JavaScript library for reading and writing data from spreadsheets. The [Complete Demo](#complete-demo) uses SheetJS to export data from a Ghidra script. We'll create an extension that loads the [V8](/docs/demos/engines/v8) JavaScript engine through the Ghidra.js[^1] integration and uses the SheetJS library to export a bitfield table from Apple Numbers to a XLSX workbook. :::note Tested Deployments This demo was tested by SheetJS users in the following deployments: | Architecture | Ghidra | Date | |:-------------|:---------|:-----------| | `darwin-arm` | `11.1.2` | 2024-10-13 | ::: ## Integration Details Ghidra natively supports scripts that are run in Java. JS extension scripts require a [JavaScript engine](/docs/demos/engines/) with Java bindings. Ghidra.js[^1] is a Ghidra integration for [RhinoJS](/docs/demos/engines/rhino), [GraalJS](/docs/demos/engines/graaljs) and [V8](/docs/demos/engines/v8#java). The current version uses [the Javet V8 binding](https://www.caoccao.com/Javet). ### Loading SheetJS Scripts The [SheetJS NodeJS module](/docs/getting-started/installation/nodejs) can be loaded in Ghidra.js scripts using `require`: ```js title="Loading SheetJS scripts in Ghidra.js" const XLSX = require("xlsx"); ``` :::caution pass SheetJS NodeJS modules must be installed in a folder in the Ghidra script path! ::: ### Bitfields and Sheets Binary file formats commonly use bitfields to compactly store a set of Boolean (true or false) flags. For example, in the XLSB file format, the `BrtRowHdr` record[^2] encodes [row properties](/docs/csf/features/rowprops). Bit offsets 91-96 are interpreted as flags marking if a row is hidden or if it is collapsed. #### Assembly Implementation Functions that parse bitfields typically test each bit sequentially: ```nasm title="x86_64 sample assembly with mnemonics" CASE_1c 41 0f ba e5 1c BT R13D,0x1c 73 69 JNC CASE_1d ;; .... Do some work here (bit offset 28) CASE_1d 41 0f ba e5 1d BT R13D,0x1d 73 69 JNC CASE_1e ;; .... Do some work here (bit offset 29) ``` :::note pass The assembly is approximated by the following TypeScript snippet: ```typescript title="Approximate TypeScript" /* R13 is a 64-bit register */ declare let R13: BigInt; /* NOTE: asm R13D is technically a live binding */ let R13D: number = Number(R13 & 0xFFFFFFFFn); if((R13D >> 28) & 1) { // .... Do some work here (bit offset 28) } if((R13D >> 29) & 1) { // .... Do some work here (bit offset 29) } ``` ::: #### Array of Objects A bitmask or bit offset can be paired with a description in a JavaScript object. For example, in the `BrtRowHdr` record, bit offset 92 indicates whether the row is hidden (if the bit is set) or visible (if the bit is not set). The offset and description can be stored as fields in an object: ```js title="Sample metadata for BrtRowHdr offset 92" const metadata_92 = { Offset: 92, Description: "Hidden flag" }; ``` Each object can be stored in an array: ```js title="Array of sample metadata for BrtRowHdr" const metadata = [ { Offset: 91, Description: "Collapsed flag" }, { Offset: 92, Description: "Hidden flag" }, // ... ]; ``` This is an ["Array of Objects"](/docs/api/utilities/array#arrays-of-objects). The SheetJS `json_to_sheet` method[^3] can generate a SheetJS worksheet object from the array: ```js title="Generating a worksheet from the metadata" const ws = XLSX.utils.json_to_sheet(metadata); ``` The SheetJS `book_new` method[^4] generates a SheetJS workbook object that can be written to the filesystem using the `writeFile` method[^5]: ```js title="Exporting the worksheet to file" const wb = XLSX.utils.book_new(ws, "Offsets"); XLSX.utils.writeFile(wb, "SheetJSGhidra.xlsx"); ``` ### Java Binding Ghidra.js exposes a number of globals for interacting with Ghidra, including: - `currentProgram`: information about the loaded program. - `JavaHelper`: Java helper to load classes. Ghidra.js automatically bridges instance methods to Java method calls. It also handles the plugin and file extension details. #### Launching the Decompiler `ghidra.app.decompiler.DecompInterface` is the primary Java interface to the decompiler. In Ghidra.js, `JavaHelper.getClass` will load the class. _Java_ ```java title="Launch decompiler process in Java (snippet)" import ghidra.app.script.GhidraScript; import ghidra.app.decompiler.DecompInterface; import ghidra.program.model.listing.Program; public class SheetZilla extends GhidraScript { @Override public void run() throws Exception { DecompInterface ifc = new DecompInterface(); boolean success = ifc.openProgram(currentProgram); /* ... do work here ... */ } } ``` _Ghidra.js_ ```js title="Launch decompiler process in Ghidra.js" const DecompInterface = JavaHelper.getClass('ghidra.app.decompiler.DecompInterface'); const decompiler = new DecompInterface(); decompiler.openProgram(currentProgram); ``` #### Identifying a Function The `getGlobalSymbols` method of a symbol table instance will return an array of symbols matching the given name: ```js /* name of function to find */ const fname = 'MyMethod'; /* find symbols matching the name */ // highlight-next-line const fsymbs = currentProgram.getSymbolTable().getGlobalSymbols(fname); /* get first result */ const fsymb = fsymbs[0]; ``` The `getFunctionAt` method of a function manager instance will take an address and return a reference to a function: ```js /* get address */ const faddr = fsymb.getAddress(); /* find function */ // highlight-next-line const fn = currentProgram.getFunctionManager().getFunctionAt(faddr); ``` #### Decompiling a Function The `decompileFunction` method attempts to decompile the referenced function: ```js /* decompile function */ // highlight-next-line const decomp = decompiler.decompileFunction(fn, 10000, null); ``` Once decompiled, it is possible to retrieve the decompiled C code: ```js /* get generated C code */ const src = decomp.getDecompiledFunction().getC(); ``` ## Complete Demo In this demo, we will inspect the `_TSTCellToCellStorage` method within the `TSTables` framework of Apple Numbers 14.2. This particular method handles serialization of cells to the NUMBERS file format. The implementation has a number of blocks which look like the following script: ```js if(flags >> 0x0d & 1) { const field = "numberFormatID"; const current_value = cell[field]; // ... check if current_value is set, do other stuff } ``` Based on the bit offset and the field name, we will generate the following row: ```js const mask = 1 << 0x0d; // = 8192 = 0x2000 const name = "number format ID"; const row = { Mask: "0x" + mask.toString(16), "Internal Name": name }; ``` Rows will be generated for each block and the final dataset will be exported. ### System Setup 0) Install Ghidra, Xcode, and Apple Numbers.
Installation Notes (click to show) On macOS, Ghidra was installed using Homebrew: ```bash brew install --cask ghidra ```
1) Add the base Ghidra folder to the PATH variable. The following shell command adds to the path for the current `zsh` or `bash` session: ```bash export PATH="$PATH":$(dirname $(realpath `which ghidraRun`)) ``` 2) Install `ghidra.js` globally: ```bash npm install -g ghidra.js ``` :::note pass If the install fails with a permissions issue, install with the root user: ```bash sudo npm install -g ghidra.js ``` ::: ### Program Preparation 3) Create a temporary folder to hold the Ghidra project: ```bash mkdir -p /tmp/sheetjs-ghidra ``` 4) Copy the `TSTables` framework to the current directory: ```bash cp /Applications/Numbers.app/Contents/Frameworks/TSTables.framework/Versions/Current/TSTables . ``` 5) Create a "thin" binary by extracting the `x86_64` part of the framework: ```bash lipo TSTables -thin x86_64 -output TSTables.macho ``` :::info pass When this demo was last tested, the headless analyzer did not support Mach-O fat binaries. `lipo` creates a new binary with support for one architecture. ::: 6) Analyze the program: ```bash $(dirname $(realpath `which ghidraRun`))/support/analyzeHeadless /tmp/sheetjs-ghidra Numbers -import TSTables.macho ``` :::note pass This process may take a while and print a number of Java stacktraces. The errors can be ignored. ::: ### SheetJS Integration 7) Download [`sheetjs-ghidra.js`](pathname:///ghidra/sheetjs-ghidra.js): ```bash curl -LO https://docs.sheetjs.com/ghidra/sheetjs-ghidra.js ``` 8) Install the [SheetJS NodeJS module](/docs/getting-started/installation/nodejs): {`\ npm i --save https://cdn.sheetjs.com/xlsx-${current}/xlsx-${current}.tgz`} 9) Run the script: ```bash $(dirname $(realpath `which ghidraRun`))/support/analyzeHeadless /tmp/sheetjs-ghidra Numbers -process TSTables.macho -noanalysis -scriptPath `pwd` -postScript sheetjs-ghidra.js ``` 10) Open the generated `SheetJSGhidraTSTCell.xlsx` spreadsheet. [^1]: The project does not have a website. The [source repository](https://github.com/vaguue/Ghidra.js) is publicly available. [^2]: `BrtRowHdr` is defined in the [`MS-XLSB` specification](/docs/miscellany/references) [^3]: See [`json_to_sheet` in "Utilities"](/docs/api/utilities/array#array-of-objects-input) [^4]: See [`book_new` in "Utilities"](/docs/api/utilities/wb) [^5]: See [`writeFile` in "Writing Files"](/docs/api/write-options)