Need a minimal secure library file #3070

Closed
opened 2024-02-08 18:35:22 +00:00 by vkandadai · 1 comment

I plan to use SheetJS in a secure environment. When I move my xslx.js file to a more secure environment it gets stripped as (I'm guessing) it has references to several URL's and also foreign characters in it. I can get it to work on lower networks in DoD. Need a vanilla version to work in secure environments. I have no issues moving for example office.js file or the MDB or Bootstrap .js files.

I plan to use SheetJS in a secure environment. When I move my xslx.js file to a more secure environment it gets stripped as (I'm guessing) it has references to several URL's and also foreign characters in it. I can get it to work on lower networks in DoD. Need a vanilla version to work in secure environments. I have no issues moving for example office.js file or the MDB or Bootstrap .js files.
Owner

tl;dr: The library must store plaintext or encrypted URLs. It is unavoidable. If a security audit is needed, please ask relevant parties to reach out to security@sheetjs.com for more information.

Under the hood, a number of supported spreadsheet formats (including XLSX and SpreadsheetML2003) use XML.

XML namespaces are specified as URLs. For example, in the linked Wikipedia article, the traditional XHTML namespace is identified as http://www.w3.org/1999/xhtml

For XLSX, there are a number of required namespaces. Some of the URLs for the namespaces are in bits/31_rels.js.

The namespace for the workbook metadata is http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument and that string must be added to generated XLSX workbooks, which means it must exist in some capacity within the library.

The "workaround" is to programmatically generate the strings. For example, the base64 representation can be stored instead of the raw content. However, the scanners may detect base64-encoded URLs. Without knowing the scanner and its ruleset, we cannot know if a particular strategy will work.

PS: On the general matter of DoD, teams within a number of defense agencies use SheetJS CE and SheetJS Pro builds as-is. It is possible that they needed to seek some sort of approval.

tl;dr: The library must store plaintext or encrypted URLs. It is unavoidable. If a security audit is needed, please ask relevant parties to reach out to security@sheetjs.com for more information. Under the hood, a number of supported spreadsheet formats (including XLSX and SpreadsheetML2003) use XML. [XML namespaces](https://en.wikipedia.org/wiki/XML_namespace) are specified as URLs. For example, in the linked Wikipedia article, the traditional XHTML namespace is identified as `http://www.w3.org/1999/xhtml` For XLSX, there are a number of required namespaces. Some of the URLs for the namespaces are in [`bits/31_rels.js`](https://git.sheetjs.com/sheetjs/sheetjs/src/branch/master/bits/31_rels.js#L1). The namespace for the workbook metadata is `http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument` and that string must be added to generated XLSX workbooks, which means it must exist in some capacity within the library. The "workaround" is to programmatically generate the strings. For example, the base64 representation can be stored instead of the raw content. However, the scanners may detect base64-encoded URLs. Without knowing the scanner and its ruleset, we cannot know if a particular strategy will work. PS: On the general matter of DoD, teams within a number of defense agencies use SheetJS CE and SheetJS Pro builds as-is. It is possible that they needed to seek some sort of approval.
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#3070
No description provided.