Possible to add a worksheet with a name Excel doesn't like #376

Closed
opened 2016-03-01 14:17:36 +00:00 by pietersv · 5 comments
pietersv commented 2016-03-01 14:17:36 +00:00 (Migrated from github.com)

This isn't a bug, more of a safety guard that's absent and thus a caution for API users. It is possible to create a workbook with a seemingly-valid sheet name that Excel cannot open.

According to this source there are seven characters not allowable as tab names: \ / * [ ] : ? whereas ampersands & should be allowable.

However, if a sheet name contains an ampersand, the workbook generates an error on opening.

Excel could not open wb.xlsx because some content is unreadable. Do you want to open and repair this workbook?
image

Internally, Excel expects the character & to be escaped as &amp. So right now it's up to the API user to escape special characters in worksheet names.

 <sheet name="Overal &amp; Total" sheetId="1" r:id="rId1"/>
workbook = {
  "SheetNames": [
    "Plough &amp; Stars"
  ],
  "Sheets": {
    "Plough &amp; Stars": {
      "A1": {  "v": "hello" }
    },
    "!ref": "A1:A1"
}
This isn't a bug, more of a safety guard that's absent and thus a caution for API users. It is possible to create a workbook with a seemingly-valid sheet name that Excel cannot open. According to [this source](http://www.accountingweb.com/technology/excel/seven-characters-you-cant-use-in-worksheet-names) there are seven characters not allowable as tab names: `\ / * [ ] : ?` whereas ampersands `&` should be allowable. However, if a sheet name contains an ampersand, the workbook generates an error on opening. `Excel could not open wb.xlsx because some content is unreadable. Do you want to open and repair this workbook?` ![image](https://cloud.githubusercontent.com/assets/10464727/13428960/ce267e12-df8b-11e5-9044-2fb42431b654.png) Internally, Excel expects the character `&` to be escaped as `&amp.` So right now it's up to the API user to escape special characters in worksheet names. ``` <sheet name="Overal &amp; Total" sheetId="1" r:id="rId1"/> ``` ``` js workbook = { "SheetNames": [ "Plough &amp; Stars" ], "Sheets": { "Plough &amp; Stars": { "A1": { "v": "hello" } }, "!ref": "A1:A1" } ```
mattbasta commented 2016-03-29 20:13:22 +00:00 (Migrated from github.com)

I ran into this in production yesterday, so this isn't an unusual edge case.

I ran into this in production yesterday, so this isn't an unusual edge case.
iantocristian commented 2016-04-11 14:18:37 +00:00 (Migrated from github.com)

It looks like a bug to me, the library should properly escape the worksheet names before writing them to workbook.xml (ampersand and other characters that require escaping in xml).

It looks like a bug to me, the library should properly escape the worksheet names before writing them to workbook.xml (ampersand and other characters that require escaping in xml).
nickpalmer commented 2016-06-15 21:27:29 +00:00 (Migrated from github.com)

Excel has a whole host of issues with sheet names. It cannot handle & : [ ] \ / ? and has a maximum length of 31 characters.

Excel has a whole host of issues with sheet names. It cannot handle & : [ ] \ / ? and has a maximum length of 31 characters.
SheetJSDev commented 2017-03-06 17:15:40 +00:00 (Migrated from github.com)

Here are the "rules":

  • bad characters: : \ / ? * [ ]
  • length: 0 < length < 32
  • Characters apparently can be encoded with the _xHHHH_ form, so it's possible to make a sheet with a newline character! However, the length only applies to the decoded characters, so a name like
1234567890
1234567890
123456789

can be encoded as 1234567890_x000d_1234567890_x000d_123456789

So definitely js-xlsx should do something. It's unclear to me how far it should go.

js-xlsx definitely should encode/decode the tab names, so the JS interface works with JS strings and the XLSX writer/reader interface with the encoded values.

I am not sure how js-xlsx should handle invalid names. Should it

A) throw an error (something about an invalid sheet name)

B) try to "fix" the name by replacing bad characters

Related to this is duplicated SheetNames entries. If a name shows up twice in the SheetNames array (like ["Sheet1", "Sheet2", "Sheet2"]) how should it be handled?

Here are the "rules": - bad characters: `: \ / ? * [ ]` - length: `0 < length < 32` - Characters apparently can be encoded with the `_xHHHH_` form, so it's possible to make a sheet with a newline character! However, the length only applies to the decoded characters, so a name like ``` 1234567890 1234567890 123456789 ``` can be encoded as `1234567890_x000d_1234567890_x000d_123456789` So definitely js-xlsx should do something. It's unclear to me how far it should go. js-xlsx definitely should encode/decode the tab names, so the JS interface works with JS strings and the XLSX writer/reader interface with the encoded values. I am not sure how js-xlsx should handle invalid names. Should it A) throw an error (something about an invalid sheet name) B) try to "fix" the name by replacing bad characters Related to this is duplicated SheetNames entries. If a name shows up twice in the `SheetNames` array (like `["Sheet1", "Sheet2", "Sheet2"]`) how should it be handled?
SheetJSDev commented 2017-03-25 04:57:00 +00:00 (Migrated from github.com)

We started the process by adding a function check_wb that is called at the start of the write function. We will slowly add the recommendations here. Currently it throws an error, but that can change if you think we should pass errors in a different way

We started the process by adding a function [`check_wb`](https://github.com/SheetJS/js-xlsx/blob/master/bits/71_wbcommon.js#L102) that is called at the start of the `write` function. We will slowly add the recommendations here. Currently it throws an error, but that can change if you think we should pass errors in a different way
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#376
No description provided.