Returning wrong sheet if the workbook contains a very hidden sheet #123

Closed
opened 2014-09-30 15:08:25 +00:00 by rla-dev · 11 comments
rla-dev commented 2014-09-30 15:08:25 +00:00 (Migrated from github.com)

We are processing .xlsx files generated by SAP which contain sheets marked as 'xlsSheetVeryHidden'. In that case, accessing a visible sheet by name returns the content of the non-visible sheet (which even has a different name). It seems that the index of a sheet is calculated based on the list of 'visible' and 'normally hidden' sheets instead of all available sheets.

This behavior can easily been verified on a simple workbook containing two sheets in which the first had been marked as 'very hidden' in the 'View Code' window.

veryhidden

We are processing .xlsx files generated by SAP which contain sheets marked as 'xlsSheetVeryHidden'. In that case, accessing a visible sheet by name returns the content of the non-visible sheet (which even has a different name). It seems that the index of a sheet is calculated based on the list of 'visible' and 'normally hidden' sheets instead of all available sheets. This behavior can easily been verified on a simple workbook containing two sheets in which the first had been marked as 'very hidden' in the 'View Code' window. ![veryhidden](https://cloud.githubusercontent.com/assets/8556384/4460463/9e5072d6-48b2-11e4-813e-0c94e702be97.png)
SheetJSDev commented 2014-09-30 15:41:42 +00:00 (Migrated from github.com)

@rla-dev very hidden sheets are odd. The list of sheets is stored in two locations:

  1. in the subfile docProps/app.xml (the document properties) -- this list does not include very hidden sheets

  2. in the subfile xl/workbook.xml (the workbook manifest) -- this list includes very hidden sheets

We currently use the former list when parsing, which is why the very hidden sheets are lost. That's an easy fix. What isn't completely clear is whether very hidden sheets should be included in the list of sheet names.

@elad @johnyesberg Picking up the conversation from #111 , should the sheet list include very hidden sheets (even though it isn't clickable) or does it make sense to store the complete list elsewhere?

@rla-dev very hidden sheets are odd. The list of sheets is stored in two locations: 1) in the subfile docProps/app.xml (the document properties) -- this list does not include very hidden sheets 2) in the subfile xl/workbook.xml (the workbook manifest) -- this list includes very hidden sheets We currently use the former list when parsing, which is why the very hidden sheets are lost. That's an easy fix. What isn't completely clear is whether very hidden sheets should be included in the list of sheet names. @elad @johnyesberg Picking up the conversation from #111 , should the sheet list include very hidden sheets (even though it isn't clickable) or does it make sense to store the complete list elsewhere?
elad commented 2014-10-01 00:56:06 +00:00 (Migrated from github.com)

In my opinion it shouldn't include very hidden sheets unless an option explicitly asked for them.

In my opinion it shouldn't include very hidden sheets unless an option explicitly asked for them.
rla-dev commented 2014-10-01 07:09:20 +00:00 (Migrated from github.com)

@SheetJSDev - thanks for analyzing the issue.

I think, because we are accessing the workbook on an API level, the list should contain all sheets of the workbook. The visibility of sheets is meant for interactive use. If the very hidden sheets are included in the list, it should be possible to determine the level of visibility.

If this introduces an unacceptable API break, I can live with the solution mentioned by @elad. Either way - accessing a sheet by name must under any circumstances return the requested sheet (API consistency).

@SheetJSDev - thanks for analyzing the issue. I think, because we are accessing the workbook on an API level, the list should contain all sheets of the workbook. The visibility of sheets is meant for interactive use. If the very hidden sheets are included in the list, it should be possible to determine the level of visibility. If this introduces an unacceptable API break, I can live with the solution mentioned by @elad. Either way - accessing a sheet by name must under any circumstances return the requested sheet (API consistency).
elad commented 2014-10-02 19:20:46 +00:00 (Migrated from github.com)

I don't think it's an unacceptable API break, but I do think that since Excel is a GUI spreadsheet and not a database the API's default functionality should reflect what a human would get working with the UI directly. Cases where such a default might slow down parsing and/or is unnecessary unless explicitly required (for example, styles) are reasonable for a default-off. I don't see why "very hidden sheets" should be parsed without the user asking them to if the UI itself doesn't display them by default either.

I don't think it's an unacceptable API break, but I do think that since Excel is a GUI spreadsheet and not a database the API's default functionality should reflect what a human would get working with the UI directly. Cases where such a default might slow down parsing and/or is unnecessary unless explicitly required (for example, styles) are reasonable for a default-off. I don't see why "very hidden sheets" should be parsed without the user asking them to if the UI itself doesn't display them by default either.
SheetJSDev commented 2014-10-02 20:01:50 +00:00 (Migrated from github.com)

@rla-dev @elad Correct me if I'm wrong, but AFAICT the only real functional difference between hidden and very hidden sheets is that very hidden sheets are not shown in the unhide popup (right-click a tab and select "unhide") Visible sheets can use formulae that reference very hidden sheets.

As for the intention, [MS-XLS] describes very hidden as:

the sheet is hidden and cannot be displayed using the user interface.

ECMA-376 describes very hidden as:

Indicates the sheet is hidden and cannot be shown in the user interface (UI). This state is only available programmatically.

The real question, then, is whether we should be considering SheetJS as a programmatic interface or a proxy for the GUI. Does it make sense to introduce another worksheet list, like TabNames, which reflects the visible tabs in order? (This would not include any hidden tabs, very hidden or otherwise) This order is well-defined and accessible in the workbook data.

@rla-dev @elad Correct me if I'm wrong, but AFAICT the only real functional difference between hidden and very hidden sheets is that very hidden sheets are not shown in the unhide popup (right-click a tab and select "unhide") Visible sheets can use formulae that reference very hidden sheets. As for the intention, [MS-XLS] describes very hidden as: > the sheet is hidden and cannot be displayed using the user interface. ECMA-376 describes very hidden as: > Indicates the sheet is hidden and cannot be shown in the user interface (UI). _This state is only available programmatically._ The real question, then, is whether we should be considering SheetJS as a programmatic interface or a proxy for the GUI. Does it make sense to introduce another worksheet list, like `TabNames`, which reflects the visible tabs in order? (This would not include any hidden tabs, very hidden or otherwise) This order is well-defined and accessible in the workbook data.
elad commented 2014-10-04 00:00:50 +00:00 (Migrated from github.com)

I think a property on the worksheet is a better approach than separate lists. As long as this is well documented I don't mind if by default js-xlsx also parses the hidden/very hidden sheets. I don't know what most users of js-xlsx would expect, so I can't make any arguments for or against a default value. :)

I think a property on the worksheet is a better approach than separate lists. As long as this is well documented I don't mind if by default js-xlsx also parses the hidden/very hidden sheets. I don't know what most users of js-xlsx would expect, so I can't make any arguments for or against a default value. :)
rla-dev commented 2014-10-13 21:07:05 +00:00 (Migrated from github.com)

I think it's important to build js-xlsx as a programmatic interface, because the library is dealing with the persisted state. It should allow to read and manipulate underlying model no matter that the model also includes visual aspects (or aspects that could be interpreted in a visual way). This is in line with the concept of libraries like Apache POI (http://poi.apache.org/spreadsheet).

In my opinion it would make sense to provide a method to retrieve all the sheets within the workbook and put the state on the sheet (as @elad suggested). This unified access should also help to provide the correct implementation of the retrieval of sheets by name (indexed access is often critical and may lead to inconsistencies - as the current implement shows).

@SheetJSDev - thanks again for your effort and the great work

I think it's important to build js-xlsx as a programmatic interface, because the library is dealing with the persisted state. It should allow to read and manipulate underlying model no matter that the model also includes visual aspects (or aspects that could be interpreted in a visual way). This is in line with the concept of libraries like Apache POI (http://poi.apache.org/spreadsheet). In my opinion it would make sense to provide a method to retrieve all the sheets within the workbook and put the state on the sheet (as @elad suggested). This unified access should also help to provide the correct implementation of the retrieval of sheets by name (indexed access is often critical and may lead to inconsistencies - as the current implement shows). @SheetJSDev - thanks again for your effort and the great work
Mior commented 2016-04-07 14:34:08 +00:00 (Migrated from github.com)

This seems to be still an issue.
I have prepared fix so that in Sheets property are all available sheets but in SheetNames are only visible tabs. 0160aaadbb
Users could iterate trough Sheets to get all sheets or base on SheetNames for only visible once.

It may be kind a confusing to the user that in SheetNames are not all sheets, only visible but in the other hand what if he want to display only visible sheets.
We probably need some property type but what could be values for that? Is there any such sheet type in specification for our purpose?

@SheetJSDev I can see 36 Pull requests pending, is it even worth contributing? This project is one of few with xls/xlsx js projects and I love to see it live.

This seems to be still an issue. I have prepared fix so that in Sheets property are all available sheets but in SheetNames are only visible tabs. https://github.com/Mior/js-xlsx/commit/0160aaadbb601a9bcae0e9a2bed6e5c75c132d84 Users could iterate trough Sheets to get all sheets or base on SheetNames for only visible once. It may be kind a confusing to the user that in SheetNames are not all sheets, only visible but in the other hand what if he want to display only visible sheets. We probably need some property `type` but what could be values for that? Is there any such sheet type in specification for our purpose? @SheetJSDev I can see 36 Pull requests pending, is it even worth contributing? This project is one of few with xls/xlsx js projects and I love to see it live.
JohnBicknell commented 2016-06-06 17:45:46 +00:00 (Migrated from github.com)

@Mior Thanks! this lost issue lost me the best part of a day, just tried out your fix and works great.

@Mior Thanks! this lost issue lost me the best part of a day, just tried out your fix and works great.
darrinholst commented 2016-08-17 01:11:29 +00:00 (Migrated from github.com)

Please consider pulling in @Mior's pr. It solved my problem that I lost a few hours on today.

Please consider pulling in @Mior's pr. It solved my problem that I lost a few hours on today.
SheetJSDev commented 2017-03-31 15:47:01 +00:00 (Migrated from github.com)

We're going in the opposite direction: including very hidden sheets in the list. This also lets us put in chartsheets and other features.

We're going in the opposite direction: including very hidden sheets in the list. This also lets us put in chartsheets and other features.
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#123
No description provided.