Returning wrong sheet if the workbook contains a very hidden sheet #123
Labels
No Label
DBF
Dates
Defined Names
Features
Formula
HTML
Images
Infrastructure
Integration
International
ODS
Operations
Performance
PivotTables
Pro
Protection
Read Bug
SSF
SYLK
Style
Write Bug
good first issue
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: sheetjs/sheetjs#123
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
We are processing .xlsx files generated by SAP which contain sheets marked as 'xlsSheetVeryHidden'. In that case, accessing a visible sheet by name returns the content of the non-visible sheet (which even has a different name). It seems that the index of a sheet is calculated based on the list of 'visible' and 'normally hidden' sheets instead of all available sheets.
This behavior can easily been verified on a simple workbook containing two sheets in which the first had been marked as 'very hidden' in the 'View Code' window.
@rla-dev very hidden sheets are odd. The list of sheets is stored in two locations:
in the subfile docProps/app.xml (the document properties) -- this list does not include very hidden sheets
in the subfile xl/workbook.xml (the workbook manifest) -- this list includes very hidden sheets
We currently use the former list when parsing, which is why the very hidden sheets are lost. That's an easy fix. What isn't completely clear is whether very hidden sheets should be included in the list of sheet names.
@elad @johnyesberg Picking up the conversation from #111 , should the sheet list include very hidden sheets (even though it isn't clickable) or does it make sense to store the complete list elsewhere?
In my opinion it shouldn't include very hidden sheets unless an option explicitly asked for them.
@SheetJSDev - thanks for analyzing the issue.
I think, because we are accessing the workbook on an API level, the list should contain all sheets of the workbook. The visibility of sheets is meant for interactive use. If the very hidden sheets are included in the list, it should be possible to determine the level of visibility.
If this introduces an unacceptable API break, I can live with the solution mentioned by @elad. Either way - accessing a sheet by name must under any circumstances return the requested sheet (API consistency).
I don't think it's an unacceptable API break, but I do think that since Excel is a GUI spreadsheet and not a database the API's default functionality should reflect what a human would get working with the UI directly. Cases where such a default might slow down parsing and/or is unnecessary unless explicitly required (for example, styles) are reasonable for a default-off. I don't see why "very hidden sheets" should be parsed without the user asking them to if the UI itself doesn't display them by default either.
@rla-dev @elad Correct me if I'm wrong, but AFAICT the only real functional difference between hidden and very hidden sheets is that very hidden sheets are not shown in the unhide popup (right-click a tab and select "unhide") Visible sheets can use formulae that reference very hidden sheets.
As for the intention, [MS-XLS] describes very hidden as:
ECMA-376 describes very hidden as:
The real question, then, is whether we should be considering SheetJS as a programmatic interface or a proxy for the GUI. Does it make sense to introduce another worksheet list, like
TabNames
, which reflects the visible tabs in order? (This would not include any hidden tabs, very hidden or otherwise) This order is well-defined and accessible in the workbook data.I think a property on the worksheet is a better approach than separate lists. As long as this is well documented I don't mind if by default js-xlsx also parses the hidden/very hidden sheets. I don't know what most users of js-xlsx would expect, so I can't make any arguments for or against a default value. :)
I think it's important to build js-xlsx as a programmatic interface, because the library is dealing with the persisted state. It should allow to read and manipulate underlying model no matter that the model also includes visual aspects (or aspects that could be interpreted in a visual way). This is in line with the concept of libraries like Apache POI (http://poi.apache.org/spreadsheet).
In my opinion it would make sense to provide a method to retrieve all the sheets within the workbook and put the state on the sheet (as @elad suggested). This unified access should also help to provide the correct implementation of the retrieval of sheets by name (indexed access is often critical and may lead to inconsistencies - as the current implement shows).
@SheetJSDev - thanks again for your effort and the great work
This seems to be still an issue.
I have prepared fix so that in Sheets property are all available sheets but in SheetNames are only visible tabs.
0160aaadbb
Users could iterate trough Sheets to get all sheets or base on SheetNames for only visible once.
It may be kind a confusing to the user that in SheetNames are not all sheets, only visible but in the other hand what if he want to display only visible sheets.
We probably need some property
type
but what could be values for that? Is there any such sheet type in specification for our purpose?@SheetJSDev I can see 36 Pull requests pending, is it even worth contributing? This project is one of few with xls/xlsx js projects and I love to see it live.
@Mior Thanks! this lost issue lost me the best part of a day, just tried out your fix and works great.
Please consider pulling in @Mior's pr. It solved my problem that I lost a few hours on today.
We're going in the opposite direction: including very hidden sheets in the list. This also lets us put in chartsheets and other features.