Feature request: ignore hidden cells in table_to_sheet/book #1115

Closed
opened 2018-05-23 04:14:05 +00:00 by Finesse · 8 comments
Finesse commented 2018-05-23 04:14:05 +00:00 (Migrated from github.com)

It would be great if SheetJS ignores hidden rows and cells when parsing a DOM table using the table_to_sheet and table_to_book functions.

For example, the following table:

<table>
  <tr>
    <td>One</td>
    <td style="display: none;">Two</td>
    <td>Three</td>
  </tr>
  <tr>
    <td>1</td>
    <td class="hidden">2</td>
    <td>3</td>
  </tr>
  <tr style="display: none;">
    <td>Раз</td>
    <td>Два</td>
    <td>Три</td>
  </tr>
</table>

<style>
  .hidden {
    display: 'none';
  }
</style>

Would be parsed as:

One Three
1 3

Detecting whether the cell is hidden can be done by detecting a display: none style:

if (getComputedStyle(cell).getPropertyValue('display') === 'none') {
    continue;
}

Ignoring hidden cells and rows can be optional:

XLSX.utils.table_to_sheet(table, {
    ignoreHiddenRows: true,
    ignoreHiddenCells: true
});

If you support this feature, I can make a pull request.

It would be great if SheetJS ignores hidden rows and cells when parsing a DOM table using the `table_to_sheet` and `table_to_book` functions. For example, the following table: ```html <table> <tr> <td>One</td> <td style="display: none;">Two</td> <td>Three</td> </tr> <tr> <td>1</td> <td class="hidden">2</td> <td>3</td> </tr> <tr style="display: none;"> <td>Раз</td> <td>Два</td> <td>Три</td> </tr> </table> <style> .hidden { display: 'none'; } </style> ``` Would be parsed as: | One | Three | |-----|-----| | 1 | 3 | Detecting whether the cell is hidden can be done by detecting a `display: none` style: ```js if (getComputedStyle(cell).getPropertyValue('display') === 'none') { continue; } ``` Ignoring hidden cells and rows can be optional: ```js XLSX.utils.table_to_sheet(table, { ignoreHiddenRows: true, ignoreHiddenCells: true }); ``` If you support this feature, I can make a pull request.
SheetJSDev commented 2018-05-23 06:25:16 +00:00 (Migrated from github.com)

There are funny corner cases surrounding the interaction of merge cells (TD with rowspan >1 and colspan > 1). For example, suppose you had a table where B2:D3 were merged:

<table border=1>
<tr><td>A1</td><td>B1</td><td>C1</td><td>D1</td></tr>
<tr><td>A2</td><td rowspan=2 colspan=2>B2:C3</td><td>D2</td></tr>
<tr><td>A3</td><td>D3</td></tr>
<tr><td>A4</td><td>B4</td><td>C4</td><td>D4</td></tr>
</table>
A1B1C1D1
A2B2:C3D2
A3D3
A4B4C4D4

If you hide A2 (setting style="display:none;, most HTML rendering engines will display as if you specified:

<table border=1>
<tr><td>A1</td><td>B1</td><td>C1</td><td>D1</td></tr>
<tr><td rowspan=2 colspan=2>B2:C3 (now A2:B3)</td><td>D2 (now C2)</td></tr>
<tr><td>A3 (now C3)</td><td>D3</td></tr>
<tr><td>A4</td><td>B4</td><td>C4</td><td>D4</td></tr>
</table>

Which would be drawn like

A1B1C1D1
B2:C3 (now A2:B3)D2 (now C2)
A3 (now C3)D3
A4B4C4D4

Is that the right answer? Or should non-display TD elements be treated as if they had no contents (generate a stub cell or nothing at all)?

There are funny corner cases surrounding the interaction of merge cells (TD with rowspan >1 and colspan > 1). For example, suppose you had a table where B2:D3 were merged: ```html <table border=1> <tr><td>A1</td><td>B1</td><td>C1</td><td>D1</td></tr> <tr><td>A2</td><td rowspan=2 colspan=2>B2:C3</td><td>D2</td></tr> <tr><td>A3</td><td>D3</td></tr> <tr><td>A4</td><td>B4</td><td>C4</td><td>D4</td></tr> </table> ``` <table border=1> <tr><td>A1</td><td>B1</td><td>C1</td><td>D1</td></tr> <tr><td>A2</td><td rowspan=2 colspan=2>B2:C3</td><td>D2</td></tr> <tr><td>A3</td><td>D3</td></tr> <tr><td>A4</td><td>B4</td><td>C4</td><td>D4</td></tr> </table> If you hide A2 (setting `style="display:none;`, most HTML rendering engines will display as if you specified: ``` <table border=1> <tr><td>A1</td><td>B1</td><td>C1</td><td>D1</td></tr> <tr><td rowspan=2 colspan=2>B2:C3 (now A2:B3)</td><td>D2 (now C2)</td></tr> <tr><td>A3 (now C3)</td><td>D3</td></tr> <tr><td>A4</td><td>B4</td><td>C4</td><td>D4</td></tr> </table> ``` Which would be drawn like <table border=1> <tr><td>A1</td><td>B1</td><td>C1</td><td>D1</td></tr> <tr><td rowspan=2 colspan=2>B2:C3 (now A2:B3)</td><td>D2 (now C2)</td></tr> <tr><td>A3 (now C3)</td><td>D3</td></tr> <tr><td>A4</td><td>B4</td><td>C4</td><td>D4</td></tr> </table> Is that the right answer? Or should non-display TD elements be treated as if they had no contents (generate a stub cell or nothing at all)?
Finesse commented 2018-05-23 06:53:49 +00:00 (Migrated from github.com)

@SheetJSDev I think that a user expects to see in a worksheet exactly what he/she sees in the browser, therefore SheetJS should behave the same as browsers (completely ignore hidden cells).

@SheetJSDev I think that a user expects to see in a worksheet exactly what he/she sees in the browser, therefore SheetJS should behave the same as browsers (completely ignore hidden cells).
SheetJSDev commented 2018-05-24 21:56:32 +00:00 (Migrated from github.com)

@Finesse It's probably best to follow Excel's example. If you take the first example, save as an HTML file, and open in Excel, you will find that it hides the row but does not hide the column:

So the specific approach that we'd be willing to accept is: if available, use getComputedStyle on the TR element to determine whether the TR is visible. If it is not visible, mark the row as hidden. The feature is controlled in the worksheet !row array

@Finesse It's probably best to follow Excel's example. If you take the first example, save as an HTML file, and open in Excel, you will find that it hides the row but does not hide the column: <img width="397" alt="" src="https://user-images.githubusercontent.com/6070939/40515562-4d89e808-5f7b-11e8-8b2f-d4c5235f3054.png"> So the specific approach that we'd be willing to accept is: if available, use `getComputedStyle` on the TR element to determine whether the TR is visible. If it is not visible, mark the row as hidden. [The feature is controlled in the worksheet `!row` array](https://docs.sheetjs.com/#row-properties)
Finesse commented 2018-05-25 01:12:56 +00:00 (Migrated from github.com)

@SheetJSDev What about cells (<td/>s and <th/>s)? Must hidden cells be ignored? I insist to ignore them to conform the HTML behaviour. This is the key feature I request.

Should the described behaviour be the default behaviour? Should there be a possibility to enable/disable it using the ignoreHiddenRows and ignoreHiddenCells parameters?

@SheetJSDev What about cells (`<td/>`s and `<th/>`s)? Must hidden cells be ignored? I insist to ignore them to conform the HTML behaviour. This is the key feature I request. Should the described behaviour be the default behaviour? Should there be a possibility to enable/disable it using the `ignoreHiddenRows` and `ignoreHiddenCells` parameters?
SheetJSDev commented 2018-05-25 03:06:42 +00:00 (Migrated from github.com)

Excel treats the singleton <td/> and <th/> as if they were empty text (stub cells). It probably makes sense to generate stub cells if sheetStubs is true (to be consistent with the other formats).

The original objection was based on how Excel handles the underlying HTML, but upon reflection your argument makes sense. Let's try:

  • The default behavior should include every cell and only hide rows if the TR element is hidden.

  • Create a new display option (default false) that omits every hidden cell and every hidden row if true (as if the hidden cells were not included in the first place).

Excel treats the singleton `<td/>` and `<th/>` as if they were empty text (stub cells). It probably makes sense to generate stub cells if `sheetStubs` is true (to be consistent with the other formats). The original objection was based on how Excel handles the underlying HTML, but upon reflection your argument makes sense. Let's try: - The default behavior should include every cell and only hide rows if the TR element is hidden. - Create a new `display` option (default false) that omits every hidden cell and every hidden row if true (as if the hidden cells were not included in the first place).
Finesse commented 2018-05-25 06:32:07 +00:00 (Migrated from github.com)

@SheetJSDev Let's recap it.

There will be a new option called display in the table_to_sheet and table_to_book methods. The default value of this option is false.

If display value is false, the result table (created from the HTML from my first example) looks this way:

image

If display value is true, the result table looks this way:

image

Is it correct?

@SheetJSDev Let's recap it. There will be a new option called `display` in the `table_to_sheet` and `table_to_book` methods. The default value of this option is `false`. If `display` value is `false`, the result table (created from the HTML from my first example) looks this way: ![image](https://user-images.githubusercontent.com/9006227/40529559-d20a2b48-6038-11e8-9179-70e3494b0913.png) If `display` value is `true`, the result table looks this way: ![image](https://user-images.githubusercontent.com/9006227/40529590-f2b991c6-6038-11e8-8296-4a52f816bbd4.png) Is it correct?
SheetJSDev commented 2018-05-25 06:40:26 +00:00 (Migrated from github.com)

@Finesse correct. That makes the most sense since the default behavior lines up with how Excel handles HTML tables while the new behavior lines up with what people would expect.

https://www.caniuse.com/#feat=getcomputedstyle suggests IE9+ support getComputedStyle, so it should not be assumed to exist (IE8 is still a target in the browser test matrix). If getComputedStyle is not available, just proceed as if nothing was hidden.

@Finesse correct. That makes the most sense since the default behavior lines up with how Excel handles HTML tables while the new behavior lines up with what people would expect. https://www.caniuse.com/#feat=getcomputedstyle suggests IE9+ support getComputedStyle, so it should not be assumed to exist (IE8 is still a target in the browser test matrix). If getComputedStyle is not available, just proceed as if nothing was hidden.
Finesse commented 2018-05-25 06:45:47 +00:00 (Migrated from github.com)

@SheetJSDev Ok! I will make a pull request in a few days.

@SheetJSDev Ok! I will make a pull request in a few days.
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#1115
No description provided.