Excel HTML files with tags inside cells results in extra white spaces #1622

Closed
opened 2019-09-12 10:45:58 +00:00 by KurtMar · 2 comments
KurtMar commented 2019-09-12 10:45:58 +00:00 (Migrated from github.com)

When parsing Excel HTML files, the htmldecode function strips new lines from the beginning and end of the cell. If there are HTML elements like DIV or SPAN inside the cell, then new lines before and after the content will be transformed into whitespaces. Here is a jsbin example showing the problem and a possible solution:

https://jsbin.com/muzucuyigo/edit?js,console

The solution of checking for <\s and \s> is not optimal, but is in my opinion better than the current implementation.

When parsing Excel HTML files, the _htmldecode_ function strips new lines from the beginning and end of the cell. If there are HTML elements like DIV or SPAN inside the cell, then new lines before and after the content will be transformed into whitespaces. Here is a jsbin example showing the problem and a possible solution: https://jsbin.com/muzucuyigo/edit?js,console The solution of checking for <\s and \s> is not optimal, but is in my opinion better than the current implementation.
SheetJSDev commented 2019-09-12 21:11:11 +00:00 (Migrated from github.com)

Agreed that it is currently incorrect and your solution looks like it addresses the problem, feel free to submit a PR. The line is in bits/22_xmlutils.js and feel free to split up that one long line into smaller parts!

Agreed that it is currently incorrect and your solution looks like it addresses the problem, feel free to submit a PR. The line is in [`bits/22_xmlutils.js`](https://github.com/SheetJS/js-xlsx/blob/master/bits/22_xmlutils.js#L181) and feel free to split up that one long line into smaller parts!
SheetJSDev commented 2019-10-05 17:28:15 +00:00 (Migrated from github.com)

#1650 closes this issue

#1650 closes this issue
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#1622
No description provided.