Error: Invalid HTML: could not find <table> #2543

Closed
opened 2022-03-05 17:15:35 +00:00 by arladmin · 4 comments
arladmin commented 2022-03-05 17:15:35 +00:00 (Migrated from github.com)
//  index.js

const XLSX = require("xlsx");
const workbook = XLSX.readFile("test.html");

I am running into this error unexpectedly.
Can't figure out the reason why.

The html does have proper <table> tags for the only table in the file.

``` // index.js const XLSX = require("xlsx"); const workbook = XLSX.readFile("test.html"); ``` I am running into this error unexpectedly. Can't figure out the reason why. The html does have proper ```<table>``` tags for the only table in the file.
reviewher commented 2022-03-05 18:48:00 +00:00 (Migrated from github.com)

Can you share the contents of test.html?

Can you share the contents of test.html?
arladmin commented 2022-03-06 02:20:21 +00:00 (Migrated from github.com)

Can you share the contents of test.html?

PFA
test.zip

> Can you share the contents of test.html? PFA [test.zip](https://github.com/SheetJS/sheetjs/files/8191842/test.zip)
SheetJSDev commented 2022-03-06 02:27:12 +00:00 (Migrated from github.com)

Feel free to send a PR. The change applies to bits/79.html.js:

diff --git a/bits/79_html.js b/bits/79_html.js
@@ -59,7 +59,7 @@ var HTML_ = (function() {
                return ws;
        }
        function html_to_book(str/*:string*/, opts)/*:Workbook*/ {
-               var mtch = str.match(/<table.*?>[\s\S]*?<\/table>/gi);
+               var mtch = str.match(/<table[\s\S]*?>[\s\S]*?<\/table>/gi);
                if(!mtch || mtch.length == 0) throw new Error("Invalid HTML: could not find <table>");
                if(mtch.length == 1) return sheet_to_workbook(html_to_sheet(mtch[0], opts), opts);
                var wb = utils.book_new();

(the TABLE open tag was split across multiple lines in the example file)

Feel free to send a PR. The change applies to [`bits/79.html.js`](https://github.com/SheetJS/sheetjs/blob/master/bits/79_html.js#L62): ```diff diff --git a/bits/79_html.js b/bits/79_html.js @@ -59,7 +59,7 @@ var HTML_ = (function() { return ws; } function html_to_book(str/*:string*/, opts)/*:Workbook*/ { - var mtch = str.match(/<table.*?>[\s\S]*?<\/table>/gi); + var mtch = str.match(/<table[\s\S]*?>[\s\S]*?<\/table>/gi); if(!mtch || mtch.length == 0) throw new Error("Invalid HTML: could not find <table>"); if(mtch.length == 1) return sheet_to_workbook(html_to_sheet(mtch[0], opts), opts); var wb = utils.book_new(); ``` (the TABLE open tag was split across multiple lines in the example file)
reviewher commented 2022-03-08 05:50:22 +00:00 (Migrated from github.com)
a32b30414b3a1da61ba6d6b847a473f54481815a
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#2543
No description provided.