Cannot read property '3' of undefined #569

Closed
opened 2017-02-23 09:01:54 +00:00 by e12009 · 4 comments
e12009 commented 2017-02-23 09:01:54 +00:00 (Migrated from github.com)

I am using node xlsx module to parse an xlsx file, with the following code snippet:

'use strict';
var xlsx = require('xlsx');
var result = xlsx.readFile(process.argv[2]);

and I got the following errors:

node_modules/xlsx/xlsx.js:1572
var __readInt32LE = function(b, idx) { return (b[idx+3]<<24)|(b[idx+2]<<16)|(b[idx+1]<<8)|b[idx]; };
^

TypeError: Cannot read property '3' of undefined
at __readInt32LE (/Users/cswei/crawler/crawler/node_modules/xlsx/xlsx.js:1572:49)
at make_sector_list (/Users/cswei/crawler/crawler/node_modules/xlsx/xlsx.js:1136:8)
at parse (/Users/cswei/crawler/crawler/node_modules/xlsx/xlsx.js:954:19)
at readFileSync (/Users/cswei/crawler/crawler/node_modules/xlsx/xlsx.js:1205:9)
at Object.readSync [as read] (/Users/cswei/crawler/crawler/node_modules/xlsx/xlsx.js:1210:23)
at readSync (/Users/cswei/crawler/crawler/node_modules/xlsx/xlsx.js:11391:28)
at Object.readFileSync (/Users/cswei/crawler/crawler/node_modules/xlsx/xlsx.js:11403:9)

Attached is t he file I am using

billExport.xlsx

This is a valid excel file and can be opened by other apps like OpenOffice, Number on Mac.

I am using node xlsx module to parse an xlsx file, with the following code snippet: ``` 'use strict'; var xlsx = require('xlsx'); var result = xlsx.readFile(process.argv[2]); ``` and I got the following errors: node_modules/xlsx/xlsx.js:1572 var __readInt32LE = function(b, idx) { return (b[idx+3]<<24)|(b[idx+2]<<16)|(b[idx+1]<<8)|b[idx]; }; ^ TypeError: Cannot read property '3' of undefined at __readInt32LE (/Users/cswei/crawler/crawler/node_modules/xlsx/xlsx.js:1572:49) at make_sector_list (/Users/cswei/crawler/crawler/node_modules/xlsx/xlsx.js:1136:8) at parse (/Users/cswei/crawler/crawler/node_modules/xlsx/xlsx.js:954:19) at readFileSync (/Users/cswei/crawler/crawler/node_modules/xlsx/xlsx.js:1205:9) at Object.readSync [as read] (/Users/cswei/crawler/crawler/node_modules/xlsx/xlsx.js:1210:23) at readSync (/Users/cswei/crawler/crawler/node_modules/xlsx/xlsx.js:11391:28) at Object.readFileSync (/Users/cswei/crawler/crawler/node_modules/xlsx/xlsx.js:11403:9) Attached is t he file I am using [billExport.xlsx](https://github.com/SheetJS/js-xlsx/files/796003/billExport.xlsx) This is a valid excel file and can be opened by other apps like OpenOffice, Number on Mac.
reviewher commented 2017-02-23 09:03:24 +00:00 (Migrated from github.com)

A little bit of context would help :) were you trying to read or write a file? IF you were reading a file, can you share the file you were using?

A little bit of context would help :) were you trying to read or write a file? IF you were reading a file, can you share the file you were using?
reviewher commented 2017-02-23 09:57:59 +00:00 (Migrated from github.com)

@e12009 thanks for sharing a file! The source fix is to add a check in the make_sector_list function:

                        if(ssz < 4 + jj) throw "FAT boundary crossed: " + j + " 4 "+ssz;
+                       if(!sectors[addr]) break;
                        j = __readInt32LE(sectors[addr], jj);

Whatever tool wrote that file made three mistakes:

  1. it's really a BIFF8 XLS file but the tool claimed it was an XLSX. Immediately Excel 2016 failed on the file, but renaming to XLS made it work.

  2. The file is 80001 bytes. Quick summary: BIFF8 XLS uses a container format called CFB, described in the MS-CFB Spec. It requires that the file size should be a multiple of 512 bytes, and 80001 is very clearly not a multiple of 512.

  3. there is only one FAT sector. The CFB file mandates that files be broken up into 512 byte blocks and FAT sectors describe how those blocks should be organized. At most 128 sectors can be covered by a FAT block. Since the file header is 512 bytes, the maximum file size supported by one FAT sector is 66048. Something doesn't make sense. I inspected the file and noticed a bunch of blank sectors! My operating theory is that the writer decided not to include a free sector chain because Excel technically doesn't require it. The JS parser actually expects a complete FAT table.

Summary is that I think the writer cut a few corners, but fortunately the one-line fix works.

@SheetJSDev the fix works for the specific file but I have not run full tests yet. It is also worth updating JS-CFB

@e12009 thanks for sharing a file! The source fix is to add a check in the `make_sector_list` function: ```diff if(ssz < 4 + jj) throw "FAT boundary crossed: " + j + " 4 "+ssz; + if(!sectors[addr]) break; j = __readInt32LE(sectors[addr], jj); ``` Whatever tool wrote that file made three mistakes: 1) it's really a BIFF8 XLS file but the tool claimed it was an XLSX. Immediately Excel 2016 failed on the file, but renaming to XLS made it work. 2) The file is 80001 bytes. Quick summary: BIFF8 XLS uses a container format called CFB, described in the [MS-CFB Spec](https://msdn.microsoft.com/en-us/library/dd942138.aspx). It requires that the file size should be a multiple of 512 bytes, and 80001 is very clearly not a multiple of 512. 3) there is only one FAT sector. The CFB file mandates that files be broken up into 512 byte blocks and FAT sectors describe how those blocks should be organized. At most 128 sectors can be covered by a FAT block. Since the file header is 512 bytes, the maximum file size supported by one FAT sector is 66048. Something doesn't make sense. I inspected the file and noticed a bunch of blank sectors! My operating theory is that the writer decided not to include a free sector chain because Excel technically doesn't require it. The JS parser actually expects a complete FAT table. Summary is that I think the writer cut a few corners, but fortunately the one-line fix works. @SheetJSDev the fix works for the specific file but I have not run full tests yet. It is also worth updating [JS-CFB](https://github.com/sheetjs/js-cfb)
e12009 commented 2017-02-23 10:24:56 +00:00 (Migrated from github.com)

@reviewher Thanks for the great comments. it seems fixes the problem. some other comments:

  1. The original file IS XLS file, but to upload here, it seems github only takes XLSX file, so I just renamed it from XLS to XLSX file;

  2. Since we are using npm to manage all the dependencies, I am not supposed to make the change locally, because even if I made the change locally, when we deploy it to the production environment, it still gets the wrong code from the github. what is the best way to get this fix quickly to the production environment before the fix is officially available ?

Thanks

@reviewher Thanks for the great comments. it seems fixes the problem. some other comments: 1. The original file IS XLS file, but to upload here, it seems github only takes XLSX file, so I just renamed it from XLS to XLSX file; 2. Since we are using npm to manage all the dependencies, I am not supposed to make the change locally, because even if I made the change locally, when we deploy it to the production environment, it still gets the wrong code from the github. what is the best way to get this fix quickly to the production environment before the fix is officially available ? Thanks
SheetJSDev commented 2017-02-23 17:18:18 +00:00 (Migrated from github.com)

@e12009 thanks for the report and @reviewher thanks for looking into it!

We hope to stand up a new version today or tomorrow, but in the meanwhile you can make a local copy of xlsx.js, add the dependencies from here to your package.json dependencies, and modify your xlsx dependency to point to a local file. See https://docs.npmjs.com/files/package.json#local-paths for an example,

@e12009 thanks for the report and @reviewher thanks for looking into it! We hope to stand up a new version today or tomorrow, but in the meanwhile you can make a local copy of xlsx.js, add the dependencies from here to your package.json dependencies, and modify your xlsx dependency to point to a local file. See https://docs.npmjs.com/files/package.json#local-paths for an example,
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#569
No description provided.