Cannot Read xlsx File Generated from java apache.poi.xssf library #2011

Closed
opened 2020-06-18 02:29:05 +00:00 by Yocoms · 7 comments
Yocoms commented 2020-06-18 02:29:05 +00:00 (Migrated from github.com)

Hi, Thanks for your work, first.

In my project, the business logic is like below.


In Java(Spring) Project : Read an Excel File that is Filled with values and fomulars --> Create a Row and Cell below the last row --> A Client(In my case Chrome Browser) Download the File --> Run Code XLSX.read() --> Run Code XLSX.utils.sheet_to_html() --> Draw a Table from the excel file

The Problems are ...

  1. SheetJS could read the origin values from the excel file, but skipped the values that generated from the java apache poi. Weird thing is that the SheetJS can get the values as data including java generated but the ref range like ("A1:F5") does not include the created area.
    (Here, If I do something(like putting a value, saving file, ...) on the excel file with office-excel program, then it can read all of the values)

  2. The problem above applied to the formula cell as well.
    When I edit a value with java apache poi, SheetJS could not show the calculated value on the table. Yes, it could calculate and show the origin values by the way.

I'm sure someone experiencing same thing like me. And I expect that someone reply for those things.
Thanks for reading.

Hi, Thanks for your work, first. In my project, the business logic is like below. ------------------------------- In Java(Spring) Project : Read an Excel File that is Filled with values and fomulars --> Create a Row and Cell below the last row --> A Client(In my case Chrome Browser) Download the File --> Run Code XLSX.read() --> Run Code XLSX.utils.sheet_to_html() --> Draw a Table from the excel file ------------------------------- The Problems are ... 1. SheetJS could read the origin values from the excel file, but skipped the values that generated from the java apache poi. Weird thing is that the SheetJS can get the values as data including java generated but the ref range like ("A1:F5") does not include the created area. (Here, If I do something(like putting a value, saving file, ...) on the excel file with office-excel program, then it can read all of the values) 2. The problem above applied to the formula cell as well. When I edit a value with java apache poi, SheetJS could not show the calculated value on the table. Yes, it could calculate and show the origin values by the way. I'm sure someone experiencing same thing like me. And I expect that someone reply for those things. Thanks for reading.
SheetJSDev commented 2020-06-18 16:51:30 +00:00 (Migrated from github.com)

Can you share a sample file so we can take a closer look?

Can you share a sample file so we can take a closer look?
Yocoms commented 2020-06-19 02:29:35 +00:00 (Migrated from github.com)

Sure, below is the source.
Thanks for your comment.

////// 1. A Client request get an excel file

////// 2. Server send response with Java file
// Java file has a process logic like below

    FileInputStream fileFis = new FileInputStream(file);
    XSSFWorkbook fileWb = new XSSFWorkbook(fileFis);
    XSSFSheet fileSh = fileWb.getSheetAt(0);

    // Create Row and Cell, Then Set a Value
    // 5, 2, 4 is just random number for test
    Row fileRow = fileSh.createRow(5);
    Cell fileCell = fileRow.createCell(2, 4);
    fileCell.setCellValue("TEST VALUE");

    fileFis.close();

    // Write the output to a file
    File resultFile = new File("1.xlsx");
    FileOutputStream resultFos = new FileOutputStream(resultFile);
    fileWb.write(resultFos);
    resultFos.close();

    return resultFile;

////// 3. Client get a response from the server and SheetJS works like below

    req.onload = function(evt) {
        var tags = "No Data";
        var data = new Uint8Array(req.response);

        if (data.length > 0) {
            var workbook = XLSX.read(data, {type:"array"});
            var sheetName = workbook.SheetNames[0];
            tags = XLSX.utils.sheet_to_html(workbook.Sheets[sheetName], { header: '' });
        }

        jQuery("#divBodyFileViewer").html(tags);
    }

I would be pleased if I could attach my sample files.
I found a demo page you made ---- https://oss.sheetjs.com/
The problem I mentioned is occurring there as well when I drop the sample files.

Since, I cannot attach the files, I attach images that I tested on your demo page.

======= CASE:: Cannot Read the Cells I Added =======
image

======= CASE:: Can Read the Cells I Added After Open the File with Excel Program and save with doing nothing =======
image

Sure, below is the source. Thanks for your comment. ////// 1. A Client request get an excel file ////// 2. Server send response with Java file // Java file has a process logic like below FileInputStream fileFis = new FileInputStream(file); XSSFWorkbook fileWb = new XSSFWorkbook(fileFis); XSSFSheet fileSh = fileWb.getSheetAt(0); // Create Row and Cell, Then Set a Value // 5, 2, 4 is just random number for test Row fileRow = fileSh.createRow(5); Cell fileCell = fileRow.createCell(2, 4); fileCell.setCellValue("TEST VALUE"); fileFis.close(); // Write the output to a file File resultFile = new File("1.xlsx"); FileOutputStream resultFos = new FileOutputStream(resultFile); fileWb.write(resultFos); resultFos.close(); return resultFile; ////// 3. Client get a response from the server and SheetJS works like below req.onload = function(evt) { var tags = "No Data"; var data = new Uint8Array(req.response); if (data.length > 0) { var workbook = XLSX.read(data, {type:"array"}); var sheetName = workbook.SheetNames[0]; tags = XLSX.utils.sheet_to_html(workbook.Sheets[sheetName], { header: '' }); } jQuery("#divBodyFileViewer").html(tags); } I would be pleased if I could attach my sample files. I found a demo page you made ---- https://oss.sheetjs.com/ The problem I mentioned is occurring there as well when I drop the sample files. Since, I cannot attach the files, I attach images that I tested on your demo page. **======= CASE:: Cannot Read the Cells I Added =======** ![image](https://user-images.githubusercontent.com/47406856/85090162-67988300-b21f-11ea-9844-0a7f56fa2ec9.png) **======= CASE:: Can Read the Cells I Added After Open the File with Excel Program and save with doing nothing =======** ![image](https://user-images.githubusercontent.com/47406856/85090252-9e6e9900-b21f-11ea-8162-02e954374a8e.png)
SheetJSDev commented 2020-06-19 03:33:02 +00:00 (Migrated from github.com)

You should be able to add a file here on github, just click and drag the file into the reply window. If you can't, please email it to hello@sheetjs.com.

In any case, If I had to guess it's a POI issue not updating the worksheet range

You should be able to add a file here on github, just click and drag the file into the reply window. If you can't, please email it to hello@sheetjs.com. In any case, If I had to guess it's a POI issue not updating the worksheet range
Yocoms commented 2020-06-19 07:00:31 +00:00 (Migrated from github.com)

Wow, I did not know that. What a fool.

The attacked file 1_NoIden.xlsx is the one from POI.
Drag it into the demo page. And then open with excel program. Then save it doing anything.
Re-drag it into the demo page. And then you can see the difference.

1_NoIden.xlsx

Wow, I did not know that. What a fool. The attacked file **1_NoIden.xlsx** is the one from POI. Drag it into the demo page. And then open with excel program. Then save it doing anything. Re-drag it into the demo page. And then you can see the difference. [1_NoIden.xlsx](https://github.com/SheetJS/sheetjs/files/4802892/1_NoIden.xlsx)
SheetJSDev commented 2020-06-19 07:23:23 +00:00 (Migrated from github.com)

It's not obvious that you can drag-drop files. It's written below the box, but there really should have been an upload link or something above the input box.

That said, this is a mixed issue. To explain, go to https://oss.sheetjs.com/cfb-editor/#/ and drag-drop the file. Then click xl/worksheets/sheet1.xml and click (View as Text). You will see

sheet1 range

That is saying the file range is B1:D9, which is what SheetJS use to determine the sheet range. If you scroll further you'll see references to E8/F8, which are both outside the reported range.

If you have patience, we can open a discussion with the POI project and see if you have to do something different to get the library to update the worksheet range.

As for the other side, the reason we've tried to trust the dimension record is based on the actual worksheet representation. Generating millions of small strings is a massive performance killer in V8 / chrome :(

It's not obvious that you can drag-drop files. It's written below the box, but there really should have been an upload link or something above the input box. That said, this is a mixed issue. To explain, go to https://oss.sheetjs.com/cfb-editor/#/ and drag-drop the file. Then click `xl/worksheets/sheet1.xml` and click `(View as Text)`. You will see <img width="215" alt="sheet1 range" src="https://user-images.githubusercontent.com/6070939/85107206-94c34200-b1db-11ea-953d-ec004fa8d1d7.png"> That is saying the file range is B1:D9, which is what SheetJS use to determine the sheet range. If you scroll further you'll see references to E8/F8, which are both outside the reported range. If you have patience, we can open a discussion with the POI project and see if you have to do something different to get the library to update the worksheet range. As for the other side, the reason we've tried to trust the `dimension` record is based on the actual worksheet representation. Generating millions of small strings is a massive performance killer in V8 / chrome :(
SheetJSDev commented 2020-06-19 07:31:24 +00:00 (Migrated from github.com)

@Yocoms we raised POI bug 64536.

@Yocoms we raised [POI bug 64536](https://bz.apache.org/bugzilla/show_bug.cgi?id=64536).
reviewher commented 2021-09-09 19:16:18 +00:00 (Migrated from github.com)

No update on that bug. Use this to update the range programmatically:

function update_sheet_range(ws) {
  var range = {s:{r:Infinity, c:Infinity},e:{r:0,c:0}};
  Object.keys(ws).filter(function(x) { return x.charAt(0) != "!"; }).map(XLSX.utils.decode_cell).forEach(function(x) {
    range.s.c = Math.min(range.s.c, x.c); range.s.r = Math.min(range.s.r, x.r);
    range.e.c = Math.max(range.e.c, x.c); range.e.r = Math.max(range.e.r, x.r);
  });
  ws['!ref'] = XLSX.utils.encode_range(range);
}
No update on that bug. Use this to update the range programmatically: ```js function update_sheet_range(ws) { var range = {s:{r:Infinity, c:Infinity},e:{r:0,c:0}}; Object.keys(ws).filter(function(x) { return x.charAt(0) != "!"; }).map(XLSX.utils.decode_cell).forEach(function(x) { range.s.c = Math.min(range.s.c, x.c); range.s.r = Math.min(range.s.r, x.r); range.e.c = Math.max(range.e.c, x.c); range.e.r = Math.max(range.e.r, x.r); }); ws['!ref'] = XLSX.utils.encode_range(range); } ```
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#2011
No description provided.