When converting json to text it is converting characters weirdly #2624

Open
opened 2022-03-21 15:01:46 +00:00 by Godrules500 · 12 comments
Godrules500 commented 2022-03-21 15:01:46 +00:00 (Migrated from github.com)

When converting json data to a text or csv it is converting characters like ’ to ’

I see that changing it to use UTF will supposedly fix it according to other documents, but I only see the use case when reading the file and not writing/creating the document.

In your documentation it says that 'txt' is UTF-16, but "string" is UTF8. How would I go about changing that, or can I?

        var workSheet = XLSX.utils.json_to_sheet(results);
        var workBook = XLSX.utils.book_new();

        XLSX.utils.book_append_sheet(workBook, workSheet);
        var content = XLSX.write(workBook, { bookType: 'txt', type: 'string' });
When converting json data to a text or csv it is converting characters like ’ to ’ I see that changing it to use UTF will supposedly fix it according to other documents, but I only see the use case when reading the file and not writing/creating the document. In your documentation it says that 'txt' is UTF-16, but "string" is UTF8. How would I go about changing that, or can I? ``` var workSheet = XLSX.utils.json_to_sheet(results); var workBook = XLSX.utils.book_new(); XLSX.utils.book_append_sheet(workBook, workSheet); var content = XLSX.write(workBook, { bookType: 'txt', type: 'string' }); ```
SheetJSDev commented 2022-03-21 17:39:25 +00:00 (Migrated from github.com)
XLSX.write(workBook, {bookType: "csv", type: "string"});   // CSV as a JS string
XLSX.utils.sheet_to_csv(workSheet);                        // CSV from a worksheet (JS string)

binary will encode as binary strings.

When using writeFile, CSV will generate the binary output and add the UTF8 BOM. TXT aligns with "UTF-16 Text" in Excel. These choices optimize for the common case of generating files that will be read by Excel.

```js XLSX.write(workBook, {bookType: "csv", type: "string"}); // CSV as a JS string XLSX.utils.sheet_to_csv(workSheet); // CSV from a worksheet (JS string) ``` `binary` will encode as binary strings. When using `writeFile`, CSV will generate the binary output and add the UTF8 BOM. TXT aligns with "UTF-16 Text" in Excel. These choices optimize for the common case of generating files that will be read by Excel.
Godrules500 commented 2022-03-21 18:13:06 +00:00 (Migrated from github.com)

Ok, so I've tried sheet_to_txt and write{bookType:'txt'}) and both are rendering the same way. Now I will say, that when I filter and convert one line, it is rendering the ’ character correctly. HOWEVER, when I return all 10,000 lines of data, it is then returning ’. Is there a reason why a small subset of data renders correctly, but the full set of data replaces certain characters?

Ok, so I've tried sheet_to_txt and write{bookType:'txt'}) and both are rendering the same way. Now I will say, that when I filter and convert one line, it is rendering the ’ character correctly. HOWEVER, when I return all 10,000 lines of data, it is then returning ’. Is there a reason why a small subset of data renders correctly, but the full set of data replaces certain characters?
SheetJSDev commented 2022-03-21 18:20:59 +00:00 (Migrated from github.com)

Does the same thing happen with CSV (using sheet_to_csv or write({bookType: "csv", type: "string"})?

In either case, can you share a sample? JSON.stringify(results) should be sufficient to test. If you can't share it publicly, email oss@sheetjs.com

Does the same thing happen with CSV (using `sheet_to_csv` or `write({bookType: "csv", type: "string"})`? In either case, can you share a sample? `JSON.stringify(results)` should be sufficient to test. If you can't share it publicly, email oss@sheetjs.com
Godrules500 commented 2022-03-21 19:05:28 +00:00 (Migrated from github.com)

test.txt

Here is the file. So weirdly I have an array of 10,000 lines. In trying to help get a smaller set of data I found that if I include 978-1232 it didn't work. If I took 979-5000 it worked.

So this file contains lines 978-1232. So in translation, the file on here is 1-255. So if I include line 1 in the sheet_to_txt it messes up the values. If I remove the first line, it works correctly.

Code:

var workSheet = XLSX.utils.json_to_sheet(results);
results = XLSX.utils.sheet_to_txt(workSheet);
return results;
[test.txt](https://github.com/SheetJS/sheetjs/files/8318534/test.txt) Here is the file. So weirdly I have an array of 10,000 lines. In trying to help get a smaller set of data I found that if I include 978-1232 it didn't work. If I took 979-5000 it worked. So this file contains lines 978-1232. So in translation, the file on here is 1-255. So if I include line 1 in the sheet_to_txt it messes up the values. If I remove the first line, it works correctly. Code: ``` var workSheet = XLSX.utils.json_to_sheet(results); results = XLSX.utils.sheet_to_txt(workSheet); return results; ```
SheetJSDev commented 2022-03-21 19:13:39 +00:00 (Migrated from github.com)

Please test the following:

results = XLSX.utils.sheet_to_txt(workSheet, {type: "string"});

If this works, the issue can be resolved with a small patch to https://github.com/SheetJS/sheetjs/blob/master/bits/90_utils.js#L153 (feel free to submit a PR):

	if(typeof $cptable == 'undefined' || opts.type == 'string' || !opts.type) return s;
Please test the following: ```js results = XLSX.utils.sheet_to_txt(workSheet, {type: "string"}); ``` If this works, the issue can be resolved with a small patch to https://github.com/SheetJS/sheetjs/blob/master/bits/90_utils.js#L153 (feel free to submit a PR): ```js if(typeof $cptable == 'undefined' || opts.type == 'string' || !opts.type) return s; ```
Godrules500 commented 2022-03-21 19:21:40 +00:00 (Migrated from github.com)

Sadly it did not work and returned the same results :. Now I have pulled out the file since I'm working in a cloud system called Netsuite, so all I'm importing a file from either node_modules/xlsx/xlsx.js or node_modules/xlsx/dist/xlsx.extendedscript.js.

Sadly it did not work and returned the same results :\. Now I have pulled out the file since I'm working in a cloud system called Netsuite, so all I'm importing a file from either node_modules/xlsx/xlsx.js or node_modules/xlsx/dist/xlsx.extendedscript.js.
Godrules500 commented 2022-03-21 19:27:48 +00:00 (Migrated from github.com)

Is there a way to read the content of json_to_sheet? That way I could determine if the file is being incorrectly manipulated there?

Is there a way to read the content of json_to_sheet? That way I could determine if the file is being incorrectly manipulated there?
SheetJSDev commented 2022-03-21 19:32:34 +00:00 (Migrated from github.com)

The result is a plain JS object, you can directly inspect it. To get a specific cell, you can index with an Excel address:

var A1 = workSheet["A1"];

Please try using node_modules/xlsx/dist/xlsx.full.min.js or node_modules/xlsx/dist/xlsx.core.min.js (and be sure to pull the latest version from npm!). node_modules/xlsx/xlsx.js is designed for use in NodeJS.

Some small test https://jsfiddle.net/sheetjs/vmez8u1g/ shows the effect of the parameters. FF FE 19 20 is a binary encoded version with the BOM while 2019 corresponds to "\u2019" ()

The result is a plain JS object, you can directly inspect it. To get a specific cell, you can index with an Excel address: ```js var A1 = workSheet["A1"]; ``` Please try using node_modules/xlsx/dist/xlsx.full.min.js or node_modules/xlsx/dist/xlsx.core.min.js (and be sure to pull the latest version from npm!). node_modules/xlsx/xlsx.js is designed for use in NodeJS. Some small test https://jsfiddle.net/sheetjs/vmez8u1g/ shows the effect of the parameters. `FF FE 19 20` is a binary encoded version with the BOM while `2019` corresponds to `"\u2019"` (`’`)
Godrules500 commented 2022-03-21 19:58:17 +00:00 (Migrated from github.com)

Continuing with the weird, in sheet_to_csv I added this return out.slice(2, 265).join("") and it works, but with slice(0 or 1, 265) it didn't work.

Ok, I'll try that and see. When I tried importing it in requirejs it was causing some issues and I couldn't run it.

Continuing with the weird, in sheet_to_csv I added this `return out.slice(2, 265).join("")` and it works, but with slice(0 or 1, 265) it didn't work. Ok, I'll try that and see. When I tried importing it in requirejs it was causing some issues and I couldn't run it.
Godrules500 commented 2022-03-21 20:27:30 +00:00 (Migrated from github.com)

For whatever reason I cannot get it to load the xlsx.min.js nor xlsx.core.min.js to load using requirejs inside of Netsuite.

For whatever reason I cannot get it to load the xlsx.min.js nor xlsx.core.min.js to load using requirejs inside of Netsuite.
SheetJSDev commented 2022-03-21 20:33:35 +00:00 (Migrated from github.com)

NetSuite support was verified working in 0.16.1 and there was no change to the RequireJS logic since then. Is there some reported error?

NetSuite support was verified working in 0.16.1 and there was no change to the RequireJS logic since then. Is there some reported error?
Godrules500 commented 2022-03-22 13:23:24 +00:00 (Migrated from github.com)

Ok, I have the xlsx.full.min.js working and I'm still getting the same issue. So I'm not sure if the code is converting it to a different character or if it is the javascript version causing the issue. since the issue seems to be when converting out to a string.

Ok, I have the xlsx.full.min.js working and I'm still getting the same issue. So I'm not sure if the code is converting it to a different character or if it is the javascript version causing the issue. since the issue seems to be when converting out to a string.
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#2624
No description provided.