CSV misinterpreted #3083

Open
opened 2024-03-07 08:59:19 +00:00 by enboig · 3 comments

I have a CSV separated using , and text using ". Some lines have tabs \t inside text, and it causes errors because \t is assumed as separator.

I think this is an error because text inside " should be completely ignored.

I have a CSV separated using `,` and text using `"`. Some lines have tabs `\t` inside text, and it causes errors because \t is assumed as separator. I think this is an error because text inside `"` should be completely ignored.
Author

I upload a simple example to replicate it.

I upload a simple example to replicate it.
Owner

Cannot reproduce. Are you using the latest version of the library?

To be sure, the file in question is 53 bytes and has a MD5 checksum of a5cc2a2517cf85c99bdc05bf12f9856c:

$ wc -c example.csv 
      53 example.csv
$ md5 example.csv 
MD5 (example.csv) = a5cc2a2517cf85c99bdc05bf12f9856c

You can test in NodeJS by running the following commands in WSL or in mac/linux terminal:

mkdir i3083
cd i3083/
npm i --save https://cdn.sheetjs.com/xlsx-0.20.1/xlsx-0.20.1.tgz
node -pe 'var XLSX = require("xlsx"); var wb = XLSX.readFile("../example.csv"); XLSX.utils.sheet_to_json(wb.Sheets.Sheet1, {header:1})'

The output is consistent with a comma delimiter:

[
  [ 'field1', 'field2', 'field3' ],
  [ 'value1', ' "\t\tvalue2"', 'value3\t\t' ]
]
Cannot reproduce. Are you using the latest version of the library? To be sure, the file in question is 53 bytes and has a MD5 checksum of `a5cc2a2517cf85c99bdc05bf12f9856c`: ```bash $ wc -c example.csv 53 example.csv $ md5 example.csv MD5 (example.csv) = a5cc2a2517cf85c99bdc05bf12f9856c ``` You can test in NodeJS by running the following commands in WSL or in mac/linux terminal: ```bash mkdir i3083 cd i3083/ npm i --save https://cdn.sheetjs.com/xlsx-0.20.1/xlsx-0.20.1.tgz node -pe 'var XLSX = require("xlsx"); var wb = XLSX.readFile("../example.csv"); XLSX.utils.sheet_to_json(wb.Sheets.Sheet1, {header:1})' ``` The output is consistent with a comma delimiter: ```js [ [ 'field1', 'field2', 'field3' ], [ 'value1', ' "\t\tvalue2"', 'value3\t\t' ] ] ```
Author

I use dropsheet, this is a capture from json sent from example.csv

I use dropsheet, this is a capture from json sent from example.csv
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#3083
No description provided.