CSV euro symbol format error #3185
Labels
No Label
DBF
Dates
Defined Names
Features
Formula
HTML
Images
Infrastructure
Integration
International
ODS
Operations
Performance
PivotTables
Pro
Protection
Read Bug
SSF
SYLK
Style
Write Bug
good first issue
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: sheetjs/sheetjs#3185
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Hello,
I'm trying to export a csv file and I noticed that the
€
symbol is not properly displayed within a cell. It seems to be a miss encoding configuration somewhere.If you take a look at the column
Importo
, you see that all the cells contain€
instead of€
Please review this response. If it makes sense, we will add this to the docs. The issue is obliquely mentioned in "Troubleshooting" but it deserves a longer explanation
SheetJS is mirroring Excel's behavior. Opening the CSV in en-US Excel for Mac shows incorrect text in column C:
This is because the file uses UTF-8 encoding but does not include the BOM. Excel parses the file using a system default codepage (typically codepage 1252 in en-US Windows and 10000 on en-US Mac).
To see the raw bytes in the file, inspect the output from
xxd deadlines_t.csv | head
:The sequence
e2 82 ac
in the 8th line corresponds to the UTF8 encoding of€
.To force a UTF8 encoding on a CSV file without a proper BOM, pass the option
codepage: 65001
toread
orreadFile
.If you are using third party software to generate CSV, ask them to add the BOM.
To verify the behavior in Excel, the BOM
ef bb bf
was manually added to the attached CSV file:The new CSV file opens in Excel with the correct text.
Thanks for your answer, but why I would force UTF8 encoding without a proper BOM?
Is not possible to add the BOM when writing a csv file?
If you are exporting CSVs using SheetJS:
writeFile
always adds the BOM:Small demo: https://jsfiddle.net/kv59zog1/
sheet_to_csv
returns a normal JavaScript string. If you are writing the file using DOM or NodeJS APIs, you will most likely have to add the BOM programmatically before exporting.It's clear, my only issues now is that I have to return a string representation of the genereted csv with the BOM encoding.
Once the file is read, I have an object of type
WorkBook
. As you said, if I useXLSX.utils.sheet_to_csv(file)
it will return a string without BOM encoding.How can I read a csv file and returning its string value keeping the BOM encoding?
If your API requires you to return a string, you can manually prepend
\uFEFF
.To keep it simple, suppose the worksheet represents the following array of arrays:
Most modern JS functions that generate files (e.g. HTML5
download
attribute) will run text data throughTextEncoder
:You will see 7 bytes:
Manually prepending the BOM character will add the required bytes to the output:
You will see the BOM bytes 0xEF 0xBB 0xBF:
If you are using SheetJS to read CSV files, you can force a UTF8 interpretation by passing the option
codepage: 65001
.If you are receiving a raw byte string from
FileReader#readAsBinaryString
or a similar API, pass the optiontype: "binary"