Weird Characters parsed #2081
Labels
No Label
DBF
Dates
Defined Names
Features
Formula
HTML
Images
Infrastructure
Integration
International
ODS
Operations
Performance
PivotTables
Pro
Protection
Read Bug
SSF
SYLK
Style
Write Bug
good first issue
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: sheetjs/sheetjs#2081
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I have this example csv file, when parse it using:
const workbook = XLSX.read(data, { type: 'array' })
It output characters like
<p>Â </p>
, which is actually a space.example_error_character.csv.zip
It's a UTF8 CSV but missing the BOM. You can see this using
xxd
:To force a UTF8 interpretation, pass the option
codepage: 65001
:const workbook = XLSX.read(data, { type: 'array', codepage: 65001 })
Thanks for respone.
I've tried to add the codepage configuration, still not work, still outputs:
<p>Â </p>
You're right, the array case in https://github.com/SheetJS/sheetjs/blob/master/bits/40_harb.js#L888 does not handle the codepage argument. As a temporary workaround, convert to binary string as shown in https://jsfiddle.net/7Lrmxb8c/ :