About Chinese garbled code with DBF files #2781
Labels
No Label
DBF
Dates
Defined Names
Features
Formula
HTML
Images
Infrastructure
Integration
International
ODS
Operations
Performance
PivotTables
Pro
Protection
Read Bug
SSF
SYLK
Style
Write Bug
good first issue
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: sheetjs/sheetjs#2781
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
when I try to create a dbf file include Chinese words , it translate the words into underline just like this '__'
<html>Here is the code and result :
1 | 2020-01-04 | English
</html>2 | 2020-01-04 | __
Thanks for raising the issue!
Attached is a ZIP file containing 4 DBF files for 4 separate encodings. Please open each one in your application and confirm all four display the correct characters.
issue2781.zip
As for the fix, there are two parts:
This can be patched as follows:
This won't be the full fix since the DBF writer needs to use the full lengths in the calculation (large Chinese strings will overflow) and this will change how some of the other legacy writers work, but it is enough to verify encoding correctness.
The main supported codepages for Chinese characters are:
There are two other codepages with support for the two characters in the example:
Thanks for replying, it works for me now.
Testing this against the latest version appears to work. Web version https://jsfiddle.net/bg10f526/ automatically generates and downloads
test.dbf
. The web file is identical to the file generated in NodeJS.For version 0.18.11, the MD5 of the generated file should be
ec756d220aa7e6ce5e7d810406617842