how to resolve the encode problem? #739
Labels
No Label
DBF
Dates
Defined Names
Features
Formula
HTML
Images
Infrastructure
Integration
International
ODS
Operations
Performance
PivotTables
Pro
Protection
Read Bug
SSF
SYLK
Style
Write Bug
good first issue
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: sheetjs/sheetjs#739
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
i got a xml file include some chinese characters, when i use the XLSX.write api, the chinese characters changed.
here is my code
workbook is a xml string
var workbook = XLSX.read(content, {type: 'binary'});
workbook.Sheets['ag-grid']['A24'].v
the cell value became this, the oringin value is '中文测试',apologize for my bed english,.
Can you share the original file and a screenshot of what excel shows?
I also have the same question. I'm working with japanese/korean. It cannot read these words, but english shows no problem.
Target excel file is the one which generated from python workbook library.
But something weird is, it reads the data well after export new xlsx file from original one without no changing.
@kwangin Can you upload or send us:
If you are running in the browser, are you using the full version or the core version?
I also have the same question.@kwangin you solve the problem?
@SheetJSDev My files are as follows
预算_月度累计预算_20170920_1216.xlsx
@PWDream thanks for sharing! This was very helpful. It looks like a file from Apache POI.
The tool that wrote the file used xml entities rather than the UTF8 string. Despite the fact that the subfile clearly has a UTF-8 encoding, the writer opted to use XML entities. This is the text for A1:
In a good file generated by Excel, the stored text is written directly:
Looking at other files in the test suite, those values should have be encoded using hexadecimal notation:
The XLSX reader accepts UTF8-encoded Chinese characters (second example) and hexadecimal digit entities (third example),
but the relevant entity checker does not accept the decimal digit case (problematic case). Since Excel accepts it, we probably should do the same. Fortunately the fix is straightforward.
Thank you very much for your reply.What time do you fix this problem?
A small local change seemed to have fixed the immediate problem but we need to do some more tests to be sure. Expect a release tomorrow or Friday
Thank you very much
@SheetJSDev Hello, is this problem solved?
@SheetJSDev Hello, is this problem solved?
@SheetJSDev Hello, is this problem solved? or you tell me how to change, I go to change.Thanks
@PWDream it's available now in version 0.11.4
@SheetJSDev Thank you very much for your changes.I have upgraded to the latest version,the coding problem has been solved,but why did I turn behind the three empty lines after the line.
My files are as follows
预算_+——)(-&……%¥#@!-_20170925_0715.xlsx
The file has a weird record near the end of the worksheet:
If you want to hide those blank rows, when converting to CSV pass the option
blankrows:false
. For example, in node:Thank you very much!
Hello
I have the same issue for the encoding.
XLS BIFF8(office 97-2004 document) and contains japanese characters.
So i have used codepage:932 for japanese and it's working well for csv files.
Unfortunately, it's not working for office 97 xls file.
Manually, i open office 97 xls file and save as xlsx or csv file and one alert prompted as like that:
"Some features in your workbook must be lost if you save it as Microsoft Excel 5.0/95 Workbook.
Do you want to keep using that format?"
After i click "Yes" and save, then it's working without any problem.(of course we use codepage: 932)
Could you please give me instruction how can i solve that issue?
Thanks.
@WangHwaKok is this happening when you read in a file or make one from scratch? Can you share the bad file?
Hello @SheetJSDev
Yes, I am trying to read a downloaded xls file from the site.
I share office 97 xls file.(it is zipped)
test.zip
Please take a look and hope we will solve the issue easily.
Does this actually work for you in Excel @WangHwaKok ? The file has a CodePage record but the specified codepage is 1252.
Yes, working well so i can see japanese characters in Excel.
Normally, we can't see original japanese characters.
I am using win 10 OS and changed region/language setting as Japanese.
As i said before, after i save the file again & click 'yes' on alert message then i can analyze the file with codepage: 932 and i can see original characters.
Hopefully this can be helpful.
This is very strange. I will raise a new issue