Wrong order for undefined cells in the first row in case of date types #2460
Labels
No Label
DBF
Dates
Defined Names
Features
Formula
HTML
Images
Infrastructure
Integration
International
ODS
Operations
Performance
PivotTables
Pro
Protection
Read Bug
SSF
SYLK
Style
Write Bug
good first issue
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: sheetjs/sheetjs#2460
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Afaics, date columns are shifted to the end (when they are empty on the first row under the headers) because the json order is the property inserts, but the undefined header value is skipped in
function make_json_row
here, hence the column is incorrectly reordered to the end when the value is finally inserted the first time in the next rows.I'm going to try this change, locally on my side and I'll create a PR after my test succeeds.
(notice that this change seems to break old tests about json output, because it introduces null fields)
Solved with native option
defval: null
insheet_to_json
(no PR needed)Confirmed/no pr/R-D EEngineer is choice/no money for spending my entire
life instructing others positively for Lightning Locker Service will
require you motherfuckers to go to bed at night with a clear conscience. My
conscience is clear, I will NOT change who I am for anybody.
-Bryan Bellipanni
November 24,2021
On Wed, Nov 24, 2021 at 09:03 giuliohome @.***> wrote:
Actually, if this is only for ordering purpose, it could make sense to introduce a different option, e.g.
keeporder: true
, that somehow corresponds todefval: null
but only for the first row. In this sense, the issue can be reopened and a PR could be made accordingly.(btw, I've blocked the above spammer)
Merged the two commits into one and PR reopenedAll rewritten into a new PR with only one commit.The final code is
as described in my last comment.
Can you actually share a case when this happens? To be clear, the ordering when calling
XLSX.utils.sheet_to_json
is determined before scanning the first data row: https://github.com/SheetJS/sheetjs/blob/master/bits/90_utils.js#L69Suppose you start with:
The actual behavior depends on the
header
parameter:If you pass
header: 1
, it will preserve the order and create arrays:If you pass
header: "A"
it will preserve the order and create objects:If you pass an actual array, it will only grab those fields (in the order specified in the array):
The default behavior is to scan the first column and use a de-duplicated
__EMPTY
key:When multiple keys are empty, other columns are assigned
__EMPTY_##
:Sir,
you replied after many months, you closed my PR and this issue.
Still, you seem to be asking something to me:
I'm doing another job, at the moment, and it would take effort from me to come back here, assess the possible changes to this repo during these months, have a look again at my issue, my code of almost one year ago... and so on.
I would be happy to do that and to carefully answer your question, if and only if this is really appreciated, which means that now you're going to read and follow the conversation, my example, and so on... Given the fact that, after your question to me, you immediately added an explanation and closed the issue, I tend to think that you're not really open to a dialog with me.
So, in conclusion, I'm happy to work for this repository, to verify again this issue and write down an example or any other checks I could do, if yours is a genuine question , not a rhetorical one.
Thank you
Sorry for delay. The open source work is unpaid volunteer work, so we have to balance this with our actual work and lives.
The loop across the cells of a given row is always performed in the same order (from lowest to highest column index). The body loop does mutate the iteration variable so the loop should always start from the first column and end on the last. The assignments to the row are of the form
row[hdr[C]] =
, so values in the same column will always map to the same header and assignments will be in the same order. (row[hdr[3]]
will always be assigned afterrow[hdr[1]]
). It is possible for the object order to be different if the header had a duplicate (for example, ifhdr = ["a","b","c","a"]
), but that case is prevented bysheet_to_json
.make_json_row
is called repeatedly bysheet_to_json
in a loop after the headers are determined. The relevant loop assigns headers based on theheader
parameter. In the default case, it de-duplicates if there are multiple columns with the same header..If you can give an example, we can revisit the original patch.
.
Upon closer inspection, there are actually two corner cases that could use a fix. Since they are in a different place, it would be easiest to just re-fork and submit a new PR:
patch
bits/90_utils.js
:This behavior should be noted in the documentation. Patch
docbits/82_util.md
:Thank you very much.
Today I'll review my fork and search my old example and I'll let you know as soon as possible (hopefully within 24 hours) if there is anything still relevant nowadays.
Thank you again very much for your availability and for sharing your precious work.
Ok, here we go. This is what I'm trying to say with an example.
So we have
Then we can do
where my usage scenario's output is
I hope the example clarifies what I meant in the first place.
Thank you again
It would be the same proposal about bits/90_utils.js replacement at line 19 for a new
keeporder
optionThe above example, with the new option, would result in
If we're following the "Zen of SheetJS", it would be better to post-process the array of arrays instead of adding a new option:
Object.fromEntries
is a neat function added in Chrome 73 and Node 12Of course that was a minimal, reproducible example, I'm already doing a lot of postprocessing and I have many other options already activated, obviously I'm not writing an app to copy the xlsx as it is ;-)
The good thing of open source is indeed that I can keep my private fork, for me it would be better to follow the general programming principle called "separation of concerns": I have a fork of SheetJS which is responsible to read and write fom/to xlsx/json (in what is supposed to be the expected format) and a post-processing (or, well, the business logic of the app) which is responsible to implement the business domain logic rules: IMO the code is much more readable with this SoC (also because I could be interested in doing other changes, inspired by the same guidelines, to the json_to_sheet and so on, but at this point I will keep all them private).
Thank you anyway very much for your code! It is always very interesting and important to get feedback from peer reviews and suggestions from other developers.