Wrong order for undefined cells in the first row in case of date types #2460

Closed
opened 2021-11-24 11:53:23 +00:00 by giuliohome · 12 comments
giuliohome commented 2021-11-24 11:53:23 +00:00 (Migrated from github.com)

Afaics, date columns are shifted to the end (when they are empty on the first row under the headers) because the json order is the property inserts, but the undefined header value is skipped in function make_json_row here, hence the column is incorrectly reordered to the end when the value is finally inserted the first time in the next rows.

I'm going to try this change, locally on my side and I'll create a PR after my test succeeds.

		if(val === undefined || val.t === undefined) {
			if(defval === undefined) {
				row[hdr[C]] = null; // <-- this is my proposed solution to the issue
				continue;
			}
			if(hdr[C] != null) { row[hdr[C]] = defval; }
			continue;
		}

(notice that this change seems to break old tests about json output, because it introduces null fields)

Afaics, date columns are shifted to the end (when they are empty on the first row under the headers) because the json order is the property inserts, but the undefined header value is skipped in `function make_json_row` [here](https://github.com/SheetJS/sheetjs/blob/master/bits/90_utils.js#L19), hence the column is incorrectly reordered to the end when the value is finally inserted the first time in the next rows. I'm going to **try** this change, locally on my side and I'll create a PR after my test succeeds. ```javascript if(val === undefined || val.t === undefined) { if(defval === undefined) { row[hdr[C]] = null; // <-- this is my proposed solution to the issue continue; } if(hdr[C] != null) { row[hdr[C]] = defval; } continue; } ``` (notice that this change seems to break old tests about json output, because it introduces null fields)
giuliohome commented 2021-11-24 15:02:09 +00:00 (Migrated from github.com)

Solved with native option defval: null in sheet_to_json (no PR needed)

Solved with native option `defval: null` in `sheet_to_json` (no PR needed)
ghost commented 2021-11-24 15:40:50 +00:00 (Migrated from github.com)

Confirmed/no pr/R-D EEngineer is choice/no money for spending my entire
life instructing others positively for Lightning Locker Service will
require you motherfuckers to go to bed at night with a clear conscience. My
conscience is clear, I will NOT change who I am for anybody.
-Bryan Bellipanni
November 24,2021

On Wed, Nov 24, 2021 at 09:03 giuliohome @.***> wrote:

Closed #2460 https://github.com/SheetJS/sheetjs/issues/2460.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/SheetJS/sheetjs/issues/2460#event-5667542034, or
unsubscribe
https://github.com/notifications/unsubscribe-auth/AVQWQIAWPLZP5EZYE7OA52TUNT5CRANCNFSM5IV32GDA
.

Confirmed/no pr/R-D EEngineer is choice/no money for spending my entire life instructing others positively for Lightning Locker Service will require you motherfuckers to go to bed at night with a clear conscience. My conscience is clear, I will NOT change who I am for anybody. -Bryan Bellipanni November 24,2021 On Wed, Nov 24, 2021 at 09:03 giuliohome ***@***.***> wrote: > Closed #2460 <https://github.com/SheetJS/sheetjs/issues/2460>. > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <https://github.com/SheetJS/sheetjs/issues/2460#event-5667542034>, or > unsubscribe > <https://github.com/notifications/unsubscribe-auth/AVQWQIAWPLZP5EZYE7OA52TUNT5CRANCNFSM5IV32GDA> > . >
giuliohome commented 2021-11-24 16:13:00 +00:00 (Migrated from github.com)

Actually, if this is only for ordering purpose, it could make sense to introduce a different option, e.g. keeporder: true, that somehow corresponds to defval: null but only for the first row. In this sense, the issue can be reopened and a PR could be made accordingly.

(btw, I've blocked the above spammer)

Actually, if this is only for ordering purpose, it could make sense to introduce a different option, e.g. `keeporder: true`, that _somehow_ corresponds to `defval: null` but only for the **first row**. In this sense, the issue can be reopened and a PR could be made accordingly. (btw, I've blocked the above spammer)
giuliohome commented 2021-11-24 17:03:10 +00:00 (Migrated from github.com)

Merged the two commits into one and PR reopened All rewritten into a new PR with only one commit.
The final code is

if(defval === undefined) { 
    if (R == 1 && o.keeporder) row[hdr[C]] = null; 
    continue; 
}

as described in my last comment.

<strike>Merged the two commits into [one](https://github.com/giuliohome/sheetjs/commit/06001899e56d3a7844658656e97846f8b5ae483f) and PR reopened</strike> All rewritten into a [new PR](https://github.com/SheetJS/sheetjs/pull/2462) with only one commit. The final code is ```javascript if(defval === undefined) { if (R == 1 && o.keeporder) row[hdr[C]] = null; continue; } ``` as described in my last comment.
SheetJSDev commented 2022-03-08 07:50:48 +00:00 (Migrated from github.com)

Can you actually share a case when this happens? To be clear, the ordering when calling XLSX.utils.sheet_to_json is determined before scanning the first data row: https://github.com/SheetJS/sheetjs/blob/master/bits/90_utils.js#L69

Suppose you start with:

var ws = XLSX.utils.aoa_to_sheet([
  ["a", ,"c"], // second header missing
  [1, 2, 3],
  [4, 5, 6]
]);

The actual behavior depends on the header parameter:

If you pass header: 1, it will preserve the order and create arrays:

// XLSX.utils.sheet_to_json(ws, {header:1})
[
  [ 'a', <1 empty item>, 'c' ],
  [ 1, 2, 3 ],
  [ 4, 5, 6 ]
]

If you pass header: "A" it will preserve the order and create objects:

// XLSX.utils.sheet_to_json(ws, {header:"A"})
[
  { A: 'a', C: 'c' },
  { A: 1, B: 2, C: 3 },
  { A: 4, B: 5, C: 6 }
]

If you pass an actual array, it will only grab those fields (in the order specified in the array):

// XLSX.utils.sheet_to_json(ws, {header:["c"]})
[
  { c: 'a' },
  { c: 1 },
  { c: 4 }
]

The default behavior is to scan the first column and use a de-duplicated __EMPTY key:

// XLSX.utils.sheet_to_json(ws)
[
  { a: 1, __EMPTY: 2, c: 3 },
  { a: 4, __EMPTY: 5, c: 6 }
]

When multiple keys are empty, other columns are assigned __EMPTY_##:

var ws = XLSX.utils.aoa_to_sheet([
  [ , , , "d"], // first three columns blank
  [1,2,3],
  [4,5,6]
]);
XLSX.utils.sheet_to_json(ws)
[
  { __EMPTY: 1, __EMPTY_1: 2, __EMPTY_2: 3 },
  { __EMPTY: 4, __EMPTY_1: 5, __EMPTY_2: 6 }
]
Can you actually share a case when this happens? To be clear, the ordering when calling `XLSX.utils.sheet_to_json` is determined before scanning the first data row: https://github.com/SheetJS/sheetjs/blob/master/bits/90_utils.js#L69 Suppose you start with: ```js var ws = XLSX.utils.aoa_to_sheet([ ["a", ,"c"], // second header missing [1, 2, 3], [4, 5, 6] ]); ``` The actual behavior depends on the `header` parameter: If you pass `header: 1`, it will preserve the order and create arrays: ```js // XLSX.utils.sheet_to_json(ws, {header:1}) [ [ 'a', <1 empty item>, 'c' ], [ 1, 2, 3 ], [ 4, 5, 6 ] ] ``` If you pass `header: "A"` it will preserve the order and create objects: ```js // XLSX.utils.sheet_to_json(ws, {header:"A"}) [ { A: 'a', C: 'c' }, { A: 1, B: 2, C: 3 }, { A: 4, B: 5, C: 6 } ] ``` If you pass an actual array, it will only grab those fields (in the order specified in the array): ```js // XLSX.utils.sheet_to_json(ws, {header:["c"]}) [ { c: 'a' }, { c: 1 }, { c: 4 } ] ``` The default behavior is to scan the first column and use a de-duplicated `__EMPTY` key: ```js // XLSX.utils.sheet_to_json(ws) [ { a: 1, __EMPTY: 2, c: 3 }, { a: 4, __EMPTY: 5, c: 6 } ] ``` When multiple keys are empty, other columns are assigned `__EMPTY_##`: ```js var ws = XLSX.utils.aoa_to_sheet([ [ , , , "d"], // first three columns blank [1,2,3], [4,5,6] ]); XLSX.utils.sheet_to_json(ws) [ { __EMPTY: 1, __EMPTY_1: 2, __EMPTY_2: 3 }, { __EMPTY: 4, __EMPTY_1: 5, __EMPTY_2: 6 } ] ```
giuliohome commented 2022-03-08 08:23:00 +00:00 (Migrated from github.com)

Sir,
you replied after many months, you closed my PR and this issue.
Still, you seem to be asking something to me:

Can you actually share a case when this happens?

I'm doing another job, at the moment, and it would take effort from me to come back here, assess the possible changes to this repo during these months, have a look again at my issue, my code of almost one year ago... and so on.
I would be happy to do that and to carefully answer your question, if and only if this is really appreciated, which means that now you're going to read and follow the conversation, my example, and so on... Given the fact that, after your question to me, you immediately added an explanation and closed the issue, I tend to think that you're not really open to a dialog with me.

So, in conclusion, I'm happy to work for this repository, to verify again this issue and write down an example or any other checks I could do, if yours is a genuine question , not a rhetorical one.
Thank you

Sir, you replied after many months, you closed my PR and this issue. Still, you seem to be asking something to me: > Can you actually share a case when this happens? I'm doing another job, at the moment, and it would take effort from me to come back here, assess the possible changes to this repo during these months, have a look again at my issue, my code of almost one year ago... and so on. I would be happy to do that and to carefully answer your question, if and only if this is really appreciated, which means that now you're going to read and follow the conversation, my example, and so on... Given the fact that, after your question to me, you immediately added an explanation and closed the issue, I tend to think that you're not really open to a dialog with me. So, in conclusion, I'm happy to work for this repository, to verify again this issue and write down an example or any other checks I could do, if yours is a genuine question , not a rhetorical one. Thank you
SheetJSDev commented 2022-03-08 09:36:08 +00:00 (Migrated from github.com)

Sorry for delay. The open source work is unpaid volunteer work, so we have to balance this with our actual work and lives.

The loop across the cells of a given row is always performed in the same order (from lowest to highest column index). The body loop does mutate the iteration variable so the loop should always start from the first column and end on the last. The assignments to the row are of the form row[hdr[C]] = , so values in the same column will always map to the same header and assignments will be in the same order. (row[hdr[3]] will always be assigned after row[hdr[1]]). It is possible for the object order to be different if the header had a duplicate (for example, if hdr = ["a","b","c","a"]), but that case is prevented by sheet_to_json.

make_json_row is called repeatedly by sheet_to_json in a loop after the headers are determined. The relevant loop assigns headers based on the header parameter. In the default case, it de-duplicates if there are multiple columns with the same header..

If you can give an example, we can revisit the original patch.

.

Upon closer inspection, there are actually two corner cases that could use a fix. Since they are in a different place, it would be easiest to just re-fork and submit a new PR:

  1. The dedupe scanner must restart with each collision. Test case:
var ws = XLSX.utils.aoa_to_sheet([
  ["a_1","a","a"],
  [1,2,3],
  [4,5,6]
]);
XLSX.utils.sheet_to_json(ws);

patch bits/90_utils.js:

diff --git a/bits/90_utils.js b/bits/90_utils.js
@@ -77,7 +77,7 @@ function sheet_to_json(sheet/*:Worksheet*/, opts/*:?Sheet2JSONOpts*/) {
                                if(val == null) val = {w: "__EMPTY", t: "s"};
                                vv = v = format_cell(val, null, o);
                                counter = 0;
-                               for(CC = 0; CC < hdr.length; ++CC) if(hdr[CC] == vv) vv = v + "_" + (++counter);
+                               for(CC = 0; CC < hdr.length; ++CC) if(hdr[CC] == vv) { vv = v + "_" + (++counter); CC = -1; }
                                hdr[C] = vv;
                }
        }
  1. A collision in a manually specified header array:
XLSX.utils.sheet_to_json(ws, { header: ["a", "b", "a"] });

This behavior should be noted in the documentation. Patch docbits/82_util.md:

diff --git a/docbits/82_util.md b/docbits/82_util.md
@@ -438,8 +438,12 @@ takes an options argument:
 | array of strings | Use specified strings as keys in row objects              |
 | (default)        | Read and disambiguate first row as keys                   |
 
-If header is not `1`, the row object will contain the non-enumerable property
-`__rowNum__` that represents the row of the sheet corresponding to the entry.
+- If header is not `1`, the row object will contain the non-enumerable property
+  `__rowNum__` that represents the row of the sheet corresponding to the entry.
+
+- If header is an array, the keys will not be disambiguated.  This can lead to
+  unexpected results if the array values are not unique!
+
 
 <details>
   <summary><b>Examples</b> (click to show)</summary>
Sorry for delay. The open source work is unpaid volunteer work, so we have to balance this with our actual work and lives. [The loop across the cells of a given row](https://github.com/SheetJS/sheetjs/blob/master/bits/90_utils.js#L16-L41) is always performed in the same order (from lowest to highest column index). The body loop does mutate the iteration variable so the loop should always start from the first column and end on the last. The assignments to the row are of the form `row[hdr[C]] =` , so values in the same column will always map to the same header and assignments will be in the same order. (`row[hdr[3]]` will always be assigned after `row[hdr[1]]`). It is possible for the object order to be different if the header had a duplicate (for example, if `hdr = ["a","b","c","a"]`), but that case is prevented by `sheet_to_json`. `make_json_row` is called repeatedly by `sheet_to_json` [in a loop](https://github.com/SheetJS/sheetjs/blob/master/bits/90_utils.js#L84-L87) after the headers are determined. The relevant loop [assigns headers based on the `header` parameter](https://github.com/SheetJS/sheetjs/blob/master/bits/90_utils.js#L69-L83). In the default case, it [de-duplicates if there are multiple columns with the same header.](https://github.com/SheetJS/sheetjs/blob/master/bits/90_utils.js#L78-L80). If you can give an example, we can revisit the original patch. . Upon closer inspection, there are actually two corner cases that could use a fix. Since they are in a different place, it would be easiest to just re-fork and submit a new PR: 1) The [dedupe scanner must restart with each collision](https://github.com/SheetJS/sheetjs/blob/master/bits/90_utils.js#L80). Test case: ```js var ws = XLSX.utils.aoa_to_sheet([ ["a_1","a","a"], [1,2,3], [4,5,6] ]); XLSX.utils.sheet_to_json(ws); ``` patch `bits/90_utils.js`: ```diff diff --git a/bits/90_utils.js b/bits/90_utils.js @@ -77,7 +77,7 @@ function sheet_to_json(sheet/*:Worksheet*/, opts/*:?Sheet2JSONOpts*/) { if(val == null) val = {w: "__EMPTY", t: "s"}; vv = v = format_cell(val, null, o); counter = 0; - for(CC = 0; CC < hdr.length; ++CC) if(hdr[CC] == vv) vv = v + "_" + (++counter); + for(CC = 0; CC < hdr.length; ++CC) if(hdr[CC] == vv) { vv = v + "_" + (++counter); CC = -1; } hdr[C] = vv; } } ``` 2) A collision in a manually specified header array: ```js XLSX.utils.sheet_to_json(ws, { header: ["a", "b", "a"] }); ``` This behavior should be noted in the documentation. Patch `docbits/82_util.md`: ```diff diff --git a/docbits/82_util.md b/docbits/82_util.md @@ -438,8 +438,12 @@ takes an options argument: | array of strings | Use specified strings as keys in row objects | | (default) | Read and disambiguate first row as keys | -If header is not `1`, the row object will contain the non-enumerable property -`__rowNum__` that represents the row of the sheet corresponding to the entry. +- If header is not `1`, the row object will contain the non-enumerable property + `__rowNum__` that represents the row of the sheet corresponding to the entry. + +- If header is an array, the keys will not be disambiguated. This can lead to + unexpected results if the array values are not unique! + <details> <summary><b>Examples</b> (click to show)</summary> ```
giuliohome commented 2022-03-08 09:44:32 +00:00 (Migrated from github.com)

Thank you very much.
Today I'll review my fork and search my old example and I'll let you know as soon as possible (hopefully within 24 hours) if there is anything still relevant nowadays.
Thank you again very much for your availability and for sharing your precious work.

Thank you very much. Today I'll review my fork and search my old example and I'll let you know as soon as possible (hopefully within 24 hours) if there is anything still relevant nowadays. Thank you again very much for your availability and for sharing your precious work.
giuliohome commented 2022-03-08 11:39:02 +00:00 (Migrated from github.com)

Ok, here we go. This is what I'm trying to say with an example.

So we have

var ws = XLSX.utils.aoa_to_sheet([
   ["a","b","c"],
   [1, , 3], // second value missing
   [1, , 4], // second value missing
   [4, 5, 6] // second value present
 ]);

Then we can do

json_ko_for_my_scenario =  XLSX.utils.sheet_to_json(ws)
[ { a: 1, c: 3 }, { a: 1, c: 4 }, { a: 4, b: 5, c: 6 } ]
json_OK_but_not_perfect_for_my_scenario =  XLSX.utils.sheet_to_json(ws, {defval:null} )
[
  { a: 1, b: null, c: 3 }, // OK for the following scenario
  { a: 1, b: null, c: 4 }, // not perfect/needed but ok
  // my invented option keeporder: true would output { a: 1, c: 4 } from the second row afterwards on
  { a: 4, b: 5, c: 6 }
]

where my usage scenario's output is

var output_ko = XLSX.utils.json_to_sheet(json_ko_for_my_scenario)
undefined
output_ko
{
  A2: { t: 'n', v: 1 },
  B2: { t: 'n', v: 3 },
  A3: { t: 'n', v: 1 },
  B3: { t: 'n', v: 4 },
  A4: { t: 'n', v: 4 },
  C4: { t: 'n', v: 5 },
  B4: { t: 'n', v: 6 },
  A1: { t: 's', v: 'a' },
  B1: { t: 's', v: 'c' }, 
  C1: { t: 's', v: 'b' },  // you see column 'b' is shifted to the end 
  '!ref': 'A1:C4'
}

var output_OK = XLSX.utils.json_to_sheet(json_OK_but_not_perfect_for_my_scenario)
undefined
output_OK
{
  A2: { t: 'n', v: 1 },
  B2: { t: 'z', v: null },
  C2: { t: 'n', v: 3 },
  A3: { t: 'n', v: 1 },
  B3: { t: 'z', v: null },
  C3: { t: 'n', v: 4 },
  A4: { t: 'n', v: 4 },
  B4: { t: 'n', v: 5 },
  C4: { t: 'n', v: 6 },
  A1: { t: 's', v: 'a' },
  B1: { t: 's', v: 'b' }, // you see column 'b' is not shifted, perfect!
  C1: { t: 's', v: 'c' },
  '!ref': 'A1:C4'
}

I hope the example clarifies what I meant in the first place.
Thank you again

Ok, here we go. This is what I'm trying to say with an example. So we have ```js var ws = XLSX.utils.aoa_to_sheet([ ["a","b","c"], [1, , 3], // second value missing [1, , 4], // second value missing [4, 5, 6] // second value present ]); ``` Then we can do ```js json_ko_for_my_scenario = XLSX.utils.sheet_to_json(ws) [ { a: 1, c: 3 }, { a: 1, c: 4 }, { a: 4, b: 5, c: 6 } ] json_OK_but_not_perfect_for_my_scenario = XLSX.utils.sheet_to_json(ws, {defval:null} ) [ { a: 1, b: null, c: 3 }, // OK for the following scenario { a: 1, b: null, c: 4 }, // not perfect/needed but ok // my invented option keeporder: true would output { a: 1, c: 4 } from the second row afterwards on { a: 4, b: 5, c: 6 } ] ``` where my usage scenario's output is ```js var output_ko = XLSX.utils.json_to_sheet(json_ko_for_my_scenario) undefined output_ko { A2: { t: 'n', v: 1 }, B2: { t: 'n', v: 3 }, A3: { t: 'n', v: 1 }, B3: { t: 'n', v: 4 }, A4: { t: 'n', v: 4 }, C4: { t: 'n', v: 5 }, B4: { t: 'n', v: 6 }, A1: { t: 's', v: 'a' }, B1: { t: 's', v: 'c' }, C1: { t: 's', v: 'b' }, // you see column 'b' is shifted to the end '!ref': 'A1:C4' } var output_OK = XLSX.utils.json_to_sheet(json_OK_but_not_perfect_for_my_scenario) undefined output_OK { A2: { t: 'n', v: 1 }, B2: { t: 'z', v: null }, C2: { t: 'n', v: 3 }, A3: { t: 'n', v: 1 }, B3: { t: 'z', v: null }, C3: { t: 'n', v: 4 }, A4: { t: 'n', v: 4 }, B4: { t: 'n', v: 5 }, C4: { t: 'n', v: 6 }, A1: { t: 's', v: 'a' }, B1: { t: 's', v: 'b' }, // you see column 'b' is not shifted, perfect! C1: { t: 's', v: 'c' }, '!ref': 'A1:C4' } ``` I hope the example clarifies what I meant in the first place. Thank you again
giuliohome commented 2022-03-08 13:15:19 +00:00 (Migrated from github.com)

It would be the same proposal about bits/90_utils.js replacement at line 19 for a new keeporder option

... The final code is

if(defval === undefined) { 
    if (R == 1 && o.keeporder) row[hdr[C]] = null; 
    continue; 
}

...

The above example, with the new option, would result in

var XLSX = require("./dist/xlsx.full.min.js")
undefined
var ws = XLSX.utils.aoa_to_sheet([
    ["a","b","c"],
    [1, , 3], // second value missing
    [1, , 4], // second value missing
    [4, 5, 6] // second value present
  ]);
undefined
json_OK = XLSX.utils.sheet_to_json(ws, {keeporder: true} )
[ { a: 1, b: null, c: 3 }, { a: 1, c: 4 }, { a: 4, b: 5, c: 6 } ] // Perfect! Only one needed null ! 
var output_OK = XLSX.utils.json_to_sheet(json_OK)
undefined
output_OK
{
  A2: { t: 'n', v: 1 },
  B2: { t: 'z', v: null },
  C2: { t: 'n', v: 3 },
  A3: { t: 'n', v: 1 },
  C3: { t: 'n', v: 4 },
  A4: { t: 'n', v: 4 },
  B4: { t: 'n', v: 5 },
  C4: { t: 'n', v: 6 },
  A1: { t: 's', v: 'a' },
  B1: { t: 's', v: 'b' }, // Tested: OK
  C1: { t: 's', v: 'c' },
  '!ref': 'A1:C4'
}
It would be the same proposal about [bits/90_utils.js](https://github.com/giuliohome/sheetjs/commit/fc20be2ed64c750b32b408ee6342b593c9bb4300#diff-2e8ac75fa7a43ab72e9e37592563da4e0f10ab377ee3c3f7fef42cfbf726b902) replacement at line 19 for a new `keeporder` option > ... The final code is > > ```js > if(defval === undefined) { > if (R == 1 && o.keeporder) row[hdr[C]] = null; > continue; > } > ``` > ... The above example, with the new option, would result in ```js var XLSX = require("./dist/xlsx.full.min.js") undefined var ws = XLSX.utils.aoa_to_sheet([ ["a","b","c"], [1, , 3], // second value missing [1, , 4], // second value missing [4, 5, 6] // second value present ]); undefined json_OK = XLSX.utils.sheet_to_json(ws, {keeporder: true} ) [ { a: 1, b: null, c: 3 }, { a: 1, c: 4 }, { a: 4, b: 5, c: 6 } ] // Perfect! Only one needed null ! var output_OK = XLSX.utils.json_to_sheet(json_OK) undefined output_OK { A2: { t: 'n', v: 1 }, B2: { t: 'z', v: null }, C2: { t: 'n', v: 3 }, A3: { t: 'n', v: 1 }, C3: { t: 'n', v: 4 }, A4: { t: 'n', v: 4 }, B4: { t: 'n', v: 5 }, C4: { t: 'n', v: 6 }, A1: { t: 's', v: 'a' }, B1: { t: 's', v: 'b' }, // Tested: OK C1: { t: 's', v: 'c' }, '!ref': 'A1:C4' } ```
SheetJSDev commented 2022-03-08 19:43:48 +00:00 (Migrated from github.com)

If we're following the "Zen of SheetJS", it would be better to post-process the array of arrays instead of adding a new option:

XLSX.utils.sheet_to_json(ws, {header:1}).reduce((acc, x, i) => {
  // capture header row
  if(i == 0) acc.header = x;

  // the second row is the first data row -- ensure each header is present in the object
  else if(i == 1) acc.data.push(Object.fromEntries(acc.header.map((h, j) => ([h, x[j]]))));

  // later rows can look for the assigned values in the data
  else acc.data.push(Object.fromEntries(x.map((v, i) => ([acc.header[i], v])).filter(r => r[0] != null)));

  return acc;
}, {header:[], data:[]}).data

Object.fromEntries is a neat function added in Chrome 73 and Node 12

If we're following [the "Zen of SheetJS"](https://github.com/sheetjs/sheetjs#the-zen-of-sheetjs), it would be better to post-process the array of arrays instead of adding a new option: ```js XLSX.utils.sheet_to_json(ws, {header:1}).reduce((acc, x, i) => { // capture header row if(i == 0) acc.header = x; // the second row is the first data row -- ensure each header is present in the object else if(i == 1) acc.data.push(Object.fromEntries(acc.header.map((h, j) => ([h, x[j]])))); // later rows can look for the assigned values in the data else acc.data.push(Object.fromEntries(x.map((v, i) => ([acc.header[i], v])).filter(r => r[0] != null))); return acc; }, {header:[], data:[]}).data ``` [`Object.fromEntries`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/fromEntries) is a neat function added in Chrome 73 and Node 12
giuliohome commented 2022-03-09 08:38:46 +00:00 (Migrated from github.com)

Of course that was a minimal, reproducible example, I'm already doing a lot of postprocessing and I have many other options already activated, obviously I'm not writing an app to copy the xlsx as it is ;-)
The good thing of open source is indeed that I can keep my private fork, for me it would be better to follow the general programming principle called "separation of concerns": I have a fork of SheetJS which is responsible to read and write fom/to xlsx/json (in what is supposed to be the expected format) and a post-processing (or, well, the business logic of the app) which is responsible to implement the business domain logic rules: IMO the code is much more readable with this SoC (also because I could be interested in doing other changes, inspired by the same guidelines, to the json_to_sheet and so on, but at this point I will keep all them private).
Thank you anyway very much for your code! It is always very interesting and important to get feedback from peer reviews and suggestions from other developers.

Of course that was a minimal, reproducible example, I'm already doing a lot of postprocessing and I have many other options already activated, obviously I'm not writing an app to copy the xlsx as it is ;-) The good thing of open source is indeed that I can keep my private fork, for me it would be better to follow the general programming principle called "separation of concerns": I have a fork of SheetJS which is responsible to read and write fom/to xlsx/json (in what is supposed to be the expected format) and a post-processing (or, well, the business logic of the app) which is responsible to implement the business domain logic rules: IMO the code is much more readable with this SoC (also because I could be interested in doing other changes, inspired by the same guidelines, to the json_to_sheet and so on, but at this point I will keep all them private). Thank you anyway very much for your code! It is always very interesting and important to get feedback from peer reviews and suggestions from other developers.
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#2460
No description provided.