Possible to skip first n rows when parsing? #215

Closed
opened 2015-04-27 10:38:09 +00:00 by stevek-pro · 4 comments
stevek-pro commented 2015-04-27 10:38:09 +00:00 (Migrated from github.com)

Hi, is it possible to skip the first n rows when parsing, I am parsing like this (in coffee)

sheet: (@file) ->
  filename          = 'uploads/' + @file
  workbook          = J.readFile(filename)
  workbookJSON      = J.utils.to_json(workbook)
  workbookProps     = J.readFile(filename, {bookProps:true})
  sheetname         = workbookProps[1].SheetNames[0]

  sheetArray        = []
  for k,v of workbookJSON
      sheetArray.push k

  sheetone = sheetArray[0]
  sheetoneJSON      = workbookJSON[sheetone]
  return sheetoneJSON

  slicedSheet.forEach (item) ->
      # do stuff
Hi, is it possible to skip the first n rows when parsing, I am parsing like this (in coffee) ``` sheet: (@file) -> filename = 'uploads/' + @file workbook = J.readFile(filename) workbookJSON = J.utils.to_json(workbook) workbookProps = J.readFile(filename, {bookProps:true}) sheetname = workbookProps[1].SheetNames[0] sheetArray = [] for k,v of workbookJSON sheetArray.push k sheetone = sheetArray[0] sheetoneJSON = workbookJSON[sheetone] return sheetoneJSON slicedSheet.forEach (item) -> # do stuff ```
SheetJSDev commented 2015-04-27 23:42:23 +00:00 (Migrated from github.com)

@nottinhill There's no option at the moment to skip the first n rows. There are a few easy ways to work with the existing utilities:

A) If you using standard utility functions, you can change the worksheet !ref property to restrict to a subset of the worksheet:

/* get the range */
var range = J.utils.decode_range(worksheet['!ref']);
/* skip n rows */
range.s.r+= n;
if(range.s.r >= range.e.r) range.s.r = range.e.r;
/* update range */
worksheet['!ref'] = J.utils.encode_range(range);

B) You can directly slice the output of to_json (since it's a normal array):

function get_json_sliced(filename, rows) {
    rows = rows || 0;
    var wb = J.readFile(filename);
    var sheetnames = wb[1].SheetNames;
    var json = J.utils.to_json(wb);
    sheetnames.forEach(function(sheetname) {
        if(Array.isArray(json[sheetname])) json[sheetname] = json[sheetname].slice(rows)
    });
    return json;
}

The difference between the two approaches is that the latter will still interpret the first row as a header row, whereas the former will completely ignore the fact that the previous rows exist.

@nottinhill There's no option at the moment to skip the first `n` rows. There are a few easy ways to work with the existing utilities: A) If you using standard utility functions, you can change the worksheet `!ref` property to restrict to a subset of the worksheet: ``` /* get the range */ var range = J.utils.decode_range(worksheet['!ref']); /* skip n rows */ range.s.r+= n; if(range.s.r >= range.e.r) range.s.r = range.e.r; /* update range */ worksheet['!ref'] = J.utils.encode_range(range); ``` B) You can directly slice the output of `to_json` (since it's a normal array): ``` function get_json_sliced(filename, rows) { rows = rows || 0; var wb = J.readFile(filename); var sheetnames = wb[1].SheetNames; var json = J.utils.to_json(wb); sheetnames.forEach(function(sheetname) { if(Array.isArray(json[sheetname])) json[sheetname] = json[sheetname].slice(rows) }); return json; } ``` The difference between the two approaches is that the latter will still interpret the first row as a header row, whereas the former will completely ignore the fact that the previous rows exist.
0o-de-lally commented 2016-03-24 23:01:09 +00:00 (Migrated from github.com)

@SheetJSDev Where is decode_range documented? I'm getting a: "TypeError: Cannot call method 'decode_range' of undefined"

It would be nice to add this example to the github docs.

A { skipToHeader: 4} would be a nice helper to have.

@SheetJSDev Where is decode_range documented? I'm getting a: "TypeError: Cannot call method 'decode_range' of undefined" It would be nice to add this example to the github docs. A { skipToHeader: 4} would be a nice helper to have.
0o-de-lally commented 2016-03-24 23:49:54 +00:00 (Migrated from github.com)

Correction, I was using a previous version of SheetJS, I don't get the same error. Still would be nice to have this util documented in the sheetjs docs

Correction, I was using a previous version of SheetJS, I don't get the same error. Still would be nice to have this util documented in the sheetjs docs
SheetJSDev commented 2017-05-16 18:05:54 +00:00 (Migrated from github.com)

Even though it could be cleaned up a bit, https://github.com/SheetJS/js-xlsx#json covers how to control the range of the JSON output.

Even though it could be cleaned up a bit, https://github.com/SheetJS/js-xlsx#json covers how to control the range of the JSON output.
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/sheetjs#215
No description provided.