data.gov-new-url

This commit is contained in:
SheetJS 2023-08-16 14:54:32 -04:00
parent fecf228fed
commit 10ed191020
4 changed files with 53 additions and 12 deletions

@ -999,8 +999,7 @@ When the app is loaded, the data will be displayed in rows.
</TabItem>
</Tabs>
[^1]: <https://catalog.data.gov/dataset/national-student-loan-data-system/resource/d5907707-a273-415d-ae6b-e0bbc73eb529>
is the original location of the CC0-licensed dataset.
[^1]: <https://catalog.data.gov/dataset/national-student-loan-data-system-aa85f> is the current location for the CC0-licensed dataset. `PortfolioSummary.xls` is the file name.
[^2]: See [`read` in "Reading Files"](/docs/api/parse-options)
[^3]: See ["SheetJS Data Model"](/docs/csf/)
[^4]: See ["Workbook Object"](/docs/csf/book)

@ -66,7 +66,7 @@ sequenceDiagram
This demo exports data from <https://sheetjs.com/demos/table>.
:::note
:::note pass
It is also possible to parse files from the browser context, but parsing from
the automation context is more efficient and strongly recommended.
@ -148,7 +148,7 @@ This file can be opened with Excel.
</TabItem>
<TabItem value="deno" label="Deno">
:::caution
:::caution pass
Deno Puppeteer is a fork. It is not officially supported by the Puppeteer team.
@ -293,7 +293,7 @@ This file can be opened with Excel.
PhantomJS is a headless web browser powered by WebKit.
:::warning
:::warning pass
This information is provided for legacy deployments. PhantomJS development has
been suspended and there are known vulnerabilities, so new projects should use
@ -340,7 +340,7 @@ page.open('https://sheetjs.com/demos/table', function() {
});`}
</CodeBlock>
:::caution
:::caution pass
PhantomJS is very finicky and will hang if there are script errors. It is
strongly recommended to add verbose logging and to lint scripts before use.
@ -351,11 +351,11 @@ strongly recommended to add verbose logging and to lint scripts before use.
:::note
This demo was last tested on 2023 April 29 against PhantomJS 2.1.1
This demo was last tested on 2023 August 16 against PhantomJS 2.1.1
:::
1) Download and unzip the PhantomJS release from the official website.
1) Download and unzip the PhantomJS release from the official website[^1].
2) Save the `SheetJSPhantom.js` code snippet to `SheetJSPhantom.js`.
@ -370,4 +370,6 @@ In macOS:
When the script finishes, the file `SheetJSPhantomJS.xlsb` will be created.
This file can be opened with Excel.
</details>
</details>
[^1]: Downloads available at <https://phantomjs.org/download.html>

@ -137,8 +137,8 @@ There are three steps to reading files:
3) Parse the data with the SheetJS `read` method[^9]. This method returns a
SheetJS workbook object.
`file.load` expects an `id` property, which can be be the internal ID (displayed
in the File Cabinet web interface) or an absolute or relative path string.
`file.load` expects an `id` property, which can be the internal ID (displayed in
the File Cabinet web interface) or an absolute or relative path string.
```js
/* file ID or path */

@ -82,6 +82,46 @@ function ExportSimpleLink(props) { return ( <button onClick={() => {
</details>
<details><summary><b>Extract all links from a file</b> (click to show)</summary>
The following example iterates through each worksheet and each cell to find all
links. The table shows sheet name, cell address, and target for each link.
```jsx live
function SheetJSParseLinks(props) {
const [rows, setRows] = React.useState([]);
return ( <>
<input type="file" onChange={async(e) => {
let rows = [];
/* parse workbook */
const file = e.target.files[0];
const data = await file.arrayBuffer();
const wb = XLSX.read(data);
const html = [];
wb.SheetNames.forEach(n => {
var ws = wb.Sheets[n]; if(!ws) return;
var ref = XLSX.utils.decode_range(ws["!ref"]);
for(var R = 0; R <= ref.e.r; ++R) for(var C = 0; C <= ref.e.c; ++C) {
var addr = XLSX.utils.encode_cell({r:R,c:C});
if(!ws[addr] || !ws[addr].l) continue;
var link = ws[addr].l;
rows.push({ws:n, addr, Target: link.Target});
}
});
setRows(rows);
}}/>
<table><tr><th>Sheet</th><th>Address</th><th>Link Target</th></tr>
{rows.map(r => (<tr><td>{r.ws}</td><td>{r.addr}</td><td>{r.Target}</td></tr>))}
</table>
</> );
}
```
</details>
## Remote Links
HTTP / HTTPS links can be used directly:
@ -199,7 +239,7 @@ function ExportInternalLink(props) { return ( <button onClick={() => {
</details>
:::caution
:::caution pass
Some third-party tools like Google Sheets do not correctly parse hyperlinks in
XLSX documents. A workaround was added in library version 0.18.12.