Incorrect CRC-32 number returned when String or BinaryString contains an integer #14

Closed
opened 2020-02-20 13:45:59 +00:00 by jminnihan · 1 comment
jminnihan commented 2020-02-20 13:45:59 +00:00 (Migrated from github.com)

If you pass in "123456789" (remove double-quotes) into CRC32.bstr() or CRC32.str(), the returned item should be 3421780262 (number) and CBF43926 (hex). Please reference https://www.lammertbies.nl/comm/info/crc-calculation, and https://www.libcrc.org for reference.

Furthermore if any part of the string contains an integer, the CRC32 calculation returned is incorrect.

Please check into this. I have added a Jest based unit test to show the results.
crc32.spec.ts.zip

If you pass in "123456789" (remove double-quotes) into CRC32.bstr() or CRC32.str(), the returned item should be 3421780262 (number) and CBF43926 (hex). Please reference https://www.lammertbies.nl/comm/info/crc-calculation, and https://www.libcrc.org for reference. Furthermore if any part of the string contains an integer, the CRC32 calculation returned is incorrect. Please check into this. I have added a Jest based unit test to show the results. [crc32.spec.ts.zip](https://github.com/SheetJS/js-crc32/files/4231006/crc32.spec.ts.zip)
SheetJSDev commented 2020-02-25 22:59:49 +00:00 (Migrated from github.com)

To be clear:

The return value is a signed 32-bit integer.

You can verify this against https://oss.sheetjs.com/js-crc32/ by entering the text in the box:

The returned value is -873187034. To recover the values you expect:

var crc = CRC32.str("123456789"); // -873187034
var unsigned_crc = crc >>> 0; // 3421780262
var crc_hex = (crc >>> 0).toString(16).toUpperCase(); // 'CBF43926'

Why is it done this way? For performance reasons. V8 (the engine behind Chrome / NodeJS) and other engines generally treat Numbers as IEEE754 doubles but optimize for the case of 32-bit signed integers. There's no special case for 32-bit unsigned, so jumping between 32-bit signed and integers >= 2**31 is expensive.

In your code, if at all possible try to stick to 32 bit signed integers. If you're reading a checksum from an ArrayBuffer, use Int32Array or DataView#getInt32

To be clear: > The return value is a signed 32-bit integer. You can verify this against https://oss.sheetjs.com/js-crc32/ by entering the text in the box: <img width="190" alt="" src="https://user-images.githubusercontent.com/6070939/75294957-a9786600-57f7-11ea-8461-423d5106f422.png"> The returned value is -873187034. To recover the values you expect: ```js var crc = CRC32.str("123456789"); // -873187034 var unsigned_crc = crc >>> 0; // 3421780262 var crc_hex = (crc >>> 0).toString(16).toUpperCase(); // 'CBF43926' ``` Why is it done this way? For performance reasons. V8 (the engine behind Chrome / NodeJS) and other engines generally treat `Number`s as IEEE754 doubles but optimize for the case of 32-bit signed integers. There's no special case for 32-bit unsigned, so jumping between 32-bit signed and integers >= 2**31 is expensive. In your code, if at all possible try to stick to 32 bit signed integers. If you're reading a checksum from an `ArrayBuffer`, use `Int32Array` or `DataView#getInt32`
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/js-crc32#14
No description provided.