CRC32 slower than MD5 #13

Closed
opened 2019-07-10 20:24:22 +00:00 by je4550 · 1 comment
je4550 commented 2019-07-10 20:24:22 +00:00 (Migrated from github.com)

What am I missing here?

    var buf = Buffer.from(data);

    let start = Date.now();
    var checksum = crc32.buf(buf);
    console.log(checksum);
    console.log(Date.now() - start);


    var hash = crypto.createHash("md5");
    hash.setEncoding("hex");
    
    start = Date.now();
    hash.write(buf);
    hash.end();
    var checksum = hash.read()
    console.log(checksum)
    console.log(Date.now() - start);

CRC constantly over 10ms while MD5 is under 5ms constantly.

What am I missing here? ``` var buf = Buffer.from(data); let start = Date.now(); var checksum = crc32.buf(buf); console.log(checksum); console.log(Date.now() - start); var hash = crypto.createHash("md5"); hash.setEncoding("hex"); start = Date.now(); hash.write(buf); hash.end(); var checksum = hash.read() console.log(checksum) console.log(Date.now() - start); ``` CRC constantly over 10ms while MD5 is under 5ms constantly.
SheetJSDev commented 2019-07-10 20:57:58 +00:00 (Migrated from github.com)

You're seeing the tradeoff between a pure JS implementation and a C implementation. NodeJS crypto module ultimately farms out to the OpenSSL library.

There's an overhead to calling the native function, which is why for smaller sized buffers the JS approach is faster.

For larger buffers, the improved performance of the native C library exceeds the cost of switching, so even though MD5 is a slower algorithm the native approach is faster. The fact that you're starting from a Buffer means that there's an additional performance advantage for the C library (it is more efficient to read from a Buffer in C++ APIs than the JS public surface).

If you want a real apples-to-apples comparison, try comparing to a pure JS MD5 implementation like https://www.npmjs.com/package/md5.js -- from local tests MD5 is 10x slower than CRC-32

You're seeing the tradeoff between a pure JS implementation and a C implementation. NodeJS `crypto` module ultimately farms out to the OpenSSL library. There's an overhead to calling the native function, which is why for smaller sized buffers the JS approach is faster. For larger buffers, the improved performance of the native C library exceeds the cost of switching, so even though MD5 is a slower algorithm the native approach is faster. The fact that you're starting from a Buffer means that there's an additional performance advantage for the C library (it is more efficient to read from a Buffer in C++ APIs than the JS public surface). If you want a real apples-to-apples comparison, try comparing to a pure JS MD5 implementation like https://www.npmjs.com/package/md5.js -- from local tests MD5 is 10x slower than CRC-32
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sheetjs/js-crc32#13
No description provided.