Optimize loops for V8 Smi #6
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "master"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This PR modifies the number of iterations that the Adler32 algorithm uses before taking a modulus. The magic numbers were the maximum values that will always ensure
B < 2 ** 30 - 1
, since that's the largest value that fits in a V8 small integer.This yields N = 2654 for binary and N = 2918 for UTF-8.
If you're wondering why UTF-8 has a maximum average byte value of 207, consider that only one of the four following code point structures are possible:
If we check the maximum value for each code point (i.e. set all the
b
placeholders to 1), we find 127 for the 1-byte variant, 414 for the 2-byte, 621 for the 3-byte, and 820 for the 4-byte. This yields a maximum value of 127/byte, 207/byte, 207/byte, and 205/byte for 1, 2, 3, and 4-byte codes respectively. Taking the max of this, the average value of all bytes in a UTF-8 sequences maxes out at 207, and will almost always be in the low to mid 100s in practice.I also updated the benchmarks, but my computer doesn't reveal any major differences between any of the magic numbers (though at least now the values are semantically correct).
Hope you find these optimizations useful!
Coverage remained the same at 88.71% when pulling
00ae70e9dd
on 101arrowz:master intob40011c6f7
on SheetJS:master.@SheetJSDev Could you take a look?