-
Notifications
You must be signed in to change notification settings - Fork 7
Description
When computing a file's checksum, we mmap() it into memory and then pass it into xxhash in one run.
Surprisingly, read() might be faster. For starter, read() is a single system call, vs. the mmap() + munmap() pair, so we save the cost of one syscall.
For small sizes (I've tested 4kB, 64kB) a read() into a static buffer, or a read() into a malloc() + free()d array are about equally fast, and are indeed significantly faster than a mmap() + munmap().
Note that in real life scenarios malloc() will be more expensive than when it's in a loop and it can always reassign the just-freed chunk.
Probably we should read() into the stack for small sizes (less than 4kB? 64kB?), and fall back to mmap or malloc (which? to be measured actually within firebuild) for large file sizes.
Or measure if going with xxhash's state management, a fix sized buffer on the stack, and reading in loop is even faster. Probably not due to the looping and the potentially way too many read() calls.