Skip to content

speedup: Maybe read() is faster than mmap() #419

@egmontkob-work

Description

@egmontkob-work

When computing a file's checksum, we mmap() it into memory and then pass it into xxhash in one run.

Surprisingly, read() might be faster. For starter, read() is a single system call, vs. the mmap() + munmap() pair, so we save the cost of one syscall.

For small sizes (I've tested 4kB, 64kB) a read() into a static buffer, or a read() into a malloc() + free()d array are about equally fast, and are indeed significantly faster than a mmap() + munmap().

Note that in real life scenarios malloc() will be more expensive than when it's in a loop and it can always reassign the just-freed chunk.

Probably we should read() into the stack for small sizes (less than 4kB? 64kB?), and fall back to mmap or malloc (which? to be measured actually within firebuild) for large file sizes.

Or measure if going with xxhash's state management, a fix sized buffer on the stack, and reading in loop is even faster. Probably not due to the looping and the potentially way too many read() calls.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions