Conversation
|
Testing on empty file as in #2307 (comment): |
Signed-off-by: Paweł Pałucha <pawel.palucha@chainguard.dev>
|
It might be good to do some benchmark on hashing vs returning constants. If memory serves, we actually end up hashing quite a few empty files in some scenarios; it might be best to just have some constants |
These are the results of benchmarking as prepared by Claude: So I guess I will propose a version with constants for the empty hashes. |
Signed-off-by: Paweł Pałucha <pawel.palucha@chainguard.dev>
|
Updated with using constants for hashes for empty files. |
|
whats t point of hashing an empty file,does it serve a purpose? |
The same purpose as hashing any other file - verifying if the content matches. |
Currently Syft is producing incorrect checksums for empty files. |
|
What is the way forward for this? |
kzantow
left a comment
There was a problem hiding this comment.
LGTM -- I was hoping to find a simple way to detect zero-sized input earlier, but I had a look and there are some *os.File inputs and other inputs that don't have length functions, so there wasn't an obvious way to optimize this further. This is still an improvement over hashing the zero-size files to prevents a bunch of duplicate strings being created, etc.. Thanks for the contribution @ppalucha!
Description
Calculate digest hashes also for empty files. Even empty files should have proper checksums.
Type of change
Checklist
Issue references
Fixes #2307