Conversation
|
The code looks fine. But currently "SizeInRepo" feels a bit misleading. It will systematically underreport how much data was added to the repository, as the pack header overhead is missing. My suggestion would be to either immediately report the pack entry header size along with the blob length or to report that overhead together with the blob that triggers a pack file upload. The downside of the second variant is that it would either require modifications to |
|
Ah, right. I'm in favor of the second option, even if that's not exact and will not report the last few header sizes. Since the header size will depend on the repo version I think it's less complex this way. |
e8590ec to
8980f97
Compare
7d00f61 to
f46a441
Compare
|
I've implemented a mix of both variants. We already have a method to calculate the size of the pack header entry for a given blob, which make accounting for the per blob overhead a one-line change. When finalizing a pack file, now the pack header crypto overhead is reported and added for the last blob. Adding the pack header entry overhead for each blob individually also has the benefit that the "stored size" for each file won't have large random spikes due to charging the full pack overhead to a single blob. |
f46a441 to
43343bb
Compare
MichaelEischer
left a comment
There was a problem hiding this comment.
LGTM. Let's merge this now and wait for feedback whether the stats need further improvements.
43343bb to
b7f5de7
Compare
This includes optional compression and crypto overhead.
raw-data summed up the size of the blob plaintexts. However, with compression this makes little sense as the storage size in the repository is lower due to compression. Thus sum up the actual size each blob takes in the repository.
This will miss the pack header crypto overhead and the length field, which only amount to a few bytes per pack file.
b7f5de7 to
00d7fcf
Compare
What does this PR change? What problem does it solve?
Print the number of bytes added to the repo (including optional compression and the crypto overhead).
Checklist
statscommandchangelog/unreleased/that describes the changes for our users (see template).gofmton the code in all commits.