os/bluestore/compression: Fix Estimator::split_and_compress#63373
os/bluestore/compression: Fix Estimator::split_and_compress#63373
Conversation
src/os/bluestore/Compression.cc
Outdated
| uint32_t blob_size = p2roundup((size + blobs - 1) / blobs, au_size); | ||
| std::vector<uint32_t> blob_sizes(blobs); | ||
| for (auto& i: blob_sizes) { | ||
| i = std::min(size, blob_size); |
There was a problem hiding this comment.
the std::min() should probably be moved out of the loop?
There was a problem hiding this comment.
A matter of taste, but I would have preferred a more descriptive version of lines 164,165.
Possibly something along the lines of:
double blobs_as_double = std::ceil(static_cast(size) / max_blob_size);
uint32_t blobs = blobs_as_double;
uint32_t blob_size = p2roundup(std::ceil( size / blobs_as_double), au_size);
There was a problem hiding this comment.
... or, instead of std::ceil() (that requires a non-integer in this case), just use the
function already there: p2roundup()
There was a problem hiding this comment.
std::min() cannot be outside loop.
It assigns values to subsequent elements, so the sum of all elements will be size.
But you missing it suggest to me that maybe I should rename "i" to "iterated_blob_size"?
p2roundup() cannot be used, blobs is not p2 value.
What I would like to do, but it is not if it exist would be: std::divide_roundup().
I would prefer to stay in integer arithmetic and do not use floats here.
There was a problem hiding this comment.
- yes. I missed that. Shame on me.
- (removed)
- I understand the reluctance to move to floats. Was looking for such a divide_roundup(), but did not find one yet.
|
Other than the nits and my prefs - LGTM |
src/os/bluestore/Compression.cc
Outdated
| uint32_t blob_size = p2roundup((size + blobs - 1) / blobs, au_size); | ||
| std::vector<uint32_t> blob_sizes(blobs); | ||
| for (auto& i: blob_sizes) { | ||
| i = std::min(size, blob_size); |
There was a problem hiding this comment.
- yes. I missed that. Shame on me.
- (removed)
- I understand the reluctance to move to floats. Was looking for such a divide_roundup(), but did not find one yet.
c7823e0 to
980b1d5
Compare
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
Fixed calculation on effective blob size. When fully non-compressible data is passed, it could cause losing few bytes in the end. Example: -107> 2025-05-17T20:40:50.468+0000 7f267a42f640 15 bluestore(/var/lib/ceph/osd/ceph-4) _do_write_v2_compressed 200000~78002 -> 200000~78002 -106> 2025-05-17T20:40:50.468+0000 7f267a42f640 20 blobs to put: 200000~f000(4d61) 20f000~f000(b51) 21e000~f000(b51) 22d000~f000(b51) 23c000~f000(b51) 24b000~f000(b51) 25a000~f000(b51) 269000~f000(b51) In result we split 0x78002 into 8 * 0xf000, losing 0x2 in the process. Calculations for original: >>> size=0x78002 >>> blobs=(size+0xffff) / 0x10000 >>> blob_size = size / blobs >>> print hex(size), blobs, hex(blob_size) 0x78002 8 0xf000 <-this means roundup is 0xf000 Calculations for fixed: >>> size=0x78002 >>> blobs=(size+0xffff) / 0x10000 >>> blob_size = (size+blobs-1) / blobs >>> print hex(size), blobs, hex(blob_size) 0x78002 8 0xf001 <-this meand roundup is 0x10000 Fixes: https://tracker.ceph.com/issues/71353 Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
980b1d5 to
80b7d68
Compare
|
Verified by https://tracker.ceph.com/issues/71503. |
Fixed calculation on effective blob size.
When fully non-compressible data is passed,
it could cause losing few bytes in the end.
Example:
Calculations for original:
Calculations for fixed:
Fixes: https://tracker.ceph.com/issues/71353
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins test classic perfJenkins Job | Jenkins Job Definitionjenkins test crimson perfJenkins Job | Jenkins Job Definitionjenkins test signedJenkins Job | Jenkins Job Definitionjenkins test make checkJenkins Job | Jenkins Job Definitionjenkins test make check arm64Jenkins Job | Jenkins Job Definitionjenkins test submodulesJenkins Job | Jenkins Job Definitionjenkins test dashboardJenkins Job | Jenkins Job Definitionjenkins test dashboard cephadmJenkins Job | Jenkins Job Definitionjenkins test apiJenkins Job | Jenkins Job Definitionjenkins test docsReadTheDocs | Github Workflow Definitionjenkins test ceph-volume allJenkins Jobs | Jenkins Jobs Definitionjenkins test windowsJenkins Job | Jenkins Job Definitionjenkins test rook e2eJenkins Job | Jenkins Job Definition