os/bluestore: Fix problem with EC + elastic shared blob#62816
os/bluestore: Fix problem with EC + elastic shared blob#62816
Conversation
| << std::dec << " hit new spanning blob " << *p << dendl; | ||
| request_reshard(p->blob_start(), p->blob_end()); | ||
| must_reshard = true; | ||
| } else if (p->blob->is_spanning() && p->logical_end() > end) { |
There was a problem hiding this comment.
I have a feeling that resharding doesn't 100% guarantee that spanning blob goes away. It might happen or not depending on shards layout.
Am I missing something?
There was a problem hiding this comment.
Resharding does not guarantee that spanning blob goes away.
But resharding guarantees that extent will not spill over from one shard to another.
There was a problem hiding this comment.
Well, I badly formulated the statement above. My concern was rather about no guarantees that resharding eliminates "spanning" extents. I.e. ones that span shard boudary.
And indeed IMO reshard UT from a7f0829 proves my concern.
Hence I extended your PR with some additional stuff (both to implement the above UT and fix the issue), please see #65065 which IMO is a full fix for the issue.
|
@aclamk it would be great if we have a reproducer for the issue in store_test or something. |
@ifed01 @Jayaprakash-ibm is working on a teuthology job that will do such a thing. |
|
jenkins test make check arm64 |
Make sure that spanning blobs are not allowed to have extents crossing shard boundary. Partially fixes: https://tracker.ceph.com/issues/70390 Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
dirty_range used to have length = 1 byte. This is good if whole extent is inside shard. But this has proven not to be the case. dirty_range(offset, length) is slower only when it crosses shard. Partially fixes: https://tracker.ceph.com/issues/70390 Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
f4d361c to
4f566ea
Compare
|
jenkins test windows |
|
jenkins test make check arm64 |
|
@ifed01: ping. |
I made an extended fix for the issue, see #65065 |
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
It looks like ESB exposed problems with
_do_removeandExtentMap::update()functions.Fixes: https://tracker.ceph.com/issues/70390
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins test classic perfJenkins Job | Jenkins Job Definitionjenkins test crimson perfJenkins Job | Jenkins Job Definitionjenkins test signedJenkins Job | Jenkins Job Definitionjenkins test make checkJenkins Job | Jenkins Job Definitionjenkins test make check arm64Jenkins Job | Jenkins Job Definitionjenkins test submodulesJenkins Job | Jenkins Job Definitionjenkins test dashboardJenkins Job | Jenkins Job Definitionjenkins test dashboard cephadmJenkins Job | Jenkins Job Definitionjenkins test apiJenkins Job | Jenkins Job Definitionjenkins test docsReadTheDocs | Github Workflow Definitionjenkins test ceph-volume allJenkins Jobs | Jenkins Jobs Definitionjenkins test windowsJenkins Job | Jenkins Job Definitionjenkins test rook e2eJenkins Job | Jenkins Job Definition