mds/quiesce: agent: avoid a race condition with rapid db updates#56956
Merged
leonid-s-usov merged 1 commit intomainfrom Apr 18, 2024
Merged
mds/quiesce: agent: avoid a race condition with rapid db updates#56956leonid-s-usov merged 1 commit intomainfrom
leonid-s-usov merged 1 commit intomainfrom
Conversation
c3ff08e to
ebc02e0
Compare
When new roots begin processing but don't yet make it into the currently tracked set, there is a window for the next update with the same roots to treat them as new. We fix it by simplifying the agent model, getting rid of the intermediate `working` set. Since we never remove or add items into the current roots collection, it's safe to update the current set directly from the pending set. The race was due to the fact that `db_update()` relied on the `current` to deduce new roots into `pending`, while the same new root could have already been seen and posted into the `working` set. This would lead to submitting the same new root twice. Without the `working` set such race isn't possible. Fixes: https://tracker.ceph.com/issues/65545 Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
ebc02e0 to
2a3faf1
Compare
batrick
approved these changes
Apr 17, 2024
Contributor
Author
|
Passed 8 tests on ubuntu machines here: https://pulpito.ceph.com/leonidus-2024-04-18_07:15:14-fs-wip-lusov-quiesce-agent-race-distro-default-smithi/ the other tests were dead due to an infra issue |
Contributor
Author
|
jenkins test api |
Contributor
Author
|
The downstream backport: https://gitlab.cee.redhat.com/ceph/ceph/-/merge_requests/597 |
leonid-s-usov
added a commit
that referenced
this pull request
Apr 18, 2024
When new roots begin processing but don't yet make it into the currently tracked set, there is a window for the next update with the same roots to treat them as new. We fix it by simplifying the agent model, getting rid of the intermediate `working` set. Since we never remove or add items into the current roots collection, it's safe to update the current set directly from the pending set. The race was due to the fact that `db_update()` relied on the `current` to deduce new roots into `pending`, while the same new root could have already been seen and posted into the `working` set. This would lead to submitting the same new root twice. Without the `working` set such race isn't possible. Fixes: https://tracker.ceph.com/issues/65570 Original-Issue: https://tracker.ceph.com/issues/65545 Original-PR: #56956 Signed-off-by: Leonid Usov <leonid.usov@ibm.com> (cherry picked from commit 2a3faf1)
mkogan1
pushed a commit
to mkogan1/ceph
that referenced
this pull request
Aug 7, 2024
When new roots begin processing but don't yet make it into the currently tracked set, there is a window for the next update with the same roots to treat them as new. We fix it by simplifying the agent model, getting rid of the intermediate `working` set. Since we never remove or add items into the current roots collection, it's safe to update the current set directly from the pending set. The race was due to the fact that `db_update()` relied on the `current` to deduce new roots into `pending`, while the same new root could have already been seen and posted into the `working` set. This would lead to submitting the same new root twice. Without the `working` set such race isn't possible. Fixes: https://tracker.ceph.com/issues/65570 Original-Issue: https://tracker.ceph.com/issues/65545 Original-PR: ceph#56956 Signed-off-by: Leonid Usov <leonid.usov@ibm.com> (cherry picked from commit 2a3faf1)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When new roots begin processing but don't yet make it into the
currently tracked set, there is a window for the next update
with the same roots to treat them as new.
We fix it by simplifying the agent model, getting rid of
the intermediate
workingset. Since we never remove or additems into the current roots collection, it's safe to update the
current set directly from the pending set.
The race was due to the fact that
db_update()relied on thecurrentto deduce new roots into
pending, while the same new rootcould have already been seen and posted into the
workingset.This would lead to submitting the same new root twice.
Without the
workingset such race isn't possible.Fixes: https://tracker.ceph.com/issues/65545
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e