Bug #67330
openmgr/volumes: resilience to failed non-atomic operations
0%
Description
Take the example of removing a subvolume which involes moving the subvolume uuid directory to staging (trash) followed by updating the subvolume metadata to reflect its state (e.g.: snapshot-retained). Now, what has been seen in a cluster full case is that the move to stating directory, which essentially is a rename operation succeeds, however, the metadata update operation fails (due to lack to cluster free size to store the update operation). Result is an invalid subvolume state - missing uuid directory (due to removal), but the path component in the subvolume metadata is still pointing to the uuid directory, causing subsequent subvolume operations to fail.
In theory, we could simply reverse the order of operation (update metadata and then remove the subvolume), thereby catching -ENOSPC failure first and erroring out, however, we should also assess if doing some sort of multi-phase commit could mitigate or recover from the intermediate state.
Updated by Venky Shankar over 1 year ago
- Status changed from New to Triaged
- Assignee set to Neeraj Pratap Singh
Updated by Raimund Sacherer over 1 year ago
Hello,
I see backport for reef and squid, can we please have a backport for Quincy as well?
Thank you,
Raimund
Updated by Venky Shankar over 1 year ago
- Backport changed from reef,squid to quincy,reef,squid
Raimund Sacherer wrote in #note-2:
Hello,
I see backport for reef and squid, can we please have a backport for Quincy as well?
Of course.
Updated by Venky Shankar over 1 year ago
- Related to Feature #62715: mgr/volumes: switch to storing subvolume metadata in libcephsqlite added
Updated by Venky Shankar over 1 year ago
Update on the proposal: It would suffice to change the order of operation done for subvolume removal -- updating the metadata first followed by a rename operation on the uuid directory. If the metadata update fails, the remove operations is failed thereby keeping the subvolume metadata consistent with the uuid path. The case where the rename would fail is very unlikely. This is the only place in mgr/volumes where such the non-atomic nature of operations causes inconsistent metadata to be left over.
Updated by Venky Shankar over 1 year ago
@Neeraj Pratap Singh - any update on this?
Updated by Neeraj Pratap Singh over 1 year ago
- Status changed from Triaged to Fix Under Review
- Pull request ID set to 59676
Updated by Neeraj Pratap Singh over 1 year ago
Venky Shankar wrote in #note-7:
@Neeraj Pratap Singh - any update on this?
PR has been opened for it now.
Updated by Konstantin Shalygin about 1 year ago
- Backport changed from quincy,reef,squid to reef,squid
Updated by Venky Shankar 6 months ago
- Status changed from Fix Under Review to Pending Backport
- Target version changed from v20.0.0 to v21.0.0
- Backport changed from reef,squid to tentacle,squid
Updated by Upkeep Bot 6 months ago
- Merge Commit set to 629e419edf3f07a122332935ee425ff542d3eda2
- Fixed In set to v20.3.0-3333-g629e419edf
- Upkeep Timestamp set to 2025-09-29T15:58:29+00:00
Updated by Upkeep Bot 6 months ago
- Copied to Backport #73308: tentacle: mgr/volumes: resilience to failed non-atomic operations added
Updated by Upkeep Bot 6 months ago
- Copied to Backport #73309: squid: mgr/volumes: resilience to failed non-atomic operations added