Project

General

Profile

Actions

Bug #67330

open

mgr/volumes: resilience to failed non-atomic operations

Added by Venky Shankar over 1 year ago. Updated 6 months ago.

Status:
Pending Backport
Priority:
Urgent
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
other
Backport:
tentacle,squid
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
mgr/volumes
Labels (FS):
Pull request ID:
Tags (freeform):
backport_processed
Fixed In:
v20.3.0-3333-g629e419edf
Released In:
Upkeep Timestamp:
2025-09-29T15:58:29+00:00

Description

Take the example of removing a subvolume which involes moving the subvolume uuid directory to staging (trash) followed by updating the subvolume metadata to reflect its state (e.g.: snapshot-retained). Now, what has been seen in a cluster full case is that the move to stating directory, which essentially is a rename operation succeeds, however, the metadata update operation fails (due to lack to cluster free size to store the update operation). Result is an invalid subvolume state - missing uuid directory (due to removal), but the path component in the subvolume metadata is still pointing to the uuid directory, causing subsequent subvolume operations to fail.

In theory, we could simply reverse the order of operation (update metadata and then remove the subvolume), thereby catching -ENOSPC failure first and erroring out, however, we should also assess if doing some sort of multi-phase commit could mitigate or recover from the intermediate state.


Related issues 3 (2 open1 closed)

Related to CephFS - Feature #62715: mgr/volumes: switch to storing subvolume metadata in libcephsqliteNewNeeraj Pratap Singh

Actions
Copied to CephFS - Backport #73308: tentacle: mgr/volumes: resilience to failed non-atomic operationsResolvedNeeraj Pratap SinghActions
Copied to CephFS - Backport #73309: squid: mgr/volumes: resilience to failed non-atomic operationsQA TestingNeeraj Pratap SinghActions
Actions #1

Updated by Venky Shankar over 1 year ago

  • Status changed from New to Triaged
  • Assignee set to Neeraj Pratap Singh
Actions #2

Updated by Raimund Sacherer over 1 year ago

Hello,

I see backport for reef and squid, can we please have a backport for Quincy as well?

Thank you,

Raimund

Actions #3

Updated by Venky Shankar over 1 year ago

  • Backport changed from reef,squid to quincy,reef,squid

Raimund Sacherer wrote in #note-2:

Hello,

I see backport for reef and squid, can we please have a backport for Quincy as well?

Of course.

Actions #4

Updated by Venky Shankar over 1 year ago

  • Related to Feature #62715: mgr/volumes: switch to storing subvolume metadata in libcephsqlite added
Actions #5

Updated by Venky Shankar over 1 year ago

Update on the proposal: It would suffice to change the order of operation done for subvolume removal -- updating the metadata first followed by a rename operation on the uuid directory. If the metadata update fails, the remove operations is failed thereby keeping the subvolume metadata consistent with the uuid path. The case where the rename would fail is very unlikely. This is the only place in mgr/volumes where such the non-atomic nature of operations causes inconsistent metadata to be left over.

Actions #6

Updated by Venky Shankar over 1 year ago

  • Priority changed from Normal to Urgent
Actions #7

Updated by Venky Shankar over 1 year ago

@Neeraj Pratap Singh - any update on this?

Actions #8

Updated by Neeraj Pratap Singh over 1 year ago

  • Status changed from Triaged to Fix Under Review
  • Pull request ID set to 59676
Actions #9

Updated by Neeraj Pratap Singh over 1 year ago

Venky Shankar wrote in #note-7:

@Neeraj Pratap Singh - any update on this?

PR has been opened for it now.

Actions #10

Updated by Konstantin Shalygin about 1 year ago

  • Backport changed from quincy,reef,squid to reef,squid
Actions #11

Updated by Venky Shankar 6 months ago

  • Status changed from Fix Under Review to Pending Backport
  • Target version changed from v20.0.0 to v21.0.0
  • Backport changed from reef,squid to tentacle,squid
Actions #12

Updated by Upkeep Bot 6 months ago

  • Merge Commit set to 629e419edf3f07a122332935ee425ff542d3eda2
  • Fixed In set to v20.3.0-3333-g629e419edf
  • Upkeep Timestamp set to 2025-09-29T15:58:29+00:00
Actions #13

Updated by Upkeep Bot 6 months ago

  • Copied to Backport #73308: tentacle: mgr/volumes: resilience to failed non-atomic operations added
Actions #14

Updated by Upkeep Bot 6 months ago

  • Copied to Backport #73309: squid: mgr/volumes: resilience to failed non-atomic operations added
Actions #15

Updated by Upkeep Bot 6 months ago

  • Tags (freeform) set to backport_processed
Actions

Also available in: Atom PDF