Skip to content

mon: disable snap id allocation for fsmap pools#47753

Merged
rishabh-d-dave merged 4 commits intoceph:mainfrom
mchangir:mon-disable-snap-id-allocation-for-fsmap-pools
Jan 30, 2023
Merged

mon: disable snap id allocation for fsmap pools#47753
rishabh-d-dave merged 4 commits intoceph:mainfrom
mchangir:mon-disable-snap-id-allocation-for-fsmap-pools

Conversation

@mchangir
Copy link
Contributor

@mchangir mchangir commented Aug 23, 2022

  • disallow monitor managed snaps to be created for pools attached to cephfs filesystems
  • disallow pools with snaps to be used in dir and file layouts
  • disallow pools with snaps to be used for fs creation

Fixes: https://tracker.ceph.com/issues/16745
Signed-off-by: Milind Changire mchangir@redhat.com

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@mchangir mchangir requested a review from a team as a code owner August 23, 2022 12:05
@github-actions github-actions bot added cephfs Ceph File System core mon labels Aug 23, 2022
@mchangir mchangir requested a review from a team August 23, 2022 12:06
@mchangir mchangir force-pushed the mon-disable-snap-id-allocation-for-fsmap-pools branch 2 times, most recently from 320fa67 to afac3a1 Compare August 23, 2022 12:25
Copy link
Contributor

@ajarr ajarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this PR is WIP/draft , and you'll add tests later.


const pg_pool_t *pool = osdmap.get_pg_pool(m->pool);

if (m->op == POOL_OP_CREATE_SNAP ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain in the commit message why we disallow pool level snaps for FS pools?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking about it more ... since pool level snap IDs are assigned by mons and FS level snap IDs are assigned by the MDS itself, I wonder how the snap IDs may clash causing problems ... which was the main driving factor for this PR
since MDS is never going to talk to the mon for manging FS snaps ... I think I may be lacking a bit of context

@gregsfortytwo could you elaborate on this please

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The snapids are a shared resource and when you delete a snapshot, you insert it into the osdmap. Any overlap is bad since it would delete both...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregsfortytwo could you point me to the piece of code on the mds side which initiates the propagation of the deleted snapid to the osdmap

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so @vshankar mentions that the reason for creating MDS managed snapshots was to implement a finer granularity (dir level) snaps rather than using the pool level snaps for the data pool in the fs

the collision happens since each write op to the data pool has to be handed a snap context (snapid) to write the data to and when pool level snaps are allowed along side MDS level snaps for the same pool, a "remove snap" from the mds could encounter a collision with pool level snap IDs and cause the unintentional removal of pool level data rather then dir level data as the MDS intended to remove

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mchangir You'd need to look at SnapServer.cc

{
/* Since we don't allow to create pool level snaps for all pools associated
* with a CephFS, we also block a pool with pre-existing snaps to be attached
* to a CephFS.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean we want to prevent a pool that has pre-existing snaps to be used as a FS data pool using 'add_data_pool' or 'fs new' command? So we will want this check in "FileSystemCommandHandler::_check_pool() as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; we'd need a check there as well. But I want others to comment on this specific issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be an ideal thing to do.

@mchangir mchangir marked this pull request as draft August 24, 2022 03:07
* to a CephFS.
*/
if (const auto *p = osdmap.get_pg_pool(pool_id); p->snaps.size() > 0) {
dout(20) << "pool(" << pool_id << ") already has snaps; can't attach to fs" << dendl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be derr ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a fatal error, so its not a good idea to tag it as ERR in the logs
in fact we are avoiding errors by blocking the addition of such pools to the fs

}

// XATTRS
int Server::check_if_pool_is_empty(const OSDMap& osdmap, int64_t pool_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to rename the function as check_if_pool_has_snaps instead ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed function to check_for_pool_snaps()

@mchangir mchangir force-pushed the mon-disable-snap-id-allocation-for-fsmap-pools branch from afac3a1 to 043c7a3 Compare August 24, 2022 06:05
{
/* Since we don't allow to create pool level snaps for all pools associated
* with a CephFS, we also block a pool with pre-existing snaps to be attached
* to a CephFS.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be an ideal thing to do.

@mchangir mchangir force-pushed the mon-disable-snap-id-allocation-for-fsmap-pools branch from 043c7a3 to eb0f530 Compare August 24, 2022 11:51
@mchangir mchangir force-pushed the mon-disable-snap-id-allocation-for-fsmap-pools branch 3 times, most recently from 4cb8673 to 86e6af3 Compare August 30, 2022 03:14
@github-actions github-actions bot added the tests label Aug 30, 2022
@mchangir mchangir marked this pull request as ready for review August 30, 2022 03:15
@vshankar
Copy link
Contributor

@mchangir I guess this is ready for QA?

@mchangir
Copy link
Contributor Author

@mchangir I guess this is ready for QA?

yes, its ready for QA

Copy link
Member

@gregsfortytwo gregsfortytwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed-by: Greg Farnum gfarnum@redhat.com

Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work!

@vshankar
Copy link
Contributor

FYI, I will pick this for fs suite runs soon -- I am currently trying to merge pending PRs which are running tests, however, I'm frequently running into teuthology related issues.

@vshankar vshankar added the wip-rishabh-testing Rishabh's testing label label Oct 17, 2022
Copy link
Contributor

@rishabh-d-dave rishabh-d-dave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO it'll be great to add a note about this in CephFS docs.

pass
self.delete_dir_and_snaps("accounts", new_limit + 1)

def test_disallow_monitor_managed_snaps_for_fs_pools(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this test to a new class, say TestMonSnapsAndFsPools (subclass of TestSnapshots), would allow us to quickly test how monitor-managed snapshots are treated on FS pools conveniently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added new class to test specifically for mon-managed snaps

}

if (pool->has_snaps()) {
*ss << "pool(" << pool_id << ") already has snaps; can't attach to fs";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth it to write "mon-managed snaps" instead of just "snaps" here for clarity of CephFS users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed debug log to mention "mon-managed snaps"

@rishabh-d-dave
Copy link
Contributor

Ceph testing infra is down for last few weeks, I'll put this PR through testing as soon as it's fine again.

self.assertEqual(ret, errno.EOPNOTSUPP)

# cleanup
self.fs.rados(["rmsnap", "snap3"], pool=test_pool_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: delete {test_pool_name}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kotreshhr do you want me to replace {test_pool_name} with snap-test-pool ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, I meant cleanup is missing. The pool needs to deleted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't cleaning up redundant for since all FSs and related pools are deleted after every test case unconditionally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added pool cleanup

Signed-off-by: Milind Changire <mchangir@redhat.com>
Signed-off-by: Milind Changire <mchangir@redhat.com>
Signed-off-by: Milind Changire <mchangir@redhat.com>
@mchangir mchangir force-pushed the mon-disable-snap-id-allocation-for-fsmap-pools branch from 86e6af3 to d8c55ed Compare November 21, 2022 10:45
@mchangir mchangir requested a review from a team as a code owner November 21, 2022 10:58
invisible, not throwing errors to the user.) If each FS gets its own
pool things probably work, but this isn't tested and may not be true.

.. Note:: To avoid snap id collision between mon-managed snapshots and file-system
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There has been some contention about "filesystem" vs "file system" vs "file-system". The powers that be prefer "file system", so I suggest taking out the - in these three instances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corrected to file system

Copy link
Contributor

@anthonyeleven anthonyeleven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hyphens

Signed-off-by: Milind Changire <mchangir@redhat.com>
@mchangir mchangir force-pushed the mon-disable-snap-id-allocation-for-fsmap-pools branch from 96915f1 to a81c0b3 Compare November 22, 2022 00:49
Copy link
Contributor

@rishabh-d-dave rishabh-d-dave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rishabh-d-dave
Copy link
Contributor

jenkins test make check arm64

@rishabh-d-dave
Copy link
Contributor

jenkins test windows

@rishabh-d-dave
Copy link
Contributor

rishabh-d-dave commented Jan 30, 2023

Since PRs from this testing batch are waiting on required CI jobs to be green, I'll wait on this one too for a couple of hours.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants