Bug #70441
closedqa: The following counters failed to be set on mds daemons: {'mds_server.req_rmsnap_latency.avgcount'}
0%
Description
/a/vshankar-2025-03-12_09:22:14-fs-wip-vshankar-testing-20250306.043526-debug-testing-default-smithi/8183606
This job has check-counters for verifying that some perf counters get updated during the course of the job execution. The failed check here is for mds_server.req_rmsnap_latency.avgcount counter. qa/suites/fs/workload/tasks/3-snaps/yes.yaml adds snap-schedule and retention. The retention part is of interest here since that's the related perf counter that didn't get updated. The yaml sets snap retention to 6m3h implying that snapshots older than 6 minutes or older than 3 hours are purged by the snap-schedule plugin. ffsb workunti finished under 6 minutes, however, the yaml has an added guard that induces wait (sleep) so that the snapshots are purged if the workunit finishes quickly. However, even after this guard is execute, the perf counter for mds_server.req_rmsnap_latency.avgcount isn't updated.
Updated by Venky Shankar 12 months ago
- Status changed from New to Triaged
- Assignee set to Jos Collin
Updated by Jos Collin 12 months ago
- Status changed from Triaged to Fix Under Review
- Pull request ID set to 62247
Updated by Venky Shankar 9 months ago
- Target version changed from v20.0.0 to v21.0.0
- Backport changed from reef,squid to tentacle,squid,reef
Updated by Patrick Donnelly 9 months ago
- Backport changed from tentacle,squid,reef to tentacle,squid
Updated by Jos Collin 9 months ago
- Status changed from Fix Under Review to In Progress
- Pull request ID deleted (
62247)
It never hits here https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L407, there's no such logging in ceph-mgr logs, which means the prune set is empty. So all those 5 snapshots are kept (in keep) and not deleted.
2025-03-12T10:33:00.463+0000 7f3b739b2640 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] compiling keep set for period n 2025-03-12T10:33:00.463+0000 7f3b739b2640 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] compiling keep set for period m 2025-03-12T10:33:00.463+0000 7f3b739b2640 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] keeping b'scheduled-2025-03-12-10_33_00_UTC' due to 6m 2025-03-12T10:33:00.463+0000 7f3b739b2640 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] keeping b'scheduled-2025-03-12-10_32_01_UTC' due to 6m 2025-03-12T10:33:00.463+0000 7f3b739b2640 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] keeping b'scheduled-2025-03-12-10_31_00_UTC' due to 6m 2025-03-12T10:33:00.463+0000 7f3b739b2640 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] keeping b'scheduled-2025-03-12-10_25_04_UTC' due to 6m 2025-03-12T10:33:00.463+0000 7f3b739b2640 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] keeping b'scheduled-2025-03-12-10_24_00_UTC' due to 6m 2025-03-12T10:33:00.463+0000 7f3b739b2640 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] compiling keep set for period h 2025-03-12T10:33:00.463+0000 7f3b739b2640 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] compiling keep set for period d 2025-03-12T10:33:00.463+0000 7f3b739b2640 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] compiling keep set for period w 2025-03-12T10:33:00.463+0000 7f3b739b2640 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] compiling keep set for period M 2025-03-12T10:33:00.463+0000 7f3b739b2640 0 [snap_schedule DEBUG snap_schedule.fs.schedule_client] compiling keep set for period y
ceph-mgr logs^ shows all the 5 snapshots are kept.
In the get_prune_set code: https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L118, `candidates` and `keep` should be the same. Thus there's nothing to delete.
That's why the counter stays at 0 and the check_counters failed as it's not `seen` https://github.com/ceph/ceph/blob/main/qa/tasks/check_counter.py#L141.
Updated by Venky Shankar 9 months ago
Jos Collin wrote in #note-5:
It never hits here https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L407, there's no such logging in ceph-mgr logs, which means the prune set is empty. So all those 5 snapshots are kept (in keep) and not deleted.
[...]ceph-mgr logs^ shows all the 5 snapshots are kept.
In the get_prune_set code: https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L118, `candidates` and `keep` should be the same. Thus there's nothing to delete.
That's why the counter stays at 0 and the check_counters failed as it's not `seen` https://github.com/ceph/ceph/blob/main/qa/tasks/check_counter.py#L141.
OK, that makes sense. So, my next question would be - is that expected? Did the snapshots not exceed the retention schedule and didn't get deleted or did we hit a corner case (or worst, a bug)?
Updated by Jos Collin 9 months ago
Venky Shankar wrote in #note-6:
Jos Collin wrote in #note-5:
It never hits here https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L407, there's no such logging in ceph-mgr logs, which means the prune set is empty. So all those 5 snapshots are kept (in keep) and not deleted.
[...]ceph-mgr logs^ shows all the 5 snapshots are kept.
In the get_prune_set code: https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L118, `candidates` and `keep` should be the same. Thus there's nothing to delete.
That's why the counter stays at 0 and the check_counters failed as it's not `seen` https://github.com/ceph/ceph/blob/main/qa/tasks/check_counter.py#L141.OK, that makes sense. So, my next question would be - is that expected? Did the snapshots not exceed the retention schedule and didn't get deleted or did we hit a corner case (or worst, a bug)?
https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L404
It should be the way the qa test is configured, which is not enough to get a prune set, provided pybind/mgr/snap_schedule code is good.
I don't see the test https://github.com/ceph/ceph/blob/main/qa/suites/fs/workload/tasks/3-snaps/yes.yaml is setting `mds_max_snaps_per_dir`. So `mds_max_snaps_per_dir" is set to 100 by default and they are retained (6m3h). But from the ceph-mgr logs above, the prune candidates are just 5, which are kept entirely.
One way to fix this is to set mds_max_snaps_per_dir to a smaller value say 3, so that https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L114 works and we have some snapshots in the prune set.
Another way is to get this check https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L110 passed and breaks out, so that we have some snapshots in the prune set.
Another thing I don't understand is: why retention is configured 6m3h. It should be higher than what we need to get a prune set.
Updated by Venky Shankar 9 months ago
Jos Collin wrote in #note-7:
Venky Shankar wrote in #note-6:
Jos Collin wrote in #note-5:
It never hits here https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L407, there's no such logging in ceph-mgr logs, which means the prune set is empty. So all those 5 snapshots are kept (in keep) and not deleted.
[...]ceph-mgr logs^ shows all the 5 snapshots are kept.
In the get_prune_set code: https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L118, `candidates` and `keep` should be the same. Thus there's nothing to delete.
That's why the counter stays at 0 and the check_counters failed as it's not `seen` https://github.com/ceph/ceph/blob/main/qa/tasks/check_counter.py#L141.OK, that makes sense. So, my next question would be - is that expected? Did the snapshots not exceed the retention schedule and didn't get deleted or did we hit a corner case (or worst, a bug)?
https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L404
It should be the way the qa test is configured, which is not enough to get a prune set, provided pybind/mgr/snap_schedule code is good.
I don't see the test https://github.com/ceph/ceph/blob/main/qa/suites/fs/workload/tasks/3-snaps/yes.yaml is setting `mds_max_snaps_per_dir`. So `mds_max_snaps_per_dir" is set to 100 by default and they are retained (6m3h). But from the ceph-mgr logs above, the prune candidates are just 5, which are kept entirely.
One way to fix this is to set mds_max_snaps_per_dir to a smaller value say 3, so that https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L114 works and we have some snapshots in the prune set.
Another way is to get this check https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L110 passed and breaks out, so that we have some snapshots in the prune set.Another thing I don't understand is: why retention is configured 6m3h. It should be higher than what we need to get a prune set.
Right. But doesn't the sleep in the yaml ensure that at least one of the snapshot gets deleted?
Updated by Jos Collin 9 months ago
Venky Shankar wrote in #note-8:
Jos Collin wrote in #note-7:
Venky Shankar wrote in #note-6:
Jos Collin wrote in #note-5:
It never hits here https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L407, there's no such logging in ceph-mgr logs, which means the prune set is empty. So all those 5 snapshots are kept (in keep) and not deleted.
[...]ceph-mgr logs^ shows all the 5 snapshots are kept.
In the get_prune_set code: https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L118, `candidates` and `keep` should be the same. Thus there's nothing to delete.
That's why the counter stays at 0 and the check_counters failed as it's not `seen` https://github.com/ceph/ceph/blob/main/qa/tasks/check_counter.py#L141.OK, that makes sense. So, my next question would be - is that expected? Did the snapshots not exceed the retention schedule and didn't get deleted or did we hit a corner case (or worst, a bug)?
https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L404
It should be the way the qa test is configured, which is not enough to get a prune set, provided pybind/mgr/snap_schedule code is good.
I don't see the test https://github.com/ceph/ceph/blob/main/qa/suites/fs/workload/tasks/3-snaps/yes.yaml is setting `mds_max_snaps_per_dir`. So `mds_max_snaps_per_dir" is set to 100 by default and they are retained (6m3h). But from the ceph-mgr logs above, the prune candidates are just 5, which are kept entirely.
One way to fix this is to set mds_max_snaps_per_dir to a smaller value say 3, so that https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L114 works and we have some snapshots in the prune set.
Another way is to get this check https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L110 passed and breaks out, so that we have some snapshots in the prune set.Another thing I don't understand is: why retention is configured 6m3h. It should be higher than what we need to get a prune set.
Right. But doesn't the sleep in the yaml ensure that at least one of the snapshot gets deleted?
It's just a sleep and the check_counter still failed. I don't understand how does it ensure the snaps get deleted. May be it just waits?
I think mds_max_snaps_per_dir should be lowered instead, so that we get a difference between the candidates and keep sets.
Updated by Venky Shankar 9 months ago
Jos Collin wrote in #note-9:
Venky Shankar wrote in #note-8:
Jos Collin wrote in #note-7:
Venky Shankar wrote in #note-6:
Jos Collin wrote in #note-5:
It never hits here https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L407, there's no such logging in ceph-mgr logs, which means the prune set is empty. So all those 5 snapshots are kept (in keep) and not deleted.
[...]ceph-mgr logs^ shows all the 5 snapshots are kept.
In the get_prune_set code: https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L118, `candidates` and `keep` should be the same. Thus there's nothing to delete.
That's why the counter stays at 0 and the check_counters failed as it's not `seen` https://github.com/ceph/ceph/blob/main/qa/tasks/check_counter.py#L141.OK, that makes sense. So, my next question would be - is that expected? Did the snapshots not exceed the retention schedule and didn't get deleted or did we hit a corner case (or worst, a bug)?
https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L404
It should be the way the qa test is configured, which is not enough to get a prune set, provided pybind/mgr/snap_schedule code is good.
I don't see the test https://github.com/ceph/ceph/blob/main/qa/suites/fs/workload/tasks/3-snaps/yes.yaml is setting `mds_max_snaps_per_dir`. So `mds_max_snaps_per_dir" is set to 100 by default and they are retained (6m3h). But from the ceph-mgr logs above, the prune candidates are just 5, which are kept entirely.
One way to fix this is to set mds_max_snaps_per_dir to a smaller value say 3, so that https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L114 works and we have some snapshots in the prune set.
Another way is to get this check https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L110 passed and breaks out, so that we have some snapshots in the prune set.Another thing I don't understand is: why retention is configured 6m3h. It should be higher than what we need to get a prune set.
Right. But doesn't the sleep in the yaml ensure that at least one of the snapshot gets deleted?
It's just a sleep and the check_counter still failed. I don't understand how does it ensure the snaps get deleted. May be it just waits?
Yes. The intention is to wait till the confirmed retention time is reached (6 minutes), so that snap_schedule plugin will delete and snap and that perf counter will get incremented. Maybe the check is racy, in the sense, it waits till the 6 minute mark, but snap schedule has still not purged some snapshots since it does the purge periodically.
I think mds_max_snaps_per_dir should be lowered instead, so that we get a difference between the candidates and keep sets.
That's by default 100 and the number of snapshots is nowhere near that, so this isn't the problem afaict.
Updated by Jos Collin 9 months ago
Venky Shankar wrote in #note-10:
Jos Collin wrote in #note-9:
Venky Shankar wrote in #note-8:
Jos Collin wrote in #note-7:
Venky Shankar wrote in #note-6:
Jos Collin wrote in #note-5:
It never hits here https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L407, there's no such logging in ceph-mgr logs, which means the prune set is empty. So all those 5 snapshots are kept (in keep) and not deleted.
[...]ceph-mgr logs^ shows all the 5 snapshots are kept.
In the get_prune_set code: https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L118, `candidates` and `keep` should be the same. Thus there's nothing to delete.
That's why the counter stays at 0 and the check_counters failed as it's not `seen` https://github.com/ceph/ceph/blob/main/qa/tasks/check_counter.py#L141.OK, that makes sense. So, my next question would be - is that expected? Did the snapshots not exceed the retention schedule and didn't get deleted or did we hit a corner case (or worst, a bug)?
https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L404
It should be the way the qa test is configured, which is not enough to get a prune set, provided pybind/mgr/snap_schedule code is good.
I don't see the test https://github.com/ceph/ceph/blob/main/qa/suites/fs/workload/tasks/3-snaps/yes.yaml is setting `mds_max_snaps_per_dir`. So `mds_max_snaps_per_dir" is set to 100 by default and they are retained (6m3h). But from the ceph-mgr logs above, the prune candidates are just 5, which are kept entirely.
One way to fix this is to set mds_max_snaps_per_dir to a smaller value say 3, so that https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L114 works and we have some snapshots in the prune set.
Another way is to get this check https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L110 passed and breaks out, so that we have some snapshots in the prune set.Another thing I don't understand is: why retention is configured 6m3h. It should be higher than what we need to get a prune set.
Right. But doesn't the sleep in the yaml ensure that at least one of the snapshot gets deleted?
It's just a sleep and the check_counter still failed. I don't understand how does it ensure the snaps get deleted. May be it just waits?
Yes. The intention is to wait till the confirmed retention time is reached (6 minutes), so that snap_schedule plugin will delete and snap and that perf counter will get incremented. Maybe the check is racy, in the sense, it waits till the 6 minute mark, but snap schedule has still not purged some snapshots since it does the purge periodically.
I think mds_max_snaps_per_dir should be lowered instead, so that we get a difference between the candidates and keep sets.
That's by default 100 and the number of snapshots is nowhere near that, so this isn't the problem afaict.
See this check https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L114
Updated by Venky Shankar 9 months ago
Jos Collin wrote in #note-11:
Venky Shankar wrote in #note-10:
Jos Collin wrote in #note-9:
Venky Shankar wrote in #note-8:
Jos Collin wrote in #note-7:
Venky Shankar wrote in #note-6:
Jos Collin wrote in #note-5:
It never hits here https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L407, there's no such logging in ceph-mgr logs, which means the prune set is empty. So all those 5 snapshots are kept (in keep) and not deleted.
[...]ceph-mgr logs^ shows all the 5 snapshots are kept.
In the get_prune_set code: https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L118, `candidates` and `keep` should be the same. Thus there's nothing to delete.
That's why the counter stays at 0 and the check_counters failed as it's not `seen` https://github.com/ceph/ceph/blob/main/qa/tasks/check_counter.py#L141.OK, that makes sense. So, my next question would be - is that expected? Did the snapshots not exceed the retention schedule and didn't get deleted or did we hit a corner case (or worst, a bug)?
https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L404
It should be the way the qa test is configured, which is not enough to get a prune set, provided pybind/mgr/snap_schedule code is good.
I don't see the test https://github.com/ceph/ceph/blob/main/qa/suites/fs/workload/tasks/3-snaps/yes.yaml is setting `mds_max_snaps_per_dir`. So `mds_max_snaps_per_dir" is set to 100 by default and they are retained (6m3h). But from the ceph-mgr logs above, the prune candidates are just 5, which are kept entirely.
One way to fix this is to set mds_max_snaps_per_dir to a smaller value say 3, so that https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L114 works and we have some snapshots in the prune set.
Another way is to get this check https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L110 passed and breaks out, so that we have some snapshots in the prune set.Another thing I don't understand is: why retention is configured 6m3h. It should be higher than what we need to get a prune set.
Right. But doesn't the sleep in the yaml ensure that at least one of the snapshot gets deleted?
It's just a sleep and the check_counter still failed. I don't understand how does it ensure the snaps get deleted. May be it just waits?
Yes. The intention is to wait till the confirmed retention time is reached (6 minutes), so that snap_schedule plugin will delete and snap and that perf counter will get incremented. Maybe the check is racy, in the sense, it waits till the 6 minute mark, but snap schedule has still not purged some snapshots since it does the purge periodically.
I think mds_max_snaps_per_dir should be lowered instead, so that we get a difference between the candidates and keep sets.
That's by default 100 and the number of snapshots is nowhere near that, so this isn't the problem afaict.
See this check https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L114
Right. And next to that is `candidates - keep` which is the set to prune. In the failed test, entries in `keep` is same as the entries in `candidate`, so the prune set is empty. Reducing `mds_max_snaps_per_dir` is forcing the snap schedule plugin to remove snapshots. But, the sleep induced in the test (via the yaml) should have considered the oldest snapshot to be a candidate for pruning. ISTM, that its a racy check, isn't it?
Updated by Jos Collin 9 months ago
Venky Shankar wrote in #note-12:
Jos Collin wrote in #note-11:
Venky Shankar wrote in #note-10:
Jos Collin wrote in #note-9:
Venky Shankar wrote in #note-8:
Jos Collin wrote in #note-7:
Venky Shankar wrote in #note-6:
Jos Collin wrote in #note-5:
It never hits here https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L407, there's no such logging in ceph-mgr logs, which means the prune set is empty. So all those 5 snapshots are kept (in keep) and not deleted.
[...]ceph-mgr logs^ shows all the 5 snapshots are kept.
In the get_prune_set code: https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L118, `candidates` and `keep` should be the same. Thus there's nothing to delete.
That's why the counter stays at 0 and the check_counters failed as it's not `seen` https://github.com/ceph/ceph/blob/main/qa/tasks/check_counter.py#L141.OK, that makes sense. So, my next question would be - is that expected? Did the snapshots not exceed the retention schedule and didn't get deleted or did we hit a corner case (or worst, a bug)?
https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L404
It should be the way the qa test is configured, which is not enough to get a prune set, provided pybind/mgr/snap_schedule code is good.
I don't see the test https://github.com/ceph/ceph/blob/main/qa/suites/fs/workload/tasks/3-snaps/yes.yaml is setting `mds_max_snaps_per_dir`. So `mds_max_snaps_per_dir" is set to 100 by default and they are retained (6m3h). But from the ceph-mgr logs above, the prune candidates are just 5, which are kept entirely.
One way to fix this is to set mds_max_snaps_per_dir to a smaller value say 3, so that https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L114 works and we have some snapshots in the prune set.
Another way is to get this check https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L110 passed and breaks out, so that we have some snapshots in the prune set.Another thing I don't understand is: why retention is configured 6m3h. It should be higher than what we need to get a prune set.
Right. But doesn't the sleep in the yaml ensure that at least one of the snapshot gets deleted?
It's just a sleep and the check_counter still failed. I don't understand how does it ensure the snaps get deleted. May be it just waits?
Yes. The intention is to wait till the confirmed retention time is reached (6 minutes), so that snap_schedule plugin will delete and snap and that perf counter will get incremented. Maybe the check is racy, in the sense, it waits till the 6 minute mark, but snap schedule has still not purged some snapshots since it does the purge periodically.
I think mds_max_snaps_per_dir should be lowered instead, so that we get a difference between the candidates and keep sets.
That's by default 100 and the number of snapshots is nowhere near that, so this isn't the problem afaict.
See this check https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L114
Right. And next to that is `candidates - keep` which is the set to prune. In the failed test, entries in `keep` is same as the entries in `candidate`, so the prune set is empty. Reducing `mds_max_snaps_per_dir` is forcing the snap schedule plugin to remove snapshots. But, the sleep induced in the test (via the yaml) should have considered the oldest snapshot to be a candidate for pruning. ISTM, that its a racy check, isn't it?
Yeah, the sleep in the yaml could be racy. Instead it should've consider the oldest snapshot for deleting, which is ensuring.
But how that could be done? Is there a command that yaml could use to delete the old snapshots?
Updated by Venky Shankar 9 months ago
Jos Collin wrote in #note-13:
Venky Shankar wrote in #note-12:
Jos Collin wrote in #note-11:
Venky Shankar wrote in #note-10:
Jos Collin wrote in #note-9:
Venky Shankar wrote in #note-8:
Jos Collin wrote in #note-7:
Venky Shankar wrote in #note-6:
Jos Collin wrote in #note-5:
It never hits here https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L407, there's no such logging in ceph-mgr logs, which means the prune set is empty. So all those 5 snapshots are kept (in keep) and not deleted.
[...]ceph-mgr logs^ shows all the 5 snapshots are kept.
In the get_prune_set code: https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L118, `candidates` and `keep` should be the same. Thus there's nothing to delete.
That's why the counter stays at 0 and the check_counters failed as it's not `seen` https://github.com/ceph/ceph/blob/main/qa/tasks/check_counter.py#L141.OK, that makes sense. So, my next question would be - is that expected? Did the snapshots not exceed the retention schedule and didn't get deleted or did we hit a corner case (or worst, a bug)?
https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L404
It should be the way the qa test is configured, which is not enough to get a prune set, provided pybind/mgr/snap_schedule code is good.
I don't see the test https://github.com/ceph/ceph/blob/main/qa/suites/fs/workload/tasks/3-snaps/yes.yaml is setting `mds_max_snaps_per_dir`. So `mds_max_snaps_per_dir" is set to 100 by default and they are retained (6m3h). But from the ceph-mgr logs above, the prune candidates are just 5, which are kept entirely.
One way to fix this is to set mds_max_snaps_per_dir to a smaller value say 3, so that https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L114 works and we have some snapshots in the prune set.
Another way is to get this check https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L110 passed and breaks out, so that we have some snapshots in the prune set.Another thing I don't understand is: why retention is configured 6m3h. It should be higher than what we need to get a prune set.
Right. But doesn't the sleep in the yaml ensure that at least one of the snapshot gets deleted?
It's just a sleep and the check_counter still failed. I don't understand how does it ensure the snaps get deleted. May be it just waits?
Yes. The intention is to wait till the confirmed retention time is reached (6 minutes), so that snap_schedule plugin will delete and snap and that perf counter will get incremented. Maybe the check is racy, in the sense, it waits till the 6 minute mark, but snap schedule has still not purged some snapshots since it does the purge periodically.
I think mds_max_snaps_per_dir should be lowered instead, so that we get a difference between the candidates and keep sets.
That's by default 100 and the number of snapshots is nowhere near that, so this isn't the problem afaict.
See this check https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L114
Right. And next to that is `candidates - keep` which is the set to prune. In the failed test, entries in `keep` is same as the entries in `candidate`, so the prune set is empty. Reducing `mds_max_snaps_per_dir` is forcing the snap schedule plugin to remove snapshots. But, the sleep induced in the test (via the yaml) should have considered the oldest snapshot to be a candidate for pruning. ISTM, that its a racy check, isn't it?
Yeah, the sleep in the yaml could be racy. Instead it should've consider the oldest snapshot for deleting, which is ensuring.
But how that could be done? Is there a command that yaml could use to delete the old snapshots?
You could add a delta to that sleep to ensure that snap schedule gets enough time to purge oldest snapshots. Do we know how often does the snap schedule checks for snapshots to purge?
Updated by Jos Collin 9 months ago
Venky Shankar wrote in #note-14:
Jos Collin wrote in #note-13:
Venky Shankar wrote in #note-12:
Jos Collin wrote in #note-11:
Venky Shankar wrote in #note-10:
Jos Collin wrote in #note-9:
Venky Shankar wrote in #note-8:
Jos Collin wrote in #note-7:
Venky Shankar wrote in #note-6:
Jos Collin wrote in #note-5:
It never hits here https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L407, there's no such logging in ceph-mgr logs, which means the prune set is empty. So all those 5 snapshots are kept (in keep) and not deleted.
[...]ceph-mgr logs^ shows all the 5 snapshots are kept.
In the get_prune_set code: https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L118, `candidates` and `keep` should be the same. Thus there's nothing to delete.
That's why the counter stays at 0 and the check_counters failed as it's not `seen` https://github.com/ceph/ceph/blob/main/qa/tasks/check_counter.py#L141.OK, that makes sense. So, my next question would be - is that expected? Did the snapshots not exceed the retention schedule and didn't get deleted or did we hit a corner case (or worst, a bug)?
https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L404
It should be the way the qa test is configured, which is not enough to get a prune set, provided pybind/mgr/snap_schedule code is good.
I don't see the test https://github.com/ceph/ceph/blob/main/qa/suites/fs/workload/tasks/3-snaps/yes.yaml is setting `mds_max_snaps_per_dir`. So `mds_max_snaps_per_dir" is set to 100 by default and they are retained (6m3h). But from the ceph-mgr logs above, the prune candidates are just 5, which are kept entirely.
One way to fix this is to set mds_max_snaps_per_dir to a smaller value say 3, so that https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L114 works and we have some snapshots in the prune set.
Another way is to get this check https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L110 passed and breaks out, so that we have some snapshots in the prune set.Another thing I don't understand is: why retention is configured 6m3h. It should be higher than what we need to get a prune set.
Right. But doesn't the sleep in the yaml ensure that at least one of the snapshot gets deleted?
It's just a sleep and the check_counter still failed. I don't understand how does it ensure the snaps get deleted. May be it just waits?
Yes. The intention is to wait till the confirmed retention time is reached (6 minutes), so that snap_schedule plugin will delete and snap and that perf counter will get incremented. Maybe the check is racy, in the sense, it waits till the 6 minute mark, but snap schedule has still not purged some snapshots since it does the purge periodically.
I think mds_max_snaps_per_dir should be lowered instead, so that we get a difference between the candidates and keep sets.
That's by default 100 and the number of snapshots is nowhere near that, so this isn't the problem afaict.
See this check https://github.com/ceph/ceph/blob/main/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L114
Right. And next to that is `candidates - keep` which is the set to prune. In the failed test, entries in `keep` is same as the entries in `candidate`, so the prune set is empty. Reducing `mds_max_snaps_per_dir` is forcing the snap schedule plugin to remove snapshots. But, the sleep induced in the test (via the yaml) should have considered the oldest snapshot to be a candidate for pruning. ISTM, that its a racy check, isn't it?
Yeah, the sleep in the yaml could be racy. Instead it should've consider the oldest snapshot for deleting, which is ensuring.
But how that could be done? Is there a command that yaml could use to delete the old snapshots?You could add a delta to that sleep to ensure that snap schedule gets enough time to purge oldest snapshots. Do we know how often does the snap schedule checks for snapshots to purge?
Not sure. It's queried from the 'schedules' table. I'll add a delta to increase the sleep.
Updated by Venky Shankar 8 months ago
- Status changed from In Progress to Pending Backport
Updated by Upkeep Bot 8 months ago
- Copied to Backport #72084: squid: qa: The following counters failed to be set on mds daemons: {'mds_server.req_rmsnap_latency.avgcount'} added
Updated by Upkeep Bot 8 months ago
- Copied to Backport #72085: tentacle: qa: The following counters failed to be set on mds daemons: {'mds_server.req_rmsnap_latency.avgcount'} added
Updated by Upkeep Bot 8 months ago
- Merge Commit set to 9aedc5178a51e9a425824a8901fd7cf5a21fc0f6
- Fixed In set to v20.3.0-1548-g9aedc5178a5
- Upkeep Timestamp set to 2025-07-11T08:11:50+00:00
Updated by Upkeep Bot 8 months ago
- Fixed In changed from v20.3.0-1548-g9aedc5178a5 to v20.3.0-1548-g9aedc5178a
- Upkeep Timestamp changed from 2025-07-11T08:11:50+00:00 to 2025-07-14T20:44:18+00:00
Updated by Jos Collin 8 months ago
- Status changed from Pending Backport to Resolved