Actions
Bug #71631
openCommands using Mgr Modules fail if run immediately post a mgr failover/ restart
% Done:
0%
Source:
Backport:
squid,tentacle
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Tags (freeform):
backport_processed
Merge Commit:
Fixed In:
v20.3.0-6266-gbbfedafcf5
Released In:
Upkeep Timestamp:
2026-03-20T21:42:02+00:00
Description
Reproduced by:
$ ceph mgr fail; ceph fs volume ls Error ENOTSUP: Warning: due to ceph-mgr restart, some PG states may not be up to date Module 'volumes' is not enabled/loaded (required by command 'fs volume ls'): use `ceph mgr module enable volumes` to enable it
Related BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2314146
Updated by Laura Flores 9 months ago
- Status changed from New to Fix Under Review
- Pull request ID set to 63859
Updated by Laura Flores 9 months ago
- Related to Bug #67230: mgr: should be declared available only after all python modules have been loaded added
Updated by Laura Flores 9 months ago
- Related to deleted (Bug #67230: mgr: should be declared available only after all python modules have been loaded)
Updated by Laura Flores 9 months ago
- Related to Bug #67230: mgr: should be declared available only after all python modules have been loaded added
Updated by Laura Flores 9 months ago
- Related to Bug #68657: squid: mgr/balancer preventing orchestrator and dashboard functionality added
Updated by Venky Shankar 9 months ago
- Related to Bug #70456: qa: Command failed on smithi012 with status 124: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph fs volume ls' added
Updated by Venky Shankar 9 months ago
https://pulpito.ceph.com/vshankar-2025-06-13_17:03:06-fs-wip-vshankar-testing-20250613.134551-debug-testing-default-smithi/8327080/ is likely another instance of this issue.
$ zgrep -v "client\." ./remote/smithi159/log/ceph-mgr.x.log.gz | egrep "_handle_command|ceph-mgr, pid" ... ... ... 025-06-15T01:22:33.693+0000 7f07749cd100 0 ceph version 20.3.0-896-g1a8c963f (1a8c963f6e5d0aa68a79fd7c4ea3e0bb861d7d90) tentacle (dev - Debug), process ceph-mgr, pid 62269 2025-06-15T01:22:38.107+0000 7f07132f2640 10 mgr.server _handle_command decoded-size=4 prefix=fs subvolumegroup create 2025-06-15T01:22:38.108+0000 7f07132f2640 10 mgr.server _handle_command passing through command 'fs subvolumegroup create' size 4 2025-06-15T01:23:36.228+0000 7f07132f2640 10 mgr.server _handle_command decoded-size=7 prefix=fs snap-schedule add 2025-06-15T01:23:36.229+0000 7f07132f2640 10 mgr.server _handle_command passing through command 'fs snap-schedule add' size 7 2025-06-15T01:23:36.697+0000 7f07132f2640 10 mgr.server _handle_command decoded-size=6 prefix=fs snap-schedule retention add 2025-06-15T01:23:36.698+0000 7f07132f2640 10 mgr.server _handle_command passing through command 'fs snap-schedule retention add' size 6 2025-06-15T01:23:37.069+0000 7f07132f2640 10 mgr.server _handle_command decoded-size=7 prefix=fs snap-schedule remove 2025-06-15T01:23:37.069+0000 7f07132f2640 10 mgr.server _handle_command passing through command 'fs snap-schedule remove' size 7 2025-06-15T01:23:37.536+0000 7f07132f2640 10 mgr.server _handle_command decoded-size=5 prefix=fs subvolume getpath 2025-06-15T01:23:37.536+0000 7f07132f2640 10 mgr.server _handle_command passing through command 'fs subvolume getpath' size 5 2025-06-15T01:23:40.762+0000 7f07132f2640 10 mgr.server _handle_command decoded-size=5 prefix=fs subvolume rm 2025-06-15T01:23:40.763+0000 7f07132f2640 10 mgr.server _handle_command passing through command 'fs subvolume rm' size 5 2025-06-15T01:23:41.153+0000 7f07132f2640 10 mgr.server _handle_command decoded-size=4 prefix=fs subvolumegroup rm 2025-06-15T01:23:41.154+0000 7f07132f2640 10 mgr.server _handle_command passing through command 'fs subvolumegroup rm' size 4 2025-06-15T01:23:42.365+0000 7f16e2a0d100 0 ceph version 20.3.0-896-g1a8c963f (1a8c963f6e5d0aa68a79fd7c4ea3e0bb861d7d90) tentacle (dev - Debug), process ceph-mgr, pid 62269 2025-06-15T01:24:02.460+0000 7f16811a4640 10 mgr.server _handle_command decoded-size=3 prefix=pg dump 2025-06-15T01:24:02.842+0000 7f16811a4640 10 mgr.server _handle_command decoded-size=3 prefix=pg dump 2025-06-15T01:24:03.222+0000 7f16811a4640 10 mgr.server _handle_command decoded-size=3 prefix=pg dump 2025-06-15T01:24:04.030+0000 7f16811a4640 10 mgr.server _handle_command decoded-size=3 prefix=pg dump 2025-06-15T01:24:07.390+0000 7f16811a4640 10 mgr.server _handle_command decoded-size=3 prefix=pg dump 2025-06-15T01:24:10.816+0000 7f16811a4640 10 mgr.server _handle_command decoded-size=2 prefix=fs volume ls 2025-06-15T01:24:10.816+0000 7f16811a4640 10 mgr.server _handle_command passing through command 'fs volume ls' size 2
In this case volume ls command didn't make progress after ceph-mgr got restarted. The command timeout (120 seconds) thereby failing the test.
Updated by Venky Shankar 9 months ago
@Laura Flores I see that the command run just after ceph-mgr restart could fail, however, as I mention in note-10, the command was blocked. Is that also a possibility?
Updated by Laura Flores 12 days ago
- Related to Bug #71830: Upgrade tests stuck when upgrading ceph-mgr daemon added
Updated by Laura Flores 10 days ago
Commits were cleaned up, and the PR is ready for final reviews.
Updated by Laura Flores 10 days ago
- Related to Bug #75422: Rocky10 - Module 'orchestrator' is not enabled/loaded added
Updated by Laura Flores 4 days ago
- Copied to Backport #75564: tentacle: Commands using Mgr Modules fail if run immediately post a mgr failover/ restart added
Updated by Laura Flores 4 days ago
- Copied to Backport #75565: squid: Commands using Mgr Modules fail if run immediately post a mgr failover/ restart added
Updated by Upkeep Bot about 15 hours ago
- Status changed from Fix Under Review to Pending Backport
- Merge Commit set to bbfedafcf532f649edc771d5d03fcc8207b806f4
- Fixed In set to v20.3.0-6266-gbbfedafcf5
- Upkeep Timestamp set to 2026-03-20T21:42:02+00:00
Actions