mgr: declare availability only after module loading#59089
mgr: declare availability only after module loading#59089
Conversation
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution! |
|
reopening for merge |
|
Bumping this PR up to seek reviews. |
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
@vshankar fyi - this PR helps with the |
I've replied here: #61325 (comment) |
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
69bdc83 to
8c461ff
Compare
|
trivial rebase to take PR out of stale status |
There was a problem hiding this comment.
@mchangir I have been evaluating this solution against the slow-module-loading issue I've been working on here (https://tracker.ceph.com/issues/71631), and I think some pieces are missing from this solution.
PTAL at #63859, which has some similar components to your solution, but I've added some extra logic to ensure that all modules have started up before we declare availability. Let me know if you have any thoughts/suggestions, etc. I'm curious on your opinion on how this solution works with yours, i.e. if these solutions are solving the same problem and one can be closed, or if we somehow need both.
When I tested your solution, I used the following reproducer, which injects a time-complex bottleneck into the balancer module to cause its startup time to take longer:
diff --git a/src/pybind/mgr/balancer/module.py b/src/pybind/mgr/balancer/module.py
index 476304275c1..d2dfa01e99a 100644
--- a/src/pybind/mgr/balancer/module.py
+++ b/src/pybind/mgr/balancer/module.py
@@ -771,6 +771,11 @@ class Module(MgrModule):
if pg_upmap_activity:
self.update_pg_upmap_activity(plan) # update pg activity in `balancer status detail`
self.optimizing = False
+ # causing a bottleneck
+ for i in range(0, 1000):
+ for j in range (0, 1000):
+ x = i + j
+ self.log.debug("hitting the bottleneck in the balancer module")
self.log.debug('Sleeping for %d', sleep_interval)
self.event.wait(sleep_interval)
self.event.clear()After injecting the bottleneck, I ran this command on a vstart cluster, which shows that the mgr still declared availability too early with this patch:
$ ./bin/ceph mgr fail; ./bin/ceph telemetry show --debug_mgrc=20
2025-06-11T18:44:49.771+0000 7f2ed6ffd640 20 mgrc handle_mgr_map mgrmap(e 6)
2025-06-11T18:44:49.772+0000 7f2ed6ffd640 4 mgrc handle_mgr_map Got map version 6
2025-06-11T18:44:49.772+0000 7f2ed6ffd640 4 mgrc handle_mgr_map Active mgr is now
2025-06-11T18:44:49.772+0000 7f2ed6ffd640 4 mgrc reconnect No active mgr available yet
2025-06-11T18:44:50.117+0000 7f2ef4805640 20 mgrc start_command cmd: [{"prefix": "telemetry show", "target": ["mon-mgr", ""]}]
2025-06-11T18:44:50.117+0000 7f2ef4805640 5 mgrc start_command no mgr session (no running mgr daemon?), waiting
2025-06-11T18:44:54.116+0000 7f2ed6ffd640 20 mgrc handle_mgr_map mgrmap(e 7)
2025-06-11T18:44:54.116+0000 7f2ed6ffd640 4 mgrc handle_mgr_map Got map version 7
2025-06-11T18:44:54.116+0000 7f2ed6ffd640 4 mgrc handle_mgr_map Active mgr is now
2025-06-11T18:44:54.116+0000 7f2ed6ffd640 4 mgrc reconnect No active mgr available yet
2025-06-11T18:44:56.132+0000 7f2ed6ffd640 20 mgrc handle_mgr_map mgrmap(e 8)
2025-06-11T18:44:56.132+0000 7f2ed6ffd640 4 mgrc handle_mgr_map Got map version 8
2025-06-11T18:44:56.132+0000 7f2ed6ffd640 4 mgrc handle_mgr_map Active mgr is now [v2:127.0.0.1:6800/3720421852,v1:127.0.0.1:6801/3720421852]
2025-06-11T18:44:56.132+0000 7f2ed6ffd640 4 mgrc reconnect Starting new session with [v2:127.0.0.1:6800/3720421852,v1:127.0.0.1:6801/3720421852]
2025-06-11T18:44:56.132+0000 7f2ed6ffd640 10 mgrc reconnect resending 0 (cli)
2025-06-11T18:44:56.135+0000 7f2ed6ffd640 20 mgrc handle_command_reply tid 0 r -95
Error ENOTSUP: Warning: due to ceph-mgr restart, some PG states may not be up to date
Module 'telemetry' is not enabled/loaded (required by command 'telemetry show'): use `ceph mgr module enable telemetry` to enable it
In the mgr log, we can see that mgr send_beacon active occurred before the telemetry module had finished loading, which resulted in the ENOTSUP error:
$ cat out/mgr.x.log | grep "send_beacon active\|I am now activating\|telemetry show"
2025-06-11T18:46:11.817+0000 7f8f5fc20200 20 mgr[py] loaded command telemetry show name=channels,req=false,n=N,type=CephString
2025-06-11T18:46:11.817+0000 7f8f5fc20200 20 mgr[py] loaded command telemetry show-device
2025-06-11T18:46:11.817+0000 7f8f5fc20200 20 mgr[py] loaded command telemetry show-all
2025-06-11T18:46:15.905+0000 7f8f5adb8640 1 mgr handle_mgr_map I am now activating
2025-06-11T18:46:17.845+0000 7f8f57db2640 20 mgr send_beacon active
2025-06-11T18:46:17.921+0000 7f8cbf60d640 1 -- [v2:127.0.0.1:6800/976509187,v1:127.0.0.1:6801/976509187] <== client.4402 127.0.0.1:0/273425155 1 ==== mgr_command(tid 0: {"prefix": "telemetry show", "target": ["mon-mgr", ""]}) ==== 79+0+0 (secure 0 0 0) 0x55cd1423a820 con 0x55cd14711400
2025-06-11T18:46:17.921+0000 7f8cbf60d640 10 mgr.server _handle_command decoded-size=2 prefix=telemetry show
2025-06-11T18:46:17.922+0000 7f8cbf60d640 20 is_capable service=py module=telemetry command=telemetry show read addr - on cap allow *
2025-06-11T18:46:17.922+0000 7f8cbf60d640 0 log_channel(audit) log [DBG] : from='client.4402 -' entity='client.admin' cmd=[{"prefix": "telemetry show", "target": ["mon-mgr", ""]}]: dispatch
Module 'telemetry' is not enabled/loaded (required by command 'telemetry show'): use `ceph mgr module enable telemetry` to enable it
Module 'telemetry' is not enabled/loaded (required by command 'telemetry show'): use `ceph mgr module enable telemetry` to enable it
Module 'telemetry' is not enabled/loaded (required by command 'telemetry show'): use `ceph mgr module enable telemetry` to enable it) -- 0x55cd14311380 con 0x55cd14711400
2025-06-11T18:46:18.503+0000 7f8f58db4640 10 log_client will send 2025-06-11T18:46:17.923865+0000 mgr.x (mgr.4396) 1 : audit [DBG] from='client.4402 -' entity='client.admin' cmd=[{"prefix": "telemetry show", "target": ["mon-mgr", ""]}]: dispatch
2025-06-11T18:46:19.847+0000 7f8f57db2640 20 mgr send_beacon active
2025-06-11T18:46:21.847+0000 7f8f57db2640 20 mgr send_beacon active
In my patch, I adjusted the solution so the mgr declares availability only after pending_modules is empty. Here, the mgr waits to declare availability properly, and the command succeeds:
$ ./bin/ceph mgr fail; ./bin/ceph telemetry show --debug_mgrc=20
2025-06-11T18:48:32.969+0000 7f67277fe640 20 mgrc handle_mgr_map mgrmap(e 19)
2025-06-11T18:48:32.969+0000 7f67277fe640 4 mgrc handle_mgr_map Got map version 19
2025-06-11T18:48:32.969+0000 7f67277fe640 4 mgrc handle_mgr_map Active mgr is now
2025-06-11T18:48:32.969+0000 7f67277fe640 4 mgrc reconnect No active mgr available yet
2025-06-11T18:48:33.317+0000 7f674d6b9640 20 mgrc start_command cmd: [{"prefix": "telemetry show", "target": ["mon-mgr", ""]}]
2025-06-11T18:48:33.317+0000 7f674d6b9640 5 mgrc start_command no mgr session (no running mgr daemon?), waiting
2025-06-11T18:48:37.368+0000 7f67277fe640 20 mgrc handle_mgr_map mgrmap(e 20)
2025-06-11T18:48:37.368+0000 7f67277fe640 4 mgrc handle_mgr_map Got map version 20
2025-06-11T18:48:37.368+0000 7f67277fe640 4 mgrc handle_mgr_map Active mgr is now
2025-06-11T18:48:37.368+0000 7f67277fe640 4 mgrc reconnect No active mgr available yet
2025-06-11T18:48:39.378+0000 7f67277fe640 20 mgrc handle_mgr_map mgrmap(e 21)
2025-06-11T18:48:39.378+0000 7f67277fe640 4 mgrc handle_mgr_map Got map version 21
2025-06-11T18:48:39.378+0000 7f67277fe640 4 mgrc handle_mgr_map Active mgr is now
2025-06-11T18:48:39.378+0000 7f67277fe640 4 mgrc reconnect No active mgr available yet
2025-06-11T18:48:50.818+0000 7f67277fe640 20 mgrc handle_mgr_map mgrmap(e 22)
2025-06-11T18:48:50.818+0000 7f67277fe640 4 mgrc handle_mgr_map Got map version 22
2025-06-11T18:48:50.818+0000 7f67277fe640 4 mgrc handle_mgr_map Active mgr is now [v2:127.0.0.1:6800/3826094627,v1:127.0.0.1:6801/3826094627]
2025-06-11T18:48:50.818+0000 7f67277fe640 4 mgrc reconnect Starting new session with [v2:127.0.0.1:6800/3826094627,v1:127.0.0.1:6801/3826094627]
2025-06-11T18:48:50.818+0000 7f67277fe640 10 mgrc reconnect resending 0 (cli)
2025-06-11T18:48:50.821+0000 7f67277fe640 20 mgrc handle_command_reply tid 0 r 0
Telemetry is off. Please consider opting-in with `ceph telemetry on`.
Preview sample reports with `ceph telemetry preview`.
In the mgr log, we can see that the mgr is sending a "starting" beacon to indicate that it is still starting, and it only sends the "active" beacon after pending_modules is empty. So, the command is ready to be supported:
$ cat out/mgr.x.log | grep "send_beacon active\|I am now activating\|telemetry show"
2025-06-11T18:48:33.278+0000 7f8f6f753200 20 mgr[py] loaded command telemetry show name=channels,req=false,n=N,type=CephString
2025-06-11T18:48:33.278+0000 7f8f6f753200 20 mgr[py] loaded command telemetry show-device
2025-06-11T18:48:33.278+0000 7f8f6f753200 20 mgr[py] loaded command telemetry show-all
2025-06-11T18:48:37.368+0000 7f8f6aed3640 1 mgr handle_mgr_map I am now activating
2025-06-11T18:48:39.312+0000 7f8f67ecd640 20 mgr send_beacon active (starting)
2025-06-11T18:48:41.313+0000 7f8f67ecd640 20 mgr send_beacon active (starting)
2025-06-11T18:48:43.314+0000 7f8f67ecd640 20 mgr send_beacon active (starting)
2025-06-11T18:48:45.315+0000 7f8f67ecd640 20 mgr send_beacon active (starting)
2025-06-11T18:48:47.316+0000 7f8f67ecd640 20 mgr send_beacon active (starting)
2025-06-11T18:48:49.316+0000 7f8f67ecd640 20 mgr send_beacon active (starting)
2025-06-11T18:48:50.756+0000 7f8ccff3c640 20 mgr send_beacon active
2025-06-11T18:48:50.819+0000 7f8ccf73b640 1 -- [v2:127.0.0.1:6800/3826094627,v1:127.0.0.1:6801/3826094627] <== client.4469 127.0.0.1:0/3847426830 1 ==== mgr_command(tid 0: {"prefix": "telemetry show", "target": ["mon-mgr", ""]}) ==== 79+0+0 (secure 0 0 0) 0x5645b30ecb60 con 0x5645b3936400
2025-06-11T18:48:50.819+0000 7f8ccf73b640 10 mgr.server _handle_command decoded-size=2 prefix=telemetry show
2025-06-11T18:48:50.821+0000 7f8ccf73b640 20 is_capable service=py module=telemetry command=telemetry show read addr - on cap allow *
2025-06-11T18:48:50.821+0000 7f8ccf73b640 0 log_channel(audit) log [DBG] : from='client.4469 -' entity='client.admin' cmd=[{"prefix": "telemetry show", "target": ["mon-mgr", ""]}]: dispatch
2025-06-11T18:48:50.821+0000 7f8ccf73b640 10 mgr.server _handle_command passing through command 'telemetry show' size 2
2025-06-11T18:48:50.821+0000 7f8cba611640 10 mgr.server operator() dispatching command 'telemetry show' size 2
2025-06-11T18:48:50.964+0000 7f8f68ecf640 10 log_client will send 2025-06-11T18:48:50.822638+0000 mgr.x (mgr.4461) 1 : audit [DBG] from='client.4469 -' entity='client.admin' cmd=[{"prefix": "telemetry show", "target": ["mon-mgr", ""]}]: dispatch
2025-06-11T18:48:51.317+0000 7f8f67ecd640 20 mgr send_beacon active
2025-06-11T18:48:51.821+0000 7f8f6aed3640 10 log_client logged 2025-06-11T18:48:50.822638+0000 mgr.x (mgr.4461) 1 : audit [DBG] from='client.4469 -' entity='client.admin' cmd=[{"prefix": "telemetry show", "target": ["mon-mgr", ""]}]: dispatch
2025-06-11T18:48:53.318+0000 7f8f67ecd640 20 mgr send_beacon active
|
@ljflores Thanks for validating my PR. From what I can remember, I came across a situation that a module may fail to load for some reason and that we have no guarantee over A module may also be deadlocked, but I guess we haven't seen something like this so far. So, we could drop my PR in favor of yours. |
|
@ljflores I'm also not aware of the sync/async nature of module loading. I'd presumed that the module loading is synchronous in nature when the list of active modules is being iterated over during module loading, i.e. the modules are loaded one after the other only when the result of module loading is known to the loader. |
Fixes: https://tracker.ceph.com/issues/67230 Signed-off-by: Milind Changire <mchangir@redhat.com>
8c461ff to
a97cd6f
Compare
|
Moved |
|
@ljflores I have now moved setting The only other major difference in my PR is that the PR doesn't confirm if If you are up to it, you could give it a try. |
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution! |
Fixes: https://tracker.ceph.com/issues/67230
Signed-off-by: Milind Changire mchangir@redhat.com
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e