mgr: add retry logic for module loading#61325
Conversation
The current "active and enabled" check order can result in an ENOTSUP being returned when a module is enabled, but needs more time to load. In this case, the command should be supported, but the module timed out loading. With the internal retry logic that is added (the mgr retries loading the module several times before giving up), ETIMEDOUT is more appropriate. The default max time that a module can take to load is now 5 seconds. To adjust this time, the following command can be run: `ceph config set mgr mgr_module_load_timeout <uint>` The default cadence at which the mgr retries is every 1 second. To adjust this time, the following command can be run: `ceph config set mgr mgr_module_load_interval <uint>` Fixes: https://tracker.ceph.com/issues/69012 Signed-off-by: Laura Flores <lflores@ibm.com> Co-authored-by: Brad Hubbard <bhubbard@redhat.com>
This commit adds a dev-only config that can inject a longer loading time into the mgr module loading sequence so we can simulate this scenario in a test. The config is 0 secs by default since we do not add any delay outside of testing scenarios. The config can be adjusted with the following command: `ceph config set mgr mgr_module_load_delay <uint>` A second dev-only config also allows you to specify which module you want to be delayed in loading time. You may change this with the following command: `ceph config set mgr mgr_module_load_delay_name <module name>` The workunit added here tests a simulated slow loading module scenario, as well as checks for any problems in the actual module loading time. Signed-off-by: Laura Flores <lflores@ibm.com>
Make check errors seem unrelated. |
|
jenkins test make check |
| << prefix << "'): retrying in " << interval << " secs (" << retries | ||
| << " retries left)." << dendl; | ||
| // Sleep for the retry interval | ||
| std::this_thread::sleep_for(std::chrono::seconds(interval)); |
There was a problem hiding this comment.
OK, so mgr starts taking care about delaying requests till the module gets initialized. Seems sane.
There was a problem hiding this comment.
Yeah, we basically give the mgr a little more time to retry loading the module. It's not a guarantee that the module loads in time, but if it doesn't load in the 5 seconds we give it, there is something bigger going on, and we will issue an ETIMEDOUT. But this will help account for slower hardware and any kind of latency in the cluster.
There was a problem hiding this comment.
This blocks the messenger thread. I don't think we can structure it this way. Note also
which also has checks for whether the module is loaded properly. That should be consolidated.
Only the mod_finisher should block waiting for the module to load and there should be a condition variable that wakes the finisher thread rather than unconditionally wait interval seconds.
|
@batrick: ? |
badone
left a comment
There was a problem hiding this comment.
LGTM but I wrote some of it of course so an additional approval would be advisable I believe :)
| echo "Testing with module load delay of 6 seconds..." | ||
| ceph config set mgr mgr_module_load_delay 6 | ||
|
|
||
| output=$(ceph mgr fail; ceph orch status 2>&1) |
|
@rzarzynski yeah, I will check with Patrick though. |
|
Going to add this in a QA batch since we have several approvals. However, if @batrick has any feedback, still feel free to add it. |
| - cluster_create | ||
| # retry every N seconds | ||
| - name: mgr_module_load_interval | ||
| type: uint |
There was a problem hiding this comment.
| type: uint | |
| type: millisecs |
Would be better. See for example mds_log_trim_upkeep_interval.
| with_legacy: true | ||
| # fail if the module fails to load in time | ||
| - name: mgr_module_load_timeout | ||
| type: uint |
| - mgr | ||
| with_legacy: true | ||
| - name: mgr_module_load_delay | ||
| type: uint |
| desc: Choose which mgr module to inject a load delay. For testing purposes only. | ||
| flags: | ||
| - runtime | ||
| with_legacy: true |
There was a problem hiding this comment.
Do you mean just for this config option, or for each new config option?
There was a problem hiding this comment.
any new config should not have with_legacy: true.
| auto& mod_name = py_command.module_name; | ||
| uint64_t interval = cct->_conf->mgr_module_load_interval; | ||
| uint64_t timeout = cct->_conf->mgr_module_load_timeout; | ||
| uint64_t retries = floor(timeout / interval); |
There was a problem hiding this comment.
floor is unnecessary for integers.
| "command '" << prefix << "')."; | ||
| dout(4) << ss.str() << dendl; | ||
| cmdctx->reply(-EOPNOTSUPP, ss); | ||
| cmdctx->reply(-ETIMEDOUT, ss); |
There was a problem hiding this comment.
| cmdctx->reply(-ETIMEDOUT, ss); | |
| cmdctx->reply(-EAGAIN, ss); |
For applications using the ceph-mgr provided CLI, we want retries prompted by returning EAGAIN. See for example:
ceph/src/mgr/PyModuleRegistry.cc
Lines 363 to 365 in 55982ea
There is also a PR relating to that: #60194
Since this code has a built-in expectation that a module may eventually load, we should tell the caller to try again later.
There was a problem hiding this comment.
Suggest also issuing a cluster log error that a module is failing to load to satisfy commands.
| << py_handler_name << "` to enable it"; | ||
| dout(4) << ss.str() << dendl; | ||
| cmdctx->reply(-EOPNOTSUPP, ss); | ||
| return true; |
There was a problem hiding this comment.
It may also be good to issue a cluster log warning that API calls are being made to the ceph-mgr which require a module to be loaded.
| << prefix << "'): retrying in " << interval << " secs (" << retries | ||
| << " retries left)." << dendl; | ||
| // Sleep for the retry interval | ||
| std::this_thread::sleep_for(std::chrono::seconds(interval)); |
There was a problem hiding this comment.
This blocks the messenger thread. I don't think we can structure it this way. Note also
which also has checks for whether the module is loaded properly. That should be consolidated.
Only the mod_finisher should block waiting for the module to load and there should be a condition variable that wakes the finisher thread rather than unconditionally wait interval seconds.
|
@ronen-fr no, dropping it from the batch |
|
Here are my comments as per Brad's request: My PR #59089 just defers the declaration of MGR availability only after an attempt to load all active modules. FYI - My PR neither reattempts to load modules that have failed loading once nor does it add any timeout logic to declare eventual failure. |
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution! |
This PR accomplishes several things:
Reorder "active and enabled" checks & add retry logic for module loading
The current "active and enabled" check order can result in an ENOTSUP being
returned when a module is enabled, but needs more time to load. In this case,
the command should be supported, but the module timed out loading. We added
internal retry logic to the mgr, so now the mgr retries loading the module several
times before giving up. With that implemented, ETIMEDOUT is more appropriate.
The default max time that a module can take to load is now 5 seconds. To
adjust this time, the following command can be run:
ceph config set mgr mgr_module_load_timeout <uint>The default cadence at which the mgr retries is every 1 second. To
adjust this time, the following command can be run:
ceph config set mgr mgr_module_load_interval <uint>Add test coverage for slow loading module
To test the above modifications, this PR adds a dev-only config
that can inject a longer loading time into the mgr module loading
sequence so we can simulate this scenario in a test.
The config is 0 secs by default since we do not add any delay
outside of testing scenarios. The config can be adjusted
with the following command:
ceph config set mgr mgr_module_load_delay <uint>A second dev-only config also allows you to specify which
module you want to be delayed in loading time. You may change
this with the following command:
ceph config set mgr mgr_module_load_delay_name <module name>The workunit added here tests a simulated slow loading module
scenario, as well as checks for any problems in the actual
module loading time.
Latest teuthology results for the new workunit are green:
https://pulpito.ceph.com/lflores-2025-01-10_16:53:51-rados:mgr-wip-_handle_command-returns-ENOTSUP-prematurely-distro-default-smithi/
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e