pybind/mgr/mgr_module: isolate logging per mgr module#67327
pybind/mgr/mgr_module: isolate logging per mgr module#67327NitzanMordhai wants to merge 3 commits intoceph:mainfrom
Conversation
d9c9e68 to
8583df5
Compare
There was a problem hiding this comment.
The larger problem is that there's a ton of code across multiple modules that doesn't actually use MgrModule.getLogger() to get a logger instance. They directly call logger.getLogger(name) to get a logger named based on the file in which they are defined. This worked before because each module init set a filter on the root logger which in turn applied to every logger.
To make matters worse, those callers don't necessarily have references to the module itself, so there's really no way to call MgrModule.getLogger. I think the solution needs to be generating a logger attached to each MgrModule subclass (type, not instance), which can then be referenced at import time in the other files in the same package.
See below, this should work.
src/pybind/mgr/mgr_module.py
Outdated
| self._root_logger.addHandler(self._mgr_log_handler) | ||
| root = logging.getLogger() | ||
| if not any(isinstance(h, CPlusPlusHandler) for h in root.handlers): | ||
| root.addHandler(CPlusPlusHandler(self)) |
There was a problem hiding this comment.
This will cause anything logged to the root logger to be formatted with the name of the first module to start up.
There was a problem hiding this comment.
@athanatos, now I'm using MgrRootHandler, so the first module name won't show. It will show instead 'mgr' at the beginning, for example:
2026-02-15T12:29:35.770+0000 7eff29af1640 0 [prometheus ERROR prometheus] Failed to get SMB metadata: No orchestrator configured (try `ceph orch set backend`)
2026-02-15T12:29:35.770+0000 7eff29af1640 0 [prometheus ERROR prometheus] Failed to collect cephadm daemon status: No orchestrator configured (try `ceph orch set backend`)
2026-02-15T12:29:35.781+0000 7eff1820e640 0 [mgr INFO mgr] scanning for idle connections..
2026-02-15T12:29:35.781+0000 7eff1820e640 0 [mgr INFO mgr] cleaning up connections: []
2026-02-15T12:29:35.781+0000 7eff17a0d640 0 [mgr INFO mgr] scanning for idle connections..
src/pybind/mgr/mgr_module.py
Outdated
| def getLogger(self, name: Optional[str] = None) -> logging.Logger: | ||
| if name is None: | ||
| return logging.getLogger(self.module_name) | ||
| return logging.getLogger(name) |
There was a problem hiding this comment.
Here, if name is not None, getLogger(name) will return a logger nested under the root logger, but not necessarily under this module's logger. I'd suggest returning either self._module_logger or self._module_logger.getChild(name).
|
@athanatos i'm using propagate=false and logging.getLogger(name) should propagate to the correct logger that was created during _configure_logging for the specific module. |
Oh! |
8583df5 to
fd8b2b7
Compare
|
jenkins test make check |
|
jenkins test make check arm64 |
|
@NitzanMordhai i was testing the latest image and i seee which is wrong. its a dashboard log and not a DEBUG one either (probably INFO) but its coming as mgr.. |
We are going to have an issue with getlogger like: Since we are now depending on a logger per module, we need to know from which module that logger was requested, and we didn't specify any, so we will route to the root logger (which is mgr logger) |
fd8b2b7 to
ac02bbb
Compare
we do like |
|
@nizamial09 since we are using propagate=true, you can add the module name to the beginning of the name, it will then route to the correct logger https://docs.python.org/3/library/logging.html#logging.Logger.propagate |
|
okay, so you mean something like |
name already have dashboard if i understand correctly, but if you want to keep: then something like: |
After PR ceph#66244, all mgr modules run inside the same Python interpreter. That means they also share the same logging subsystem. Previously, each module attached its handlers to the root logger. In practice, whichever module initialized logging last effectively “owned” the root logger, and log messages from other modules could end up attributed incorrectly. This change scopes logging per module. Each module now registers its handlers on a dedicated logger named after the module itself, with propagate=False to avoid leaking messages into the root logger. Now, the getLogger() default (no args) returns the module's named logger rather than the root logger. This ensures self.log routes correctly. Fixes: https://tracker.ceph.com/issues/74848 Signed-off-by: Nitzan Mordechai <nmordech@ibm.com>
Fixes: https://tracker.ceph.com/issues/74848 Signed-off-by: Nitzan Mordechai <nmordech@ibm.com>
Previously, using a hard-coded logger name like 'rgw_client' created a top-level logger that bypassed the 'mgr.dashboard' hierarchy. By switching to __name__, we ensure the logger identity follows the package structure (e.g., 'mgr.dashboard.services.rgw_client'). Since propagate=True is enabled, this allows log records to flow upward through the 'mgr' parent loggers, ensuring they are correctly captured, formatted, and attributed to the dashboard module rather than falling back to the root logger. Fixes: https://tracker.ceph.com/issues/74848 Signed-off-by: Nitzan Mordechai <nmordech@ibm.com>
ac02bbb to
8c0f0d5
Compare
|
@nizamial09 please check the new commit for changing the dashboard logging |
nizamial09
left a comment
There was a problem hiding this comment.
Woah! I honestly thought we were using __name__ across the dashboard getLogger() but i was wrong then. Thanks for taking care of that @NitzanMordhai
|
This PR is under test in https://tracker.ceph.com/issues/75377. |
After PR #66244, all mgr modules run inside the same Python interpreter.
That means they also share the same logging subsystem.
Previously, each module attached its handlers to the root logger. In practice,
whichever module initialized logging last effectively “owned” the root logger,
and log messages from other modules could end up attributed incorrectly.
This change scopes logging per module. Each module now registers its handlers
on a dedicated logger named after the module itself, with propagate=False to avoid
leaking messages into the root logger.
Now, the getLogger() default (no args) returns the module's named logger
rather than the root logger. This ensures self.log routes correctly.
Fixes: https://tracker.ceph.com/issues/74848
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins test classic perfJenkins Job | Jenkins Job Definitionjenkins test crimson perfJenkins Job | Jenkins Job Definitionjenkins test signedJenkins Job | Jenkins Job Definitionjenkins test make checkJenkins Job | Jenkins Job Definitionjenkins test make check arm64Jenkins Job | Jenkins Job Definitionjenkins test submodulesJenkins Job | Jenkins Job Definitionjenkins test dashboardJenkins Job | Jenkins Job Definitionjenkins test dashboard cephadmJenkins Job | Jenkins Job Definitionjenkins test apiJenkins Job | Jenkins Job Definitionjenkins test docsReadTheDocs | Github Workflow Definitionjenkins test ceph-volume allJenkins Jobs | Jenkins Jobs Definitionjenkins test windowsJenkins Job | Jenkins Job Definitionjenkins test rook e2eJenkins Job | Jenkins Job DefinitionYou must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.