Bug #74643
opencherrypy.process.wspbus.ChannelFailures: TypeError('certfile should be a valid filesystem path')
0%
Description
/a/nmordech-2026-01-28_16:11:20-rados-wip-rocky10-branch-of-the-day-2026-01-23-1769128778-distro-default-trial/23483/teuthology.log
026-01-28T17:59:12.296 INFO:journalctl@ceph.mgr.y.trial152.stdout:Jan 28 17:59:11 trial152 bash[67743]: debug 2026-01-28T17:59:11.930+0000 7f6f70d48640 -1 log_channel(cluster) log [ERR] : Unhandled exception from module 'dashboard' while running on mgr.y: TypeError('certfile should be a valid filesystem path')
2026-01-28T17:59:12.296 INFO:journalctl@ceph.mgr.y.trial152.stdout:Jan 28 17:59:11 trial152 bash[67743]: debug 2026-01-28T17:59:11.930+0000 7f6f70d48640 -1 dashboard.serve:
2026-01-28T17:59:12.296 INFO:journalctl@ceph.mgr.y.trial152.stdout:Jan 28 17:59:11 trial152 bash[67743]: debug 2026-01-28T17:59:11.930+0000 7f6f70d48640 -1 Traceback (most recent call last):
2026-01-28T17:59:12.296 INFO:journalctl@ceph.mgr.y.trial152.stdout:Jan 28 17:59:11 trial152 bash[67743]: File "/usr/share/ceph/mgr/dashboard/module.py", line 367, in serve
2026-01-28T17:59:12.296 INFO:journalctl@ceph.mgr.y.trial152.stdout:Jan 28 17:59:11 trial152 bash[67743]: cherrypy.engine.start()
2026-01-28T17:59:12.296 INFO:journalctl@ceph.mgr.y.trial152.stdout:Jan 28 17:59:11 trial152 bash[67743]: File "/lib/python3.9/site-packages/cherrypy/process/wspbus.py", line 283, in start
2026-01-28T17:59:12.296 INFO:journalctl@ceph.mgr.y.trial152.stdout:Jan 28 17:59:11 trial152 bash[67743]: raise e_info
2026-01-28T17:59:12.296 INFO:journalctl@ceph.mgr.y.trial152.stdout:Jan 28 17:59:11 trial152 bash[67743]: File "/lib/python3.9/site-packages/cherrypy/process/wspbus.py", line 268, in start
2026-01-28T17:59:12.296 INFO:journalctl@ceph.mgr.y.trial152.stdout:Jan 28 17:59:11 trial152 bash[67743]: self.publish('start')
2026-01-28T17:59:12.296 INFO:journalctl@ceph.mgr.y.trial152.stdout:Jan 28 17:59:11 trial152 bash[67743]: File "/lib/python3.9/site-packages/cherrypy/process/wspbus.py", line 248, in publish
2026-01-28T17:59:12.296 INFO:journalctl@ceph.mgr.y.trial152.stdout:Jan 28 17:59:11 trial152 bash[67743]: raise exc
2026-01-28T17:59:12.296 INFO:journalctl@ceph.mgr.y.trial152.stdout:Jan 28 17:59:11 trial152 bash[67743]: cherrypy.process.wspbus.ChannelFailures: TypeError('certfile should be a valid filesystem path')
Updated by Nitzan Mordechai about 2 months ago
- Related to QA Run #74540: wip-rocky10-branch-of-the-day-2026-01-23-1769128778 added
Updated by Nizamudeen A about 2 months ago
I couldn't fully understand the logs in the first place tbh.
The below says they are originated from `volumes` module but infact some of them are dashboard logs but I am not sure why its getting logged as volumes.
2026-01-28T17:58:47.037+0000 7f49c5982640 0 [volumes INFO dashboard.module] Engine started.
2026-01-28T17:58:47.037+0000 7f49c5982640 20 mgr get_config key: mgr/dashboard/GRAFANA_UPDATE_DASHBOARDS
2026-01-28T17:58:47.037+0000 7f49c5982640 10 mgr get_typed_config GRAFANA_UPDATE_DASHBOARDS not found
2026-01-28T17:58:47.153+0000 7f49c998a640 0 [volumes DEBUG root] Cephadm agent endpoint using 7151
2026-01-28T17:58:47.189+0000 7f49b8968640 0 [volumes ERROR root] Failed to start engine: TypeError('certfile should be a valid filesystem path')
2026-01-28T17:58:47.189+0000 7f49b8968640 20 mgr ~Gil Destroying new thread state 0x25cb7560
2026-01-28T17:58:47.261+0000 7f49c998a640 0 [volumes DEBUG root] Cherrypy engine started.
2026-01-28T17:58:47.261+0000 7f49c998a640 0 [volumes DEBUG root] _kick_serve_loop
That makes me worry that maybe some shared global state is messing things up. To make things worse, cherrypy shares a global config. cephadm and dashboard uses cherrypy and I think both modifies the global config of cherrypy. Earlier it was isolated per module but now it looks like its not so they are messing up each other. But I couldn't understand why its interfering globally now? @Nitzan Mordechai are these ran against any PRs?
One fix to get rid of the isolation might be to start different servers using adapters and config each servers on their own and globally start the cherrypy like https://docs.cherrypy.dev/en/latest/pkg/cherrypy.process.servers.html#multiple-servers-ports? but I guess first i would like to understand the situation more..
cc: @Ernesto Puerta
Updated by Samuel Just about 2 months ago
See my comment on http://localhost:3000/issues/74543#note-19, log prefixes are messed up due to global state.
I agree that two cherrypys with a shared global configuration is likely to be the problem.
Updated by Samuel Just about 2 months ago
There might be a way to work around this issue, but I'd really prefer that mgr modules no longer rely on sub-interpreter isolation -- there are dependencies that don't work in sub-interpreters at all. I think it's worth fixing properly.
Updated by Samuel Just about 2 months ago
@Afreen Misbah @Nizamudeen A @Ernesto Puerta Is the plan for you to address the cherrypy and logging issues? I can take a look if not, but I suspect you'll have a much easier time of it. You'll want to base your work on https://github.com/ceph/ceph/pull/66467 (CLICommand refactor) and https://github.com/ceph/ceph/pull/66244 (Run modules in a single interpreter). Please let me know either way, it's a blocker for rocky10. @Yaarit Hatuka
Updated by Nizamudeen A about 2 months ago
@Samuel Just Yup, we are having a look at the issue and will try to resolve it. I was trying to reproduce it using the image that we get from the shaman runs but somehow they are working without any issues so do you have any suggestions to start a similar environment where things are breaking like in the teuthology?
Updated by Samuel Just about 2 months ago
Apologies, I really don't have much experience with the manager python module code. I'd guess the important part will be starting the cluster with cephadm to get that module actually running. Probably worth reaching out to that team?
Updated by Samuel Just about 2 months ago
To clarify, my first guess would be that you need to start up a local cluster with cephadm to get both that and the dashboard running at the same time.
Updated by Nitzan Mordechai about 1 month ago
/a/nmordech-2026-02-04_08:35:27-rados:cephadm-wip-rocky10-branch-of-the-day-2026-02-03-1770151121-distro-default-trial/34065
Updated by Yaarit Hatuka about 1 month ago
- Backport set to tentacle
- Pull request ID set to 67227
Updated by Nizamudeen A about 1 month ago
- Status changed from New to Fix Under Review
- Assignee changed from Afreen Misbah to Nizamudeen A
Updated by Yaarit Hatuka about 1 month ago
- Has duplicate Bug #74543: Rocky10 - AttributeError in dashboard module added
Updated by Nizamudeen A 21 days ago
- Related to Bug #73930: ceph-mgr modules rely on deprecated python subinterpreters added
Updated by Upkeep Bot 17 days ago
- Status changed from Fix Under Review to Pending Backport
- Merge Commit set to 805aa0b1ae57911f2f25d96d5d10a6fe5d583ce3
- Fixed In set to v20.3.0-5760-g805aa0b1ae
- Upkeep Timestamp set to 2026-03-03T07:17:04+00:00
Updated by Upkeep Bot 17 days ago
- Copied to Backport #75286: tentacle: cherrypy.process.wspbus.ChannelFailures: TypeError('certfile should be a valid filesystem path') added
Updated by Naveen Naidu 14 days ago
/a/yuriw-2026-03-02_18:34:01-rados-wip-yuri3-testing-2026-03-02-1622-distro-default-trial
2 jobs: ['76554', '76792']