mgr: isolated CherryPy to prevent global state sharing#67227
mgr: isolated CherryPy to prevent global state sharing#67227nizamial09 merged 1 commit intoceph:mainfrom
Conversation
|
I am opening this for some early reviews. Meanwhile I am testing this locally in different environments to see it works without any error. And it touches some code that I am not entirely familiar (cephadm agent and node-proxy agent servers along with service_discovery) but I was able to verify that they were working in my local kcli environment atleast. I'll refactor that code a bit more (cc: @adk3798 @rkachach ) @NitzanMordhai could we trigger a teuthology along with the other rocky10 PRs to see if this works in teuthology (who knows what issues it can catch other issues with the change 😅 |
e9c1f37 to
e8c450d
Compare
we need to include that PR on next build of branch-of-the-day, need to update the list of PRs and add that one |
There was a problem hiding this comment.
Pull request overview
This PR aims to isolate CherryPy instances across different Ceph manager modules to prevent global state conflicts when modules are loaded in the main interpreter. It introduces a new CherryPyMgr utility class that creates independent WSGI server instances for each module.
Changes:
- Introduces
cherrypy_module.pywith theCherryPyMgrclass to manage isolated CherryPy server instances - Refactors prometheus, dashboard, and cephadm modules to use the new isolated server approach
- Updates configuration handling to use application-level configs instead of global CherryPy configs
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| src/pybind/mgr/cherrypy_module.py | New utility class to create isolated CherryPy WSGI server instances |
| src/pybind/mgr/prometheus/module.py | Refactored to use CherryPyMgr for isolated server instances; config moved from global to app-level |
| src/pybind/mgr/dashboard/module.py | Updated to use CherryPyMgr; config handling moved to app-level |
| src/pybind/mgr/dashboard/tools.py | Updated CORS configuration to work with app-level configs |
| src/pybind/mgr/dashboard/services/auth/auth.py | Removed global CherryPy config updates |
| src/pybind/mgr/dashboard/tests/init.py | Updated test setup to use isolated tree |
| src/pybind/mgr/cephadm/module.py | Added debug log message for HTTP server startup |
| src/pybind/mgr/cephadm/http_server.py | Refactored to start isolated servers for service discovery and agent |
| src/pybind/mgr/cephadm/agent.py | Removed Server inheritance; updated to work with isolated servers |
| src/pybind/mgr/cephadm/services/service_discovery.py | Removed Server inheritance; updated to work with isolated servers |
| debian/ceph-mgr.install | Added cherrypy_module to installation |
| ceph.spec.in | Added cherrypy_module to RPM spec |
Comments suppressed due to low confidence (3)
src/pybind/mgr/cephadm/http_server.py:55
- The configure method at line 52 calls self.agent.configure() without any arguments, but the agent.configure method signature was changed to require a tree parameter. This will cause a TypeError when configure() is called. The configure method should be removed or updated to pass the tree argument.
def configure(self) -> None:
self.agent.configure()
self.service_discovery.configure(self.mgr.service_discovery_port,
self.mgr.get_mgr_ip(),
self.security_enabled)
src/pybind/mgr/dashboard/module.py:520
- Call to function configure_cors with too few arguments; should be no fewer than 1.
configure_cors()
src/pybind/mgr/dashboard/module.py:538
- Call to function configure_cors with too few arguments; should be no fewer than 1.
configure_cors()
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
athanatos
left a comment
There was a problem hiding this comment.
Seems broadly reasonable from my understanding. Someone more familiar with these users should give an actual review.
|
https://jenkins.ceph.com/job/ceph-dev-pipeline/2821/pipeline-overview/?selected-node=1087 Build failure ^ |
e8c450d to
2285e49
Compare
fixed that. I'll retrigger an isolated build to make sure builds are passing. once its passing I'll let you guys know and we can add it along with the rocky10 patches. |
a301de0 to
9c5bf62
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 11 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
c4ae8e9 to
c0f77e4
Compare
|
added unit tests as well. |
|
jenkins test make check arm64 |
ee8d2a0 to
0b70cae
Compare
0b70cae to
13bc335
Compare
13bc335 to
dbe078a
Compare
as the modules are now being loaded onto the main interpreter (see ceph#66244), the cherrypy is getting hit with an issue where its global state is being affecting all the modules updating the cherrypy config simultaneously in the same tree. So i am adding a CherryPyMgr which manages all the independent servers that will be created across all modules. This CherryPyMgr will create its own server instances by utilizing cherrypy's WSGI Server and eliminates the global state sharing. Each module or app can create their own tree and start an adapter which will open an independent server for that app. - also added a method to update the config in place so CORS urls can be configured without restarting servers. Fixes: https://tracker.ceph.com/issues/74643, https://tracker.ceph.com/issues/74543, https://tracker.ceph.com/issues/74980 Signed-off-by: Nizamudeen A <nia@redhat.com>
dbe078a to
1384ae3
Compare
|
@nizamial09 PTAL at this ticket that came up in the Rocky10 testing: https://tracker.ceph.com/issues/75213 |
|
thanks @ljflores, I also saw that in a different rados suite here and fixed that issue. I was checking the latest rados result which included an updated commit and I don't see the issue there anymore. |
Thanks for confirming! |
as the modules are now being loaded onto the main interpreter (see #66244), the
cherrypy is getting hit with an issue where its global state is being affecting all the modules updating the cherrypy config simultaneously in the same tree.
So i am adding a CherryPyMgr which manages all the independent servers that will be created across all modules. This CherryPyMgr will create its own server instances by utilizing cherrypy's WSGI Server and eliminates the global state sharing. Each module or app can create their own tree and start an adapter which will open an independent server for that app.
based on https://docs.cherrypy.dev/en/latest/pkg/cherrypy.process.servers.html#multiple-servers-ports
Fixes: https://tracker.ceph.com/issues/74643, https://tracker.ceph.com/issues/74543
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins test classic perfJenkins Job | Jenkins Job Definitionjenkins test crimson perfJenkins Job | Jenkins Job Definitionjenkins test signedJenkins Job | Jenkins Job Definitionjenkins test make checkJenkins Job | Jenkins Job Definitionjenkins test make check arm64Jenkins Job | Jenkins Job Definitionjenkins test submodulesJenkins Job | Jenkins Job Definitionjenkins test dashboardJenkins Job | Jenkins Job Definitionjenkins test dashboard cephadmJenkins Job | Jenkins Job Definitionjenkins test apiJenkins Job | Jenkins Job Definitionjenkins test docsReadTheDocs | Github Workflow Definitionjenkins test ceph-volume allJenkins Jobs | Jenkins Jobs Definitionjenkins test windowsJenkins Job | Jenkins Job Definitionjenkins test rook e2eJenkins Job | Jenkins Job DefinitionYou must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.