Skip to content

mgr: isolated CherryPy to prevent global state sharing#67227

Merged
nizamial09 merged 1 commit intoceph:mainfrom
rhcs-dashboard:isolate-cherrypy
Mar 3, 2026
Merged

mgr: isolated CherryPy to prevent global state sharing#67227
nizamial09 merged 1 commit intoceph:mainfrom
rhcs-dashboard:isolate-cherrypy

Conversation

@nizamial09
Copy link
Member

@nizamial09 nizamial09 commented Feb 5, 2026

as the modules are now being loaded onto the main interpreter (see #66244), the
cherrypy is getting hit with an issue where its global state is being affecting all the modules updating the cherrypy config simultaneously in the same tree.

So i am adding a CherryPyMgr which manages all the independent servers that will be created across all modules. This CherryPyMgr will create its own server instances by utilizing cherrypy's WSGI Server and eliminates the global state sharing. Each module or app can create their own tree and start an adapter which will open an independent server for that app.

based on https://docs.cherrypy.dev/en/latest/pkg/cherrypy.process.servers.html#multiple-servers-ports

Fixes: https://tracker.ceph.com/issues/74643, https://tracker.ceph.com/issues/74543

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

You must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.

@nizamial09
Copy link
Member Author

nizamial09 commented Feb 5, 2026

I am opening this for some early reviews. Meanwhile I am testing this locally in different environments to see it works without any error.

And it touches some code that I am not entirely familiar (cephadm agent and node-proxy agent servers along with service_discovery) but I was able to verify that they were working in my local kcli environment atleast. I'll refactor that code a bit more (cc: @adk3798 @rkachach )

@NitzanMordhai could we trigger a teuthology along with the other rocky10 PRs to see if this works in teuthology (who knows what issues it can catch other issues with the change 😅

@nizamial09 nizamial09 force-pushed the isolate-cherrypy branch 2 times, most recently from e9c1f37 to e8c450d Compare February 5, 2026 11:01
@NitzanMordhai
Copy link
Contributor

@NitzanMordhai could we trigger a teuthology along with the other rocky10 PRs to see if this works in teuthology (who knows it can catch other issues with the change 😅

we need to include that PR on next build of branch-of-the-day, need to update the list of PRs and add that one

@nizamial09 nizamial09 changed the title [wip]: mgr: isolated CherryPy to prevent global state sharing [wip] mgr: isolated CherryPy to prevent global state sharing Feb 5, 2026
@epuertat epuertat requested a review from Copilot February 5, 2026 12:01
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to isolate CherryPy instances across different Ceph manager modules to prevent global state conflicts when modules are loaded in the main interpreter. It introduces a new CherryPyMgr utility class that creates independent WSGI server instances for each module.

Changes:

  • Introduces cherrypy_module.py with the CherryPyMgr class to manage isolated CherryPy server instances
  • Refactors prometheus, dashboard, and cephadm modules to use the new isolated server approach
  • Updates configuration handling to use application-level configs instead of global CherryPy configs

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
src/pybind/mgr/cherrypy_module.py New utility class to create isolated CherryPy WSGI server instances
src/pybind/mgr/prometheus/module.py Refactored to use CherryPyMgr for isolated server instances; config moved from global to app-level
src/pybind/mgr/dashboard/module.py Updated to use CherryPyMgr; config handling moved to app-level
src/pybind/mgr/dashboard/tools.py Updated CORS configuration to work with app-level configs
src/pybind/mgr/dashboard/services/auth/auth.py Removed global CherryPy config updates
src/pybind/mgr/dashboard/tests/init.py Updated test setup to use isolated tree
src/pybind/mgr/cephadm/module.py Added debug log message for HTTP server startup
src/pybind/mgr/cephadm/http_server.py Refactored to start isolated servers for service discovery and agent
src/pybind/mgr/cephadm/agent.py Removed Server inheritance; updated to work with isolated servers
src/pybind/mgr/cephadm/services/service_discovery.py Removed Server inheritance; updated to work with isolated servers
debian/ceph-mgr.install Added cherrypy_module to installation
ceph.spec.in Added cherrypy_module to RPM spec
Comments suppressed due to low confidence (3)

src/pybind/mgr/cephadm/http_server.py:55

  • The configure method at line 52 calls self.agent.configure() without any arguments, but the agent.configure method signature was changed to require a tree parameter. This will cause a TypeError when configure() is called. The configure method should be removed or updated to pass the tree argument.
    def configure(self) -> None:
        self.agent.configure()
        self.service_discovery.configure(self.mgr.service_discovery_port,
                                         self.mgr.get_mgr_ip(),
                                         self.security_enabled)

src/pybind/mgr/dashboard/module.py:520

        configure_cors()

src/pybind/mgr/dashboard/module.py:538

        configure_cors()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@athanatos athanatos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems broadly reasonable from my understanding. Someone more familiar with these users should give an actual review.

@athanatos
Copy link
Contributor

https://jenkins.ceph.com/job/ceph-dev-pipeline/2821/pipeline-overview/?selected-node=1087

libtcmalloc.so.4()(64bit) libudev.so.1()(64bit) libupb.so.24()(64bit) libz.so.1()(64bit) rtld(GNU_HASH)
Processing files: ceph-mgr-20.3.0-5168.gc852aade.el9.x86_64
error: File not found: /ceph/rpmbuild/BUILDROOT/ceph-20.3.0-5168.gc852aade.el9.x86_64/usr/share/ceph/mgr/cherrypy_module.*
RPM build errors:
    File not found: /ceph/rpmbuild/BUILDROOT/ceph-20.3.0-5168.gc852aade.el9.x86_64/usr/share/ceph/mgr/cherrypy_module.*
2026-02-05 19:13:35,046: INFO: step done: rpm failed in 00:23:30
2026-02-05 19:13:35,046: ERROR: Command failed: Command '['podman', 'run', '--name=ceph_build', '--rm', '--pids-limit=-1', '--user=0', '--env-file=/home/jenkins-build/build/workspace/ceph-dev-pipeline/.env', '--volume=/home/jenkins-build/build/workspace/ceph-dev-pipeline/dist/ceph:/ceph:Z', '-eHOMEDIR=/ceph', 'quay.ceph.io/ceph-ci/ceph-build:c852aad.wip-rocky10-branch-of-the-day-2026-02-05-1770316030-with-cherrypy-pr.centos9.x86_64.default', 'bash', '-c', "set -x; mkdir -p /ceph/rpmbuild && rpmbuild --rebuild '-D_topdir /ceph/rpmbuild' --with=sccache --without=dwz --with=tcmalloc /ceph/ceph-20.3.0-5168.gc852aade.el9.src.rpm"]' returned non-zero exit status 1.
2026-02-05 19:13:35,046: WARNING: 🚧 the command may have faild due to circumstances beyond the influence of this build script. For example: a complier error caused by a source code change. Pay careful attention to the output generated by the command before reporting this as a problem with the build-with-container.py script. 🚧
script returned exit code 1
error: File not found: /ceph/rpmbuild/BUILDROOT/ceph-20.3.0-5168.gc852aade.el9.x86_64/usr/share/ceph/mgr/cherrypy_module.*

Build failure ^

@github-actions github-actions bot added CI Continuous Integration mgr labels Feb 6, 2026
@nizamial09
Copy link
Member Author


Build failure ^

fixed that. I'll retrigger an isolated build to make sure builds are passing. once its passing I'll let you guys know and we can add it along with the rocky10 patches.

@nizamial09 nizamial09 force-pushed the isolate-cherrypy branch 3 times, most recently from a301de0 to 9c5bf62 Compare February 6, 2026 06:37
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 11 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@nizamial09 nizamial09 force-pushed the isolate-cherrypy branch 2 times, most recently from c4ae8e9 to c0f77e4 Compare February 11, 2026 15:22
@nizamial09
Copy link
Member Author

added unit tests as well.

@nizamial09
Copy link
Member Author

jenkins test make check arm64

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 8 comments.

Copy link
Contributor

@rkachach rkachach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

as the modules are now being loaded onto the main interpreter (see
ceph#66244), the
cherrypy is getting hit with an issue where its global state is being
affecting all the modules updating the cherrypy config simultaneously in
the same tree.

So i am adding a CherryPyMgr which manages all the independent servers
that will be created across all modules. This CherryPyMgr will create
its own server instances by utilizing cherrypy's WSGI Server and
eliminates the global state sharing. Each module or app can create their
own tree and start an adapter which will open an independent server for
that app.

- also added a method to update the config in place so CORS urls can be
  configured without restarting servers.

Fixes: https://tracker.ceph.com/issues/74643, https://tracker.ceph.com/issues/74543, https://tracker.ceph.com/issues/74980
Signed-off-by: Nizamudeen A <nia@redhat.com>
@ljflores
Copy link
Member

@nizamial09 PTAL at this ticket that came up in the Rocky10 testing: https://tracker.ceph.com/issues/75213

@nizamial09
Copy link
Member Author

thanks @ljflores, I also saw that in a different rados suite here and fixed that issue. I was checking the latest rados result which included an updated commit and I don't see the issue there anymore.

@nizamial09 nizamial09 merged commit 805aa0b into ceph:main Mar 3, 2026
13 of 15 checks passed
@nizamial09 nizamial09 deleted the isolate-cherrypy branch March 3, 2026 07:14
@github-project-automation github-project-automation bot moved this from Reviewer approved to Done in Ceph-Dashboard Mar 3, 2026
@ljflores
Copy link
Member

ljflores commented Mar 3, 2026

thanks @ljflores, I also saw that in a different rados suite here and fixed that issue. I was checking the latest rados result which included an updated commit and I don't see the issue there anymore.

Thanks for confirming!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

10 participants