squid: mgr: fix PyObject* refcounting in TTLCache and cleanup logic#66483
Open
NitzanMordhai wants to merge 5 commits intoceph:squidfrom
Open
squid: mgr: fix PyObject* refcounting in TTLCache and cleanup logic#66483NitzanMordhai wants to merge 5 commits intoceph:squidfrom
NitzanMordhai wants to merge 5 commits intoceph:squidfrom
Conversation
rzarzynski
reviewed
Dec 4, 2025
Contributor
rzarzynski
left a comment
There was a problem hiding this comment.
LGTM with a nit: the differences from cherry-picking need to be explicitly mentioned the commit description (Conflicts).
src/pybind/mgr/prometheus/module.py
Outdated
| import enum | ||
| from collections import namedtuple | ||
| from collections import OrderedDict | ||
| from tempfile import NamedTemporaryFile |
Contributor
There was a problem hiding this comment.
I think it's a difference from main which was including tempfile already.
64ef34f to
defd6ac
Compare
rzarzynski
approved these changes
Dec 4, 2025
| import time | ||
| import enum | ||
| from collections import namedtuple | ||
| from collections import OrderedDict |
Contributor
There was a problem hiding this comment.
Oh, so the discrepancy between main and the backport has been tackled not by adding the Conflict section but rather by squeezing the diff. Fine!
defd6ac to
c094b0c
Compare
Member
Contributor
Author
|
don't merge until #66571 merged |
Member
|
Converting to draft based on #66483 (comment) |
c094b0c to
0cea9aa
Compare
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
0cea9aa to
49c0001
Compare
This patch introduces several improvements to the Prometheus module:
- Introduces `HealthHistory._prune()` to drop stale and inactive health checks.
Limits the in-memory healthcheck dict to a configurable max_entries (default 1000).
TTL for stale entries is configurable via `healthcheck_history_stale_ttl` (default 3600s).
- Refactors HealthHistory.check() to use a unified iteration over known and current checks,
improving concurrency and minimizing redundant updates.
- Use cherrypy.tools.gzip instead of manual gzip.compress() for cleaner
HTTP compression with proper header handling and client negotiation.
- Introduces new module options:
- `healthcheck_history_max_entries`
- Add proper error handling for CherryPy engine startup failures
- Remove os._exit monkey patch in favor of proper exception handling
- Remove manual Content-Type header setting (CherryPy handles automatically)
Fixes: https://tracker.ceph.com/issues/68989
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
(cherry picked from commit be28901)
Fix incorrect reference counting and memory retention behavior in TTLCache when storing PyObject* values. Previously, TTLCache::insert did not increment the reference count, and `erase` / `clear` did not correctly decref the values, leading to use-after-free or leaks depending on usage. Changes: - Move Py_INCREF from cacheable_get_python() to TTLCache::insert() - Add `TTLCache::clear()` method for proper memory cleanup - Ensure TTLCache::get() returns a new reference - Fix misuse of std::move on c_str() in PyJSONFormatter These changes prevent both memory leaks and use-after-free errors when mgr modules use cached Python objects logic. Fixes: https://tracker.ceph.com/issues/68989 Signed-off-by: Nitzan Mordechai <nmordech@redhat.com> (cherry picked from commit 7fadf6a)
49c0001 to
f4b4236
Compare
The HealthHistory.check() method acquires the lock and then calls HealthHistory.save(), which also tries to acquire the same lock. With a regular Lock(), the same thread blocks trying to re-acquire it (deadlock). Switch to RLock to allow nested acquisition by the same thread. PR ceph#65245 added the locks. Fixes: https://tracker.ceph.com/issues/74148 Signed-off-by: Nitzan Mordechai <nmordech@ibm.com> (cherry picked from commit 26394ca)
PR ceph#65245 drop the header set for standby module, we should still have it. Fixes: https://tracker.ceph.com/issues/74149 Signed-off-by: Nitzan Mordechai <nmordech@ibm.com> (cherry picked from commit 2ef12b2)
Fixes: https://tracker.ceph.com/issues/74149 Signed-off-by: Nitzan Mordechai <nmordech@ibm.com> (cherry picked from commit e3de8c9)
f4b4236 to
2d050a7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
backport tracker: https://tracker.ceph.com/issues/74057
backport of #65245
parent tracker: https://tracker.ceph.com/issues/68989
this backport was staged using ceph-backport.sh version 16.0.0.6848
find the latest version at https://github.com/ceph/ceph/blob/main/src/script/ceph-backport.sh