test/osd: Fix unittest_peeringstate message leaks:#67544
Merged
Conversation
14 tasks
Contributor
Author
|
see https://jenkins.ceph.com/job/ceph-pull-requests/175374/console for the full log. |
5dacb0a to
d0afe9d
Compare
The test creates OSD messages (MOSDPGLease, MOSDPGNotify2,
MBackfillReserve, etc.) via ceph::make_message but the mock
PGListener accumulates them in a messages map without clearing them
at test teardown. This caused ~100 leak reports in CI.
Fix by explicitly clearing undispatched messages in TearDown() before
destroying the listeners. This ensures all MessageRef objects are
properly released even if not all messages were dispatched during the
test.
Part of the leak report:
```
==77847==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 86400 byte(s) in 48 object(s) allocated from:
#0 0x61806af7c2bd in operator new(unsigned long) (/ceph/build/bin/unittest_peeringstate+0x9092bd) (BuildId: 9a900986804eedf4e9290ec705e6444f4bf0ba94)
#1 0x61806b4bc9b9 in boost::intrusive_ptr<MOSDPGInfo2> ceph::make_message<MOSDPGInfo2, spg_t&, pg_info_t const&, unsigned int&, unsigned int&, std::optional<pg_lease_t>&, std::optional<pg_lease_ack_t>&>(spg_t&, pg_info_t const&, unsigned int&, unsigned int&, std::optional<pg_lease_t>&, std::optional<pg_lease_ack_t>&) /ceph/src/msg/Message.h:597:11
#2 0x61806b3e182a in BufferedRecoveryMessages::send_info(int, spg_t, unsigned int, unsigned int, pg_info_t const&, std::optional<pg_lease_t>, std::optional<pg_lease_ack_t>) /ceph/src/osd/PeeringState.cc:86:5
#3 0x61806b4c86c9 in PeeringCtxWrapper::send_info(int, spg_t, unsigned int, unsigned int, pg_info_t const&, std::optional<pg_lease_t>, std::optional<pg_lease_ack_t>) /ceph/src/osd/PeeringState.h:269:10
#4 0x61806b48a56c in PeeringState::ReplicaActive::react(PeeringState::ActivateCommitted const&) /ceph/src/osd/PeeringState.cc:7011:8
#5 0x61806b5f7cb6 in boost::statechart::detail::reaction_result boost::statechart::custom_reaction<PeeringState::ActivateCommitted>::react<PeeringState::ReplicaActive, boost::statechart::event_base, void const*>(PeeringState::ReplicaActive&, boost::statechart::event_base const&, void const* const&) /opt/ceph/include/boost/statechart/custom_reaction.hpp:42:15
#6 0x61806b5f7b1d in boost::statechart::detail::reaction_result boost::statechart::simple_state<PeeringState::ReplicaActive, PeeringState::Started, PeeringState::RepNotRecovering, (boost::statechart::history_mode)0>::local_react_impl_non_empty::local_react_impl<boost::mpl::list10<boost::statechart::custom_reaction<PeeringState::ActivateCommitted>, boost::statechart::custom_reaction<DeferRecovery>, boost::statechart::custom_reaction<DeferBackfill>, boost::statechart::custom_reaction<PeeringState::UnfoundRecovery>, boost::statechart::custom_reaction<PeeringState::UnfoundBackfill>, boost::statechart::custom_reaction<PeeringState::RemoteBackfillPreempted>, boost::statechart::custom_reaction<PeeringState::RemoteRecoveryPreempted>, boost::statechart::custom_reaction<RecoveryDone>, boost::statechart::transition<PeeringState::DeleteStart, PeeringState::ToDelete, boost::statechart::detail::no_context<PeeringState::DeleteStart>, &boost::statechart::detail::no_context<PeeringState::DeleteStart>::no_function(PeeringState::DeleteStart const&)>, boost::statechart::custom_reaction<MLease>>, boost::statechart::simple_state<PeeringState::ReplicaActive, PeeringState::Started, PeeringState::RepNotRecovering, (boost::statechart::history_mode)0>>(boost::statechart::simple_state<PeeringState::ReplicaActive, PeeringState::Started, PeeringState::RepNotRecovering, (boost::statechart::history_mode)0>&, boost::statechart::event_base const&, void const*) /opt/ceph/include/boost/statechart/simple_state.hpp:814:11
#7 0x61806b5f79b4 in boost::statechart::detail::reaction_result boost::statechart::simple_state<PeeringState::ReplicaActive, PeeringState::Started, PeeringState::RepNotRecovering, (boost::statechart::history_mode)0>::local_react<boost::mpl::list10<boost::statechart::custom_reaction<PeeringState::ActivateCommitted>, boost::statechart::custom_reaction<DeferRecovery>, boost::statechart::custom_reaction<DeferBackfill>, boost::statechart::custom_reaction<PeeringState::UnfoundRecovery>, boost::statechart::custom_reaction<PeeringState::UnfoundBackfill>, boost::statechart::custom_reaction<PeeringState::RemoteBackfillPreempted>, boost::statechart::custom_reaction<PeeringState::RemoteRecoveryPreempted>, boost::statechart::custom_reaction<RecoveryDone>, boost::statechart::transition<PeeringState::DeleteStart, PeeringState::ToDelete, boost::statechart::detail::no_context<PeeringState::DeleteStart>, &boost::statechart::detail::no_context<PeeringState::DeleteStart>::no_function(PeeringState::DeleteStart const&)>, boost::statechart::custom_reaction<MLease>>>(boost::statechart::event_base const&, void const*) /opt/ceph/include/boost/statechart/simple_state.hpp:850:14
```
Signed-off-by: Kefu Chai <k.chai@proxmox.com>
d0afe9d to
5a5dbc7
Compare
Contributor
Author
|
@bill-scales Hi Bill, would you be able to review this change? This PR fixes memory leaks in the gating tests as part of a broader effort to enable AddressSanitizer (ASan) in our CI workflow (see #56537) -- catching these issues early is a key step toward getting that enabled. |
Contributor
Author
|
@bill-scales hi Bill, ping? |
bill-scales
approved these changes
Mar 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The test creates OSD messages (MOSDPGLease, MOSDPGNotify2, MBackfillReserve, etc.) via ceph::make_message but the mock PGListener accumulates them in a messages map without clearing them at test teardown. This caused ~100 leak reports in CI.
Fix by explicitly clearing undispatched messages in TearDown() before destroying the listeners. This ensures all MessageRef objects are properly released even if not all messages were dispatched during the test.
Part of the leak report:
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins test classic perfJenkins Job | Jenkins Job Definitionjenkins test crimson perfJenkins Job | Jenkins Job Definitionjenkins test signedJenkins Job | Jenkins Job Definitionjenkins test make checkJenkins Job | Jenkins Job Definitionjenkins test make check arm64Jenkins Job | Jenkins Job Definitionjenkins test submodulesJenkins Job | Jenkins Job Definitionjenkins test dashboardJenkins Job | Jenkins Job Definitionjenkins test dashboard cephadmJenkins Job | Jenkins Job Definitionjenkins test apiJenkins Job | Jenkins Job Definitionjenkins test docsReadTheDocs | Github Workflow Definitionjenkins test ceph-volume allJenkins Jobs | Jenkins Jobs Definitionjenkins test windowsJenkins Job | Jenkins Job Definitionjenkins test rook e2eJenkins Job | Jenkins Job DefinitionYou must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.