Skip to content

Conversation

@BewareMyPower
Copy link
Contributor

@BewareMyPower BewareMyPower commented Sep 9, 2025

Motivation

See #24714 (comment)

Currently this issue rarely happens because most metrics updates happen in Netty I/O threads when handling requests. However, DataSketchesOpStatsLogger and DataSketchesSummaryLogger are public and can be used by downstream plugins (e.g. protocol handlers).

Modifications

  • Refactor the code by adding a record method to LocalData and ThreadLocalAccessor responsible for updating the metrics.
  • Then, add ThreadLocalAccessorTest to verify when the metrics are updated in a normal thread (not FastThreadLocalRunnable), the LocalData objects won't be removed from the map in ThreadLocalAccessor
  • Fix this issue by introducing a weak reference that holds the current thread when creating a LocalData in a non-Netty thread and remove it from the map if it's garbage collected or not alive when iterating the whole map called by rotateLatencyCollection.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@BewareMyPower BewareMyPower self-assigned this Sep 9, 2025
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Sep 9, 2025
@BewareMyPower BewareMyPower added type/bug The PR fixed a bug or issue reported a bug area/broker release/4.0.7 release/3.0.14 release/4.1.1 and removed doc-not-needed Your PR changes do not impact docs labels Sep 9, 2025
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Sep 9, 2025
Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BewareMyPower
Copy link
Contributor Author

cc @oneby-wang

BTW, the archived open source KoP fixes this issue by remembering the Netty I/O executor everywhere and updating the metrics in that executor. See https://github.com/streamnative/kop/pull/2006/files

However, this solution is error-prone because it's easy to miss such pattern.

@BewareMyPower BewareMyPower marked this pull request as draft September 9, 2025 11:57
@BewareMyPower BewareMyPower marked this pull request as ready for review September 9, 2025 12:32
@codecov-commenter
Copy link

codecov-commenter commented Sep 9, 2025

Codecov Report

❌ Patch coverage is 87.27273% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.18%. Comparing base (90b2aac) to head (98ff897).
⚠️ Report is 159 commits behind head on master.

Files with missing lines Patch % Lines
.../stats/prometheus/metrics/ThreadLocalAccessor.java 91.30% 4 Missing ⚠️
.../prometheus/metrics/DataSketchesOpStatsLogger.java 57.14% 3 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #24719      +/-   ##
============================================
+ Coverage     74.17%   74.18%   +0.01%     
- Complexity    33473    33494      +21     
============================================
  Files          1895     1896       +1     
  Lines        148023   148025       +2     
  Branches      17140    17148       +8     
============================================
+ Hits         109790   109813      +23     
+ Misses        29471    29456      -15     
+ Partials       8762     8756       -6     
Flag Coverage Δ
inttests 26.43% <56.36%> (+0.19%) ⬆️
systests 22.63% <52.72%> (-0.04%) ⬇️
unittests 73.71% <87.27%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
.../prometheus/metrics/DataSketchesSummaryLogger.java 100.00% <100.00%> (ø)
.../prometheus/metrics/DataSketchesOpStatsLogger.java 63.41% <57.14%> (-3.26%) ⬇️
.../stats/prometheus/metrics/ThreadLocalAccessor.java 91.30% <91.30%> (ø)

... and 88 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@lhotari lhotari merged commit 9715653 into apache:master Sep 9, 2025
52 checks passed
@BewareMyPower BewareMyPower deleted the bewaremypower/local-data-leak branch September 9, 2025 16:42
@Technoboy- Technoboy- added this to the 4.2.0 milestone Sep 10, 2025
Technoboy- pushed a commit that referenced this pull request Sep 10, 2025
Technoboy- pushed a commit to Technoboy-/pulsar that referenced this pull request Sep 10, 2025
Technoboy- pushed a commit that referenced this pull request Sep 10, 2025
manas-ctds pushed a commit to datastax/pulsar that referenced this pull request Sep 11, 2025
…her than FastThreadLocalThread (apache#24719)

(cherry picked from commit e241b0a)
lhotari pushed a commit that referenced this pull request Sep 11, 2025
…her than FastThreadLocalThread (#24719)

(cherry picked from commit 9715653)
lhotari pushed a commit that referenced this pull request Sep 11, 2025
…her than FastThreadLocalThread (#24719)

(cherry picked from commit 9715653)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Sep 12, 2025
…her than FastThreadLocalThread (apache#24719)

(cherry picked from commit e241b0a)
nborisov pushed a commit to nborisov/pulsar that referenced this pull request Sep 12, 2025
ganesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Sep 15, 2025
…her than FastThreadLocalThread (apache#24719)

(cherry picked from commit 9715653)
(cherry picked from commit cb5a3ab)
ganesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Sep 15, 2025
…her than FastThreadLocalThread (apache#24719)

(cherry picked from commit 9715653)
(cherry picked from commit cb5a3ab)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Sep 15, 2025
…her than FastThreadLocalThread (apache#24719)

(cherry picked from commit 9715653)
(cherry picked from commit cb5a3ab)
nodece pushed a commit to nodece/pulsar that referenced this pull request Sep 16, 2025
…her than FastThreadLocalThread (apache#24719)

(cherry picked from commit 9715653)
KannarFr pushed a commit to CleverCloud/pulsar that referenced this pull request Sep 22, 2025
walkinggo pushed a commit to walkinggo/pulsar that referenced this pull request Oct 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants