-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[improve][broker] Cache last publish timestamp for idle topics to reduce storage reads #24825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[improve][broker] Cache last publish timestamp for idle topics to reduce storage reads #24825
Conversation
…uce storage reads ### Motivation When topics are idle (no recent write activity), stats collection needs to read from storage to get the last publish timestamp since the managed ledger doesn't cache it. If stats are polled frequently, this causes unnecessary repeated storage reads for the same value. ### Modifications - Added cachedLastPublishTimestamp field to cache the last publish timestamp for idle topics - The cache is only used when ledger.getLastAddEntryTime() <= 0 (topic is idle) - When the topic has recent activity (ledger.getLastAddEntryTime() > 0), we use the ledger's value directly and clear the cache - This optimization specifically targets idle topics where stats are polled repeatedly but nothing is being written ### Verifying this change - Added unit test testLastPublishTimestampCaching that verifies: 1. First stats call on idle topic reads from storage and caches 2. Second stats call uses cached value (no storage read) 3. When topic becomes active, uses ledger value and clears cache 4. When topic becomes idle again, reads from storage (cache was properly cleared) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #24825 +/- ##
============================================
- Coverage 74.33% 74.29% -0.04%
+ Complexity 33438 33423 -15
============================================
Files 1912 1912
Lines 149077 149100 +23
Branches 17300 17301 +1
============================================
- Hits 110815 110777 -38
- Misses 29442 29501 +59
- Partials 8820 8822 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
lhotari
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
@codelipenghui There's a failing test, AdminApi2Test.testTopicCreationAndLastPublishTimestamps |
|
Cherry-picked to branch-4.1, and there is no need to cherry-pick to branch-4.0 since the issue is only for 4.1.0 |
…uce storage reads (apache#24825) (cherry picked from commit d5e1d7e) (cherry picked from commit f0c9b05)
(cherry picked from commit 9c70771)
…uce storage reads (apache#24825) (cherry picked from commit d5e1d7e) (cherry picked from commit f0c9b05)
(cherry picked from commit 9c70771)
…uce storage reads (apache#24825) (cherry picked from commit d5e1d7e) (cherry picked from commit f0c9b05)
(cherry picked from commit 9c70771)
Motivation
When topics are idle (no recent write activity), stats collection needs to read from storage to get the last publish timestamp since the managed ledger doesn't cache it. If stats are polled frequently, this causes unnecessary repeated storage reads for the same value.
Modifications
cachedLastPublishTimestampfield to cache the last publish timestamp for idle topicsledger.getLastAddEntryTime() <= 0(topic is idle)ledger.getLastAddEntryTime() > 0), we use the ledger's value directly and clear the cacheVerifying this change
testLastPublishTimestampCachingthat verifies:Documentation
doc-required(Your PR needs to update docs and you will update later)doc-not-needed(Please explain why)doc(Your PR contains doc changes)doc-complete(Docs have been already added)🤖 Generated with Claude Code