kvserver: deflake TestReadLoadMetricAccounting#141843
Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom Feb 21, 2025
Merged
kvserver: deflake TestReadLoadMetricAccounting#141843craig[bot] merged 2 commits intocockroachdb:masterfrom
craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
Member
622c900 to
42e78a2
Compare
Occasionally, a leader lease upgrade request interferes with the metrics measured by this test. This commit makes it wait for the upgrade first, before checking metrics. Epic: none Releaste note: none
42e78a2 to
55d1505
Compare
tbg
approved these changes
Feb 21, 2025
Epic: none Release note: none
Collaborator
Author
|
bors r=tbg |
arulajmani
approved these changes
Feb 21, 2025
Collaborator
arulajmani
left a comment
There was a problem hiding this comment.
Reviewed 3 of 3 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @pav-kv)
Contributor
|
Build succeeded: |
Collaborator
Author
|
blathers backport 25.1 |
|
Based on the specified backports for this PR, I applied new labels to the following linked issue(s). Please adjust the labels as needed to match the branches actually affected by the issue(s), including adding any known older branches. Issue #141716: branch-release-25.1. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
craig bot
pushed a commit
that referenced
this pull request
Dec 19, 2025
159877: kvserver: deflake TestReadLoadMetricAccounting r=tbg a=tbg `TestReadLoadMetricAccounting` has a history of flaking due to lease-related writes interfering with load metric measurements. Issue #141716 (and #141586) identified the same failure signature: ``` Error: Max difference between 0 and 85 allowed is 4, but difference was -85 ``` The root cause was identified by `@pav-kv:` an "unexpected" leader lease upgrade write was interfering with the test's write bytes measurements. PR #141843 added `tc.MaybeWaitForLeaseUpgrade()` to wait for lease upgrades before starting measurements. **The fix from #141843 IS present** in the failing SHA. However, the test still flaked with the same error signature (85 write bytes when expecting 0). The logs show: 1. AddSSTableRequest evaluated (test setup) 2. Many LeaseInfoRequest polls (from MaybeWaitForLeaseUpgrade) 3. RequestLeaseRequest (the lease upgrade write) 4. More LeaseInfoRequest polls 5. "lease is now of type: LeaseLeader" - **upgrade complete** 6. "test #1" - test loop begins 7. GetRequest evaluated (the actual test request) 8. **Assertion fails** - 85 write bytes observed The race condition is subtle: `MaybeWaitForLeaseUpgrade()` waits until `FindRangeLeaseEx()` reports the lease is upgraded, but it does **not** guarantee that the write bytes have been recorded to load stats. This is because stats are recorded "awkwardly late" on the client goroutine (`SendWithWriteBytes`). The fix: 1. Wraps each test case iteration in `SucceedsSoon` 2. Resets load stats, sends the request, checks results 3. If any stat doesn't match (due to background activity like lease upgrades), returns an error to trigger retry 4. Adds a comment noting that test cases must be idempotent (they are—all reads) ## Related Issues/PRs | Issue/PR | Status | Relevance | |----------|--------|-----------| | #159719 | OPEN | Current failure | | #141716 | CLOSED | Duplicate, Feb 2025 | | #141586 | CLOSED | Original issue, Feb 2025 | | #141843 | MERGED | Deflake attempt (wait for lease upgrade) | | #141599 | MERGED | Added logging to help debug | | #141905 | CLOSED | Duplicate | | #134799 | CLOSED | Older occurrence | This is more robust than trying to synchronize with specific background operations because it handles **any** source of interference, not just lease upgrades. Epic: none Closes #159719. Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com>
This was referenced Feb 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Occasionally, a leader lease upgrade request interferes with the metrics measured by this test. This commit makes it wait for the upgrade first, before checking metrics.
Fixes #141716