raft: optimize storage access for term when leader tries to commit#139609
Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom Jan 30, 2025
Merged
Conversation
Member
pav-kv
reviewed
Jan 23, 2025
bd7dc5b to
3c697a7
Compare
e172a3c to
d51426d
Compare
pav-kv
reviewed
Jan 29, 2025
Collaborator
pav-kv
left a comment
There was a problem hiding this comment.
LGTM % nits. Also, please squash the PR into one commit.
ac9be23 to
2f92236
Compare
Collaborator
|
The code LGTM, but the release note seems off:
Please either remove it, or note something like: (both in the commit and PR description) |
pav-kv
approved these changes
Jan 29, 2025
Collaborator
pav-kv
left a comment
There was a problem hiding this comment.
Good to go once release notes and nits are fixed.
When the raft leader wants to commit an entry(advance commit index), it must check the term of the entry it wants to commit and ensure the term number is the same as the current leader term(term of itself). Previously, this process may involve a storage access for the term number if the entry has been persisted on stable storage and no longer in the unstable log of the leader node. In fact this above scenario happens quite often. Which makes an optimization improving leader commit latency fruitful. This PR introduces an optimization that utilizes an invariant that is logically equivalent to checking whether the want-to-be-committed entry has the same term as the leader doing the commit. This is done by keeping an entry index in leader node's memory that signifies the point in time where the current leader became leader. When the leader wants to commit an entry, the leader compares the want-to-be-committed entry's index(available in memory) with the newly added index in this PR to determine whether the term matches. Thus completely eliminating storage access on this code path. A corresponding unit test is also modified to account for the elimination of storage access when leader tries to commit. Jira issue: CRDB-45763 Fixes: cockroachdb#137826 Release note (performance improvement): Removed a potential storage read from raft commit pipeline. This reduces the worst-case KV write latency.
2f92236 to
465e772
Compare
Contributor
Author
|
TYFTR! bors r=pav-kv |
Contributor
|
Build failed: |
Contributor
Author
|
Retry bors r=pav-kv |
Contributor
Author
|
bors retry |
Contributor
|
Already running a review |
Contributor
tbg
added a commit
to tbg/cockroach
that referenced
this pull request
Feb 13, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When the raft leader wants to commit an entry(advance commit index), it must check the term of the entry it wants to commit and ensure the term number is the same as the current leader term(term of itself).
Previously, this process may involve a storage access for the term number if the entry has been persisted on stable storage and no longer in the unstable log of the leader node.
In fact this above scenario happens quite often. Which makes an optimization improving leader commit latency fruitful.
This PR introduces an optimization that utilizes an invariant that is logically equivalent to checking whether the want-to-be-committed entry has the same term as the leader doing the commit.
This is done by keeping an entry index in leader node's memory that signifies the point in time where the current leader became leader. When the leader wants to commit an entry, the leader compares the want-to-be-committed entry's index(available in memory) with the newly added index in this PR to determine whether the term matches.
Thus completely eliminating storage access on this code path.
A corresponding unit test is also modified to account for the elimination of storage access when leader tries to commit.
A previous related PR(merged): #139907
Jira issue: CRDB-45763
Fixes: #137826
Release note (performance improvement): removed a potential storage read from raft commit pipeline. This reduces the worst-case KV write latency.