etcdserver: separate maybeCompactRaftLog function to compact raft log independently#18635
etcdserver: separate maybeCompactRaftLog function to compact raft log independently#18635clement2026 wants to merge 9 commits intoetcd-io:mainfrom
Conversation
…ly from snapshots Signed-off-by: Clement <gh.2lgqz@aleeas.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: clement2026 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @clement2026. Thanks for your PR. I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@serathius Please take a look. This PR focuses on one task only. |
…ly from snapshots Signed-off-by: Clement <gh.2lgqz@aleeas.com>
…ly from snapshots Signed-off-by: Clement <gh.2lgqz@aleeas.com>
Codecov Report❌ Patch coverage is Please upload reports for the commit f34bee5 to get more accurate results. Additional details and impacted files
... and 13 files with indirect coverage changes @@ Coverage Diff @@
## main #18635 +/- ##
==========================================
+ Coverage 68.79% 68.88% +0.08%
==========================================
Files 420 420
Lines 35522 35545 +23
==========================================
+ Hits 24438 24485 +47
+ Misses 9657 9639 -18
+ Partials 1427 1421 -6 Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
Signed-off-by: Clement <gh.2lgqz@aleeas.com>
|
I can't test Add log statements ep := etcdProgress{
confState: sn.Metadata.ConfState,
snapi: sn.Metadata.Index,
appliedt: sn.Metadata.Term,
appliedi: sn.Metadata.Index,
compacti: fi - 1,
}
s.Logger().Info("initial ep.compacti", zap.Uint64("ep.compacti", ep.compacti),zap.Uint64("ep.snapi", ep.snapi)) compacti := ep.snapi - s.Cfg.SnapshotCatchUpEntries
if compacti <= ep.compacti {
return
}
s.Logger().Info("compacti > ep.compacti", zap.Uint64("compacti", compacti), zap.Uint64("ep.compacti", ep.compacti))Start a new etcd instance and make 110 put requests➜ rm -rf default.etcd; bin/etcd --experimental-snapshot-catchup-entries=5 --snapshot-count=10 2>&1 | grep 'ep.compacti'
➜ benchmark put --key-space-size=9999999 --val-size=100 --total=110Logs: Restart the etcd instance and make 50 put requests➜ bin/etcd --experimental-snapshot-catchup-entries=5 --snapshot-count=10 2>&1 | grep 'ep.compacti'
➜ benchmark put --key-space-size=9999999 --val-size=100 --total=50Logs: |
| snapi: sn.Metadata.Index, | ||
| appliedt: sn.Metadata.Term, | ||
| appliedi: sn.Metadata.Index, | ||
| compacti: fi - 1, |
There was a problem hiding this comment.
Would be good to add a comment why we substract one. Compact index should be the last index that raftStorage.Compact was called. Whether we should substract depend on whether the Compact operation is inclusive or exclusive. Need to confirm this.
|
|
||
| err = s.r.raftStorage.Compact(compacti) | ||
| err := s.r.raftStorage.Compact(compacti) | ||
| ep.compacti = compacti |
There was a problem hiding this comment.
I think we should update compact id if there was no error.
Signed-off-by: Clement <gh.2lgqz@aleeas.com>
Signed-off-by: Clement <gh.2lgqz@aleeas.com>
| // TestMemoryStorageCompaction tests that after calling raftStorage.Compact(compacti) | ||
| // without errors, the dummy entry becomes {Index: compacti} and | ||
| // raftStorage.FirstIndex() returns (compacti+1, nil). | ||
| func TestMemoryStorageCompaction(t *testing.T) { |
There was a problem hiding this comment.
Thanks for the test, it's really great to protect our assumption via test.
Signed-off-by: Clement <gh.2lgqz@aleeas.com>
Signed-off-by: Clement <gh.2lgqz@aleeas.com>
99db0bb to
bc9a3fa
Compare
7a5bfd6 to
f34bee5
Compare
| } | ||
| } | ||
| // The first snapshot and compaction shouldn't happen because applied index is less than 11 | ||
| logOccurredAtMostNTimes(t, mem, 5*time.Second, "saved snapshot", 0) |
There was a problem hiding this comment.
Those timeouts are very high, can we not have as big timeout? Snapshot and compaction should happen synchronously to operations, meaning logs should be available immidietly.
There was a problem hiding this comment.
Alright, I'll reduce the timeout to 1 or 2 seconds second and see how it works out.
| } | ||
| // The first snapshot and compaction shouldn't happen because applied index is less than 11 | ||
| logOccurredAtMostNTimes(t, mem, 5*time.Second, "saved snapshot", 0) | ||
| logOccurredAtMostNTimes(t, mem, time.Second, "compacted Raft logs", 0) |
There was a problem hiding this comment.
Is there a reason sometimes we wait for 1 second, sometimes for 5 seconds?
There was a problem hiding this comment.
The first logOccurredAtMostNTimes waits for 5 seconds. Once it's done, we can assume the log is synced up, so the second logOccurredAtMostNTimes doesn't need to wait that long.
There was a problem hiding this comment.
Logs should appear in in matter of tens of miliseconds not multiple seconds. The whole test should take seconds.
| // logOccurredAtMostNTimes ensures that the log has exactly `count` occurrences of `s` before timing out, no more, no less. | ||
| func logOccurredAtMostNTimes(t *testing.T, m *integration.Member, timeout time.Duration, s string, count int) { |
There was a problem hiding this comment.
| // logOccurredAtMostNTimes ensures that the log has exactly `count` occurrences of `s` before timing out, no more, no less. | |
| func logOccurredAtMostNTimes(t *testing.T, m *integration.Member, timeout time.Duration, s string, count int) { | |
| func logOccurredExactlyNTimes(t *testing.T, m *integration.Member, timeout time.Duration, s string, count int) { |
| // Retain all log entries up to the latest snapshot index to ensure any member can recover from that snapshot. | ||
| // Beyond the snapshot index, preserve the most recent s.Cfg.SnapshotCatchUpEntries entries in memory. | ||
| // This allows slow followers to catch up by synchronizing entries instead of requiring a full snapshot transfer. | ||
| if ep.snapi <= s.Cfg.SnapshotCatchUpEntries { |
There was a problem hiding this comment.
| if ep.snapi <= s.Cfg.SnapshotCatchUpEntries { | |
| if ep.appliedi <= s.Cfg.SnapshotCatchUpEntries { |
There was a problem hiding this comment.
snapi is more correct here, the reason is described in the comment.
| return | ||
| } | ||
|
|
||
| compacti := ep.snapi - s.Cfg.SnapshotCatchUpEntries |
There was a problem hiding this comment.
| compacti := ep.snapi - s.Cfg.SnapshotCatchUpEntries | |
| compacti := ep.appliedi - s.Cfg.SnapshotCatchUpEntries |
There was a problem hiding this comment.
What's the purpose of the change?
If you use ep.snapi, then the behaviour is exactly the same as existing behaviour, because ep.snapi only gets updated each time after creating the (v2) snapshot.
There was a problem hiding this comment.
Please see #17098 (comment). Can we get that confirmed firstly?
There was a problem hiding this comment.
This change doesn't change when compaction is run or how many times it's executed. I was aware of https://github.com/etcd-io/raft/blob/5d6eb55c4e6929e461997c9113aba99a5148e921/storage.go#L266-L269 code, that's why I was proposing compacting only ever X applies.
| <-apply.notifyc | ||
|
|
||
| s.triggerSnapshot(ep) | ||
| s.maybeCompactRaftLog(ep) |
There was a problem hiding this comment.
If you really want to get this small merged firstly, then please ensure it's as independent as possible. Currently the s.snapshot performs both snapshot and compaction operations. It makes sense to extract the compaction operation as an independent function/method, but let's call the method inside s.triggerSnapshot,
func (s *EtcdServer) triggerSnapshot(ep *etcdProgress) {
if !s.shouldSnapshot(ep) {
return
}
lg := s.Logger()
lg.Info(
"triggering snapshot",
zap.String("local-member-id", s.MemberID().String()),
zap.Uint64("local-member-applied-index", ep.appliedi),
zap.Uint64("local-member-snapshot-index", ep.snapi),
zap.Uint64("local-member-snapshot-count", s.Cfg.SnapshotCount),
zap.Bool("snapshot-forced", s.forceSnapshot),
)
s.forceSnapshot = false
s.snapshot(ep.appliedi, ep.confState)
ep.snapi = ep.appliedi
s.compact(xxxxx) // call the new method here, so we still do it each time after creating a snapshot.
}
There was a problem hiding this comment.
No rush on merging this PR. If we do merge it, we need to ensure etcd actually benefits from it. Let's resolve #17098 (comment) first.
There was a problem hiding this comment.
The goal of the PR is to make compaction independent from snapshot. Not just refactoring it to function.
There was a problem hiding this comment.
Not just refactor the function.
Just refactoring the function (extract the compact into a separate method) is an independent and safe change, accordingly can be merged soon.
The goal of the PR is to make compaction independent from snapshot
It modifies the logic/semantics, so it's no longer an independent change.
There was a problem hiding this comment.
"Refactoring it to function" is a subset of "refactoring".
There was a problem hiding this comment.
To be clearer about #18635 (comment), I am proposing an independent & safe minor refactoring below as the very first step
| // The first snapshot and compaction should happen because applied index is 11 | ||
| logOccurredAtMostNTimes(t, mem, 5*time.Second, "saved snapshot", 1) | ||
| logOccurredAtMostNTimes(t, mem, time.Second, "compacted Raft logs", 1) | ||
| expectMemberLog(t, mem, time.Second, "\"compact-index\": 6", 1) |
There was a problem hiding this comment.
Do we really need to check if compacted Raft log occured at most N times? This is hard to check, why checking if "compact-index": X is not enough ?
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
|
Closed with #18825 |
Part of #17098
Goal: separate maybeCompactRaftLog function to compact raft log independently from snapshots.
TODO