[LI-HOTFIX] Transfer leadership for UnderMinISR partitions#406
Conversation
groelofs
left a comment
There was a problem hiding this comment.
Looks good, though admittedly I was getting a little tired at the end, so I hope I didn't miss anything. :-/ Sorry for the delay!
| leaderLogIfLocal.flatMap { log => | ||
| val outOfSyncReplicaIds = getOutOfSyncReplicas(replicaLagTimeMaxMs) | ||
| if ((inSyncReplicaIds -- outOfSyncReplicaIds).size < log.config.minInSyncReplicas) { | ||
| inSyncReplicaIds.find(_ != localBrokerId) |
There was a problem hiding this comment.
Maybe add a TODO to consider a smarter handoff algorithm in the future? (For example, based on some metric.)
There was a problem hiding this comment.
I thought more about this. Suppose we have the following condition with 3 replicas in the ISR:
leader A has 5 messages (offsets 1-5);
follower B has 4 messages (offsets 1-4);
follower C has 3 messages (offsets 1-3).
One might think that transferring the new leader to B is the preferable option. However, because the high watermark is the lowest end offset of the ISR, i.e. determined by follower C in this case,
regardless of whether we transfer the leader to B or C, the produce request for msg at offset 4 and 5 will receive a "NotLeaderForPartition" error and retry the produce requests.
Thus transferring the leadership to B does not actually help.
Besides this line of thinking, I'm not sure how else the handoff algorithm can be smarter.
The async/event-based replica fetcher series (TransferLeaderManager, AbstractAsyncFetcher, AsyncReplicaFetcher, FetcherEventBus, FetcherEventManager — PRs #121/#123/#124/#143/#144/#403/#406) was removed in the PR #538 squash. The li.async.fetcher.enable config key was left behind in KafkaConfig as dead surface area: it parsed and validated, but had no consumers. Going with audit P1 #6 option (a): retire the feature, remove the dead config key. Operators with li.async.fetcher.enable=<anything> in their server.properties may see an 'Unknown configuration' warning at broker startup; this is harmless — the broker still starts and the value would have been a no-op anyway. Removes: - Defaults.LiAsyncFetcherEnabled - KafkaConfig.LiAsyncFetcherEnableProp - The brokerConfigDef.define(...) registration - KafkaConfig.liAsyncFetcherEnable accessor Audit option (b) — porting the entire async fetcher series forward into 3.6-li — was deferred. If the optimization is needed in the future, it would need to be reimplemented from scratch on top of upstream's current ReplicaFetcherThread / ReplicaFetcherManager.
TICKET = LIKAFKA-47596
LI_DESCRIPTION =
This PR adds the following change:
When a partition's ISR size is about to fall under the MinISR setting,
instead of letting it go under MinISR, the leader host would try to initiate a leadership transfer.
Currently, we chose the first replica in the ISR to be the new leader.
EXIT_CRITERIA = If and when this change is accepted in upstream Kafka.
Committer Checklist (excluded from commit message)