KAFKA-19567: Added the check for underlying partition being the leader in delayedShareFetch tryComplete method by chirag-wadhwa5 · Pull Request #20280 · apache/kafka

chirag-wadhwa5 · 2025-07-31T17:09:42Z

In the current implementation, some delayed share fetch operations get
trapped in the delayed share fetch purgatory when the partition
leaderships change during share consumption. This is because there is no
check in code to make sure the current broker is still the partition
leader corresponding to the share partitions. So, when leadership
changes, the share partitions cannot be acquired, because they have
already been fenced, and tryComplete returns false. Although the
operatio does get completed when the timer expires for it, but it is too
late by then, and the operation get stuck in the watchers list waiting
for it to get purged when estimated operations increase to more than
1000. This Pr resolves this by adding the required check so that if
partition leadership changes, then the delayed share fetches waiting on
it gets completed instantaneously.

Reviewers: Apoorv Mittal apoorvmittal10@gmail.com, Andrew Schofield
aschofield@confluent.io

…hareFetch tryComplete

AndrewJSchofield

Thanks for the PR. A few initial comments.

AndrewJSchofield · 2025-07-31T17:35:03Z

                canComplete = true;
            } catch (NotLeaderOrFollowerException e) { // Case c
-                log.debug("Broker is no longer the leader or follower of topicPartition {}, satisfy {} immediately", topicIdPartition, shareFetch.fetchParams());
+                log.debug("Broker is no longer the leader or follower of topicIdPartition {}, satisfy {} immediately", topicIdPartition, shareFetch.fetchParams());


I'd revert this line because all of the other log lines just call it a topicPartition, in spite of technically having the topic ID. That's really fine because it is just an identifier for a topic partition.

AndrewJSchofield · 2025-07-31T17:35:38Z

-                replicaManager.getPartitionOrException(topicIdPartition.topicPartition());
+                Partition partition = replicaManager.getPartitionOrException(topicIdPartition.topicPartition());
+                if (!partition.isLeader()) {
+                    throw new NotLeaderOrFollowerException("Broker is no longer the leader of topicIdPartition: " + topicIdPartition);


Maybe "topicPartition" is more aligned with the rest of this code.

AndrewJSchofield · 2025-07-31T17:35:48Z

+                Partition partition = replicaManager.getPartitionOrException(topicIdPartition.topicPartition());
+                if (!partition.isLeader()) {
+                    log.error("Broker is no longer the leader of topicIdPartition {}", topicIdPartition);
+                    throw new NotLeaderOrFollowerException("Broker is no longer the leader of topicIdPartition: " +  topicIdPartition);


"topicPartition" :)

DL1231

Thanks for the patch, please fix the failed UT ReplicaManagerTest#testDelayedShareFetchPurgatoryOperationExpiration()

…ded comment

apoorvmittal10

Looks good, some minor comments.

apoorvmittal10 · 2025-08-06T12:33:59Z

+            // In that case, such partitions would not be able to get acquired, and the tryComplete will keep on returning false.
+            // Eventually the operation will get timed out and completed, but it might not get removed from the purgatory.
+            // This has been eventually left it like this because the purge interval will make sure that the remaining operations
+            // in the purgatory do not grow indefinitely and are purged time to time.


Is it time to time, or when keys or operations in purgatory are exceeded against a config. Can you please be explicit in the comment.

apoorvmittal10 · 2025-08-06T12:35:07Z

-                replicaManager.getPartitionOrException(topicIdPartition.topicPartition());
+                Partition partition = replicaManager.getPartitionOrException(topicIdPartition.topicPartition());
+                if (!partition.isLeader()) {
+                    throw new NotLeaderOrFollowerException("Broker is no longer the leader of topicPartition: " + topicIdPartition);


Are you certain the partition is also not follower? If not then shall we use NotLeaderException?

apoorvmittal10 · 2025-08-06T12:38:09Z

+        when(p1.isLeader()).thenReturn(true);
+
+        when(replicaManager.getPartitionOrException(tp0.topicPartition())).thenReturn(p0);
+        when(replicaManager.getPartitionOrException(tp1.topicPartition())).thenReturn(p1);


Did we add a new test case that validates the added functionality?

Thanks for the review. I have made the required updates and added a test case for the remote fetch throwing a NotLeaderException as well

apoorvmittal10

Left some comments.

apoorvmittal10 · 2025-08-07T18:57:41Z

+        // All the topic partitions are acquirable.
+        when(sp0.maybeAcquireFetchLock(fetchId)).thenReturn(true);
+
+        assertFalse(delayedShareFetch.isCompleted());


Shouldn't the delayed share fetch complete?

Thanks for the review. The DelayedShareFetch should complete, but only after tryComplete is called. I have added an assertTrue for this after tryComplete is called to make the test better.

apoorvmittal10 · 2025-08-07T18:58:06Z

-                replicaManager.getPartitionOrException(topicIdPartition.topicPartition());
+                Partition partition = replicaManager.getPartitionOrException(topicIdPartition.topicPartition());
+                if (!partition.isLeader()) {
+                    throw new NotLeaderException("Broker is no longer the leader of topicPartition: " + topicIdPartition);


Don't you need to handle the exception below?

Thanks for the review. Yep, I have added another catch block here for the new NotLeaderException thrown.

apoorvmittal10 · 2025-08-08T13:05:36Z

                canComplete = true;
-            } catch (NotLeaderOrFollowerException e) { // Case c
+            } catch (NotLeaderException e) { // Case c
                log.debug("Broker is no longer the leader or follower of topicPartition {}, satisfy {} immediately", topicIdPartition, shareFetch.fetchParams());


The text is incorrect. It should be for NotLeaderOrFollowerException.

apoorvmittal10

LGTM. Thanks for the PR.

Added the check for underlying partition being the leader in delayedS…

a3a2f79

…hareFetch tryComplete

github-actions Bot added triage PRs from the community core Kafka Broker KIP-932 Queues for Kafka labels Jul 31, 2025

AndrewJSchofield requested changes Jul 31, 2025

View reviewed changes

AndrewJSchofield added ci-approved and removed triage PRs from the community labels Jul 31, 2025

DL1231 reviewed Aug 3, 2025

View reviewed changes

chirag-wadhwa5 changed the title ~~Added the check for underlying partition being the leader in delayedShareFetch tryComplete methhod~~ KAFKA-19567: Added the check for underlying partition being the leader in delayedShareFetch tryComplete methhod Aug 4, 2025

chirag-wadhwa5 added 2 commits August 4, 2025 11:27

KAFKA-19567: Removed the change for checking leadership change and ad…

594d79a

…ded comment

changed a few comment lines and resolved a test failure

fcd4a79

apoorvmittal10 reviewed Aug 6, 2025

View reviewed changes

Minor changes and added a test case

4d6c123

chirag-wadhwa5 requested a review from apoorvmittal10 August 6, 2025 16:12

apoorvmittal10 changed the title ~~KAFKA-19567: Added the check for underlying partition being the leader in delayedShareFetch tryComplete methhod~~ KAFKA-19567: Added the check for underlying partition being the leader in delayedShareFetch tryComplete method Aug 7, 2025

apoorvmittal10 reviewed Aug 7, 2025

View reviewed changes

KAFKA-19567: Minor changes and improvements

3b0faae

apoorvmittal10 reviewed Aug 8, 2025

View reviewed changes

KAFKA-19567: Changed one small log line

b6716e8

apoorvmittal10 approved these changes Aug 8, 2025

View reviewed changes

Merge remote-tracking branch 'upstream/trunk' into KAFKA-19567

de22d8f

apoorvmittal10 merged commit 43a2504 into apache:trunk Aug 10, 2025
22 checks passed

Conversation

chirag-wadhwa5 commented Jul 31, 2025 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AndrewJSchofield left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DL1231 left a comment

Choose a reason for hiding this comment

Uh oh!

apoorvmittal10 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

apoorvmittal10 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

apoorvmittal10 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chirag-wadhwa5 commented Jul 31, 2025 •

edited by github-actions Bot

Loading