Skip to content

KAFKA-9212; Ensure LeaderAndIsr state updated in controller context during reassignment (#7795)#7800

Merged
hachikuji merged 1 commit into
apache:2.4from
ijuma:kafka-9212-2.4
Dec 9, 2019
Merged

KAFKA-9212; Ensure LeaderAndIsr state updated in controller context during reassignment (#7795)#7800
hachikuji merged 1 commit into
apache:2.4from
ijuma:kafka-9212-2.4

Conversation

@ijuma

@ijuma ijuma commented Dec 9, 2019

Copy link
Copy Markdown
Member

KIP-320 improved fetch semantics by adding leader epoch validation. This relies on
reliable propagation of leader epoch information from the controller. Unfortunately, we
have encountered a bug during partition reassignment in which the leader epoch in the
controller context does not get properly updated. This causes UpdateMetadata requests
to be sent with stale epoch information which results in the metadata caches on the
brokers falling out of sync.

This bug has existed for a long time, but it is only a problem due to the new epoch
validation done by the client. Because the client includes the stale leader epoch in its
requests, the leader rejects them, yet the stale metadata cache on the brokers prevents
the consumer from getting the latest epoch. Hence the consumer cannot make progress
while a reassignment is ongoing.

Although it is straightforward to fix this problem in the controller for the new releases
(which this patch does), it is not so easy to fix older brokers which means new clients
could still encounter brokers with this bug. To address this problem, this patch also
modifies the client to treat the leader epoch returned from the Metadata response as
"unreliable" if it comes from an older version of the protocol. The client in this case will
discard the returned epoch and it won't be included in any requests.

Also, note that the correct epoch is still forwarded to replicas correctly in the
LeaderAndIsr request, so this bug does not affect replication.

This is a cherry-pick of 5d0cb14 to the 2.4 branch with the changes necessary to
make it work. Since the changes were not trivial, I submitted a pull request. Jason
remains the author of the change.

@ijuma ijuma requested a review from hachikuji December 9, 2019 02:23

@ijuma ijuma Dec 9, 2019

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part needs to be re-reviewed. The rest applied cleanly although I had to add some test utils methods for the code to compile.

@ijuma

ijuma commented Dec 9, 2019

Copy link
Copy Markdown
Member Author

cc @omkreddy

…uring reassignment (apache#7795)

KIP-320 improved fetch semantics by adding leader epoch validation. This relies on
reliable propagation of leader epoch information from the controller. Unfortunately, we
have encountered a bug during partition reassignment in which the leader epoch in the
controller context does not get properly updated. This causes UpdateMetadata requests
to be sent with stale epoch information which results in the metadata caches on the
brokers falling out of sync.

This bug has existed for a long time, but it is only a problem due to the new epoch
validation done by the client. Because the client includes the stale leader epoch in its
requests, the leader rejects them, yet the stale metadata cache on the brokers prevents
the consumer from getting the latest epoch. Hence the consumer cannot make progress
while a reassignment is ongoing.

Although it is straightforward to fix this problem in the controller for the new releases
(which this patch does), it is not so easy to fix older brokers which means new clients
could still encounter brokers with this bug. To address this problem, this patch also
modifies the client to treat the leader epoch returned from the Metadata response as
"unreliable" if it comes from an older version of the protocol. The client in this case will
discard the returned epoch and it won't be included in any requests.

Also, note that the correct epoch is still forwarded to replicas correctly in the
LeaderAndIsr request, so this bug does not affect replication.

Reviewers: Jun Rao <junrao@gmail.com>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Ismael Juma <ismael@juma.me.uk>
resource.close()
}
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the methods below were added so that the new test would compile. They were added to trunk via 6c0aed2#diff-db06f61adab5c7665b1cc0369eaac846R1630, but it made more sense to copy the relevant TestUtils method than cherry-picking the commit.

@ijuma

ijuma commented Dec 9, 2019

Copy link
Copy Markdown
Member Author

Build is green, so we just need a review before merging this.

@hachikuji hachikuji left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hachikuji hachikuji merged commit 522b517 into apache:2.4 Dec 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants