KAFKA-17912: Align string representations of SharePartitionKey#17656
Conversation
mumrah
left a comment
There was a problem hiding this comment.
Thanks for the quick fix on this @AndrewJSchofield 👍
|
|
||
| private TopicPartition topicPartitionFor(SharePartitionKey key) { | ||
| return new TopicPartition(Topic.SHARE_GROUP_STATE_TOPIC_NAME, partitionFor(key.toString())); | ||
| return new TopicPartition(Topic.SHARE_GROUP_STATE_TOPIC_NAME, partitionFor(key.asCoordinatorKey())); |
There was a problem hiding this comment.
@AndrewJSchofield could we move away from using String keys to avoid issues like this in the future? IOW, could we refactor partitionFor to accept a SharePartitionKey?
There was a problem hiding this comment.
I created https://issues.apache.org/jira/browse/KAFKA-17914 to cover this. Unfortunately, it's a little more work than just changing this class, and I think it would be better to prioritise getting this merged.
There was a problem hiding this comment.
@mumrah Changing partitionFor would be counter productive, IMO. This is because it will be invoked by the KafkaApis FIND_COORD API handler.
The find coordinator request comprises of the coordinator keys as Strings (as its a JSON defined RPC spec). The find coordinator handler will simply invoke partitionFor with the key specified in the request argument.
If we change partitionFor signature to accept the SharePartitionKey, we would not have enough information in FIND_COORD API handler to create a SharePartitionKey object and call partitionFor unless we resort to splitting and formatting string in the request and creating the required object and pass to partitionFor.
There was a problem hiding this comment.
@smjn I don't quite agree. I understand that the FindCoordinator RPC includes a string group-id:topic-id:partition. We could transform that into a SharePartitionKey on receipt, and then ShareCoordinator.partitionFor could take a SharePartitionKey as its parameter.
One other thing jumped out at me when I was debugging this. SharePartitionKey is a combination of a group ID and a TopicIdPartition. The topic name in the TopicIdPartition was null because it was constructed when no topic name was available. While a partition can be identified by just a topic ID and a partition, it does strike me that we need to be super-careful that we don't have a topic name sometimes, and no topic name at others, and introduce a bug due to this difference. One thing we could do for instance is ensure that the TopicIdPartition in a SharePartitionKey always has a null topic name. Maybe it doesn't make sense to use TopicIdPartition.toString() as part of the SharePartitionKey.toString() because the topic name is always null. Just wondering whether there's an easy win here in terms of defensive coding.
There was a problem hiding this comment.
We could transform that into a SharePartitionKey on receipt
Which would mean writing code to split and format strings - was trying to avoid it.
is a combination of a group ID and a TopicIdPartition
I did not want to do this. The initial impl of SPK clearly defined 3 fields -String groupId, Uuid topicId, int partition. In fact, the entire share-coordinator flow does not use the TopicIdPartition argument to the constructor at all. Not sure whether it is necessary for the SharePartition flow or was it done for aesthetic reasons.
There was a problem hiding this comment.
the entire share-coordinator flow does not use the TopicIdPartition argument to the constructor at all
IMO, we should use TopicIdPartition class itself as used elsewhere, not beacause of aesthetic reasons but it's a logical encapsulated entity.
There was a problem hiding this comment.
but it's a logical encapsulated entity.
Yes thats true but it might not be a good fit since topicName is always (or mostly) null.
There was a problem hiding this comment.
Yes thats true but it might not be a good fit since topicName is always (or mostly) null.
And why is that? For the calls from SharePartition we can fill up topicName as well. We do a metadata read to verify first.
There was a problem hiding this comment.
If you plan to use it then it should be fine. Otherwise, serves no purpose.
It can be argued that doing this causes additional problems, both with FIND_COORD RPC as well as well as persistence RPCs.
If we forgo the topicName from both the SharePartition and the ShareCoordinator side, we could directly invoke SPK.hashCode() method in the partitionFor method thus completely abstracting out the reliance on groupId:topicId:partitionId.
And as pointed out above we do not have access to the topic name in the above RPCs.
| @@ -539,7 +539,7 @@ public void onNewMetadataImage(MetadataImage newImage, MetadataDelta delta) { | |||
| } | |||
|
|
|||
| private TopicPartition topicPartitionFor(SharePartitionKey key) { | |||
There was a problem hiding this comment.
Can we add a unit test for this method?
There was a problem hiding this comment.
Is the string representation String.format("%s:%s:%d", groupId, topicId, partition) included by KIP-932? if not, maybe we should add it to the KIP and json file (FindCoordinatorRequest).
There was a problem hiding this comment.
I have added it to the JSON file. I'll make sure the KIP contains this too.
There was a problem hiding this comment.
|
Thanks @AndrewJSchofield for the quick fix. |
|
@AndrewJSchofield nice find. It seems like it could pass sometimes if we're lucky, as both string representations might end up mapping to the same partition. |
apoorvmittal10
left a comment
There was a problem hiding this comment.
Shall we also enable back the failing testShareGroups test?
chia7712
left a comment
There was a problem hiding this comment.
LGTM
I'll merge this to unblock the CI. Please feel free to leave more comments or open a JIRA for further improvements.
|
…#17656) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
…#17656) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
…#17656) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
The recent test failure documented by #17649 seems to be related to the string representation of SharePartitionKey. This is a combination of group-id:topic-id:partition used by share groups to refer to a "share-partition" which is a particular share group's view of a topic-partition. The SharePartitionKey is used to find the share coordinator responsible for a share-partition.
It looks like some code was using a simple concatenation of the fields, while others was using toString which is human-readable. This mismatch seems to have caused requests to be sent to the wrong share coordinator, resulting in tests timing out.
This PR aligns all lookup-based uses of SharePartitionKey on the simple concatenation of its constituent parts.
Committer Checklist (excluded from commit message)