KAFKA-2373: Add Kafka-backed offset storage for Copycat.#202
Conversation
|
@gwenshap Sorry, this turned out a bit bigger than intended because some of the MockConsumer stuff was incomplete, but much more useful than trying to use an EasyMock object. One issue with the current patch is that it is only unit tested. Two additional JIRAs (2374, 2375) address other components necessary for the full distributed version. Until then, it doesn't make much sense to have more extensive tests. I did, however, manually verify by changing the Worker to use this implementation then running the system tests using that version. Not sure if we want something more intermediate until the other two patches are in place or if we should just leave it to the last one to integrate them all and test them end-to-end. |
|
kafka-trunk-git-pr #386 SUCCESS |
|
kafka-trunk-git-pr #388 SUCCESS |
There was a problem hiding this comment.
Does this mean you manually assign this consumer to read all partitions? So the consumer group id doesn't matter or is not used? I couldn't see where a consumer group was being set.
That means that each instance of this code consumes the entire topic, right? Which is exactly what you want.
I ask because I have many use cases where I want a consumer to get all partitions. We currently do it by trying to create unique consumer group ids, but that is kind of annoying.
There was a problem hiding this comment.
Yes, we are using this in simple consumer mode. I initially started with the approach you're describing. I didn't actually want a consumer group since each consumer reads the entire topics, we don't need offset commits, etc. However, one drawback with this approach is that it doesn't automatically pick up changes in the # of partitions in the topic.
However, I think this shouldn't be a problem anyway because if you wan to use a compacted topic for this (which should be reasonable), you can't just change the # of partitions since it'll break the key -> partition mapping.
There was a problem hiding this comment.
I agree that this seems much nicer.
You said "simple consumer mode". So this is the New Consumer in simple consumer mode, correct?
Will the New Consumer automatically handle rebalances due to leader failover (broker failure where the partition leader changes), even when in simple consumer mode? Because that was a downside to the previous Simple Consumer -- you had to handle leadership changes on your own.
There was a problem hiding this comment.
In the new consumer, if you specify what to consume by using assign(), it consumes from those topic partitions and doesn't use any of the group management features. This only makes sense if you want a single consumer instance to see everything. Everything related to which broker you need to be fetching data for should be handled for you. If you use subscribe(), then you join your consumer group and are assigned a subset of the topic partitions by the consumer coordinator. All the consumer group rebalancing is handled automatically in that case.
|
@gwenshap Updated with some issues after some updates to trunk. Also added some basic system tests. They are very close to the standalone tests for now. As we add the other components we can improve them to do a better job of sanity checking the distributed mode. Also, if you can keep a careful eye on the MockConsumer/MockConsumerTest that'd be helpful. I noticed what appeared to be an error with how it was reporting offsets, and I want to make sure we get those semantics right. MockConsumer isn't much use if it doesn't actually match KafkaConsumer behavior.... |
There was a problem hiding this comment.
I think your IDE did this automatically and we normally stick to the no wildcard rule?
…fkaOffsetBackingStore.
b3eacbe to
04bcc1c
Compare
|
kafka-trunk-git-pr #496 FAILURE |
|
Looks like there are some conflicts now :( Mind rebasing again? LGTM otherwise. |
No description provided.