Describe the problem
If one uses auto-rebalancing Kafka clusters (e.g. Redpanda 21.8.1) you'll see the following retryable changefeed errors quite often which causes issues monitoring the changefeeds:
date,Pod Name,Service,message
2021-08-30T10:13:50.419Z,syd-crdb-cockroachdb-1,cockroach,"[n17,job=‹688920172251709457›] 1321287 CHANGEFEED job ‹688920172251709457› encountered retryable error: ‹retryable changefeed error›: ‹kafka server: Tried to send a message to a replica that is not the leader for some partition. Your metadata is out of date.›"
2021-08-30T10:09:25.361Z,syd-crdb-cockroachdb-1,cockroach,"[n17,job=‹688920172251709457›] 1320407 CHANGEFEED job ‹688920172251709457› encountered retryable error: ‹retryable changefeed error›: ‹kafka server: Tried to send a message to a replica that is not the leader for some partition. Your metadata is out of date.›"
2021-08-30T10:07:18.331Z,syd-crdb-cockroachdb-1,cockroach,"[n17,job=‹688920172251709457›] 1319933 CHANGEFEED job ‹688920172251709457› encountered retryable error: ‹retryable changefeed error›: ‹kafka server: Tried to send a message to a replica that is not the leader for some partition. Your metadata is out of date.›"
The situation is the following:
- Kafka clusters with leadership auto rebalancing automatically even out the number of partition leaders on each node
- That causes clients to periodically have stale metadata, which cluster will nudge them to update
This is expected behaviour that is not really actionable in any way from the user end. As such it shouldn't really be an error, but Ideally relegated to something akin to a warning.
Environment:
- CockroachDB 21.1.1
- Redpanda 21.8.1
Thanks for a great product ❤️
Describe the problem
If one uses auto-rebalancing Kafka clusters (e.g. Redpanda 21.8.1) you'll see the following retryable changefeed errors quite often which causes issues monitoring the changefeeds:
The situation is the following:
This is expected behaviour that is not really actionable in any way from the user end. As such it shouldn't really be an error, but Ideally relegated to something akin to a warning.
Environment:
Thanks for a great product ❤️