-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Search before asking
- I searched in the issues and found nothing similar.
Version
Pulsar broker: 2.8.4
Java Pulsar client: 2.8.4
Minimal reproduce step
Non-partitioned topic. Batching is disabled on both producer and consumer. No acknowledge timeout. 5 subscriptions, each has 12 consumers.
One consumer of one subscription fails to process a message and doesn't ack it.
On a fail, I give the consumer a minute more to try to process other messages and ack them, if they are processed successfully. After a minute, I recreate the consumer and try to reprocess the messages, which would help if the error was transient.
What did you expect to see?
I expected to see the subscription backlog consumed further by the consumer with 1 failed message and by the other 11 consumers.
What did you see instead?
If a consumer fails to process one message, processing of all other messages with other keys is also stalled.
Including the other 11 consumers of the subscription.
All the other subscriptions and their consumers of the topic continue processing as expected.
As a symptom, I see the stuck subscription has "waitingReadOp" : false and "subscriptionHavePendingRead" : false, while the other subscription has these fields at true.
Anything else?
The message rate is about 50 messages per second. The same scenario with a few (1-2-5) messages per minute works as expected. So, I believe there might be some race condition.
Are you willing to submit a PR?
- I'm willing to submit a PR!
Metadata
Metadata
Assignees
Labels
Type
Projects
Status