Skip to content

KAFKA-19585: Avoid noisy NPE logs when closing consumer after constructor failures#20491

Merged
lianetm merged 3 commits into
apache:trunkfrom
gensericghiro:genseric/kafka-19585
Sep 8, 2025
Merged

KAFKA-19585: Avoid noisy NPE logs when closing consumer after constructor failures#20491
lianetm merged 3 commits into
apache:trunkfrom
gensericghiro:genseric/kafka-19585

Conversation

@gensericghiro

@gensericghiro gensericghiro commented Sep 5, 2025

Copy link
Copy Markdown
Contributor

If there's a failure in the kafka consumer constructor, we attempt to
close it
https://github.com/lianetm/kafka/blob/2329def2ff9ca4f7b9426af159b6fa19a839dc4d/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L540
In that case, it could be the case that some components may have not
been created, so we should consider some null checks to avoid noisy logs
about NPE.

This noisy logs have been reported with the console share consumer in a
similar scenario, so this task is to review and do a similar fix for the
Async if needed.

The fix is to check if handlers/invokers are null before trying to close
them. Similar to what was done here
#20290

Reviewers: TengYao Chi frankvicky@apache.org, Lianet Magrans
lmagrans@confluent.io

@github-actions github-actions Bot added triage PRs from the community consumer clients small Small PRs labels Sep 5, 2025
@lianetm lianetm added ci-approved and removed triage PRs from the community labels Sep 5, 2025

@lianetm lianetm left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch! Just a minor comment for consideration

@Test
public void testFailConstructor() {
final Properties props = requiredConsumerConfig();
props.put(ConsumerConfig.GROUP_ID_CONFIG, "group-id");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, trying to understand why we needed this, I realized that it was probably to make sure there is a non-null groupMetadata, so you can ensure that the test pass because it hits they newly added null check on appEventHandler, and not because of the existing early return on groupMetadata.isEmpty, correct? But then I realized that the groupMetadata creation happens after the metrics anyways, so it will end up always being null for this test anyways, right? (meaning I expect this test would pass, no noisy NPE even without this PR)

That being said, I think the sanity checks make sense, and the test will surely help catch any regression (a simple change or order in the actions of the constructor would make the noisy NPEs appear I expect).

So no changes I would say, just thinking out loud here and sharing for the record.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

groupMetadata is always initialized (ref), so my understanding was what you said, i.e. "you can ensure that the test pass because it hits they newly added null check on appEventHandler, and not because of the existing early return on groupMetadata.isEmpty".

creation happens after the metrics anyways, so it will end up always being null for this test anyways, right?

Good point. The only place we could add something after the initialization of groupMetadata but before the initialization of the other handlers/invokers that are checked for nullity is in RequestManagers.supplier, but that doesn't throw any exception. Now I'm wondering how these NPEs were obtained in the first place 🤔? Only place groupMetadata isn't checked for emptiness is in awaitPendingAsyncCommitsAndExecuteCommitCallbacks, and lastPendingAsyncCommit is always null unless there's already been an asyncCommit, which I assume is impossible if the consumer isn't yet built?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just dived into the constructor and found that reproducing the noisy close scenario is pretty tricky.
In the context of Kafka, most cases will return early due to the groupMetadata.isEmpty.
I suppose the scenario might only occur at the JVM level, such as an OOM during construction.

@lianetm lianetm left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gensericghiro! LGTM

@lianetm lianetm merged commit 872647f into apache:trunk Sep 8, 2025
21 checks passed
props.put(ConsumerConfig.METRIC_REPORTER_CLASSES_CONFIG, "an.invalid.class");
final ConsumerConfig config = new ConsumerConfig(props);

try (LogCaptureAppender appender = LogCaptureAppender.createAndRegister()) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that code was copied from ShareConsumerImplTest#testFailConstructor with some improvements -- for example, the appender now gets closed :)

@gensericghiro Do you have a free cycle to file a MINOR patch to fix it in ShareConsumerImplTest#testFailConstructor?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will do it

chia7712 pushed a commit that referenced this pull request Sep 9, 2025
…va::testFailConstructor (#20514)

Similarly to what was done for
AsyncKafkaConsumerTest::testFailConstructor,
[here](#20491)

Reviewers: Lianet Magrans <lmagrans@confluent.io>, Chia-Ping Tsai
 <chia7712@gmail.com>
eduwercamacaro pushed a commit to littlehorse-enterprises/kafka that referenced this pull request Nov 12, 2025
…ctor failures (apache#20491)

If there's a failure in the kafka consumer constructor, we attempt to
close it

https://github.com/lianetm/kafka/blob/2329def2ff9ca4f7b9426af159b6fa19a839dc4d/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L540
In that case, it could be the case that some components may have not
been created, so we should consider some null checks to avoid noisy logs
about NPE.

This noisy logs have been reported with the console share consumer in a
similar scenario, so this task is to review and do a similar fix for the
Async if needed.

The fix is to check if handlers/invokers are null before trying to close
them. Similar to what was done here
apache#20290

Reviewers: TengYao Chi <frankvicky@apache.org>, Lianet Magrans
 <lmagrans@confluent.io>
eduwercamacaro pushed a commit to littlehorse-enterprises/kafka that referenced this pull request Nov 12, 2025
…va::testFailConstructor (apache#20514)

Similarly to what was done for
AsyncKafkaConsumerTest::testFailConstructor,
[here](apache#20491)

Reviewers: Lianet Magrans <lmagrans@confluent.io>, Chia-Ping Tsai
 <chia7712@gmail.com>
shashankhs11 pushed a commit to shashankhs11/kafka that referenced this pull request Dec 15, 2025
…ctor failures (apache#20491)

If there's a failure in the kafka consumer constructor, we attempt to
close it

https://github.com/lianetm/kafka/blob/2329def2ff9ca4f7b9426af159b6fa19a839dc4d/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L540
In that case, it could be the case that some components may have not
been created, so we should consider some null checks to avoid noisy logs
about NPE.

This noisy logs have been reported with the console share consumer in a
similar scenario, so this task is to review and do a similar fix for the
Async if needed.

The fix is to check if handlers/invokers are null before trying to close
them. Similar to what was done here
apache#20290

Reviewers: TengYao Chi <frankvicky@apache.org>, Lianet Magrans
 <lmagrans@confluent.io>
shashankhs11 pushed a commit to shashankhs11/kafka that referenced this pull request Dec 15, 2025
…va::testFailConstructor (apache#20514)

Similarly to what was done for
AsyncKafkaConsumerTest::testFailConstructor,
[here](apache#20491)

Reviewers: Lianet Magrans <lmagrans@confluent.io>, Chia-Ping Tsai
 <chia7712@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants