KAFKA-19384: The passing of BrokerRegistrationRequestTest is a false positive#4
KAFKA-19384: The passing of BrokerRegistrationRequestTest is a false positive#4apalan60 wants to merge 13 commits into
Conversation
…ache#20315) The PR removes unnecessary updates for find next fetch offset. When the state is in transition and not yet completed then anyways respective offsets should not be considered for acquisition. The find next fetch offset is updated finally when transition is completed. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Abhinav Dixit <adixit@confluent.io>
Now that Kafka support Java 17, this PR makes some changes in `trogdor` module. The changes mostly include: - Collections.emptyList(), Collections.singletonList() and Arrays.asList() are replaced with List.of() - Collections.emptyMap() and Collections.singletonMap() are replaced with Map.of() - Collections.singleton() is replaced with Set.of() Some minor cleanups around use of enhanced switch blocks and conversion of classes to record classes. Reviewers: Ken Huang <s7133700@gmail.com>, Vincent Jiang <vpotucek@me.com>, Chia-Ping Tsai <chia7712@gmail.com>
…timeout (apache#20310) Fixing max delivery check on acquisition lock timeout and write state RPC failure. When acquisition lock is already timed out and write state RPC failure occurs then we need to check if records need to be archived. However with the fix we do not persist the information, which is relevant as some records may be archived or delivery count is bumped. The information will be persisted eventually. The persister call has failed already hence issuing another persister call due to a failed persister call earlier is not correct. Rather let the data persist in future persister calls. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Abhinav Dixit <adixit@confluent.io>
…_ENABLE_CONFIG (apache#20322) Document deprecation of PARTITIONER_ADPATIVE_PARTITIONING_ENABLE_CONFIG in `upgrade.html`, which was missed in apache#20317 Reviewers: Ken Huang <s7133700@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
…ose (apache#20290) *What* https://issues.apache.org/jira/browse/KAFKA-19572 - If a `ShareConsumer` constructor failed due to any exception, then we call `close()` in the catch block. - If there were uninitialized members accessed during `close()`, then it would throw a NPE. Currently there are no null checks, hence we were attempting to use these fields during `close()` execution. - To avoid this, PR adds null checks in the `close()` function before we access fields which possibly could be null. Reviewers: Apoorv Mittal <amittal@confluent.io>, Lianet Magrans <lmagrans@confluent.io>
Improve RLMM doc: 1. Distinguish RLMM configs from other tiered storage configs, all RLMM configs need to start with a specific prefix, but the original documentation miss description. 2. Added description of additional configs for client, which is required when configuring authentication information. This can confuse users, for example: Aiven-Open/tiered-storage-for-apache-kafka#681 Reviewers: Luke Chen <showuon@gmail.com>, TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
The previous URL http://lambda-architecture.net/ seems to now be controlled by spammers Co-authored-by: Shashank <hsshashank.grad@gmail.com> Reviewers: Mickael Maison <mickael.maison@gmail.com>
…r in delayedShareFetch tryComplete method (apache#20280) In the current implementation, some delayed share fetch operations get trapped in the delayed share fetch purgatory when the partition leaderships change during share consumption. This is because there is no check in code to make sure the current broker is still the partition leader corresponding to the share partitions. So, when leadership changes, the share partitions cannot be acquired, because they have already been fenced, and tryComplete returns false. Although the operatio does get completed when the timer expires for it, but it is too late by then, and the operation get stuck in the watchers list waiting for it to get purged when estimated operations increase to more than 1000. This Pr resolves this by adding the required check so that if partition leadership changes, then the delayed share fetches waiting on it gets completed instantaneously. Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
Use Java 24 for the spotbugs checks, now that Spotbugs works on Java 24. Added some more warning exclusions for warnings that are new to 4.9.4. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
…N format (apache#20336) This PR do following: 1. Use correct json format to test. 2. make `PartitionReassignmentState` and `VerifyAssignmentResult.java` become record Reviewers: TengYao Chi <frankvicky@apache.org>, Ken Huang <s7133700@gmail.com>, PoAn Yang <payang@apache.org>
There was a problem hiding this comment.
Summary of Changes
Hello @apalan60, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request primarily addresses a false positive in the BrokerRegistrationRequestTest by migrating the test from Scala to Java and enhancing its validation logic. It also includes several other improvements across the Kafka codebase, focusing on null-safety in consumer operations, refining share partition state management, and modernizing various utility and test classes by adopting newer Java language features.
Highlights
- Fixing False Positive Test: The
BrokerRegistrationRequestTestwas refactored from Scala to Java, introducing more robust validation for broker registration scenarios, particularly concerning metadata version compatibility and ZK broker migrations, which previously led to a false positive test result. - Enhanced Consumer Stability: Added comprehensive null-safety checks within
ShareConsumerImplto preventNullPointerExceptions during consumer close operations and acknowledgement handling, improving the consumer's robustness. - Improved Share Partition State Management: Refined the logic for acquiring and acknowledging records in
SharePartition, ensuring correct state transitions and offset management, especially when dealing with acquisition lock timeouts and write state RPC failures. - Codebase Modernization: Updated various utility and test classes across the project to leverage modern Java features, such as records and enhanced collection factory methods, contributing to cleaner and more concise code.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request introduces a variety of improvements across the codebase. It fixes a false positive in BrokerRegistrationRequestTest by replacing the Scala test with a more accurate Java version. It also enhances robustness in the ShareConsumerImpl by adding defensive null checks, which are validated by a new test case. Several bug fixes and logical improvements have been made to the share consumer and delayed fetch handling, particularly around state transitions and leadership changes. Additionally, a significant portion of the trogdor and tools modules has been modernized, adopting Java records, switch expressions, and immutable collections, which improves code clarity and safety. The upgrade of the SpotBugs tool and removal of related workarounds is also a welcome maintenance update. Overall, these are solid changes that improve test quality, code robustness, and maintainability.
| public void testPartitionNumbers() { | ||
| List<Integer> partsANumbers = PARTSA.partitionNumbers(); | ||
| assertEquals(Integer.valueOf(0), partsANumbers.get(0)); | ||
| assertEquals(Integer.valueOf(1), partsANumbers.get(1)); | ||
| assertEquals(Integer.valueOf(2), partsANumbers.get(2)); | ||
| assertTrue(partsANumbers.contains(0)); | ||
| assertTrue(partsANumbers.contains(1)); | ||
| assertTrue(partsANumbers.contains(2)); | ||
| assertEquals(3, partsANumbers.size()); | ||
|
|
||
| List<Integer> partsBNumbers = PARTSB.partitionNumbers(); | ||
| assertEquals(Integer.valueOf(0), partsBNumbers.get(0)); | ||
| assertEquals(Integer.valueOf(1), partsBNumbers.get(1)); | ||
| assertTrue(partsBNumbers.contains(0)); | ||
| assertTrue(partsBNumbers.contains(1)); | ||
| assertEquals(2, partsBNumbers.size()); | ||
| } |
There was a problem hiding this comment.
The change from assertEquals(Integer.valueOf(0), partsANumbers.get(0)) to assertTrue(partsANumbers.contains(0)) is a good improvement. The original test was depending on the iteration order of a HashMap's key set, which is not guaranteed. The new assertion correctly checks for the presence of the partition numbers without being sensitive to their order, making the test more robust.
| return switch (state) { | ||
| case PENDING -> new TaskPending(spec); | ||
| case RUNNING -> new TaskRunning(spec, startedMs, getCombinedStatus()); | ||
| case STOPPING -> new TaskStopping(spec, startedMs, getCombinedStatus()); | ||
| case DONE -> new TaskDone(spec, startedMs, doneMs, error, cancelled, getCombinedStatus()); | ||
| }; |
| return switch (spec.action()) { | ||
| case CONNECT -> new ConnectStressor(spec); | ||
| case FETCH_METADATA -> new FetchMetadataStressor(spec); | ||
| }; |
No description provided.