Skip to content

KAFKA-8275; Take throttling into account when choosing least loaded node#6619

Merged
hachikuji merged 2 commits into
apache:trunkfrom
hachikuji:KAFKA-8275
May 7, 2019
Merged

KAFKA-8275; Take throttling into account when choosing least loaded node#6619
hachikuji merged 2 commits into
apache:trunkfrom
hachikuji:KAFKA-8275

Conversation

@hachikuji

Copy link
Copy Markdown
Contributor

If a node is currently throttled, we should take it out of the running for leastLoadedNode. Additionally, current logic seems to favor connecting to new nodes rather than using existing connections which have one or more in flight requests. The javadoc is slightly vague about whether this is expected, but it seems not.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@hachikuji hachikuji requested a review from rajinisivaram April 22, 2019 22:22

@rajinisivaram rajinisivaram left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hachikuji Thanks for the PR, LGTM

@hachikuji

Copy link
Copy Markdown
Contributor Author

retest this please

@hachikuji hachikuji merged commit 8895122 into apache:trunk May 7, 2019
ijuma added a commit to ijuma/kafka that referenced this pull request May 8, 2019
…s-hashcode

* apache-github/trunk:
  KAFKA-8158: Add EntityType for Kafka RPC fields (apache#6503)
  MINOR: correctly parse version OffsetCommitResponse version < 3
  KAFKA-8284: enable static membership on KStream (apache#6673)
  KAFKA-8304: Fix registration of Connect REST extensions (apache#6651)
  KAFKA-8275; Take throttling into account when choosing least loaded node (apache#6619)
  KAFKA-3522: Interactive Queries must return timestamped stores (apache#6661)
  MINOR: MetricsIntegrationTest should set StreamsConfig.STATE_DIR_CONFIG (apache#6687)
  MINOR: Remove unused field in `ListenerConnectionQuota`
  KAFKA-8131; Move --version implementation into CommandLineUtils (apache#6481)
  KAFKA-8056; Use automatic RPC generation for FindCoordinator (apache#6408)
  MINOR: Remove workarounds for lz4-java bug affecting byte buffers (apache#6679)
  KAFKA-7455: Support JmxTool to connect to a secured RMI port. (apache#5968)
  MINOR: Document improvement (apache#6682)
  MINOR: Fix ThrottledReplicaListValidator doc error. (apache#6537)
  KAFKA-8306; Initialize log end offset accurately when start offset is non-zero (apache#6652)
pengxiaolong pushed a commit to pengxiaolong/kafka that referenced this pull request Jun 14, 2019
…ode (apache#6619)

If a node is currently throttled, we should take it out of the running for `leastLoadedNode`. Additionally, current logic seems to favor connecting to new nodes rather than using existing connections which have one or more in flight requests. The javadoc is slightly vague about whether this is expected, but it seems not.

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
@goalfull

goalfull commented Jun 13, 2024

Copy link
Copy Markdown

The leastLoadedNode() function has a bug during the consumer process starting period. The function sendMetadataRequest() called by getTopicMetadataRequest() uses a random node which maybe faulty since every node‘s state recorded in the client thread is not ready yet. It happened in my production environment during my consumer thread restarting and meanwhile one of the KAFKA server node is dead.
@hachikuji What do you think?

@goalfull

Copy link
Copy Markdown

The leastLoadedNode() function has a bug during the consumer process starting period. The function sendMetadataRequest() called by getTopicMetadataRequest() uses a random node which maybe faulty since every node‘s state recorded in the client thread is not ready yet. It happened in my production environment during my consumer thread restarting and meanwhile one of the KAFKA server node is dead. @hachikuji What do you think?

I'm using the kafka-client-2.0.1.jar. I have checked the source code of higher versions and the issue still exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants