Skip to content

[Bug] /metrics endpoint processes requests one-by-one and seems to queue up waiting requests infinitely, ignoring the request timeout #22477

@lhotari

Description

@lhotari

Search before asking

  • I searched in the issues and found nothing similar.

Read release policy

  • I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker.

Version

Applies to at least 3.0.x and 3.2.x versions of the Broker.

Minimal reproduce step

TBD. I believe this could be done with a http performance testing tool such as k6 or wrk2. Will follow up later. This might also require a large amount of topics so that the metrics size is significant.

This problem reproduces also with the metricsBufferResponse=true setting which is expected to cache the results and allow concurrent requests.

What did you expect to see?

Calling the /metrics endpoint should be possible concurrently and should generate the metrics results once and share it across the concurrent requests. The expiration of the cached result should be configurable.

What did you see instead?

requests queue up until there there's a failure about reaching the connection limit
INFO org.eclipse.jetty.server.ConnectionLimit - Connection Limit(2048) reached for [ServerConnector@.........{HTTP/1.1, (http/1.1)}{0.0.0.0:8080}]

Anything else?

The problem is partially mitigated by enabling Gzip compression with #21667, however it doesn't address the root cause.
There has been a previous attempt to address performance problems in the /metrics endpoint with #14453.

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

Labels

type/bugThe PR fixed a bug or issue reported a bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions