Skip to content

[feat] Add topic stats and metrics for observing message replay behavior and Key_Shared filtering/blocking behavior #23205

@lhotari

Description

@lhotari

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

Currently, it's very challenging to investigate issues related to message replay ("message redelivery controller"). Some examples of this include:

Solution

Add topic stats and metrics for observing message replay and related Key_Shared filtering (hash blocking) behavior.

Specific Metrics to Consider

  1. Number of messages in redelivery (replay)
  2. For Key_Shared subscriptions: Ways to observe internal state related to blocked hashes
  3. Counter for delayed delivery messages being added to delivery (replay)

Implementation Requirements

  • It should be possible to detect replays in topic stats (or internal stats) and also in aggregated metrics
  • The aggregated metrics should be usable in monitoring tools (e.g., Grafana dashboards)
  • The specific types of metrics (counters, gauges) to be used will be determined in the detailed design phase

Expected Benefits

  • Improved observability for message replay and Key_Shared behavior
  • Easier troubleshooting of related issues
  • Enhanced monitoring capabilities for Pulsar clusters

Alternatives

No response

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

Labels

release/blockerIndicate the PR or issue that should block the release until it gets resolvedtype/enhancementThe enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages

Type

No type

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions