Skip to content

Adjust configmap to not scrape proxies from failed pods and completed jobs#14916

Merged
alpeb merged 3 commits intolinkerd:mainfrom
bkittinger:fix/viz-scraping-ip-reuse
Mar 9, 2026
Merged

Adjust configmap to not scrape proxies from failed pods and completed jobs#14916
alpeb merged 3 commits intolinkerd:mainfrom
bkittinger:fix/viz-scraping-ip-reuse

Conversation

@bkittinger
Copy link
Contributor

Problem

When a pod is assigned an IP address that was previously used by a pod that is in a failed or completed state (either because a pod failed or a cronjob ran to completion), viz will attempt to scrape both pods.

Since the IP address is the same on both pods, this leads to duplicate metrics with different kubernetes metadata.

Solution

The proposed solution is to add a relabel to the scraping config that only keeps metrics where the pod phase is either Running or Pending states.

Validation

  • Run a large number of jobs to completion in a cluster, aiming to use up all address space.
  • Run a pod exposing a port and ensure it gets assigned an IP address that is also used by one of the completed jobs
  • Create a Server resource selecting the pod/port and ensure the default policy is set to deny
  • Generate traffic from a non-meshed pod to the meshed pod hosting the service and observe the inbound_http_authz_deny_total metric in prometheus

Previous to the fix, you will see multiple metrics entries for the same IP with different pod metadata. After the fix is applied, only the running pod will be referenced.

Fixes #10562

linkerd#10562

Signed-off-by: Benedikt Kittinger <benedikt.kittinger@post.at>
@bkittinger bkittinger requested a review from a team as a code owner February 10, 2026 10:13
@alpeb alpeb self-requested a review March 5, 2026 17:25
Copy link
Member

@alpeb alpeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bkittinger, this looks good to me. I've pushed an additional commit to take care of the test fixtures changes.

@alpeb alpeb merged commit a913c05 into linkerd:main Mar 9, 2026
20 checks passed
@bkittinger bkittinger deleted the fix/viz-scraping-ip-reuse branch March 10, 2026 11:07
@bkittinger bkittinger restored the fix/viz-scraping-ip-reuse branch March 10, 2026 11:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

linkerd-proxy metrics annotated with the wrong pod information

2 participants