Skip to content

node: metrics for alignment failures#129950

Merged
k8s-ci-robot merged 1 commit intokubernetes:masterfrom
ffromani:alignment-error-detail-metrics
Mar 13, 2025
Merged

node: metrics for alignment failures#129950
k8s-ci-robot merged 1 commit intokubernetes:masterfrom
ffromani:alignment-error-detail-metrics

Conversation

@ffromani
Copy link
Copy Markdown
Contributor

@ffromani ffromani commented Feb 3, 2025

What type of PR is this?

/kind cleanup
/kind feature

What this PR does / why we need it:

Add metric about detailed alignment errors

Which issue(s) this PR fixes:

Related to kubernetes/enhancements#5108

Special notes for your reviewer:

N/A

Does this PR introduce a user-facing change?

Add metrics to expose the main known reasons for resource alingment errors

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Feb 3, 2025
@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 3, 2025
@ffromani ffromani force-pushed the alignment-error-detail-metrics branch from 5b3d643 to 224918a Compare February 25, 2025 15:45
@ffromani ffromani changed the title WIP: node: metrics for alignment failures node: metrics for alignment failures Feb 25, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 25, 2025
@ffromani
Copy link
Copy Markdown
Contributor Author

misses e2e test coverage, everything else is reviewable

Copy link
Copy Markdown
Contributor

@swatisehgal swatisehgal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looking good, Thanks for your work on this!

Can we add some e2e tests to ensure that the added metric is getting populated and updated as expected.

Add metrics to report alignment allocation failures
See: kubernetes/enhancements#5108

Signed-off-by: Francesco Romani <fromani@redhat.com>
@ffromani ffromani force-pushed the alignment-error-detail-metrics branch from 224918a to 04129d1 Compare March 4, 2025 18:50
@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Mar 4, 2025
@ffromani
Copy link
Copy Markdown
Contributor Author

ffromani commented Mar 4, 2025

/test pull-kubernetes-node-kubelet-serial-containerd
/test pull-kubernetes-node-kubelet-serial-containerd-sidecar-containers
/test pull-kubernetes-node-kubelet-serial-cpu-manager
/test pull-kubernetes-node-kubelet-serial-hugepages
/test pull-kubernetes-node-kubelet-serial-memory-manager
/test pull-kubernetes-node-kubelet-serial-topology-manager

@ffromani
Copy link
Copy Markdown
Contributor Author

ffromani commented Mar 5, 2025

/retest

@SergeyKanzhelev SergeyKanzhelev moved this from Triage to Archive-it in SIG Node CI/Test Board Mar 5, 2025
Copy link
Copy Markdown
Contributor

@swatisehgal swatisehgal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

/hold
to prevent inadvertent merge and allow time in case other reviewers want to add their input

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 7, 2025
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 7, 2025
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: 0da312b304768623ab0d7e16c094d97f606baf76

@swatisehgal
Copy link
Copy Markdown
Contributor

/unhold
Ah, just realized that we need approval on metrics so removing hold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 7, 2025
@swatisehgal
Copy link
Copy Markdown
Contributor

/triage accepted
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 10, 2025
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ffromani, mrunalp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 12, 2025
@k8s-ci-robot k8s-ci-robot merged commit 05bfdbc into kubernetes:master Mar 13, 2025
@k8s-ci-robot k8s-ci-robot added this to the v1.33 milestone Mar 13, 2025
@github-project-automation github-project-automation Bot moved this from Work in progress to Done in SIG Node: code and documentation PRs Mar 13, 2025
@github-project-automation github-project-automation Bot moved this from Archive-it to Done in SIG Node CI/Test Board Mar 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants