Fix potential double-locking of RWMutex in device manager by gzb1128 · Pull Request #136660 · kubernetes/kubernetes

gzb1128 · 2026-01-31T07:29:46Z

The podDevices() function was calling containerDevices() while
holding the read lock, but containerDevices() also attempts to
acquire the same lock. This could cause a deadlock when a writer
tries to acquire the lock between the two RLock() calls.

Introduce containerDevicesLocked() that expects the caller to
already hold the lock, and use it from podDevices() to avoid
nested locking.

Related to #127826

NONE

k8s-ci-robot · 2026-01-31T07:29:55Z

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot · 2026-01-31T07:29:55Z

Hi @gzb1128. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot · 2026-01-31T07:30:01Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: gzb1128
Once this PR has been reviewed and has the lgtm label, please assign ffromani for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

pkg/kubelet/cm/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gzb1128 · 2026-01-31T07:32:19Z

/kind bug

gzb1128 · 2026-01-31T07:32:35Z

/priority important-soon

The podDevices() function was calling containerDevices() while holding the read lock, but containerDevices() also attempts to acquire the same lock. This could cause a deadlock when a writer tries to acquire the lock between the two RLock() calls. Introduce containerDevicesLocked() that expects the caller to already hold the lock, and use it from podDevices() to avoid nested locking. Signed-off-by: gzb1128 <591605936@qq.com>

gzb1128 · 2026-01-31T07:36:15Z

/kind bug

gzb1128 · 2026-01-31T07:42:48Z

@ffromani from the original PR #136235. We noticed the original PR has been stagnant for 2 weeks, so we created this new PR to keep the fix moving forward. Would appreciate your review.

Add tests to verify the correctness of the RWMutex double-locking fix: - TestPodDevices: Verify multi-container device aggregation - TestContainerDevices: Verify basic functionality - TestPodDevicesConcurrentAccess: Verify concurrent read-write safety The concurrent test uses 10 readers and 5 writers competing for the lock to ensure no deadlock occurs under the fixed implementation. Run with -race flag to detect data races: go test -race ./...

ffromani · 2026-01-31T08:18:23Z

@ffromani from the original PR #136235. We noticed the original PR has been stagnant for 2 weeks, so we created this new PR to keep the fix moving forward. Would appreciate your review.

thanks, but 2 weeks is a pretty normal (trending on lower end) for high-churn opensource projects. Let's see how we can move forward

gzb1128 · 2026-02-12T07:40:14Z

@ffromani

Thank you for the response. You're right - 2 weeks is indeed normal for such a large project, and I apologize for being impatient. I should have waited for the original PR #136235 to progress.

The code and tests are ready in this PR. However, I fully respect your decision as the reviewer and the original PR author's work. If you prefer to continue with the original PR, I'm happy to close this one. Please let me know how you'd like to proceed.

k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Jan 31, 2026

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. labels Jan 31, 2026

github-project-automation bot added this to SIG Node: code and documentation PRs Jan 31, 2026

github-project-automation bot moved this to Triage in SIG Node: code and documentation PRs Jan 31, 2026

k8s-ci-robot removed the do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jan 31, 2026

k8s-ci-robot requested review from klueska and yujuhong January 31, 2026 07:30

k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. and removed do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Jan 31, 2026

k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jan 31, 2026

gzb1128 force-pushed the fix-device-manager-double-locking branch from ca5149f to e43bd9c Compare January 31, 2026 07:33

gzb1128 mentioned this pull request Jan 31, 2026

Fix potential double-locking of RWMutex in device manager #136235

Open

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 31, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix potential double-locking of RWMutex in device manager#136660

Fix potential double-locking of RWMutex in device manager#136660
gzb1128 wants to merge 2 commits intokubernetes:masterfrom
gzb1128:fix-device-manager-double-locking

gzb1128 commented Jan 31, 2026 •

edited

Loading

Uh oh!

k8s-ci-robot commented Jan 31, 2026

Uh oh!

k8s-ci-robot commented Jan 31, 2026

Uh oh!

k8s-ci-robot commented Jan 31, 2026

Uh oh!

gzb1128 commented Jan 31, 2026

Uh oh!

gzb1128 commented Jan 31, 2026

Uh oh!

gzb1128 commented Jan 31, 2026

Uh oh!

gzb1128 commented Jan 31, 2026

Uh oh!

ffromani commented Jan 31, 2026

Uh oh!

gzb1128 commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gzb1128 commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Jan 31, 2026

Uh oh!

k8s-ci-robot commented Jan 31, 2026

Uh oh!

k8s-ci-robot commented Jan 31, 2026

Uh oh!

gzb1128 commented Jan 31, 2026

Uh oh!

gzb1128 commented Jan 31, 2026

Uh oh!

gzb1128 commented Jan 31, 2026

Uh oh!

gzb1128 commented Jan 31, 2026

Uh oh!

ffromani commented Jan 31, 2026

Uh oh!

gzb1128 commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gzb1128 commented Jan 31, 2026 •

edited

Loading