Fix potential double-locking of RWMutex in device manager#136235
Fix potential double-locking of RWMutex in device manager#136235ValeryCherneykin wants to merge 1 commit intokubernetes:masterfrom
Conversation
|
|
|
Welcome @ValeryCherneykin! |
|
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Hi @ValeryCherneykin. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ValeryCherneykin The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
The podDevices() function was calling containerDevices() while holding the read lock, but containerDevices() also attempts to acquire the same lock. This could cause a deadlock when a writer tries to acquire the lock between the two RLock() calls. Introduce containerDevicesLocked() that expects the caller to already hold the lock, and use it from podDevices() to avoid nested locking.
1bd48b4 to
fde201b
Compare
| type ( | ||
| resourceAllocateInfo map[string]deviceAllocateInfo // Keyed by resourceName. | ||
| containerDevices map[string]resourceAllocateInfo // Keyed by containerName. | ||
| podDevices struct { | ||
| sync.RWMutex | ||
| devs map[string]containerDevices // Keyed by podUID. | ||
| } | ||
| ) |
There was a problem hiding this comment.
unnecessary change, let's please keep the change minimal and focused (no style + functional changes bundled together)
|
@ValeryCherneykin: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Since this PR has been inactive for 2 weeks, I've created PR #136660 to continue moving this fix forward. The changes are identical to the approved fix here, just rebased onto the latest master. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
The
podDevices()function was callingcontainerDevices()while holding the read lock, butcontainerDevices()also attempts to acquire the same lock. This could cause a deadlock when a writer tries to acquire the lock between the twoRLock()calls.Introduce
containerDevicesLocked()that expects the caller to already hold the lock, and use it frompodDevices()to avoid nested locking.Which issue(s) this PR is related to:
Fixes #127826
Special notes for your reviewer:
The fix follows the standard Go pattern of having a
*Lockedvariant of a function that expects the caller to hold the lock.Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: