Fix potential double-locking of RWMutex in device manager#136660
Fix potential double-locking of RWMutex in device manager#136660gzb1128 wants to merge 2 commits intokubernetes:masterfrom
Conversation
|
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Hi @gzb1128. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: gzb1128 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/kind bug |
|
/priority important-soon |
The podDevices() function was calling containerDevices() while holding the read lock, but containerDevices() also attempts to acquire the same lock. This could cause a deadlock when a writer tries to acquire the lock between the two RLock() calls. Introduce containerDevicesLocked() that expects the caller to already hold the lock, and use it from podDevices() to avoid nested locking. Signed-off-by: gzb1128 <591605936@qq.com>
ca5149f to
e43bd9c
Compare
|
/kind bug |
Add tests to verify the correctness of the RWMutex double-locking fix: - TestPodDevices: Verify multi-container device aggregation - TestContainerDevices: Verify basic functionality - TestPodDevicesConcurrentAccess: Verify concurrent read-write safety The concurrent test uses 10 readers and 5 writers competing for the lock to ensure no deadlock occurs under the fixed implementation. Run with -race flag to detect data races: go test -race ./...
thanks, but 2 weeks is a pretty normal (trending on lower end) for high-churn opensource projects. Let's see how we can move forward |
|
Thank you for the response. You're right - 2 weeks is indeed normal for such a large project, and I apologize for being impatient. I should have waited for the original PR #136235 to progress. The code and tests are ready in this PR. However, I fully respect your decision as the reviewer and the original PR author's work. If you prefer to continue with the original PR, I'm happy to close this one. Please let me know how you'd like to proceed. |
The podDevices() function was calling containerDevices() while
holding the read lock, but containerDevices() also attempts to
acquire the same lock. This could cause a deadlock when a writer
tries to acquire the lock between the two RLock() calls.
Introduce containerDevicesLocked() that expects the caller to
already hold the lock, and use it from podDevices() to avoid
nested locking.
Related to #127826