-
Notifications
You must be signed in to change notification settings - Fork 136
[linstor] Refactor node-level RWX validation #1856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughExtracts RWX block-attachment validation from the CSI driver into a new utils package, adds public APIs for validation and VM-name resolution, wires a CLI flag to disable validation, and updates ControllerPublishVolume to call the utils validation path. Tests for the new utils were added. Changes
Sequence DiagramsequenceDiagram
participant Client as ControllerPublishVolume
participant Driver as CSI Driver
participant Utils as utils.ValidateRWX...
participant KubeAPI as Kubernetes API
Client->>Driver: ControllerPublishVolume(volumeID, rwxBlock)
alt validation enabled & rwxBlock
Driver->>Utils: ValidateRWXBlockAttachment(ctx, kubeClient, log, volumeID)
Utils->>KubeAPI: Get PersistentVolume for volumeID (PV)
KubeAPI-->>Utils: PV (may reference pod)
alt PV references pod
Utils->>KubeAPI: Get Pod(namespace, name)
KubeAPI-->>Utils: Pod (unstructured)
Utils->>Utils: GetVMNameFromPod(pod)
Utils-->>Driver: VM name / validation result
else PV missing or not pod-backed
Utils-->>Driver: validation error or empty VM name
end
else validation disabled or not rwxBlock
Driver-->>Client: skip RWX validation
end
Driver-->>Client: continue or return error
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @kvaps, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the node-level validation for ReadWriteMany (RWX) block volumes within the LINSTOR CSI driver. The core validation logic, which ensures that multiple pods using the same RWX block volume belong to the same KubeVirt VM, has been moved to a new utility package. This change enhances code organization and maintainability. Additionally, a new command-line flag is introduced, allowing administrators to disable this validation if needed, offering greater control over deployment configurations. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
c94e596 to
e6942d3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively refactors the node-level RWX validation logic by moving it into a dedicated utils package, which improves code organization and modularity. The introduction of the disable-rwx-block-validation flag also adds useful flexibility. The overall changes are solid, and I have one suggestion to make the new validation logic even more robust.
packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff (2)
158-158: Consider using Debug level for this log statement.
Infolevel for every validation call will be noisy in production. This diagnostic message is better suited toDebuglevel, consistent with the similar log at line 303.Suggested change
- log.WithField("volumeID", volumeID).Info("validateRWXBlockAttachment called") + log.WithField("volumeID", volumeID).Debug("validateRWXBlockAttachment called")
236-247: Original error context is lost when GetVMNameFromPod fails.When
GetVMNameFromPodreturns an error (e.g., "failed to get owner pod"), the error is logged as a warning but the pod continues withvmName = "". If this later triggers the strict mode check (line 280-285), the user sees a generic "does not have the vm.kubevirt.io/name label" error, not the actual root cause.Consider propagating the underlying error for better debuggability:
Alternative: Propagate the actual error
vmName, err := GetVMNameFromPod(ctx, kubeClient, log, &item) if err != nil { - log.WithError(err).WithField("pod", item.GetName()).Warn("failed to get VM name from pod") - // Continue with empty vmName - will be caught by strict mode check - vmName = "" + return "", fmt.Errorf("failed to determine VM for pod %s: %w", item.GetName(), err) }Alternatively, keep the current soft failure behavior but enhance the strict mode error message to include more context.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff
🧰 Additional context used
📓 Path-based instructions (1)
packages/system/**
📄 CodeRabbit inference engine (AGENTS.md)
Use Helm Chart umbrella pattern with vendored upstream charts in
charts/directory
Files:
packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build
🔇 Additional comments (5)
packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff (5)
41-51: LGTM!The new CLI flag follows the existing pattern and has a clear description. The conditional wiring to the driver options is correct.
73-97: LGTM!The option function follows the functional options pattern correctly with clear documentation. The validation integration in
ControllerPublishVolumeis well-placed - checking before the actual attach and returning an appropriateFailedPreconditionstatus on failure.
308-356: GetVMNameFromPod logic is well-structured.Clean handling of both virt-launcher pods (direct label lookup) and hotplug disk pods (owner reference traversal). The short-circuit on nil labels and the guard against nil/non-controller owner references are correct.
Minor observation: The function returns
("", nil)for pods without KubeVirt labels (line 356), which allows non-KubeVirt pods through when used in isolation. The caller handles this correctly via strict mode checks.
396-572: Comprehensive test coverage.Excellent table-driven tests covering:
- No pods / single pod / multiple pods scenarios
- Live migration (same VM, multiple pods)
- Cross-VM conflicts
- Strict mode (missing KubeVirt label)
- Pod phase filtering (Succeeded/Failed ignored)
- Namespace and PVC isolation
- Hotplug disk pod resolution
667-705: Hotplug disk pod test helper is well-constructed.The
ownerReferencesstructure correctly setscontroller: truewhich will be properly unmarshaled byGetOwnerReferences(). The helper matches the expected structure in production hotplug disk pods created by KubeVirt.
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
e6942d3 to
3c4f0cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff (2)
196-209: Consider consistent error handling for claimRef extraction.Multiple errors from
NestedStringandNestedMapare silently ignored. While the fail-open design is intentional, suppressing errors without even logging them could make debugging harder.♻️ Suggested improvement
- claimRef, found, _ := unstructured.NestedMap(pv.Object, "spec", "claimRef") + claimRef, found, err := unstructured.NestedMap(pv.Object, "spec", "claimRef") + if err != nil { + log.WithError(err).Warn("cannot validate RWX attachment: failed to read claimRef") + return "", nil + } if !found { log.Warn("cannot validate RWX attachment: PV has no claimRef") return "", nil } - pvcName, _, _ := unstructured.NestedString(claimRef, "name") - pvcNamespace, _, _ := unstructured.NestedString(claimRef, "namespace") + pvcName, _, err := unstructured.NestedString(claimRef, "name") + if err != nil { + log.WithError(err).Warn("cannot validate RWX attachment: failed to read PVC name") + return "", nil + } + pvcNamespace, _, err := unstructured.NestedString(claimRef, "namespace") + if err != nil { + log.WithError(err).Warn("cannot validate RWX attachment: failed to read PVC namespace") + return "", nil + }
576-582: Test assertion could be more precise.The assertion
if len(tc.pods) > 0 { assert.NotEmpty(t, vmName) }checks if test pods were defined, not if matching pods were found. This works for current test cases but could give false positives if a test case defines pods that all fail to match (different namespace + different PVC).Consider adding an
expectedVMNamefield to the test cases for explicit verification.♻️ Suggested improvement
testCases := []struct { name string pods []*unstructured.Unstructured pvcName string namespace string expectError bool errorMsg string + expectedVM string // expected VM name, empty if no VM should be found }{ { name: "no pods using PVC", pods: []*unstructured.Unstructured{}, pvcName: "test-pvc", namespace: "default", expectError: false, + expectedVM: "", }, { name: "single pod using PVC", pods: []*unstructured.Unstructured{ createUnstructuredPod("pod1", "default", "test-pvc", map[string]string{KubeVirtVMLabel: "vm1"}, "Running"), }, pvcName: "test-pvc", namespace: "default", expectError: false, + expectedVM: "vm1", }, // ... add expectedVM to other casesThen in the assertion:
} else { assert.NoError(t, err) - // VM name is returned when there are pods using the volume - if len(tc.pods) > 0 { - assert.NotEmpty(t, vmName) - } + assert.Equal(t, tc.expectedVM, vmName) }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff
🧰 Additional context used
📓 Path-based instructions (1)
packages/system/**
📄 CodeRabbit inference engine (AGENTS.md)
Use Helm Chart umbrella pattern with vendored upstream charts in
charts/directory
Files:
packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build
🔇 Additional comments (5)
packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff (5)
41-51: LGTM!The new
disable-rwx-block-validationflag is well-integrated. The naming convention, default value (false = validation enabled), and conditional wiring follow the existing patterns in the codebase.
87-97: LGTM!The validation integration point is well-placed before the
Attachcall. The guard conditionrwxBlock && !d.disableRWXBlockValidationcorrectly ensures validation only runs for RWX block volumes when not explicitly disabled. The discarded VM name return value is acceptable since the function's purpose here is validation only.
321-370: LGTM!The
GetVMNameFromPodfunction correctly handles three distinct cases: direct virt-launcher pods, hotplug disk pods (via owner reference resolution), and non-KubeVirt pods. The differentiated error handling between hotplug pods (which must resolve to a VM) and regular pods (which can legitimately have no VM label) is appropriate.
679-718: LGTM!The
createHotplugDiskPodhelper correctly sets up the owner reference structure that allows the validation logic to resolve the VM name from the virt-launcher pod. The test cases for hotplug scenarios (both success and failure) provide good coverage.
151-157: Note: TOCTOU limitation is acceptable for this use case.The validation inherently has a time-of-check-to-time-of-use gap since pod state could change between validation and the actual
Attachcall. This is acceptable given CSI's eventually-consistent nature - the validation serves as a best-effort guard against obvious misuse rather than a strict enforcement mechanism. The fail-open design appropriately prioritizes availability.
|
Successfully created backport PR for |
What this PR does
upstream PR piraeusdatastore/linstor-csi#403
Release note
Summary by CodeRabbit
New Features
Improvements
Tests
✏️ Tip: You can customize this high-level summary in your review settings.