Skip to content

Conversation

@kvaps
Copy link
Member

@kvaps kvaps commented Jan 13, 2026

What this PR does

upstream PR piraeusdatastore/linstor-csi#403

Release note

[linstor] Refactor node-level RWX validation

Summary by CodeRabbit

  • New Features

    • Added a command-line option to disable strict RWX block volume validation for specialized deployments.
    • Centralized RWX validation into a shared utility used by the driver.
  • Improvements

    • Improved VM identification and pod/PV handling for RWX block attachments, including hotplug scenarios.
  • Tests

    • Added comprehensive unit tests covering RWX validation and related pod/PV scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jan 13, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 13, 2026

📝 Walkthrough

Walkthrough

Extracts RWX block-attachment validation from the CSI driver into a new utils package, adds public APIs for validation and VM-name resolution, wires a CLI flag to disable validation, and updates ControllerPublishVolume to call the utils validation path. Tests for the new utils were added.

Changes

Cohort / File(s) Summary
Driver
pkg/driver/driver.go
Added disableRWXBlockValidation field and DisableRWXBlockValidation() option; replaced in-driver RWX logic with calls to utils.ValidateRWXBlockAttachment / GetVMNameFromPod.
Utils (new/modified)
pkg/utils/rwx_validation.go
New public APIs: ValidateRWXBlockAttachment(ctx, kubeClient, log, volumeID) (string, error) and GetVMNameFromPod(ctx, kubeClient, log, pod) (string, error); added KubeVirtVMLabel, KubeVirtHotplugDiskLabel, PodGVR, PVGVR.
Command / CLI wiring
cmd/linstor-csi/linstor-csi.go
Added --disable-rwx-block-validation flag and applied driver.DisableRWXBlockValidation() when set.
Tests
pkg/utils/rwx_validation_test.go
New tests covering ValidateRWXBlockAttachment and VM-name resolution across scenarios (no kube client, missing PV, regular pods, hotplug pods) and test helpers for unstructured Pod/PV creation.
Patches
packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff
Patch file containing the aggregated changes described above (+246/-126).

Sequence Diagram

sequenceDiagram
    participant Client as ControllerPublishVolume
    participant Driver as CSI Driver
    participant Utils as utils.ValidateRWX...
    participant KubeAPI as Kubernetes API

    Client->>Driver: ControllerPublishVolume(volumeID, rwxBlock)
    alt validation enabled & rwxBlock
        Driver->>Utils: ValidateRWXBlockAttachment(ctx, kubeClient, log, volumeID)
        Utils->>KubeAPI: Get PersistentVolume for volumeID (PV)
        KubeAPI-->>Utils: PV (may reference pod)
        alt PV references pod
            Utils->>KubeAPI: Get Pod(namespace, name)
            KubeAPI-->>Utils: Pod (unstructured)
            Utils->>Utils: GetVMNameFromPod(pod)
            Utils-->>Driver: VM name / validation result
        else PV missing or not pod-backed
            Utils-->>Driver: validation error or empty VM name
        end
    else validation disabled or not rwxBlock
        Driver-->>Client: skip RWX validation
    end
    Driver-->>Client: continue or return error
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hopped through code with nimble paws,

Pulled RWX checks out of driver laws.
Labels and pods now sing in tune,
A flag to hush validation soon.
Hooray — a tidy, tested bloom!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Refactor node-level RWX validation' accurately describes the main change: migrating RWX validation logic from the driver to a utils module with new public APIs and control-flow updates.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @kvaps, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the node-level validation for ReadWriteMany (RWX) block volumes within the LINSTOR CSI driver. The core validation logic, which ensures that multiple pods using the same RWX block volume belong to the same KubeVirt VM, has been moved to a new utility package. This change enhances code organization and maintainability. Additionally, a new command-line flag is introduced, allowing administrators to disable this validation if needed, offering greater control over deployment configurations.

Highlights

  • Refactored RWX Block Volume Validation: The logic for validating ReadWriteMany (RWX) block volume attachments has been extracted from the main driver file into a new, dedicated utility package (pkg/utils/rwx_validation.go). This improves modularity and testability.
  • New Configuration Option: A new command-line flag, --disable-rwx-block-validation, has been added to the linstor-csi executable. This flag allows users to optionally disable the KubeVirt VM ownership validation for RWX block volumes, providing flexibility for different environments.
  • Enhanced Validation Logic: The ValidateRWXBlockAttachment function now includes an additional check to ensure that the Persistent Volume's volumeHandle matches the provided volumeID, adding robustness to the validation process.
  • Improved Test Structure: The unit tests for RWX validation have been moved alongside the refactored logic into pkg/utils/rwx_validation_test.go, ensuring that tests are co-located with the code they validate and are updated to reflect the new function signatures.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@kvaps kvaps force-pushed the refactor-rwx-validation branch from c94e596 to e6942d3 Compare January 13, 2026 12:20
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively refactors the node-level RWX validation logic by moving it into a dedicated utils package, which improves code organization and modularity. The introduction of the disable-rwx-block-validation flag also adds useful flexibility. The overall changes are solid, and I have one suggestion to make the new validation logic even more robust.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff (2)

158-158: Consider using Debug level for this log statement.

Info level for every validation call will be noisy in production. This diagnostic message is better suited to Debug level, consistent with the similar log at line 303.

Suggested change
-	log.WithField("volumeID", volumeID).Info("validateRWXBlockAttachment called")
+	log.WithField("volumeID", volumeID).Debug("validateRWXBlockAttachment called")

236-247: Original error context is lost when GetVMNameFromPod fails.

When GetVMNameFromPod returns an error (e.g., "failed to get owner pod"), the error is logged as a warning but the pod continues with vmName = "". If this later triggers the strict mode check (line 280-285), the user sees a generic "does not have the vm.kubevirt.io/name label" error, not the actual root cause.

Consider propagating the underlying error for better debuggability:

Alternative: Propagate the actual error
 			vmName, err := GetVMNameFromPod(ctx, kubeClient, log, &item)
 			if err != nil {
-				log.WithError(err).WithField("pod", item.GetName()).Warn("failed to get VM name from pod")
-				// Continue with empty vmName - will be caught by strict mode check
-				vmName = ""
+				return "", fmt.Errorf("failed to determine VM for pod %s: %w", item.GetName(), err)
 			}

Alternatively, keep the current soft failure behavior but enhance the strict mode error message to include more context.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dececc8 and c94e596.

📒 Files selected for processing (1)
  • packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff
🧰 Additional context used
📓 Path-based instructions (1)
packages/system/**

📄 CodeRabbit inference engine (AGENTS.md)

Use Helm Chart umbrella pattern with vendored upstream charts in charts/ directory

Files:

  • packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build
🔇 Additional comments (5)
packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff (5)

41-51: LGTM!

The new CLI flag follows the existing pattern and has a clear description. The conditional wiring to the driver options is correct.


73-97: LGTM!

The option function follows the functional options pattern correctly with clear documentation. The validation integration in ControllerPublishVolume is well-placed - checking before the actual attach and returning an appropriate FailedPrecondition status on failure.


308-356: GetVMNameFromPod logic is well-structured.

Clean handling of both virt-launcher pods (direct label lookup) and hotplug disk pods (owner reference traversal). The short-circuit on nil labels and the guard against nil/non-controller owner references are correct.

Minor observation: The function returns ("", nil) for pods without KubeVirt labels (line 356), which allows non-KubeVirt pods through when used in isolation. The caller handles this correctly via strict mode checks.


396-572: Comprehensive test coverage.

Excellent table-driven tests covering:

  • No pods / single pod / multiple pods scenarios
  • Live migration (same VM, multiple pods)
  • Cross-VM conflicts
  • Strict mode (missing KubeVirt label)
  • Pod phase filtering (Succeeded/Failed ignored)
  • Namespace and PVC isolation
  • Hotplug disk pod resolution

667-705: Hotplug disk pod test helper is well-constructed.

The ownerReferences structure correctly sets controller: true which will be properly unmarshaled by GetOwnerReferences(). The helper matches the expected structure in production hotplug disk pods created by KubeVirt.

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
@kvaps kvaps force-pushed the refactor-rwx-validation branch from e6942d3 to 3c4f0cd Compare January 13, 2026 12:23
@kvaps kvaps added the backport Should change be backported on previus release label Jan 13, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff (2)

196-209: Consider consistent error handling for claimRef extraction.

Multiple errors from NestedString and NestedMap are silently ignored. While the fail-open design is intentional, suppressing errors without even logging them could make debugging harder.

♻️ Suggested improvement
-	claimRef, found, _ := unstructured.NestedMap(pv.Object, "spec", "claimRef")
+	claimRef, found, err := unstructured.NestedMap(pv.Object, "spec", "claimRef")
+	if err != nil {
+		log.WithError(err).Warn("cannot validate RWX attachment: failed to read claimRef")
+		return "", nil
+	}
 	if !found {
 		log.Warn("cannot validate RWX attachment: PV has no claimRef")
 		return "", nil
 	}

-	pvcName, _, _ := unstructured.NestedString(claimRef, "name")
-	pvcNamespace, _, _ := unstructured.NestedString(claimRef, "namespace")
+	pvcName, _, err := unstructured.NestedString(claimRef, "name")
+	if err != nil {
+		log.WithError(err).Warn("cannot validate RWX attachment: failed to read PVC name")
+		return "", nil
+	}
+	pvcNamespace, _, err := unstructured.NestedString(claimRef, "namespace")
+	if err != nil {
+		log.WithError(err).Warn("cannot validate RWX attachment: failed to read PVC namespace")
+		return "", nil
+	}

576-582: Test assertion could be more precise.

The assertion if len(tc.pods) > 0 { assert.NotEmpty(t, vmName) } checks if test pods were defined, not if matching pods were found. This works for current test cases but could give false positives if a test case defines pods that all fail to match (different namespace + different PVC).

Consider adding an expectedVMName field to the test cases for explicit verification.

♻️ Suggested improvement
 	testCases := []struct {
 		name        string
 		pods        []*unstructured.Unstructured
 		pvcName     string
 		namespace   string
 		expectError bool
 		errorMsg    string
+		expectedVM  string // expected VM name, empty if no VM should be found
 	}{
 		{
 			name:        "no pods using PVC",
 			pods:        []*unstructured.Unstructured{},
 			pvcName:     "test-pvc",
 			namespace:   "default",
 			expectError: false,
+			expectedVM:  "",
 		},
 		{
 			name: "single pod using PVC",
 			pods: []*unstructured.Unstructured{
 				createUnstructuredPod("pod1", "default", "test-pvc", map[string]string{KubeVirtVMLabel: "vm1"}, "Running"),
 			},
 			pvcName:     "test-pvc",
 			namespace:   "default",
 			expectError: false,
+			expectedVM:  "vm1",
 		},
 		// ... add expectedVM to other cases

Then in the assertion:

 		} else {
 			assert.NoError(t, err)
-			// VM name is returned when there are pods using the volume
-			if len(tc.pods) > 0 {
-				assert.NotEmpty(t, vmName)
-			}
+			assert.Equal(t, tc.expectedVM, vmName)
 		}
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c94e596 and 3c4f0cd.

📒 Files selected for processing (1)
  • packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff
🧰 Additional context used
📓 Path-based instructions (1)
packages/system/**

📄 CodeRabbit inference engine (AGENTS.md)

Use Helm Chart umbrella pattern with vendored upstream charts in charts/ directory

Files:

  • packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build
🔇 Additional comments (5)
packages/system/linstor/images/linstor-csi/patches/001-rwx-validation.diff (5)

41-51: LGTM!

The new disable-rwx-block-validation flag is well-integrated. The naming convention, default value (false = validation enabled), and conditional wiring follow the existing patterns in the codebase.


87-97: LGTM!

The validation integration point is well-placed before the Attach call. The guard condition rwxBlock && !d.disableRWXBlockValidation correctly ensures validation only runs for RWX block volumes when not explicitly disabled. The discarded VM name return value is acceptable since the function's purpose here is validation only.


321-370: LGTM!

The GetVMNameFromPod function correctly handles three distinct cases: direct virt-launcher pods, hotplug disk pods (via owner reference resolution), and non-KubeVirt pods. The differentiated error handling between hotplug pods (which must resolve to a VM) and regular pods (which can legitimately have no VM label) is appropriate.


679-718: LGTM!

The createHotplugDiskPod helper correctly sets up the owner reference structure that allows the validation logic to resolve the VM name from the virt-launcher pod. The test cases for hotplug scenarios (both success and failure) provide good coverage.


151-157: Note: TOCTOU limitation is acceptable for this use case.

The validation inherently has a time-of-check-to-time-of-use gap since pod state could change between validation and the actual Attach call. This is acceptable given CSI's eventually-consistent nature - the validation serves as a best-effort guard against obvious misuse rather than a strict enforcement mechanism. The fail-open design appropriately prioritizes availability.

@kvaps kvaps merged commit 746e58d into main Jan 13, 2026
6 of 8 checks passed
@kvaps kvaps deleted the refactor-rwx-validation branch January 13, 2026 12:27
@github-actions
Copy link

Successfully created backport PR for release-0.40:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport Should change be backported on previus release size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants