Skip to content

nvmeof: add dhchap feature into nvmeof csi driver #5913

Merged
mergify[bot] merged 8 commits into
ceph:develfrom
gadididi:nvmeof/add_dhchap
Feb 13, 2026
Merged

nvmeof: add dhchap feature into nvmeof csi driver #5913
mergify[bot] merged 8 commits into
ceph:develfrom
gadididi:nvmeof/add_dhchap

Conversation

@gadididi

@gadididi gadididi commented Jan 12, 2026

Copy link
Copy Markdown
Contributor

Describe what this PR does

Adds DH-CHAP (Diffie-Hellman Challenge Handshake Authentication Protocol)
authentication support for NVMe-oF connections with pluggable KMS backend
for secure key management.

Key components:

  • SecurityKeyNVMEOFManager: Manages authentication keys using KMS encryption
    (similar to volume encryption pattern)
  • dhchapMode parameter: none (default), unidirectional,
    bidirectional
  • RBD metadata DEKStore: Stores encrypted keys in RBD image metadata (testing)
  • Controller: Generates and stores keys during ControllerPublishVolume
  • Node: Retrieves keys during NodeStageVolume and passes to nvme-cli

Authentication flow:

  1. Controller generates unique DH-CHAP key per node-subsystem pair
  2. Key encrypted with KEK from Secret and stored via DEKStore
  3. Node retrieves and decrypts key during connection setup
  4. nvme-cli performs in-band authentication with gateway

Is there anything that requires special attention

Testing limitations:

  • RBD metadata DEKStore is for POC/testing only - stores encrypted keys
    in volume metadata
  • Not tested with Vault KMS (recommended for production)
  • For production, use Vault with integrated DEK storage.

Implementation notes:

  • Backward compatible: omitting dhchapMode defaults to none (no auth)
  • Keys persist per node-subsystem pair, not per volume
  • Supports RWX volumes (multiple nodes mounting same namespace)

Related issues

More details here:
#5723

Testing

  1. Create kms-config.yml (with rbd metadata kms type) for nvmeof csi driver:
apiVersion: v1
kind: ConfigMap
metadata:
  name: csi-kms-connection-details
  namespace: openshift-storage
data:
  metadata: |
    {
      "encryptionKMSType": "metadata"
    }
  1. Create PVC
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cephcsi-nvmeof-pvc
spec:
  volumeAttributesClassName: default
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 64Mi
  storageClassName: ocs-storagecluster-ceph-nvmeof

logs:

I0122 12:38:24.049751       1 utils.go:350] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 GRPC call: /csi.v1.Controller/CreateVolume
I0122 12:38:24.049860       1 utils.go:351] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 GRPC request: {"capacity_range":{"required_bytes":67108864},"mutable_parameters":{"rwIosPerSecond":"5000"},"name":"pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0","parameters":{"clusterID":"openshift-storage","csi.storage.k8s.io/pv/name":"pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0","csi.storage.k8s.io/pvc/name":"cephcsi-nvmeof-pvc","csi.storage.k8s.io/pvc/namespace":"default","dhchapMode":"bidirectional","imageFeatures":"layering","listeners":"[\n  {\n    \"address\": \"10.131.0.130\",\n    \"port\": 4420,\n    \"hostname\": \"ceph-nvmeof-gateway-67468f76d7-49v8c\"\n  },\n  {\n    \"address\": \"10.128.2.52\",\n    \"port\": 4420,\n    \"hostname\": \"ceph-nvmeof-gateway-67468f76d7-hkv98\"\n  }\n]\n","nvmeofGatewayAddress":"172.30.192.133","nvmeofGatewayPort":"5500","pool":"ocs-storagecluster-cephblockpool","subsystemNQN":"nqn.2016-06.io.ceph:subsystem.test-integration"},"secrets":"***stripped***","volume_capabilities":[{"access_mode":{"mode":"SINGLE_NODE_WRITER"},"mount":{"fs_type":"ext4"}}]}
I0122 12:38:24.050271       1 rbd_util.go:1424] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 setting disableInUseChecks: false image features: [layering] mounter: rbd
I0122 12:38:24.084068       1 omap.go:89] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 got omap values: (pool="ocs-storagecluster-cephblockpool", namespace="", name="csi.volumes.default"): map[]
I0122 12:38:24.110441       1 omap.go:159] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 set omap keys (pool="ocs-storagecluster-cephblockpool", namespace="", name="csi.volumes.default"): map[csi.volume.pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0:a617ce6d-9918-4451-98e9-a2096c57a456])
I0122 12:38:24.115872       1 omap.go:159] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 set omap keys (pool="ocs-storagecluster-cephblockpool", namespace="", name="csi.volume.a617ce6d-9918-4451-98e9-a2096c57a456"): map[csi.imagename:csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456 csi.volname:pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 csi.volume.owner:default])
I0122 12:38:24.115902       1 rbd_journal.go:521] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 generated Volume ID (0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456) and image name (csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456) for request name (pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0)
I0122 12:38:24.115980       1 rbd_util.go:462] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 rbd: create ocs-storagecluster-cephblockpool/csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456 size 64M (features: [layering]) using mon 172.30.188.242:3300,172.30.64.86:3300,172.30.84.79:3300
I0122 12:38:24.116036       1 rbd_util.go:1688] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 setting image options on ocs-storagecluster-cephblockpool/csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456
I0122 12:38:24.141228       1 controllerserver.go:809] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 created image ocs-storagecluster-cephblockpool/csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456 backed for request name pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0
I0122 12:38:24.165260       1 omap.go:159] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 set omap keys (pool="ocs-storagecluster-cephblockpool", namespace="", name="csi.volume.a617ce6d-9918-4451-98e9-a2096c57a456"): map[csi.imageid:9b12e674b06a])
I0122 12:38:24.165521       1 controllerserver.go:1232] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 Connected to the gateway 172.30.192.133:5500
I0122 12:38:24.165575       1 nvmeof.go:323] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 Checking if subsystem nqn.2016-06.io.ceph:subsystem.test-integration exists on gateway 172.30.192.133:5500
I0122 12:38:24.176463       1 nvmeof.go:344] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 Subsystem nqn.2016-06.io.ceph:subsystem.test-integration does not exist
I0122 12:38:24.176503       1 nvmeof.go:255] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 Creating NVMe subsystem: nqn.2016-06.io.ceph:subsystem.test-integration on gateway 172.30.192.133:5500
I0122 12:38:24.209949       1 nvmeof.go:293] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 Subsystem created successfully: nqn.2016-06.io.ceph:subsystem.test-integration
I0122 12:38:24.209978       1 controllerserver.go:616] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 Creating listener 0: 10.131.0.130:4420 (ceph-nvmeof-gateway-67468f76d7-49v8c)
I0122 12:38:24.209985       1 nvmeof.go:394] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 Adding listener 10.131.0.130 to subsystem nqn.2016-06.io.ceph:subsystem.test-integration
I0122 12:38:24.254837       1 nvmeof.go:425] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 Listener added successfully: 10.131.0.130 to subsystem nqn.2016-06.io.ceph:subsystem.test-integration
I0122 12:38:24.254866       1 controllerserver.go:616] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 Creating listener 1: 10.128.2.52:4420 (ceph-nvmeof-gateway-67468f76d7-hkv98)
I0122 12:38:24.254871       1 nvmeof.go:394] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 Adding listener 10.128.2.52 to subsystem nqn.2016-06.io.ceph:subsystem.test-integration
I0122 12:38:24.281339       1 nvmeof.go:416] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 Listener 10.128.2.52 stashed for subsystem nqn.2016-06.io.ceph:subsystem.test-integration (will be active when ceph-nvmeof-gateway-67468f76d7-hkv98 gateway comes up)
I0122 12:38:24.281393       1 controllerserver.go:763] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 subsystem nqn.2016-06.io.ceph:subsystem.test-integration and Listener [10.131.0.130:4420 (ceph-nvmeof-gateway-67468f76d7-49v8c) 10.128.2.52:4420 (ceph-nvmeof-gateway-67468f76d7-hkv98)] for the subsystem were created
I0122 12:38:24.281405       1 nvmeof.go:112] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 Creating namespace for RBD ocs-storagecluster-cephblockpool/csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456 in subsystem nqn.2016-06.io.ceph:subsystem.test-integration
I0122 12:38:25.210734       1 nvmeof.go:143] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 Namespace created with NSID: 1
I0122 12:38:25.210765       1 controllerserver.go:771] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 Namespace created: ocs-storagecluster-cephblockpool/csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456 with NSID: 1
I0122 12:38:25.210790       1 controllerserver.go:776] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 Setting QoS limits: RwIops=5000
I0122 12:38:25.210800       1 nvmeof.go:190] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 Setting QoS limits on namespace 1 in subsystem nqn.2016-06.io.ceph:subsystem.test-integration
I0122 12:38:25.312336       1 nvmeof.go:207] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 QoS limits set successfully on namespace 1
I0122 12:38:25.393869       1 omap.go:89] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 got omap values: (pool="ocs-storagecluster-cephblockpool", namespace="", name="csi.volume.a617ce6d-9918-4451-98e9-a2096c57a456"): map[csi.imageid:9b12e674b06a csi.imagename:csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456 csi.volname:pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 csi.volume.owner:default]
I0122 12:38:25.636432       1 controllerserver.go:1106] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 All NVMe-oF metadata stored successfully for volume: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456
I0122 12:38:25.636731       1 utils.go:357] ID: 15 Req-ID: pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 GRPC response: {"volume":{"capacity_bytes":67108864,"volume_context":{"GatewayAddress":"172.30.192.133","GatewayPort":"5500","NamespaceID":"1","NamespaceUUID":"c33919e8-d6ba-49e3-8670-8e3346b9318d","SubsystemNQN":"nqn.2016-06.io.ceph:subsystem.test-integration","authenticationKMSID":"metadata","clusterID":"openshift-storage","dhchapMode":"bidirectional","imageFeatures":"layering","imageName":"csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456","journalPool":"ocs-storagecluster-cephblockpool","listeners":"[{\"address\":\"10.131.0.130\",\"port\":4420,\"hostname\":\"ceph-nvmeof-gateway-67468f76d7-49v8c\"},{\"address\":\"10.128.2.52\",\"port\":4420,\"hostname\":\"ceph-nvmeof-gateway-67468f76d7-hkv98\"}]","nvmeofGatewayAddress":"172.30.192.133","nvmeofGatewayPort":"5500","pool":"ocs-storagecluster-cephblockpool","subsystemNQN":"nqn.2016-06.io.ceph:subsystem.test-integration"},"volume_id":"0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456"}}
[root@cephnvme-devel-server-gadi nvmeof]#
  1. Create test-pod
apiVersion: v1
kind: Pod
metadata:
  name: nvmeof-test-pod
spec:
  containers:
  - name: test-container
    image: busybox
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo 'Testing NVMe-oF mount'; df -h /mnt/nvmeof; sleep 45; done"]
    volumeMounts:
    - name: nvmeof-volume
      mountPath: /mnt/nvmeof
  volumes:
  - name: nvmeof-volume
    persistentVolumeClaim:
      claimName: cephcsi-nvmeof-pvc
  restartPolicy: Never

logs from provisione:

I0122 12:38:56.204437       1 utils.go:350] ID: 16 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 GRPC call: /csi.v1.Controller/ControllerPublishVolume
I0122 12:38:56.204660       1 utils.go:351] ID: 16 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 GRPC request: {"node_id":"gdidi-gwfsm-worker-3-ngbpx","secrets":"***stripped***","volume_capability":{"access_mode":{"mode":"SINGLE_NODE_WRITER"},"mount":{"fs_type":"ext4"}},"volume_context":{"GatewayAddress":"172.30.192.133","GatewayPort":"5500","NamespaceID":"1","NamespaceUUID":"c33919e8-d6ba-49e3-8670-8e3346b9318d","SubsystemNQN":"nqn.2016-06.io.ceph:subsystem.test-integration","authenticationKMSID":"metadata","clusterID":"openshift-storage","dhchapMode":"bidirectional","imageFeatures":"layering","imageName":"csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456","journalPool":"ocs-storagecluster-cephblockpool","listeners":"[{\"address\":\"10.131.0.130\",\"port\":4420,\"hostname\":\"ceph-nvmeof-gateway-67468f76d7-49v8c\"},{\"address\":\"10.128.2.52\",\"port\":4420,\"hostname\":\"ceph-nvmeof-gateway-67468f76d7-hkv98\"}]","nvmeofGatewayAddress":"172.30.192.133","nvmeofGatewayPort":"5500","pool":"ocs-storagecluster-cephblockpool","storage.kubernetes.io/csiProvisionerIdentity":"1769085470831-2724-nvmeof.csi.ceph.com","subsystemNQN":"nqn.2016-06.io.ceph:subsystem.test-integration"},"volume_id":"0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456"}
I0122 12:38:56.204822       1 controllerserver.go:1232] ID: 16 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 Connected to the gateway 172.30.192.133:5500
I0122 12:38:56.204838       1 controllerserver.go:1273] ID: 16 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 DH-CHAP mode: bidirectional - setting up authentication for node gdidi-gwfsm-worker-3-ngbpx
I0122 12:38:56.230277       1 omap.go:89] ID: 16 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 got omap values: (pool="ocs-storagecluster-cephblockpool", namespace="", name="csi.volume.a617ce6d-9918-4451-98e9-a2096c57a456"): map[csi.imageid:9b12e674b06a csi.imagename:csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456 csi.volname:pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 csi.volume.owner:default]
I0122 12:38:56.406653       1 controllerserver.go:1307] ID: 16 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 DH-CHAP host key retrieved for node gdidi-gwfsm-worker-3-ngbpx, subsystem nqn.2016-06.io.ceph:subsystem.test-integration
I0122 12:38:56.568944       1 controllerserver.go:1315] ID: 16 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 DH-CHAP subsystem key retrieved for node gdidi-gwfsm-worker-3-ngbpx, subsystem nqn.2016-06.io.ceph:subsystem.test-integration
I0122 12:38:56.569042       1 nvmeof.go:351] ID: 16 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 Adding host nqn.2014-08.org.nvmexpress:gdidi-gwfsm-worker-3-ngbpx to subsystem nqn.2016-06.io.ceph:subsystem.test-integration on gateway 172.30.192.133:5500
I0122 12:38:56.640341       1 nvmeof.go:379] ID: 16 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 Host added successfully: nqn.2014-08.org.nvmexpress:gdidi-gwfsm-worker-3-ngbpx to subsystem nqn.2016-06.io.ceph:subsystem.test-integration
I0122 12:38:56.640375       1 controllerserver.go:900] ID: 16 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 Host nqn.2014-08.org.nvmexpress:gdidi-gwfsm-worker-3-ngbpx successfully added to subsystem nqn.2016-06.io.ceph:subsystem.test-integration
I0122 12:38:56.640846       1 utils.go:357] ID: 16 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 GRPC response: {"publish_context":{"GatewayAddress":"172.30.192.133","GatewayPort":"5500","HostNQN":"nqn.2014-08.org.nvmexpress:gdidi-gwfsm-worker-3-ngbpx","NamespaceID":"1","NamespaceUUID":"c33919e8-d6ba-49e3-8670-8e3346b9318d","SubsystemNQN":"nqn.2016-06.io.ceph:subsystem.test-integration","authenticationKMSID":"metadata","clusterID":"openshift-storage","dhchapMode":"bidirectional","imageFeatures":"layering","imageName":"csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456","journalPool":"ocs-storagecluster-cephblockpool","listeners":"[{\"address\":\"10.131.0.130\",\"port\":4420,\"hostname\":\"ceph-nvmeof-gateway-67468f76d7-49v8c\"},{\"address\":\"10.128.2.52\",\"port\":4420,\"hostname\":\"ceph-nvmeof-gateway-67468f76d7-hkv98\"}]","nvmeofGatewayAddress":"172.30.192.133","nvmeofGatewayPort":"5500","pool":"ocs-storagecluster-cephblockpool","storage.kubernetes.io/csiProvisionerIdentity":"1769085470831-2724-nvmeof.csi.ceph.com","subsystemNQN":"nqn.2016-06.io.ceph:subsystem.test-integration"}}
[root@cephnvme-devel-server-gadi nvmeof]#

logs from nodeserer:

I0122 12:39:01.823458  105266 utils.go:350] ID: 33 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 GRPC call: /csi.v1.Node/NodeStageVolume
I0122 12:39:01.823658  105266 utils.go:351] ID: 33 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 GRPC request: {"publish_context":{"GatewayAddress":"172.30.192.133","GatewayPort":"5500","HostNQN":"nqn.2014-08.org.nvmexpress:gdidi-gwfsm-worker-3-ngbpx","NamespaceID":"1","NamespaceUUID":"c33919e8-d6ba-49e3-8670-8e3346b9318d","SubsystemNQN":"nqn.2016-06.io.ceph:subsystem.test-integration","authenticationKMSID":"metadata","clusterID":"openshift-storage","dhchapMode":"bidirectional","imageFeatures":"layering","imageName":"csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456","journalPool":"ocs-storagecluster-cephblockpool","listeners":"[{\"address\":\"10.131.0.130\",\"port\":4420,\"hostname\":\"ceph-nvmeof-gateway-67468f76d7-49v8c\"},{\"address\":\"10.128.2.52\",\"port\":4420,\"hostname\":\"ceph-nvmeof-gateway-67468f76d7-hkv98\"}]","nvmeofGatewayAddress":"172.30.192.133","nvmeofGatewayPort":"5500","pool":"ocs-storagecluster-cephblockpool","storage.kubernetes.io/csiProvisionerIdentity":"1769085470831-2724-nvmeof.csi.ceph.com","subsystemNQN":"nqn.2016-06.io.ceph:subsystem.test-integration"},"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/nvmeof.csi.ceph.com/bf75892c9c560a63f15aa55cfd4701da521cf3feb7135fd6bf4276e1da92be88/globalmount","volume_capability":{"access_mode":{"mode":"SINGLE_NODE_MULTI_WRITER"},"mount":{"fs_type":"ext4"}},"volume_context":{"GatewayAddress":"172.30.192.133","GatewayPort":"5500","NamespaceID":"1","NamespaceUUID":"c33919e8-d6ba-49e3-8670-8e3346b9318d","SubsystemNQN":"nqn.2016-06.io.ceph:subsystem.test-integration","authenticationKMSID":"metadata","clusterID":"openshift-storage","dhchapMode":"bidirectional","imageFeatures":"layering","imageName":"csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456","journalPool":"ocs-storagecluster-cephblockpool","listeners":"[{\"address\":\"10.131.0.130\",\"port\":4420,\"hostname\":\"ceph-nvmeof-gateway-67468f76d7-49v8c\"},{\"address\":\"10.128.2.52\",\"port\":4420,\"hostname\":\"ceph-nvmeof-gateway-67468f76d7-hkv98\"}]","nvmeofGatewayAddress":"172.30.192.133","nvmeofGatewayPort":"5500","pool":"ocs-storagecluster-cephblockpool","storage.kubernetes.io/csiProvisionerIdentity":"1769085470831-2724-nvmeof.csi.ceph.com","subsystemNQN":"nqn.2016-06.io.ceph:subsystem.test-integration"},"volume_id":"0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456"}
I0122 12:39:01.839793  105266 omap.go:89] ID: 33 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 got omap values: (pool="ocs-storagecluster-cephblockpool", namespace="", name="csi.volume.a617ce6d-9918-4451-98e9-a2096c57a456"): map[csi.imageid:9b12e674b06a csi.imagename:csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456 csi.volname:pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 csi.volume.owner:default]
I0122 12:39:02.099316  105266 cephcmds.go:164] ID: 33 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 command succeeded: nvme [list-subsys -o json]
I0122 12:39:02.099386  105266 nvmeof_initiator.go:175] ID: 33 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 Connecting to NVMe-oF subsystem nqn.2016-06.io.ceph:subsystem.test-integration at 10.131.0.130:4420
I0122 12:39:03.274222  105266 cephcmds.go:164] ID: 33 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 command succeeded: nvme [connect -t tcp -n nqn.2016-06.io.ceph:subsystem.test-integration -a 10.131.0.130 -s 4420 -l 1800 --hostnqn nqn.2014-08.org.nvmexpress:gdidi-gwfsm-worker-3-ngbpx --dhchap-secret DHHC-1:01:WcZe7dl7Rf+kr2jL8Yw1oR5M0Sb0mtl/8o45JJzWgGhaKarA: --dhchap-ctrl-secret DHHC-1:01:v9V4Ig5zLlI8+y1r4P/11czWD/OfcEwdZnCy0wHhpY6tw1a4:]
I0122 12:39:03.274291  105266 nvmeof_initiator.go:208] ID: 33 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 Successfully connected to subsystem nqn.2016-06.io.ceph:subsystem.test-integration via 10.131.0.130:4420
I0122 12:39:03.274303  105266 nvmeof_initiator.go:175] ID: 33 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 Connecting to NVMe-oF subsystem nqn.2016-06.io.ceph:subsystem.test-integration at 10.128.2.52:4420
E0122 12:39:03.281079  105266 cephcmds.go:157] ID: 33 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 an error (exit status 1) and stderror (already connected
) occurred while running nvme args: [connect -t tcp -n nqn.2016-06.io.ceph:subsystem.test-integration -a 10.128.2.52 -s 4420 -l 1800 --hostnqn nqn.2014-08.org.nvmexpress:gdidi-gwfsm-worker-3-ngbpx --dhchap-secret DHHC-1:01:WcZe7dl7Rf+kr2jL8Yw1oR5M0Sb0mtl/8o45JJzWgGhaKarA: --dhchap-ctrl-secret DHHC-1:01:v9V4Ig5zLlI8+y1r4P/11czWD/OfcEwdZnCy0wHhpY6tw1a4:]
W0122 12:39:03.281121  105266 nvmeof_initiator.go:203] ID: 33 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 Failed to connect to 10.128.2.52:4420 - stdout: , stderr: already connected
I0122 12:39:03.404608  105266 nodeserver.go:727] ID: 33 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 nvmeof: mounting device /dev/disk/by-id/nvme-uuid.c33919e8-d6ba-49e3-8670-8e3346b9318d to /var/lib/kubelet/plugins/kubernetes.io/csi/nvmeof.csi.ceph.com/bf75892c9c560a63f15aa55cfd4701da521cf3feb7135fd6bf4276e1da92be88/globalmount/0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 with fsType ext4
I0122 12:39:03.404638  105266 mount_linux.go:670] Attempting to determine if disk "/dev/disk/by-id/nvme-uuid.c33919e8-d6ba-49e3-8670-8e3346b9318d" is formatted using blkid with args: ([-p -s TYPE -s PTTYPE -o export /dev/disk/by-id/nvme-uuid.c33919e8-d6ba-49e3-8670-8e3346b9318d])
I0122 12:39:03.477409  105266 mount_linux.go:673] Output: ""
I0122 12:39:03.477450  105266 mount_linux.go:608] Disk "/dev/disk/by-id/nvme-uuid.c33919e8-d6ba-49e3-8670-8e3346b9318d" appears to be unformatted, attempting to format as type: "ext4" with options: [-F -m0 /dev/disk/by-id/nvme-uuid.c33919e8-d6ba-49e3-8670-8e3346b9318d]
I0122 12:39:03.524091  105266 mount_linux.go:619] Disk successfully formatted (mkfs): ext4 - /dev/disk/by-id/nvme-uuid.c33919e8-d6ba-49e3-8670-8e3346b9318d /var/lib/kubelet/plugins/kubernetes.io/csi/nvmeof.csi.ceph.com/bf75892c9c560a63f15aa55cfd4701da521cf3feb7135fd6bf4276e1da92be88/globalmount/0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456
I0122 12:39:03.524131  105266 mount_linux.go:637] Attempting to mount disk /dev/disk/by-id/nvme-uuid.c33919e8-d6ba-49e3-8670-8e3346b9318d in ext4 format at /var/lib/kubelet/plugins/kubernetes.io/csi/nvmeof.csi.ceph.com/bf75892c9c560a63f15aa55cfd4701da521cf3feb7135fd6bf4276e1da92be88/globalmount/0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456
I0122 12:39:03.524158  105266 mount_linux.go:260] Mounting cmd (mount) with arguments (-t ext4 -o _netdev,defaults /dev/disk/by-id/nvme-uuid.c33919e8-d6ba-49e3-8670-8e3346b9318d /var/lib/kubelet/plugins/kubernetes.io/csi/nvmeof.csi.ceph.com/bf75892c9c560a63f15aa55cfd4701da521cf3feb7135fd6bf4276e1da92be88/globalmount/0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456)
I0122 12:39:03.536261  105266 nodeserver.go:204] ID: 33 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 nvmeof: successfully staged volume 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 to stagingTargetPath /var/lib/kubelet/plugins/kubernetes.io/csi/nvmeof.csi.ceph.com/bf75892c9c560a63f15aa55cfd4701da521cf3feb7135fd6bf4276e1da92be88/globalmount/0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456
I0122 12:39:03.536410  105266 utils.go:357] ID: 33 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 GRPC response: {}
I0122 12:39:03.538050  105266 utils.go:350] ID: 34 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0122 12:39:03.538072  105266 utils.go:351] ID: 34 GRPC request: {}
I0122 12:39:03.538144  105266 utils.go:357] ID: 34 GRPC response: {"capabilities":[{"rpc":{"type":"STAGE_UNSTAGE_VOLUME"}},{"rpc":{"type":"SINGLE_NODE_MULTI_WRITER"}},{"rpc":{"type":"EXPAND_VOLUME"}}]}
I0122 12:39:03.551101  105266 utils.go:350] ID: 35 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0122 12:39:03.551121  105266 utils.go:351] ID: 35 GRPC request: {}
I0122 12:39:03.551178  105266 utils.go:357] ID: 35 GRPC response: {"capabilities":[{"rpc":{"type":"STAGE_UNSTAGE_VOLUME"}},{"rpc":{"type":"SINGLE_NODE_MULTI_WRITER"}},{"rpc":{"type":"EXPAND_VOLUME"}}]}
I0122 12:39:03.552245  105266 utils.go:350] ID: 36 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0122 12:39:03.552265  105266 utils.go:351] ID: 36 GRPC request: {}
I0122 12:39:03.552323  105266 utils.go:357] ID: 36 GRPC response: {"capabilities":[{"rpc":{"type":"STAGE_UNSTAGE_VOLUME"}},{"rpc":{"type":"SINGLE_NODE_MULTI_WRITER"}},{"rpc":{"type":"EXPAND_VOLUME"}}]}
I0122 12:39:03.553324  105266 utils.go:350] ID: 37 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 GRPC call: /csi.v1.Node/NodePublishVolume
I0122 12:39:03.553435  105266 utils.go:351] ID: 37 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 GRPC request: {"publish_context":{"GatewayAddress":"172.30.192.133","GatewayPort":"5500","HostNQN":"nqn.2014-08.org.nvmexpress:gdidi-gwfsm-worker-3-ngbpx","NamespaceID":"1","NamespaceUUID":"c33919e8-d6ba-49e3-8670-8e3346b9318d","SubsystemNQN":"nqn.2016-06.io.ceph:subsystem.test-integration","authenticationKMSID":"metadata","clusterID":"openshift-storage","dhchapMode":"bidirectional","imageFeatures":"layering","imageName":"csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456","journalPool":"ocs-storagecluster-cephblockpool","listeners":"[{\"address\":\"10.131.0.130\",\"port\":4420,\"hostname\":\"ceph-nvmeof-gateway-67468f76d7-49v8c\"},{\"address\":\"10.128.2.52\",\"port\":4420,\"hostname\":\"ceph-nvmeof-gateway-67468f76d7-hkv98\"}]","nvmeofGatewayAddress":"172.30.192.133","nvmeofGatewayPort":"5500","pool":"ocs-storagecluster-cephblockpool","storage.kubernetes.io/csiProvisionerIdentity":"1769085470831-2724-nvmeof.csi.ceph.com","subsystemNQN":"nqn.2016-06.io.ceph:subsystem.test-integration"},"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/nvmeof.csi.ceph.com/bf75892c9c560a63f15aa55cfd4701da521cf3feb7135fd6bf4276e1da92be88/globalmount","target_path":"/var/lib/kubelet/pods/c25d681a-9ad9-42aa-8977-14d7a40054be/volumes/kubernetes.io~csi/pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0/mount","volume_capability":{"access_mode":{"mode":"SINGLE_NODE_MULTI_WRITER"},"mount":{"fs_type":"ext4"}},"volume_context":{"GatewayAddress":"172.30.192.133","GatewayPort":"5500","NamespaceID":"1","NamespaceUUID":"c33919e8-d6ba-49e3-8670-8e3346b9318d","SubsystemNQN":"nqn.2016-06.io.ceph:subsystem.test-integration","authenticationKMSID":"metadata","clusterID":"openshift-storage","dhchapMode":"bidirectional","imageFeatures":"layering","imageName":"csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456","journalPool":"ocs-storagecluster-cephblockpool","listeners":"[{\"address\":\"10.131.0.130\",\"port\":4420,\"hostname\":\"ceph-nvmeof-gateway-67468f76d7-49v8c\"},{\"address\":\"10.128.2.52\",\"port\":4420,\"hostname\":\"ceph-nvmeof-gateway-67468f76d7-hkv98\"}]","nvmeofGatewayAddress":"172.30.192.133","nvmeofGatewayPort":"5500","pool":"ocs-storagecluster-cephblockpool","storage.kubernetes.io/csiProvisionerIdentity":"1769085470831-2724-nvmeof.csi.ceph.com","subsystemNQN":"nqn.2016-06.io.ceph:subsystem.test-integration"},"volume_id":"0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456"}
I0122 12:39:03.553647  105266 nodeserver.go:426] ID: 37 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 target /var/lib/kubelet/pods/c25d681a-9ad9-42aa-8977-14d7a40054be/volumes/kubernetes.io~csi/pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0/mount
isBlock false
fstype ext4
stagingPath /var/lib/kubelet/plugins/kubernetes.io/csi/nvmeof.csi.ceph.com/bf75892c9c560a63f15aa55cfd4701da521cf3feb7135fd6bf4276e1da92be88/globalmount/0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456
readonly false
mountflags [bind]
I0122 12:39:03.553675  105266 mount_linux.go:260] Mounting cmd (mount) with arguments (-t ext4 -o bind /var/lib/kubelet/plugins/kubernetes.io/csi/nvmeof.csi.ceph.com/bf75892c9c560a63f15aa55cfd4701da521cf3feb7135fd6bf4276e1da92be88/globalmount/0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 /var/lib/kubelet/pods/c25d681a-9ad9-42aa-8977-14d7a40054be/volumes/kubernetes.io~csi/pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0/mount)
I0122 12:39:03.555523  105266 mount_linux.go:260] Mounting cmd (mount) with arguments (-t ext4 -o bind,remount /var/lib/kubelet/plugins/kubernetes.io/csi/nvmeof.csi.ceph.com/bf75892c9c560a63f15aa55cfd4701da521cf3feb7135fd6bf4276e1da92be88/globalmount/0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 /var/lib/kubelet/pods/c25d681a-9ad9-42aa-8977-14d7a40054be/volumes/kubernetes.io~csi/pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0/mount)
I0122 12:39:03.557109  105266 nodeserver.go:252] ID: 37 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 nvmeof: successfully mounted stagingPath /var/lib/kubelet/plugins/kubernetes.io/csi/nvmeof.csi.ceph.com/bf75892c9c560a63f15aa55cfd4701da521cf3feb7135fd6bf4276e1da92be88/globalmount/0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 to targetPath /var/lib/kubelet/pods/c25d681a-9ad9-42aa-8977-14d7a40054be/volumes/kubernetes.io~csi/pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0/mount
I0122 12:39:03.557166  105266 utils.go:357] ID: 37 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 GRPC response: {}



# test pod event description 

Events:
  Type     Reason                  Age                From                     Message
  ----     ------                  ----               ----                     -------
  Normal   Scheduled               56s                default-scheduler        Successfully assigned default/nvmeof-test-pod to gdidi-gwfsm-worker-3-ngbpx
  Normal   SuccessfulAttachVolume  56s                attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0"
  Warning  FailedMount             53s (x3 over 55s)  kubelet                  MountVolume.MountDevice failed for volume "pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0" : rpc error: code = Internal desc = failed to connect to subsystem: failed to connect to any gateway address for subsystem nqn.2016-06.io.ceph:subsystem.test-integration
  Normal   AddedInterface          49s                multus                   Add eth0 [10.131.0.145/23] from ovn-kubernetes
  Normal   Pulling                 49s                kubelet                  Pulling image "busybox"
  Normal   Pulled                  48s                kubelet                  Successfully pulled image "busybox" in 639ms (639ms including waiting). Image size: 4670414 bytes.
  Normal   Created                 48s                kubelet                  Created container: test-container
  Normal   Started                 48s                kubelet                  Started container test-container

verift with nvmeof GW:

[root@cephnvme-devel-server-gadi nvmeof]# kubectl -n openshift-storage exec -it nvmeof-cli -- /tmp/nvmeof-cli host list -n nqn.2016-06.io.ceph:subsystem.test-integration
Hosts allowed to access nqn.2016-06.io.ceph:subsystem.test-integration:
╒═══════════════════════════════════════════════════════╤════════════╤═══════════════╕
│                       Host NQN                        │  Uses PSK  │  Uses DHCHAP  │
╞═══════════════════════════════════════════════════════╪════════════╪═══════════════╡
│ nqn.2014-08.org.nvmexpress:gdidi-gwfsm-worker-3-ngbpx │     No     │      Yes      │
╘═══════════════════════════════════════════════════════╧════════════╧═══════════════╛
[root@cephnvme-devel-server-gadi nvmeof]#
[root@cephnvme-devel-server-gadi nvmeof]# kubectl -n openshift-storage exec -it nvmeof-cli -- /tmp/nvmeof-cli ns list -n nqn.2016-06.io.ceph:subsystem.test-integration
Namespaces in subsystem nqn.2016-06.io.ceph:subsystem.test-integration:
╒════════════════════════════════════════════════╤════════╤════════════════════════╤═══════════════════════════════════════════════════════════════════════════════╤════════════╤═════════╤═════════╤═════════════════════╤═════════════╤════════════╤══════════════╤═══════════╤═════════════════╕
│ NQN                                            │   NSID │ Bdev                   │ RBD                                                                           │ Mode       │ Image   │ Block   │ UUID                │        Load │ Location   │ Visibility   │   IOs per │ R/W, R, W MBs   │
│                                                │        │ Name                   │ Image                                                                         │            │ Size    │ Size    │                     │   Balancing │            │              │    second │ per second      │
│                                                │        │                        │                                                                               │            │         │         │                     │       Group │            │              │           │                 │
╞════════════════════════════════════════════════╪════════╪════════════════════════╪═══════════════════════════════════════════════════════════════════════════════╪════════════╪═════════╪═════════╪═════════════════════╪═════════════╪════════════╪══════════════╪═══════════╪═════════════════╡
│ nqn.2016-06.io.ceph:subsystem.test-integration │      1 │ bdev_c33919e8-d6ba-    │ ocs-storagecluster-cephblockpool/csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456 │ Read-Write │ 64 MiB  │ 4 KiB   │ c33919e8-d6ba-49e3- │           2 │ <N/A>      │ All Hosts    │      5000 │ unset           │
│                                                │        │ 49e3-8670-8e3346b9318d │                                                                               │            │         │         │ 8670-8e3346b9318d   │             │            │              │           │ unset           │
│                                                │        │                        │                                                                               │            │         │         │                     │             │            │              │           │ unset           │
╘════════════════════════════════════════════════╧════════╧════════════════════════╧═══════════════════════════════════════════════════════════════════════════════╧════════════╧═════════╧═════════╧═════════════════════╧═════════════╧════════════╧══════════════╧═══════════╧═════════════════╛
[root@cephnvme-devel-server-gadi nvmeof]#

Delete test-pod:

I0122 12:41:10.905952       1 utils.go:350] ID: 17 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 GRPC call: /csi.v1.Controller/ControllerUnpublishVolume
I0122 12:41:10.906039       1 utils.go:351] ID: 17 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 GRPC request: {"node_id":"gdidi-gwfsm-worker-3-ngbpx","secrets":"***stripped***","volume_id":"0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456"}
I0122 12:41:10.913464       1 omap.go:89] ID: 17 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 got omap values: (pool="ocs-storagecluster-cephblockpool", namespace="", name="csi.volume.a617ce6d-9918-4451-98e9-a2096c57a456"): map[csi.imageid:9b12e674b06a csi.imagename:csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456 csi.volname:pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 csi.volume.owner:default]
I0122 12:41:11.065162       1 controllerserver.go:1232] ID: 17 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 Connected to the gateway 172.30.192.133:5500
I0122 12:41:11.065191       1 nvmeof.go:494] ID: 17 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 Listing namespaces in subsystem nqn.2016-06.io.ceph:subsystem.test-integration
I0122 12:41:11.182261       1 nvmeof.go:508] ID: 17 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 Listed namespaces in subsystem nqn.2016-06.io.ceph:subsystem.test-integration successfully
I0122 12:41:11.182292       1 nvmeof.go:467] ID: 17 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 Removing host nqn.2014-08.org.nvmexpress:gdidi-gwfsm-worker-3-ngbpx from subsystem nqn.2016-06.io.ceph:subsystem.test-integration
I0122 12:41:11.280930       1 nvmeof.go:479] ID: 17 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 Host nqn.2014-08.org.nvmexpress:gdidi-gwfsm-worker-3-ngbpx removed successfully from subsystem nqn.2016-06.io.ceph:subsystem.test-integration
I0122 12:41:11.280959       1 controllerserver.go:952] ID: 17 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 Host nqn.2014-08.org.nvmexpress:gdidi-gwfsm-worker-3-ngbpx removed from subsystem nqn.2016-06.io.ceph:subsystem.test-integration
I0122 12:41:11.280969       1 controllerserver.go:1339] ID: 17 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 Cleaning up DH-CHAP keys for node gdidi-gwfsm-worker-3-ngbpx, subsystem nqn.2016-06.io.ceph:subsystem.test-integration
I0122 12:41:11.280974       1 controllerserver.go:1341] ID: 17 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 DH-CHAP mode: bidirectional - setting up authentication for node gdidi-gwfsm-worker-3-ngbpx
I0122 12:41:11.302981       1 omap.go:89] ID: 17 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 got omap values: (pool="ocs-storagecluster-cephblockpool", namespace="", name="csi.volume.a617ce6d-9918-4451-98e9-a2096c57a456"): map[csi.imageid:9b12e674b06a csi.imagename:csi-vol-a617ce6d-9918-4451-98e9-a2096c57a456 csi.volname:pvc-aed85ef3-9023-4587-aa71-05ef188f8ce0 csi.volume.owner:default]
I0122 12:41:11.320743       1 controllerserver.go:1370] ID: 17 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 DH-CHAP host key removed for node gdidi-gwfsm-worker-3-ngbpx
I0122 12:41:11.320771       1 controllerserver.go:1378] ID: 17 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 DH-CHAP subsystem key removed for node gdidi-gwfsm-worker-3-ngbpx
I0122 12:41:11.321152       1 utils.go:357] ID: 17 Req-ID: 0001-0011-openshift-storage-0000000000000002-a617ce6d-9918-4451-98e9-a2096c57a456 GRPC response: {}

You can see

nvme connect -t tcp -n nqn.2016-06.io.ceph:subsystem.test-integration -a 10.128.2.52 -s 4420 -l 1800 --hostnqn nqn.2014-08.org.nvmexpress:gdidi-gwfsm-worker-3-ngbpx --dhchap-secret DHHC-1:01:WcZe7dl7Rf+kr2jL8Yw1oR5M0Sb0mtl/8o45JJzWgGhaKarA: --dhchap-ctrl-secret DHHC-1:01:v9V4Ig5zLlI8+y1r4P/11czWD/OfcEwdZnCy0wHhpY6tw1a4:]

Future concerns

  • Vault integration testing: Validate with production Vault KMS
  • Key rotation: Add support for periodic key rotation

Checklist:

  • Commit Message Formatting: Commit titles and messages follow
    guidelines in the developer
    guide
    .
  • Reviewed the developer guide on Submitting a Pull
    Request
  • Pending release
    notes

    updated with breaking and/or notable changes for the next major release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

Show available bot commands

These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:

  • /retest ci/centos/<job-name>: retest the <job-name> after unrelated
    failure (please report the failure too!)

@gadididi gadididi self-assigned this Jan 12, 2026
@gadididi gadididi added the component/nvme-of Issues and PRs related to NVMe-oF. label Jan 12, 2026
@gadididi gadididi force-pushed the nvmeof/add_dhchap branch 14 times, most recently from 4a104fb to 803fb10 Compare January 19, 2026 14:51
@gadididi gadididi changed the title [WIP] nvmeof: add dhchap feature into nvmeof csi driver nvmeof: add dhchap feature into nvmeof csi driver Jan 20, 2026
@gadididi gadididi marked this pull request as ready for review January 20, 2026 11:59
@gadididi gadididi requested a review from nixpanic January 20, 2026 12:02
@gadididi gadididi force-pushed the nvmeof/add_dhchap branch 2 times, most recently from 0302f7b to 985a022 Compare January 21, 2026 09:47
@gadididi gadididi requested a review from Madhu-1 January 21, 2026 09:47

@nixpanic nixpanic left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks reasonable, but missing some of the design to understand it well. Can you add DHCAP support to the docs/proposals/nvmeof.md file?

Comment thread internal/nvmeof/dhchap.go
return fmt.Errorf("invalid NVMe-oF QoS parameters: %w", err)
}
dhchapMode := params["dhchapMode"]
if dhchapMode != nvmeof.DHCHAPEmpty && dhchapMode != nvmeof.DHCHAPModeNone &&

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are quite a few large if-statements like this. Can you place this in some helper function in a new internal/nvmeof/controller/dhchap.go file?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


// setupDHCHAPKeys configures DH-CHAP authentication and returns the host key.
// Returns empty string if DH-CHAP is disabled.
func (cs *Server) setupDHCHAPKeys(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move these functions to a new dhchap.go file, please

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is very specific to controller , there are some steps here that (like setup the dekstore if need) do not belong to th DH-CHAP

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an other filename is fine too. My preference is to have the main controllerserver.go only implement the functions for the CSI gRPC protocol, and a few minimal helper functions. Other specialized functions should be placed in other (new) files. This file is far over 1000 lines long, and that contributes to maintenance overhead.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread internal/nvmeof/dhchap.go
Comment thread internal/nvmeof/dhchap.go Outdated
// DH-CHAP modes.
const (
DHCHAPEmpty = ""
DHCHAPModeNone = "none"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the difference between empty/none? Comments for the consts might be helpful.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add comment.
DHCHAPEmpty is when this field was not provided in the storageclass.
DHCHAPModeNone is when this field was provided , but the choose option is none means no dh-chap feature.
these 2 vars behave the same.

Comment thread internal/nvmeof/dhchap.go Outdated
return dhchapKey, nil
}

// DH-CHAP specific key generator.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a standard or specification to refer to?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes,
in NVM Express® Base Specification, Revision 2.3

image

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. But better to use nvme gen-dhchap-key so ceph-csi does not need to implement this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

Comment thread internal/nvmeof/crypto.go Outdated

// SecurityKeyNVMEOFManager manages NVMe-oF authentication keys
// Similar to VolumeEncryption, but for authentication instead of disk encryption.
type SecurityKeyNVMEOFManager struct {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no need to have NVMeoF prefixes for the const or inside the struct name. This is included in the nvmeof package already, no need for the duplication.

Have you considered making this a type SecurityKeyManager interface? An internal type securityKeyManager struct can have the implementation. The func InitSecurityKeyManager(..) (SecurityKeyManager, error) returns a pointer to the securityKeyManager struct, and all attributes inside the struct are protected from abuse/misuse. This results in a cleaner API that eventually leads to fewer errors while using it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread internal/nvmeof/nodeserver/nodeserver.go
func (cs *NodeServer) getOrInitSecurityKeys(
ctx context.Context,
kmsID string,
credentials map[string]string,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what kind of credentials are this? Not the usual util.Credentials... Might be good to give it an other name, or explain it in a comment above the function

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, this is the secret. I gave it wrong param name.. fixed

Comment thread internal/nvmeof/nodeserver/nodeserver.go
@mergify

mergify Bot commented Jan 25, 2026

Copy link
Copy Markdown
Contributor

This pull request now has conflicts with the target branch. Could you please resolve conflicts and force push the corrected changes? 🙏

@gadididi gadididi force-pushed the nvmeof/add_dhchap branch 2 times, most recently from 07c835e to 24d43f1 Compare January 25, 2026 11:41
nixpanic
nixpanic previously approved these changes Feb 5, 2026

@nixpanic nixpanic left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for all the improvements!

Left a question about the nvme command execution for key generation. That is something to consider for a follow-up PR.

Comment thread internal/nvmeof/dhchap.go
@mergify mergify Bot dismissed nixpanic’s stale review February 8, 2026 10:24

Pull request has been modified.

@gadididi gadididi requested a review from nixpanic February 8, 2026 10:26
Comment thread internal/nvmeof/controller/controllerserver.go Outdated
Comment thread internal/nvmeof/controller/security.go Outdated
Comment thread internal/nvmeof/controller/security.go
Comment thread internal/nvmeof/controller/security.go
Comment thread internal/nvmeof/nodeserver/security.go
Comment thread internal/nvmeof/crypto.go Outdated
Comment thread internal/nvmeof/dhchap.go
Comment thread internal/nvmeof/dhchap.go
) (string, error) {
// Try to get existing key
hostKey, err := getDHCHAPHostKey(ctx, skm, nodeID, subsystemNQN)
if err == nil {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about the some internal error cases? it looks like we will generate and storage the key

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great catch, I will fix it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Madhu-1 , I am wondering how solve this issue.
I can add a checking for the return error like

func GetOrCreateDHCHAPHostKey(
	ctx context.Context,
	skm SecurityKeyManager,
	nodeID,
	subsystemNQN,
	hostNQN string,
) (string, error) {
	// Try to get existing key
	hostKey, err := getDHCHAPHostKey(ctx, skm, nodeID, subsystemNQN)
	if err == nil {
		// Key exists, return it
		return hostKey, nil
	}
	// Only create if truly not found - not on any other error.
	if !errors.Is(err, ErrKeyNotFound) {
		// Real error (KMS down, network issue etc) - don't generate new key
		return "", fmt.Errorf("failed to check existing host key: %w", err)
	}
...

ErrKeyNotFound will come from the callee function getDHCHAPHostKey() when it will find out non-exist error.
but each dekstore has its own error code (azure , rbd meta data- for testing, etc'.. )
I saw that rbd meta data dekstore ( internal/nvmeof/rbd_dekstore.go ) I created calls to GetMetadata() which it calls to librbd.
librbd returns -ENOENT when metadata key doesn't exist.

But another kms (with different dekstore) probably will return deferent error..
I am not sure how I can handle it..

maybe in each exist kms implementation in internal/kms that also implements DEKStore interface need to add specific error check if the key not found (and not just err != nil)

what do you think?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we dont get a common error, it would be tricky, lets add this, and we need to add a comment to update the error list when we add a new KMS support asan error might differ

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@gadididi gadididi requested a review from Madhu-1 February 10, 2026 11:22
Implement SecurityKeyNVMEOFManager to manage NVMe-oF DH-CHAP authentication
keys using pluggable KMS backends (Vault, metadata KMS, etc).
In the future will be used also for tls\psk keys

- Encrypts/decrypts keys using KMS
- Supports both integrated KMS (Vault) and external DEKStore (RBD metadata)
- Provides StoreKey/GetKey/RemoveKey for key lifecycle management

Defaults to "metadata" KMS for **testing**,
which stores encrypted keys
in RBD image metadata via external DEKStore.

Signed-off-by: gadi-didi <gadi.didi@ibm.com>
Add DH-CHAP authentication key operations for NVMe-oF connections:

- generateDHCHAPKey(): Creates NVMe spec-compliant keys in
  DHHC-1:hash:base64:crc32 format (supports SHA-256/384/512)
- GetOrCreateDHCHAPHostKey/SubsystemKey(): Manages per-connection keys
- buildDHCHAPKeyID(): Generates unique key IDs using nodeID and
  hashed subsystemNQN (e.g., nvmeof-dhchap-host-node1-abc123)

Keys are stored/retrieved via SecurityKeyNVMEOFManager using KMS
encryption. Each node-subsystem pair gets a unique authentication key.

Signed-off-by: gadi-didi <gadi.didi@ibm.com>
- get the dhchap mode- it can be
"none", "unidirectional", "bidirectional", or empty
and store it in volumeContext (for ControllerPublishVolume())
- get authenticationKMSID var. also can
be empty. and store it in volumeContext.

Signed-off-by: gadi-didi <gadi.didi@ibm.com>
adding these vars because there are going to be in used
in removing dh-chap keys- in ControllerunPblishVoluem().

Signed-off-by: gadi-didi <gadi.didi@ibm.com>
Implement kms.DEKStore using RBD image metadata for testing/development.
Stores encrypted DH-CHAP keys as metadata entries in the backing RBD image.

- Keys stored with prefix "nvmeof.csi.ceph.com/" to distinguish from
  volume encryption and NVMe-oF resource metadata
- Used with "metadata" KMS type (secrets-metadata) for POC/testing
- Production deployments should use Vault KMS with integrated storage

Signed-off-by: gadi-didi <gadi.didi@ibm.com>
add instace of SecurityKeyNVMEOFManager into
controller server struct.
also add getOrInitSecurityKeys() ,
lazy init for SecurityKeyNVMEOFManager struct.
this field is the manager of sercurity things like dh-chap
psk\tls.

Signed-off-by: gadi-didi <gadi.didi@ibm.com>
Implement DH-CHAP authentication for NVMe-oF host connections with
pluggable KMS backend support for secure key management.

Changes:
- setupDHCHAPKeys(): Initialize KMS, retrieve/generate host authentication
  keys, and configure DEKStore (RBD metadata for testing, Vault for prod)
- cleanupDHCHAPKeys(): Remove DH-CHAP keys during ControllerUnpublishVolume
- Update AddHost() to accept optional DH-CHAP host key parameter
- Add dhchapMode volume context parameter (none/unidirectional/bidirectional)

Authentication flow:
1. Parse dhchapMode from volume context (defaults to "metadata" KMS)
2. Initialize SecurityKeyNVMEOFManager with KMS credentials
3. For metadata KMS: Set RBD volume as DEKStore (test mode only)
4. GetOrCreateDHCHAPHostKey() retrieves existing or generates new key
5. Pass encrypted key to gateway AddHost() call

Keys are stored per node-subsystem pair, encrypted with KMS (KEK from
K8s Secret), and persist across volume operations. Production deployments
should use Vault KMS with integrated DEK storage.

Signed-off-by: gadi-didi <gadi.didi@ibm.com>
Implement DH-CHAP authentication for NVMe-oF initiator connections
with support for both unidirectional and bidirectional modes.

Changes:
- setupDHCHAPAuth(): Retrieve host/subsystem keys from KMS and configure
  ConnectRequest with authentication parameters
- Pass dhchapMode from volume context through connection flow
- Add --dhchap-secret and --dhchap-ctrl-secret to nvme connect command
- Support RBD metadata DEKStore (testing) and Vault KMS (production)

Authentication flow:
1. NodeStageVolume receives dhchapMode from volume context
2. Initialize SecurityKeyNVMEOFManager with same KMS as controller
3. Retrieve existing DH-CHAP keys (generated during ControllerPublishVolume)
4. For unidirectional: Add host key to nvme connect
5. For bidirectional: Add both host and subsystem keys

Signed-off-by: gadi-didi <gadi.didi@ibm.com>
@gadididi

Copy link
Copy Markdown
Contributor Author

Hi @nixpanic, @Madhu-1
I pushed they fixes : https://github.com/ceph/ceph-csi/compare/51efa66c5463a73a9b142965930657a5a7214df7..549b5157ef5c87ecbcbcf881c08f88c6517ee1f6
I also tested and verified everything is ok.
there are no dh-chap tests yet (will be in e2e test soon in next PR).
can you re-review again my PR please? 🙂

@nixpanic

Copy link
Copy Markdown
Member

@Mergifyio queue

@mergify

mergify Bot commented Feb 13, 2026

Copy link
Copy Markdown
Contributor

Merge Queue Status

Rule: default


This pull request spent 37 minutes 19 seconds in the queue, including 37 minutes running CI.

Required conditions to merge
  • #approved-reviews-by >= 2 [🛡 GitHub branch protection]
  • #changes-requested-reviews-by = 0 [🛡 GitHub branch protection]
  • any of:
    • all of:
      • base=devel
      • status-success=codespell
      • status-success=go-test
      • status-success=golangci-lint
      • status-success=lint-extras
      • status-success=mod-check
      • status-success=multi-arch-build
      • status-success=uncommitted-code-check
      • any of:
        • label=ci/skip/e2e
        • all of:
          • status-success=ci/centos/k8s-e2e-external-storage/1.33
          • status-success=ci/centos/k8s-e2e-external-storage/1.34
          • status-success=ci/centos/k8s-e2e-external-storage/1.35
          • status-success=ci/centos/mini-e2e-helm/k8s-1.33
          • status-success=ci/centos/mini-e2e-helm/k8s-1.34
          • status-success=ci/centos/mini-e2e-helm/k8s-1.35
          • status-success=ci/centos/mini-e2e/k8s-1.33
          • status-success=ci/centos/mini-e2e/k8s-1.34
          • status-success=ci/centos/mini-e2e/k8s-1.35
          • status-success=ci/centos/upgrade-tests-cephfs
          • status-success=ci/centos/upgrade-tests-rbd
    • all of:
      • base~=^(release-.+)$
      • status-success=codespell
      • status-success=go-test
      • status-success=golangci-lint
      • status-success=lint-extras
      • status-success=mod-check
      • status-success=multi-arch-build
      • status-success=uncommitted-code-check
      • any of:
        • label=ci/skip/e2e
        • all of:
          • status-success=ci/centos/k8s-e2e-external-storage/1.32
          • status-success=ci/centos/mini-e2e-helm/k8s-1.32
          • status-success=ci/centos/mini-e2e/k8s-1.32
          • status-success=ci/centos/k8s-e2e-external-storage/1.33
          • status-success=ci/centos/k8s-e2e-external-storage/1.34
          • status-success=ci/centos/mini-e2e-helm/k8s-1.33
          • status-success=ci/centos/mini-e2e-helm/k8s-1.34
          • status-success=ci/centos/mini-e2e/k8s-1.33
          • status-success=ci/centos/mini-e2e/k8s-1.34
          • status-success=ci/centos/upgrade-tests-cephfs
          • status-success=ci/centos/upgrade-tests-rbd
    • all of:
      • base=release-v3.15
      • status-success=codespell
      • status-success=go-test
      • status-success=golangci-lint
      • status-success=lint-extras
      • status-success=mod-check
      • status-success=multi-arch-build
      • status-success=uncommitted-code-check
      • any of:
        • label=ci/skip/e2e
        • all of:
          • status-success=ci/centos/k8s-e2e-external-storage/1.31
          • status-success=ci/centos/k8s-e2e-external-storage/1.32
          • status-success=ci/centos/mini-e2e-helm/k8s-1.31
          • status-success=ci/centos/mini-e2e-helm/k8s-1.32
          • status-success=ci/centos/mini-e2e/k8s-1.31
          • status-success=ci/centos/mini-e2e/k8s-1.32
          • status-success=ci/centos/k8s-e2e-external-storage/1.33
          • status-success=ci/centos/mini-e2e-helm/k8s-1.33
          • status-success=ci/centos/mini-e2e/k8s-1.33
          • status-success=ci/centos/upgrade-tests-cephfs
          • status-success=ci/centos/upgrade-tests-rbd
    • all of:
      • base=ci/centos
      • status-success=ci/centos/jjb-validate
      • status-success=ci/centos/job-validation

@mergify mergify Bot added the queued label Feb 13, 2026
mergify Bot added a commit that referenced this pull request Feb 13, 2026
@mergify mergify Bot merged commit 9747644 into ceph:devel Feb 13, 2026
18 checks passed
@mergify mergify Bot removed the queued label Feb 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/skip/e2e skip running e2e CI jobs component/nvme-of Issues and PRs related to NVMe-oF.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants