nvmeof: fix bug in cleanup NVMe-oF resources #5936
Conversation
6b69715 to
52cbada
Compare
add another case, if the subsystem of the rquested ns (that we want to remove) does not exist, dont raise an error. Signed-off-by: gadi-didi <gadi.didi@ibm.com>
52cbada to
3c27975
Compare
When CreateVolume fails during NVMe-oF resource setup (e.g., listener creation failure), the cleanup defer was not executing properly because createNVMeoFResources() returned nil for nvmeofData on error. This left orphaned subsystems on the gateway that prevented subsequent volume creation attempts. Changes: - Modified createNVMeoFResources() to return nvmeofData even on error (instead of nil), allowing cleanup to access gateway connection info and subsystem details - Added NamespaceID > 0 check in cleanupNVMeoFResources() to skip namespace deletion when it was never created (NamespaceID defaults to 0) Signed-off-by: gadi-didi <gadi.didi@ibm.com>
when there is some failure to remove some listener, it is ok to just print warning, because delete the subsystem will delete the listener. Signed-off-by: gadi-didi <gadi.didi@ibm.com>
before this change,if the subsystem exists,it would not create listeners. but maybe another pvc on same subsystem but with different listener\s was requested, it would not create the listener\s (in manual mode of adding the listeners) now, even the subsystem exists, if we are in manual listener mode try to add them (if they exist, it is ok) Signed-off-by: gadi-didi <gadi.didi@ibm.com>
the defer in CreateVolume() for cleanup the NVMe-oF resoureces was in incorrect place, so in case of createNVMeoFResources() failure it was not called at all. Signed-off-by: gadi-didi <gadi.didi@ibm.com>
3c27975 to
8ebc1c2
Compare
| // Step 3: Ensure subsystem exists (and listener) | ||
| if err := ensureSubsystem(ctx, gateway, nvmeofData.SubsystemNQN, networkMask, nvmeofData.ListenerInfo); err != nil { | ||
| return nil, fmt.Errorf("subsystem setup failed: %w", err) | ||
| return nvmeofData, fmt.Errorf("subsystem setup failed: %w", err) |
There was a problem hiding this comment.
I think it is nicer to not return nvmeofData in case of a failure. That would mean a cleanup needs to be done in this function too. It should make the code simpler to understand, but it also adds a little more complexity in this function 🤔
There was a problem hiding this comment.
@nixpanic , I wanted to reuse the cleanupNVMeoFResources() from DeleteVolume().
So, if I return nil it means here:
defer func() {
// Cleanup NVMe-oF resources on subsequent errors.
// only if there was an error and nvmeofData is not nil, it means resources were created.
if err != nil && nvmeofData != nil {
log.DebugLog(ctx, "Cleaning up NVMe-oF resources for volume %q due to error: %v", volumeID, err)
cleanupErr := cs.cleanupNVMeoFResources(ctx, nvmeofData)
if cleanupErr != nil {
log.ErrorLog(ctx, "failed to cleanup NVMe-oF resources for volume %q: %v", volumeID, cleanupErr)
}
}
}()I cannot to send to cs.cleanupNVMeoFResources() the nvmeofData because it will be empty. So, I will need to change the cleanupNVMeoFResources() to get subsystemNqn, NSid , and ListenerInfo..
(and the cleanupNVMeoFResources() that is called in the DeleteVolume() as well)
what do you think is the best way to clean in case of partial initialization?
There was a problem hiding this comment.
My preference is to have each function cleanup the resources it allocated itself. That also means there is less re-use possible of functions like cleanupNVMeoFResources(). To structure it the way I would prefer, each function that allocates resources would probably get a cleanup-helper function, which then can get called from cleanupNVMeoFResources().
But I'm not sure that it really results in a simpler/cleaner approach... The current way is good enough for now, we can try to make it better maintainable later on.
|
@Mergifyio queue |
✅ The pull request has been merged automaticallyDetailsThe pull request has been merged automatically at 6b688a9 |
Merge Queue Status✅ The pull request has been merged at 8ebc1c2 This pull request spent 35 minutes 5 seconds in the queue, including 34 minutes 47 seconds running CI. Required conditions to merge
|
Describe what this PR does
Problem
CreateVolumefails during NVMe-oF resource setup, the cleanup defer does not execute properly, leaving orphaned subsystems (instead of removing it) on the gateway that cause subsequent volume creation attempts to success (but should not success).Root Cause
The issue occurred in the following sequence:
CreateVolumecreates an RBD volume successfullycreateNVMeoFResources()is called to set up NVMe-oF resources:createNVMeoFResources()returnsnil, error<-- BugCreateVolumewas not called at all! <-- Another BugExample from logs:
you can see from the log that in the second try (by CSI mechanism) it passed and subsystem existed already, but it should not!!
Solution
1. Return nvmeofData on Error (Main Fix)
Modified
createNVMeoFResources()to returnnvmeofDatainstead ofnilwhen errors occur after the struct is initialized:Why this fixes it: The cleanup defer now receives valid
nvmeofDatacontaining gateway connection info and subsystem details, allowing it to properly clean up the orphaned subsystem.2. Add NamespaceID Guard
Added check in
cleanupNVMeoFResources()to skip namespace deletion when it was never created:Why this is needed: Since we now return
nvmeofDataeven when namespace creation fails,NamespaceIDmay still be 0 (the default value). Attempting to delete namespace 0 would fail.3. Fix ensureSubsystem to Create Listeners Even if Subsystem Exists
Modified
ensureSubsystem()to attempt listener creation even when the subsystem already exists:4. The defer was declared AFTER the call that could fail
Moved the cleanup defer to be declared BEFORE the call to
createNVMeoFResources():Note: The RBD cleanup defer remains in its current position (after
CreateVolume) because:volumeIDfrom the response for cleanupChecklist:
guidelines in the developer
guide.
Request
notes
updated with breaking and/or notable changes for the next major release.
Show available bot commands
These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:
/retest ci/centos/<job-name>: retest the<job-name>after unrelatedfailure (please report the failure too!)