Cleanup Crucible resources with volume delete saga#1706
Conversation
Add a volume delete saga, and call that whenever a volume needs to be deleted. Crucible resources now have accounting for their use, and when the number of references reaches zero then cleanup occurs. Adds a table to store created region snapshots as part of this accounting, and make a record of when a Crucible snapshot is taken of a region. Region snapshots are uniquely identified by (dataset_id, region_id, snapshot_id), and additionally contain a snapshot address for identification as part of a volume construction request. All DELETE calls to Crucible Agent(s) are now only performed by the volume delete saga. Add tests for cases where if Nexus is properly accounting for snapshots. It should not delete the disk's corresponding region when the disk is deleted because a snapshot exists, or if multiple disks reference a running snapshot. A small change to the simulated Crucible agent is also included in this commit - logic was moved out of the HTTP endpoint function into the Crucible struct. Logic doesn't belong in HTTP endpoint functions :) Closes oxidecomputer#1632
|
|
||
| let now = Utc::now(); | ||
| diesel::update(dsl::volume) | ||
| diesel::delete(dsl::volume) |
There was a problem hiding this comment.
If we are no longer soft-deleting, is it even possible to have a Not Found error returned? I see we're handling it below, but wouldn't that just look like "no rows updated" now?
There was a problem hiding this comment.
We're doing both - volumes are either soft deleted (if there are associated regions) and hard deleted (when those regions are cleaned up). I don't think I understand this comment though - volume_hard_delete doesn't filter on anything except the id, so it should find it every time.
There was a problem hiding this comment.
I think my only beef here was with the error handling, where we handle the NotFoundByLookup case below. If we issue this delete request when the volume doesn't exist, will a NotFound error actually be returned?
https://docs.diesel.rs/master/diesel/result/enum.Error.html#variant.NotFound mentions this:
No rows were returned by a query expected to return at least one row.
This variant is only returned by get_result and first. load does not treat 0 rows as an error. If you would like to allow either 0 or 1 rows, call optional on the result.
I'm not sure what happens when we make this call with execute - it doesn't seem like an error for no rows to be updated.
|
I believe I've addressed all the comments, let me know. |
smklein
left a comment
There was a problem hiding this comment.
Ship it! Thanks for the patches.
Add a volume delete saga, and call that whenever a volume needs to be deleted. Crucible resources now have accounting for their use, and when the number of references reaches zero then cleanup occurs.
Adds a table to store created region snapshots as part of this accounting, and make a record of when a Crucible snapshot is taken of a region. Region snapshots are uniquely identified by (dataset_id, region_id, snapshot_id), and additionally contain a snapshot address for identification as part of a volume construction request.
All DELETE calls to Crucible Agent(s) are now only performed by the volume delete saga.
Add tests for cases where if Nexus is properly accounting for snapshots. It should not delete the disk's corresponding region when the disk is deleted because a snapshot exists, or if multiple disks reference a running snapshot.
A small change to the simulated Crucible agent is also included in this commit - logic was moved out of the HTTP endpoint function into the Crucible struct. Logic doesn't belong in HTTP endpoint functions :)
Closes #1632