[nexus] Instance Deletion is now a saga by smklein · Pull Request #2060 · oxidecomputer/omicron

smklein · 2022-12-16T17:17:36Z

Fixes #2034

just-be-dev · 2022-12-21T21:31:07Z

+async fn sid_delete_instance_record(
+    sagactx: NexusActionContext,
+) -> Result<(), ActionError> {
+    let osagactx = sagactx.user_data();
+    let params = sagactx.saga_params::<Params>()?;
+    let opctx = OpContext::for_saga_action(&sagactx, &params.serialized_authn);
+    osagactx
+        .datastore()
+        .project_delete_instance(&opctx, &params.authz_instance)
+        .await
+        .map_err(ActionError::action_failed)?;
+    Ok(())
+}


It looks like part of what project_delete_instance is doing is going through an detaching disks. I guess that's really just updating the db record for the disks to unassociate them with the instance (via Instance::detach_resource). Looking in the detach disk call it's using the same Instance::detach_resource. I was wondering if that action, detaching disks, should be hoisted to the saga. Given that it's basically using the same thing under the hood I suppose not?

I do not think it should be.

When I wrote DatastoreDetachManyTarget::detach_resources - which Instance objects implement, so they could detach disks - the whole point of it was that the detach operation (on disks) could be issued as part of the same statement that deletes instances, within a common table expression being issued to CRDB.

By existing within a single CTE, the statement is transactional, even though it modifies multiple tables and only sends a single request to CRDB. This follows the advice of https://www.cockroachlabs.com/docs/v22.2/performance-best-practices-overview#reduce-transaction-contention

Using CTEs, where possible, is generally our "best performance" way to combine multiple SQL statements. My usage of sagas elsewhere is a worse-performance attempt to get similar properties, in a non-atomic way.

Related: Ideally, each function exposed from the datastore() would only issue a single DB statement, so that it either "fully completes" or "does nothing at all".

project_delete_instance, by virtue of being a CTE, already abides by this principle.

just-be-dev · 2022-12-21T21:32:23Z

+declare_saga_actions! {
+    instance_delete;
+    INSTANCE_DELETE_RECORD -> "no_result1" {
+        + sid_delete_instance_record
+    }
+    DELETE_NETWORK_INTERFACES -> "no_result2" {
+        + sid_delete_network_interfaces
+    }
+    DEALLOCATE_EXTERNAL_IP -> "no_result3" {
+        + sid_deallocate_external_ip
+    }
+}


Should there be any rollback cases represented here? What happens if a step fails?

There could be; I'm not 100% sure there should be? If we're able to delete the instance record, I think we should be "best-effort" rolling forward deletion, rather than attempting to undo the deletion.

For example, suppose we cannot re-allocate the deleted resource: either the resource itself was exhausted, user-data was deleted, or a sled cooperating in the allocation is no longer around. What should we do?

By having deletion be a saga, it's at least durable - even if Nexus crashes, it'll finish the work when it comes back online - but I'm not sure about the value of "making deletion undoable" vs "making sure it rolls forward, whatever that means".

TL;DR: Maybe? But there is a lot of complexity down the road of undoing deletions that I'm not really sure how to handle yet.

just-be-dev

Overall this looks good. The new saga macro really helps make this easier to follow. Had a question about rollbacks and somewhat of a question about disk detachment but otherwise I'm 👍

davepacheco

Thanks for fixing this.

smklein added 3 commits December 15, 2022 22:38

[sagas] Make a macro to simplify declaring saga actions

b2a2ba1

ignore rustdoc

f399ff7

[nexus] Instance Deletion is now a saga

66cf0c0

smklein mentioned this pull request Dec 16, 2022

Tracking issue for More Sagas #2059

Open

16 tasks

smklein added 5 commits December 19, 2022 13:43

Merge branch 'main' into saga-macro

2e0a199

Merge branch 'main' into saga-macro

533ee76

Move some lazy_static to once_cell

6abce08

Merge branch 'main' into saga-macro

169ed10

Extend docs

29a3ab5

Base automatically changed from saga-macro to main December 20, 2022 15:37

smklein added 2 commits December 20, 2022 15:58

Merge branch 'main' into saga-macro

3b62a6a

Merge branch 'saga-macro' into more-sagas

05cbd9d

smklein requested review from davepacheco and just-be-dev December 21, 2022 20:14

just-be-dev reviewed Dec 21, 2022

View reviewed changes

Comment thread nexus/src/app/sagas/instance_delete.rs

just-be-dev reviewed Dec 21, 2022

View reviewed changes

just-be-dev approved these changes Dec 21, 2022

View reviewed changes

davepacheco reviewed Dec 21, 2022

View reviewed changes

Comment thread nexus/authz-macros/src/lib.rs

Comment thread nexus/db-model/src/update_artifact.rs

Comment thread nexus/src/authz/api_resources.rs

Comment thread nexus/src/app/sagas/instance_delete.rs

smklein mentioned this pull request Dec 22, 2022

Resource Utilization #1782

Merged

14 tasks

davepacheco approved these changes Dec 22, 2022

View reviewed changes

smklein merged commit b5c0d3f into main Dec 22, 2022

smklein deleted the more-sagas branch December 22, 2022 18:14

smklein mentioned this pull request Dec 28, 2022

[nexus] Make project creation unwind safe, add tests #2087

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[nexus] Instance Deletion is now a saga#2060

[nexus] Instance Deletion is now a saga#2060
smklein merged 10 commits into
mainfrom
more-sagas

smklein commented Dec 16, 2022

Uh oh!

Uh oh!

just-be-dev Dec 21, 2022

Uh oh!

smklein Dec 22, 2022

Uh oh!

smklein Dec 22, 2022

Uh oh!

just-be-dev Dec 21, 2022

Uh oh!

smklein Dec 22, 2022

Uh oh!

just-be-dev left a comment

Uh oh!

davepacheco left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

smklein commented Dec 16, 2022

Uh oh!

Uh oh!

just-be-dev Dec 21, 2022

Choose a reason for hiding this comment

Uh oh!

smklein Dec 22, 2022

Choose a reason for hiding this comment

Uh oh!

smklein Dec 22, 2022

Choose a reason for hiding this comment

Uh oh!

just-be-dev Dec 21, 2022

Choose a reason for hiding this comment

Uh oh!

smklein Dec 22, 2022

Choose a reason for hiding this comment

Uh oh!

just-be-dev left a comment

Choose a reason for hiding this comment

Uh oh!

davepacheco left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants