Skip to content

Workspace resource is stuck with external-create-pending annotation if provider-terraform pod is deleted during Create #340

@bobh66

Description

@bobh66

What happened?

The provider-terraform Create() implementation runs terraform apply which can take "a while" - certainly tens of seconds and often several minutes.

If the provider pod is terminated while terraform apply is running, the external-create-pending annotation can get stuck on the Workspace resource and the managed Reconciler will not process it without manual intervention to remove the annotation.

This is most easily demonstrated when provider-terraform has code that detected the context cancellation and returns failure from Create():

crossplane-contrib/provider-terraform#76

This should cause the Reconciler to set the ExternalCreateFailed annotation, which would allow the resource to be retried on the next cycle. However the update of the resource to set that annotation fails because the context has already been cancelled, and so the resource is stuck with the pending annotation.

The logs indicate that the critical annotations cannot be updated because the context has been canceled:

2022-07-14T01:33:06.401Z        DEBUG   provider-terraform      Cannot create external resource {"controller": "managed/workspace.tf.crossplane.io", "request": "/test7", "uid": "9b803b6d-2380-47dc-91e2-99f94d0dd0b5", "version": "64480800", "external-name": "", "error": "cannot apply Terraform configuration: context canceled shutdown while running terraform command: error waiting for child process to terminate: context canceled", "errorVerbose": "context canceled\nerror waiting for child process to terminate\ngithub.com/crossplane-contrib/provider-terraform/internal/terraform.runCommand\n\t/home/bobh/git/bobh66/provider-terraform/internal/terraform/terraform.go:528\ngithub.com/crossplane-contrib/provider-terraform/internal/terraform.Harness.Apply\n\t/home/bobh/git/bobh66/provider-terraform/internal/terraform/terraform.go:474\ngithub.com/crossplane-contrib/provider-terraform/internal/controller/workspace.(*external).Update\n\t/home/bobh/git/bobh66/provider-terraform/internal/controller/workspace/workspace.go:331\ngithub.com/crossplane-contrib/provider-terraform/internal/controller/workspace.(*external).Create\n\t/home/bobh/git/bobh66/provider-terraform/internal/controller/workspace/workspace.go:315\ngithub.com/crossplane/crossplane-runtime/pkg/reconciler/managed.(*Reconciler).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/github.com/crossplane/crossplane-runtime/pkg/reconciler/managed/reconciler.go:886\ngithub.com/crossplane/crossplane-runtime/pkg/ratelimiter.(*Reconciler).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/github.com/crossplane/crossplane-runtime/pkg/ratelimiter/reconciler.go:54\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581\ncontext canceled shutdown while running terraform command\ngithub.com/crossplane-contrib/provider-terraform/internal/terraform.runCommand\n\t/home/bobh/git/bobh66/provider-terraform/internal/terraform/terraform.go:528\ngithub.com/crossplane-contrib/provider-terraform/internal/terraform.Harness.Apply\n\t/home/bobh/git/bobh66/provider-terraform/internal/terraform/terraform.go:474\ngithub.com/crossplane-contrib/provider-terraform/internal/controller/workspace.(*external).Update\n\t/home/bobh/git/bobh66/provider-terraform/internal/controller/workspace/workspace.go:331\ngithub.com/crossplane-contrib/provider-terraform/internal/controller/workspace.(*external).Create\n\t/home/bobh/git/bobh66/provider-terraform/internal/controller/workspace/workspace.go:315\ngithub.com/crossplane/crossplane-runtime/pkg/reconciler/managed.(*Reconciler).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/github.com/crossplane/crossplane-runtime/pkg/reconciler/managed/reconciler.go:886\ngithub.com/crossplane/crossplane-runtime/pkg/ratelimiter.(*Reconciler).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/github.com/crossplane/crossplane-runtime/pkg/ratelimiter/reconciler.go:54\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581\ncannot apply Terraform configuration\ngithub.com/crossplane-contrib/provider-terraform/internal/controller/workspace.(*external).Update\n\t/home/bobh/git/bobh66/provider-terraform/internal/controller/workspace/workspace.go:332\ngithub.com/crossplane-contrib/provider-terraform/internal/controller/workspace.(*external).Create\n\t/home/bobh/git/bobh66/provider-terraform/internal/controller/workspace/workspace.go:315\ngithub.com/crossplane/crossplane-runtime/pkg/reconciler/managed.(*Reconciler).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/github.com/crossplane/crossplane-runtime/pkg/reconciler/managed/reconciler.go:886\ngithub.com/crossplane/crossplane-runtime/pkg/ratelimiter.(*Reconciler).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/github.com/crossplane/crossplane-runtime/pkg/ratelimiter/reconciler.go:54\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581"}
2022-07-14T01:33:06.401Z        DEBUG   provider-terraform      cannot update managed resource annotations      {"controller": "managed/workspace.tf.crossplane.io", "request": "/test7", "uid": "9b803b6d-2380-47dc-91e2-99f94d0dd0b5", "version": "64480800", "external-name": "", "error": "cannot update critical annotations: context canceled"}
2022-07-14T01:33:06.401Z        ERROR   controller.managed/workspace.tf.crossplane.io   Reconciler error        {"reconciler group": "tf.crossplane.io", "reconciler kind": "Workspace", "name": "test7", "namespace": "", "error": "cannot update managed resource status: context canceled"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227
2022-07-14T01:33:06.401Z        DEBUG   provider-terraform      Reconciling     {"controller": "managed/workspace.tf.crossplane.io", "request": "/test7"}
2022-07-14T01:33:06.401Z        DEBUG   provider-terraform      cannot determine creation result - remove the crossplane.io/external-create-pending annotation if it is safe to proceed {"controller": "managed/workspace.tf.crossplane.io", "request": "/test7", "uid": "9b803b6d-2380-47dc-91e2-99f94d0dd0b5", "version": "64480909", "external-name": "test7"}
2022-07-14T01:33:06.401Z        ERROR   controller.managed/workspace.tf.crossplane.io   Reconciler error        {"reconciler group": "tf.crossplane.io", "reconciler kind": "Workspace", "name": "test7", "namespace": "", "error": "cannot update managed resource status: context canceled"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227
2022-07-14T01:33:06.401Z        DEBUG   events  Warning {"object": {"kind":"Workspace","name":"test7","uid":"9b803b6d-2380-47dc-91e2-99f94d0dd0b5","apiVersion":"tf.crossplane.io/v1alpha1","resourceVersion":"64480909"}, "reason": "CannotCreateExternalResource", "message": "cannot apply Terraform configuration: context canceled shutdown while running terraform command: error waiting for child process to terminate: context canceled"}
2022-07-14T01:33:06.401Z        DEBUG   events  Warning {"object": {"kind":"Workspace","name":"test7","uid":"9b803b6d-2380-47dc-91e2-99f94d0dd0b5","apiVersion":"tf.crossplane.io/v1alpha1","resourceVersion":"64480909"}, "reason": "CannotUpdateManagedResource", "message": "cannot update managed resource annotations: cannot update critical annotations: context canceled"}

How can we reproduce it?

Create a terraform Workspace object that uses the kubernetes backend with a time_sleep resource that has a create_duration of 5 minutes

After applying the Workspace object, delete the provider-terraform pod within 5 minutes

Observe the above log messages and that the resource cannot be managed by the Reconciler because of the orphaned external-create-pending annotation

What environment did it happen in?

Crossplane version: 1.8.1

Question - should the Create() process be considered to be idempotent? Should it always be possible to rerun Create() even when the external-create-pending annotation is set?

The existing implementation of the external-create annotations has the "incomplete" check immediately at the start of reconciliation. I'm wondering if it would make sense to allow the Observe to happen and only check for incomplete external creation if the Observe indicates that the resource exists? If the resource does not exist then maybe Create should be responsible for creating it, even if the pending annotation is set?

@ytsarev @negz FYI

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstale

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions