-
Notifications
You must be signed in to change notification settings - Fork 141
Workspace resource is stuck with external-create-pending annotation if provider-terraform pod is deleted during Create #340
Description
What happened?
The provider-terraform Create() implementation runs terraform apply which can take "a while" - certainly tens of seconds and often several minutes.
If the provider pod is terminated while terraform apply is running, the external-create-pending annotation can get stuck on the Workspace resource and the managed Reconciler will not process it without manual intervention to remove the annotation.
This is most easily demonstrated when provider-terraform has code that detected the context cancellation and returns failure from Create():
crossplane-contrib/provider-terraform#76
This should cause the Reconciler to set the ExternalCreateFailed annotation, which would allow the resource to be retried on the next cycle. However the update of the resource to set that annotation fails because the context has already been cancelled, and so the resource is stuck with the pending annotation.
The logs indicate that the critical annotations cannot be updated because the context has been canceled:
2022-07-14T01:33:06.401Z DEBUG provider-terraform Cannot create external resource {"controller": "managed/workspace.tf.crossplane.io", "request": "/test7", "uid": "9b803b6d-2380-47dc-91e2-99f94d0dd0b5", "version": "64480800", "external-name": "", "error": "cannot apply Terraform configuration: context canceled shutdown while running terraform command: error waiting for child process to terminate: context canceled", "errorVerbose": "context canceled\nerror waiting for child process to terminate\ngithub.com/crossplane-contrib/provider-terraform/internal/terraform.runCommand\n\t/home/bobh/git/bobh66/provider-terraform/internal/terraform/terraform.go:528\ngithub.com/crossplane-contrib/provider-terraform/internal/terraform.Harness.Apply\n\t/home/bobh/git/bobh66/provider-terraform/internal/terraform/terraform.go:474\ngithub.com/crossplane-contrib/provider-terraform/internal/controller/workspace.(*external).Update\n\t/home/bobh/git/bobh66/provider-terraform/internal/controller/workspace/workspace.go:331\ngithub.com/crossplane-contrib/provider-terraform/internal/controller/workspace.(*external).Create\n\t/home/bobh/git/bobh66/provider-terraform/internal/controller/workspace/workspace.go:315\ngithub.com/crossplane/crossplane-runtime/pkg/reconciler/managed.(*Reconciler).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/github.com/crossplane/crossplane-runtime/pkg/reconciler/managed/reconciler.go:886\ngithub.com/crossplane/crossplane-runtime/pkg/ratelimiter.(*Reconciler).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/github.com/crossplane/crossplane-runtime/pkg/ratelimiter/reconciler.go:54\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581\ncontext canceled shutdown while running terraform command\ngithub.com/crossplane-contrib/provider-terraform/internal/terraform.runCommand\n\t/home/bobh/git/bobh66/provider-terraform/internal/terraform/terraform.go:528\ngithub.com/crossplane-contrib/provider-terraform/internal/terraform.Harness.Apply\n\t/home/bobh/git/bobh66/provider-terraform/internal/terraform/terraform.go:474\ngithub.com/crossplane-contrib/provider-terraform/internal/controller/workspace.(*external).Update\n\t/home/bobh/git/bobh66/provider-terraform/internal/controller/workspace/workspace.go:331\ngithub.com/crossplane-contrib/provider-terraform/internal/controller/workspace.(*external).Create\n\t/home/bobh/git/bobh66/provider-terraform/internal/controller/workspace/workspace.go:315\ngithub.com/crossplane/crossplane-runtime/pkg/reconciler/managed.(*Reconciler).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/github.com/crossplane/crossplane-runtime/pkg/reconciler/managed/reconciler.go:886\ngithub.com/crossplane/crossplane-runtime/pkg/ratelimiter.(*Reconciler).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/github.com/crossplane/crossplane-runtime/pkg/ratelimiter/reconciler.go:54\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581\ncannot apply Terraform configuration\ngithub.com/crossplane-contrib/provider-terraform/internal/controller/workspace.(*external).Update\n\t/home/bobh/git/bobh66/provider-terraform/internal/controller/workspace/workspace.go:332\ngithub.com/crossplane-contrib/provider-terraform/internal/controller/workspace.(*external).Create\n\t/home/bobh/git/bobh66/provider-terraform/internal/controller/workspace/workspace.go:315\ngithub.com/crossplane/crossplane-runtime/pkg/reconciler/managed.(*Reconciler).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/github.com/crossplane/crossplane-runtime/pkg/reconciler/managed/reconciler.go:886\ngithub.com/crossplane/crossplane-runtime/pkg/ratelimiter.(*Reconciler).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/github.com/crossplane/crossplane-runtime/pkg/ratelimiter/reconciler.go:54\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581"}
2022-07-14T01:33:06.401Z DEBUG provider-terraform cannot update managed resource annotations {"controller": "managed/workspace.tf.crossplane.io", "request": "/test7", "uid": "9b803b6d-2380-47dc-91e2-99f94d0dd0b5", "version": "64480800", "external-name": "", "error": "cannot update critical annotations: context canceled"}
2022-07-14T01:33:06.401Z ERROR controller.managed/workspace.tf.crossplane.io Reconciler error {"reconciler group": "tf.crossplane.io", "reconciler kind": "Workspace", "name": "test7", "namespace": "", "error": "cannot update managed resource status: context canceled"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227
2022-07-14T01:33:06.401Z DEBUG provider-terraform Reconciling {"controller": "managed/workspace.tf.crossplane.io", "request": "/test7"}
2022-07-14T01:33:06.401Z DEBUG provider-terraform cannot determine creation result - remove the crossplane.io/external-create-pending annotation if it is safe to proceed {"controller": "managed/workspace.tf.crossplane.io", "request": "/test7", "uid": "9b803b6d-2380-47dc-91e2-99f94d0dd0b5", "version": "64480909", "external-name": "test7"}
2022-07-14T01:33:06.401Z ERROR controller.managed/workspace.tf.crossplane.io Reconciler error {"reconciler group": "tf.crossplane.io", "reconciler kind": "Workspace", "name": "test7", "namespace": "", "error": "cannot update managed resource status: context canceled"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/home/bobh/git/bobh66/provider-terraform/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227
2022-07-14T01:33:06.401Z DEBUG events Warning {"object": {"kind":"Workspace","name":"test7","uid":"9b803b6d-2380-47dc-91e2-99f94d0dd0b5","apiVersion":"tf.crossplane.io/v1alpha1","resourceVersion":"64480909"}, "reason": "CannotCreateExternalResource", "message": "cannot apply Terraform configuration: context canceled shutdown while running terraform command: error waiting for child process to terminate: context canceled"}
2022-07-14T01:33:06.401Z DEBUG events Warning {"object": {"kind":"Workspace","name":"test7","uid":"9b803b6d-2380-47dc-91e2-99f94d0dd0b5","apiVersion":"tf.crossplane.io/v1alpha1","resourceVersion":"64480909"}, "reason": "CannotUpdateManagedResource", "message": "cannot update managed resource annotations: cannot update critical annotations: context canceled"}
How can we reproduce it?
Create a terraform Workspace object that uses the kubernetes backend with a time_sleep resource that has a create_duration of 5 minutes
After applying the Workspace object, delete the provider-terraform pod within 5 minutes
Observe the above log messages and that the resource cannot be managed by the Reconciler because of the orphaned external-create-pending annotation
What environment did it happen in?
Crossplane version: 1.8.1
Question - should the Create() process be considered to be idempotent? Should it always be possible to rerun Create() even when the external-create-pending annotation is set?
The existing implementation of the external-create annotations has the "incomplete" check immediately at the start of reconciliation. I'm wondering if it would make sense to allow the Observe to happen and only check for incomplete external creation if the Observe indicates that the resource exists? If the resource does not exist then maybe Create should be responsible for creating it, even if the pending annotation is set?