-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Describe the bug
We recently upgraded from ESO 0.10.5 to 0.16.2. For some reason, only when certain SecretStores are in an error'ed state, ESO flips out and starts spamming errors nonstop, so fast it fills my screen and hangs my terminal. Only way to fix it is to scale down ESO, fix the SecretStore, and start it back up.
As we support 1000+ developers who often don't get their SecretStore object working 100% right off the gate, this pretty much renders the controller non-functional, as it wont do anything else since it's "hung up" on this one bad SecretStore. When I say bad, I mean anything....wrong IAM role, wrong AWS region, something about it is incorrect.
In 0.10.5, it would simply log the error and move on, retrying every 1m.
Example of logs (but repeated thousands of times a minute)
external-secrets-operator-85d7bff98f-6k5bb external-secrets {"level":"error","ts":"2025-08-28T23:08:06.377Z","msg":"Reconciler error","controller":"secretstore","controllerGroup":"external-secrets.io","controllerKind":"SecretStore","SecretStore":{"name":"beta-cq-workflow-scheduler-aws-systemmanager-secret-store","namespace":"beta-workflow-scheduler"},"namespace":"beta-workflow-scheduler","name":"beta-cq-workflow-scheduler-aws-systemmanager-secret-store","reconcileID":"d6c72ba5-778e-404c-8227-b947a68600b9","error":"could not validate provider: AccessDenied: User: arn:aws:sts::049306942178:assumed-role/cqeks-nonprod-049306942178-us-east-2-external-secrets/token-file-web-identity is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::049306942178:role/es/role-eks-beta-222262-cq-workflow-scheduler-us-east-2\n\tstatus code: 403, request id: 4eaf8694-474d-48c5-b4f5-44d124780182","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:347\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:294\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:255"}
external-secrets-operator-85d7bff98f-6k5bb external-secrets {"level":"error","ts":"2025-08-28T23:08:06.402Z","logger":"controllers.SecretStore","msg":"unable to validate store","secretstore":{"name":"beta-cq-workflow-scheduler-aws-systemmanager-secret-store","namespace":"beta-workflow-scheduler"},"error":"could not validate provider: AccessDenied: User: arn:aws:sts::049306942178:assumed-role/cqeks-nonprod-049306942178-us-east-2-external-secrets/token-file-web-identity is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::049306942178:role/es/role-eks-beta-222262-cq-workflow-scheduler-us-east-2\n\tstatus code: 403, request id: 1c02a194-c2d2-4065-bc6a-9425a376ee85","stacktrace":"github.com/external-secrets/external-secrets/pkg/controllers/secretstore.reconcile\n\t/home/runner/work/external-secrets/external-secrets/pkg/controllers/secretstore/common.go:76\ngithub.com/external-secrets/external-secrets/pkg/controllers/secretstore.(*StoreReconciler).Reconcile\n\t/home/runner/work/external-secrets/external-secrets/pkg/controllers/secretstore/secretstore_controller.go:66\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:334\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:294\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:255"}
external-secrets-operator-85d7bff98f-6k5bb external-secrets {"level":"error","ts":"2025-08-28T23:08:06.413Z","msg":"Reconciler error","controller":"secretstore","controllerGroup":"external-secrets.io","controllerKind":"SecretStore","SecretStore":{"name":"beta-cq-workflow-scheduler-aws-systemmanager-secret-store","namespace":"beta-workflow-scheduler"},"namespace":"beta-workflow-scheduler","name":"beta-cq-workflow-scheduler-aws-systemmanager-secret-store","reconcileID":"b0f5ee44-6f7e-4abd-b6ed-af4a42ea6133","error":"could not validate provider: AccessDenied: User: arn:aws:sts::049306942178:assumed-role/cqeks-nonprod-049306942178-us-east-2-external-secrets/token-file-web-identity is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::049306942178:role/es/role-eks-beta-222262-cq-workflow-scheduler-us-east-2\n\tstatus code: 403, request id: 1c02a194-c2d2-4065-bc6a-9425a376ee85","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:347\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:294\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:255"}
external-secrets-operator-85d7bff98f-6k5bb external-secrets {"level":"error","ts":"2025-08-28T23:08:06.440Z","logger":"controllers.SecretStore","msg":"unable to validate store","secretstore":{"name":"beta-cq-workflow-scheduler-aws-systemmanager-secret-store","namespace":"beta-workflow-scheduler"},"error":"could not validate provider: AccessDenied: User: arn:aws:sts::049306942178:assumed-role/cqeks-nonprod-049306942178-us-east-2-external-secrets/token-file-web-identity is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::049306942178:role/es/role-eks-beta-222262-cq-workflow-scheduler-us-east-2\n\tstatus code: 403, request id: 2c5d0935-1b1e-4812-9745-0a9fb2e04be3","stacktrace":"github.com/external-secrets/external-secrets/pkg/controllers/secretstore.reconcile\n\t/home/runner/work/external-secrets/external-secrets/pkg/controllers/secretstore/common.go:76\ngithub.com/external-secrets/external-secrets/pkg/controllers/secretstore.(*StoreReconciler).Reconcile\n\t/home/runner/work/external-secrets/external-secrets/pkg/controllers/secretstore/secretstore_controller.go:66\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:334\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:294\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/internal/controller/controller.go:255"}
To Reproduce
Reproducing is....transient. If I create a brand new SecretStore incorrectly, no issue, it functions like 0.10.5. If I modify certain existing SecretStores, same thing. But other ones cause this issue, even though they look identical to me (we create all of our SecretStores with an inhouse helm chart, so all of our SecretStores are identical from a YAML schema perspective.
apiVersion: external-secrets.io/v1
kind: SecretStore
metadata:
managedFields:
- apiVersion: external-secrets.io/v1beta1
fieldsType: FieldsV1
fieldsV1:
f:status:
.: {}
f:capabilities: {}
manager: external-secrets
operation: Update
subresource: status
time: '2024-07-29T02:44:17Z'
- apiVersion: external-secrets.io/v1beta1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:meta.helm.sh/release-name: {}
f:meta.helm.sh/release-namespace: {}
f:labels:
.: {}
f:app: {}
f:app-id: {}
f:app.kubernetes.io/managed-by: {}
f:deploy-date: {}
f:development-team-email: {}
f:environment: {}
f:helm-chart-release: {}
f:release: {}
f:spec:
.: {}
f:provider:
.: {}
f:aws:
.: {}
f:region: {}
f:service: {}
manager: helm
operation: Update
time: '2025-06-05T14:15:04Z'
- apiVersion: external-secrets.io/v1
fieldsType: FieldsV1
fieldsV1:
f:spec:
f:provider:
f:aws:
f:role: {}
manager: agent
operation: Update
time: '2025-08-28T22:59:51Z'
- apiVersion: external-secrets.io/v1
fieldsType: FieldsV1
fieldsV1:
f:status:
f:conditions: {}
manager: external-secrets
operation: Update
subresource: status
time: '2025-08-28T23:00:00Z'
name: beta-ci-analyzer-aws-systemmanager-secret-store
namespace: backend
spec:
provider:
aws:
region: us-east-2
role: arn:aws:iam::999999999:role/eks/role-eks-beta-219787-ci-analyzer-us-east-2
service: ParameterStoreExpected behavior
Simply log the problem and retry in 1m.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status