Skip to content

eks: renaming the cluster would trigger rollback due to not authorized to delete the cluster #29282

@pahud

Description

@pahud

Describe the bug

Given the code:

 const cluster = new eks.Cluster(this, 'demo-eks-cluster', {
      vpc,
      clusterName: 'foo',
      defaultCapacity: 0,
      version: eks.KubernetesVersion.V1_29,
      kubectlLayer: new KubectlLayer(this, 'kubectlLayer'),
    });

If we rename the clusterName, it would trigger the replacement due to this, which creates a new one and delete the existing one. But we are seeing not authorized error hence rollback.

11:12:03 AM | DELETE_FAILED        | AWS::CloudFormation::CustomResource | demo-eks-cluster/R...e/Resource/De
fault
Received response status [FAILED] from custom resource. Message returned: User:
arn:aws:sts::<deducted>:assumed-role/dummy-stack1-demoeksclusterCreationRoleD556FC0C-eSVJAmlypMdd/AWSCDK.EK
SCluster.Delete.ec88927b-3c8e-4b8f-bd7b-94445b11de48 is not authorized to perform: eks:DeleteCluster on resou
rce: arn:aws:eks:us-east-1:<deducted>:cluster/foo

Logs: /aws/lambda/dummy-stack1-awscdkawseksCl-OnEventHandler42BEBAE0-F7RpMPmuSPA5

at throwDefaultError (/var/runtime/node_modules/@aws-sdk/node_modules/@smithy/smithy-client/dist-cjs/default-
error-handler.js:8:22)
at /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/smithy-client/dist-cjs/default-error-handler.js:18
:39
at de_DeleteClusterCommandError (/var/runtime/node_modules/@aws-sdk/client-eks/dist-cjs/protocols/Aws_restJso
n1.js:1526:20)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/middleware-serde/dist-cjs/deserializerMiddle
ware.js:7:24

I am not sure if update clusterName should trigger replacement but obviously we probably need to add the relevant permissions to the cluster resource handler.

Expected Behavior

update the clusterName should not fail. Preferably in-place update but if replacement is necessary, it should not fail and roll back.

Current Behavior

fail and roll back

Reproduction Steps

as described above

Possible Solution

  1. Let's test if we can simply trigger the in-palace update rather than replacement.
  2. If replacement is necessary, add eks:DeleteCluster on the cluster resource to the custom resource handler role.

Additional Information/Context

No response

CDK CLI Version

v2.130.0

Framework Version

No response

Node.js Version

v18.16.0

OS

mac os x

Language

TypeScript

Language Version

No response

Other information

No response

Metadata

Metadata

Assignees

Labels

@aws-cdk/aws-eksRelated to Amazon Elastic Kubernetes ServicebugThis issue is a bug.effort/mediumMedium work item – several days of effortp1

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions