Skip to content

Incorrect region in cached image names #1460

@dbaumgarten

Description

@dbaumgarten

What happened: We tried to update our cluster to use version 1.26.8-20230919 for the nodes and the newly launched ec2 instances (repeatedly) failed to join the cluster.

What you expected to happen: The new nodes should successfully join the cluster

How to reproduce it (as minimally and precisely as possible): Start a cluster with an older AMI and then update to v1.26.8-20230919 .

Anything else we need to know?:
I have looked into it and here is what I found:

  • The node does not start because it fails to pull the pause-image
  • In the containerd logs we see messages like: The image 602401143452.dkr.ecr.ap-south-2\nap-south-1\neu-south-1\neu-south-2\nme-central-1\nil-central-1\nca-central-1\neu-central-1\neu-central-2\nus-west-1\nus-west-2\naf-south[...]amazonaws.com/eks/pause:3.5 is not unpacked.
  • That image-name does not seem correct to me. Instead of a single region, it contains all the regions, separated by newlines.
  • Looking into the git-commits of this repo we find: be7bc10
  • In the script (from the commit mentioned above) the variable REGIONS contains a newline separated list of all regions
  • The loop in line 469 does not work. Instead of iterating over the regions, it just runs unce, placing the whole (newline separated) list of regions into the region-variable
  • This broken region-variable is then used in the next few lines to build the image-names, resulting in broken, un-pullable images.

I think that this broken image-names prevent the node from booting up properly.

Environment:

  • AWS Region: eu-west-1
  • Instance Type(s): t3a.large
  • EKS Platform version (use aws eks describe-cluster --name <name> --query cluster.platformVersion):
  • Kubernetes version (use aws eks describe-cluster --name <name> --query cluster.version): eks.6
  • AMI Version: v1.26.8-20230919
  • Kernel (e.g. uname -a): 5.10.192-183.736.amzn2.x86_64
  • Release information (run cat /etc/eks/release on a node):
BASE_AMI_ID="ami-0bac1825e471f5042"
BUILD_TIME="Mon Oct  2 20:39:23 UTC 2023"
BUILD_KERNEL="5.10.192-183.736.amzn2.x86_64"
ARCH="x86_64"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions