Skip to content

(aws_eks): New isolated subnets (CDK 1.140) leads to error in cdk's EKS logic #19122

@ThomasSteinbach

Description

@ThomasSteinbach

UPDATE 2022-03-04: This is definately a bug and I've got a solution. See the last chapter Workaround / Solution at the very end of my bug report.


What is the problem?

The problem is that the VPC and EKS construct doesn't work properly with the newly introduced aws_ec2.SubnetType.ISOLATED in CDK 1.140 ( see #18638) as well with the aws_ec2.SubnetType.PRIVATE_ISOLATED and aws_ec2.SubnetType.PRIVATE_WITH_NAT.

Problems especially occur on imported VPCs which just have "isolated" (formerly "private") subnets.

CDK v1

With CDK 1.140 former private networks were recognized as isolated networks - see #18638

As a result of this we had to change the parameter for aws_eks.Cluster(vpc_subnets) to [aws_ec2.SubnetSelection(subnet_type=aws_ec2.SubnetType.ISOLATED)].

However the new isolated subnets seem to collide with the aws_eks.Cluster(endpoint_access) value of aws_eks.EndpointAccess.PRIVATE:

File "/usr/local/lib/python3.7/site-packages/aws_cdk/aws_eks/__init__.py", line 11745, in __init__ jsii.create(self.__class__, self, [scope, id, props])
File "/usr/local/lib/python3.7/site-packages/jsii/_kernel/__init__.py", line 297, in create for iface in getattr(klass, "__jsii_ifaces__", [])
File "/usr/local/lib/python3.7/site-packages/jsii/_kernel/providers/process.py", line 344, in create return self._process.send(request, CreateResponse)
File "/usr/local/lib/python3.7/site-packages/jsii/_kernel/providers/process.py", line 326, in send raise JSIIError(resp.error) from JavaScriptError(resp.stack)
jsii.errors.JSIIError: Vpc must contain private subnets when public endpoint access is disabled

CDK v2

After migrating to CDK v2 we have changed the aws_ec2.SubnetType.ISOLATED (v1) to aws_ec2.SubnetType.PRIVATE_ISOLATED (v2). This is because the AWS cdk-cli will store our VPC subnets within the cdk.context.json as "isolated":

    "subnetGroups": [
      {
        "name": "Isolated",
        "type": "Isolated",
        "subnets": [
          {
            "subnetId": "subnet-00xxxxxxxxxxxxxxx",

Synthesizing the app leads to the exact same error as with the previous CDK v1 example:

jsii.errors.JavaScriptError:
  Error: Vpc must contain private subnets when public endpoint access is disabled
      at new Cluster (/private/var/folders/tv/9zml9td56ssfq30x0c39z1240000gn/T/jsii-kernel-OBsjs4/node_modules/aws-cdk-lib/aws-eks/lib/cluster.js:1:9002)

Reproduction Steps

  1. mkdir /tmp/bug && cd /tmp/bug
  2. cdk init --language=python
  3. Add a real environment to your app, so that you app will look like following:
app = cdk.App()
BugStack(app, "BugStack", env=cdk.Environment(region="eu-central-1", account="123456789000"))
  1. Add the imported vpc and the EKS cluster to the Stack ( bug/bug_stack.py):
class BugStack(Stack):

    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        vpc = aws_ec2.Vpc.from_lookup(self, "vpc", vpc_id="vpc-xxxxxxxxxxxxxxxx")

        print(f"#####--{id}")
        for sn in vpc.public_subnets:
            print(f"##### PUBLIC {sn.subnet_id}")
        for sn in vpc.private_subnets:
            print(f"##### PRIVATE {sn.subnet_id}")
        for sn in vpc.isolated_subnets:
            print(f"##### ISOLATED {sn.subnet_id}")

        aws_eks.Cluster(
                self,
                "cluster",
                version=aws_eks.KubernetesVersion.V1_20,
                endpoint_access=aws_eks.EndpointAccess.PRIVATE,
                vpc=vpc,
                vpc_subnets=[aws_ec2.SubnetSelection(subnet_type=aws_ec2.SubnetType.PRIVATE_ISOLATED)],
                )
  1. Doing a cdk synth should lead to the following error:
#####--<built-in function id>
##### PUBLIC s-12345
##### PUBLIC s-67890
##### PRIVATE p-12345
##### PRIVATE p-67890
jsii.errors.JavaScriptError:
  Error: Vpc must contain private subnets when public endpoint access is disabled
      at new Cluster (/private/var/folders/tv/9zml9td56ssfq30x0c39z1240000gn/T/jsii-kernel-cN6IVu/node_modules/aws-cdk-lib/aws-eks/lib/cluster.js:1:9002)
...

Oddness / "Workaround"

When changing the subnet type to PRIVATE_WITH_NAT...

vpc_subnets=[aws_ec2.SubnetSelection(subnet_type=aws_ec2.SubnetType.PRIVATE_WITH_NAT)],

... and running cdk synth again, the cdk-cli is doing a real lookup into our AWS account and will create a cdk.context.json with following content:

{
  "vpc-provider:account=xxxxxxxxxxxx:filter.vpc-id=vpc-xxxxxxxxxxxxxxxxx:region=eu-central-1:returnAsymmetricSubnets=true": {
    "vpcId": "vpc-xxxxxxxxxxxxxxxxx",
    "vpcCidrBlock": "10.105.28.0/22",
    "availabilityZones": [],
    "subnetGroups": [
      {
        "name": "Isolated",
        "type": "Isolated",
        "subnets": [
          {
            "subnetId": "subnet-xxxxxxxxxxxxxxxxx",
            "cidr": "10.105.28.0/24",
            "availabilityZone": "eu-central-1a",
            "routeTableId": "rtb-xxxxxxxxxxxxxxxxx"
          },
          {
            "subnetId": "subnet-xxxxxxxxxxxxxxxxx",
            "cidr": "10.105.29.0/24",
            "availabilityZone": "eu-central-1b",
            "routeTableId": "rtb-xxxxxxxxxxxxxxxxx"
          },
          {
            "subnetId": "subnet-xxxxxxxxxxxxxxxxx",
            "cidr": "10.105.30.0/24",
            "availabilityZone": "eu-central-1c",
            "routeTableId": "rtb-xxxxxxxxxxxxxxxxx"
          }
        ]
      }
    ]
  }
}

The synthsization itself will exit with following error:

(#####--<built-in function id>
##### PUBLIC s-12345
##### PUBLIC s-67890
##### PRIVATE p-12345
##### PRIVATE p-67890
#####--<built-in function id>
##### ISOLATED subnet-00e691f3dxxxxxxxx
##### ISOLATED subnet-0b8a1a22fxxxxxxxx
##### ISOLATED subnet-0b5a836b4xxxxxxxx
jsii.errors.JavaScriptError:
  Error: There are no 'Private' subnet groups in this VPC. Available types: Isolated
      at LookedUpVpc.selectSubnetObjectsByType (/private/var/folders/tv/9zml9td56ssfq30x0c39z1240000gn/T/jsii-kernel-0s3DjG/node_modules/aws-cdk-lib/aws-ec2/lib/vpc.js:1:4824)

Now an important behavior to mention: With this cdk.context.json present, when changing the subnet selection back to PRIVATE_ISOLATED the cdk synth will work!

That mean on the one hand CDK will fail to pass the subnet type checks of the EKS construct before doing the lookups, because CDK just passes "fake" public/private (private_with_nat) subnets to the EKS construct. If CDK reads the sunbnets from the cdk.context.json, cdk passes the proper isolated (private_isolated) subnets to EKS. On the other hand CDK fails to accept the isolated (private_isolated) subnets as "private" (private_with_nat) subnets in the VPC construct. That wrong logic definately needs to be fixed.

What did you expect to happen?

I'd expect to happen that the code will synthesize with ISOLATED(v1)/ PRIVATE_ISOLATED(v1) subnets as before with private subnets.
We changed nothing but the subnet selection from private to isolated networks, as the cdk lookup detects our networks as 'isolated' since cdk 1.140.

What actually happened?

cdk synth complains about

jsii.errors.JSIIError: Vpc must contain private subnets when public endpoint access is disabled

CDK CLI Version

3.13.0 (build b0b744d)

Framework Version

1.144 / 2.13.0

Node.js Version

v16.13.0

OS

Linux/Mac/Win

Language

Python

Language Version

Python 3.7, 3.8, 3.9, 3.10


UPDATES after new insights from 2022-03-04

Solution

The bug is, that on unresolved VPC lookups CDK uses a dummy VPC which doesn't contain isolated subnets.
So if you filter your subnets for any construct for only isolated subnets, the list of subnets returned is empy. This causes all the errors shown, as in my example of the EKS construct.
CDK returns this dummy VPC when either the cdk.context.json is not available or the lookup is not possible (e.g. you have no api access to your AWS account).

The solution for CDK developers should be to also add isolated subnets to the dummy VPC here (link for CDK v2.15.0):
https://github.com/aws/aws-cdk/blob/v2.15.0/packages/%40aws-cdk/aws-ec2/lib/vpc.ts#L2172

Please forgive me, that I didn't provide a merge request myself. I mainly use CDK in Python and never got into JS/TS.

Workaround (Python)

With the solution in mind we created a workaround, we created our custom VPC singleton whenever we got the dummy VPC. Our custom VPC includes isolated subnets. Because this dummy vpc is only used on unresolved VPC lookups, this VPC never gets into the real deployed stack, where the actual lookup will succeed.

    @property
    def imported_base_vpc(self) -> aws_ec2.IVpc:
        if self._imported_base_vpc is None:
            self._imported_base_vpc = aws_ec2.Vpc.from_lookup(
                self,
                "imported-base-vpc",
                vpc_id=dbv1_ssm.StringParameter.value_from_lookup(
                    self, (f"/{_account_application_name}/{dbv1_core.Stack.of(self).dbv1_environment}" f"/{_account_info_subsystem}/{_string_vpc_id}")
                ),
            )

        # AWS CDK has a severe bug. It introduced a new SubnetType.PRIVATE_ISOLATED, which is not included into the
        # temporary VPC cdk creates before doing lookups into the account, or when lookups not possible. Because we
        # use PRIVATE_ISOLATED subnets in our CDK constructs, we have to check for the presence/absence of this subnet
        # type and create a custom subnet, with PRIVATE_ISOLATED networks included.
        # see https://github.com/aws/aws-cdk/issues/19122
        if len(self._imported_base_vpc.isolated_subnets) == 0:
            singleton_id = "vpc_on_unresolved_lookup"
            stack = dbv1_core.Stack.of(self)
            retVal = stack.node.try_find_child(singleton_id)
            if not retVal:
                retVal = aws_ec2.Vpc(
                    stack,
                    singleton_id,
                    subnet_configuration=[
                        aws_ec2.SubnetConfiguration(name="public", subnet_type=aws_ec2.SubnetType.PUBLIC, cidr_mask=24),
                        aws_ec2.SubnetConfiguration(name="private_with_nat", subnet_type=aws_ec2.SubnetType.PRIVATE_WITH_NAT, cidr_mask=24),
                        aws_ec2.SubnetConfiguration(name="private_isolated", subnet_type=aws_ec2.SubnetType.PRIVATE_ISOLATED, cidr_mask=24),
                    ],
                )
            return cast(aws_ec2.Vpc, retVal)
        return self._imported_base_vpc

Metadata

Metadata

Assignees

No one assigned

    Labels

    @aws-cdk/aws-eksRelated to Amazon Elastic Kubernetes ServicebugThis issue is a bug.needs-triageThis issue or PR still needs to be triaged.p1

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions