Skip to content

(aws-batch): (Compute environments cannot be created with launch templates specifying network interface) #21577

@tcutts

Description

@tcutts

Describe the bug

Many HPC applications require low latency, and so it's desirable to use launch templates to configure EC2 instances with Elastic Fabric Adapters. This currently fails at deployment time.

Expected Behavior

Should be able to configure a Compute Environment with no security groups, using network interfaces in the Launch Template.

Current Behavior

L2 Construct always creates a SecurityGroupIds property in the compute environment, and so the deployment fails with:

Failed resources:
batch-stack | 09:35:08 | CREATE_FAILED        | AWS::Batch::ComputeEnvironment        | EFABatch (EFABatchXXXXXXX) Resource handler returned message: "Error executing request, Exception : Either compute environment Security Groups or Network Interfaces in Launch template are exclusively allowed, RequestId: nnnnnnn-nnnn-nnnn-nnnn-nnnnnnnnnnnn (Service: Batch, Status Code: 400, Request ID: nnnnnnn-nnnn-nnnn-nnnn-nnnnnnnnnnnn)" (RequestToken: nnnnnnn-nnnn-nnnn-nnnn-nnnnnnnnnnnn, HandlerErrorCode: InvalidRequest)

Reproduction Steps

The following integrity test fails, demonstrating the problem:

import * as ec2 from '@aws-cdk/aws-ec2';
import * as cdk from '@aws-cdk/core';
import * as integ from '@aws-cdk/integ-tests';
import * as batch from '../lib/';

export const app = new cdk.App();

const stack = new cdk.Stack(app, 'batch-stack');

const vpc = new ec2.Vpc(stack, 'vpc');

// While this test specifies EFA, the same behavior occurs with
// interfaceType: 'interface' as well
const launchTemplateEFA = new ec2.CfnLaunchTemplate(stack, 'ec2-launch-template-efa', {
  launchTemplateData: {
    networkInterfaces: [{
      deviceIndex: 0,
      subnetId: vpc.privateSubnets[0].subnetId,
      interfaceType: 'efa',
    }],
  },
});

new batch.ComputeEnvironment(stack, 'EFABatch', {
  managed: true,
  computeResources: {
    type: batch.ComputeResourceType.ON_DEMAND,
    instanceTypes: [new ec2.InstanceType('c5n')],
    vpc,
    launchTemplate: {
      launchTemplateName: launchTemplateEFA.launchTemplateName as string,
    },
  },
});

new integ.IntegTest(app, 'BatchWithEFATest', {
  testCases: [stack],
});

app.synth();

Possible Solution

The connected pull request is a proposed solution to this problem, allowing the user to explicitly exclude securitygroups from their ComputeEnvironment, so that they can then set the SecurityGroups in their LaunchTemplate instead. It's hard to put a validation check in for this, because the user might not have defined the launch template within the stack at all, so its contents cannot be checked before runtime.

Additional Information/Context

Interestingly, Terraform's AWS provider has exactly the same problem, also recently reported: hashicorp/terraform-provider-aws#25801

CDK CLI Version

2.37.1

Framework Version

No response

Node.js Version

14.20

OS

MacOS 12.5

Language

Typescript

Language Version

No response

Other information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    @aws-cdk/aws-batchRelated to AWS BatchbugThis issue is a bug.effort/smallSmall work item – less than a day of effortin-progressThis issue is being actively worked on.p2

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions