Skip to content

aws-stepfunctions-tasks: Add support for ExecutionRoleArn to EmrAddStep  #27691

@brandondahler

Description

@brandondahler

Describe the feature

On October 22, 2022, EMR launched the Runtime Roles feature to allow jobs to execute as a more specific role than the cluster. This additionally opened up the ability to utilize LakeFormation to access data shared to your job's execution role.

This feature added a new, optional parameter named ExecutionRoleArn to the AddJobFlowSteps action. Consequently, the matching StepFunction action of addStep and addStep.sync have also added this optional parameter.

Optionally, you can also specify the ExecutionRoleArn parameter while using this API.

I'd like to have EmrAddStep support this new ExecutionRoleArn field so that I can utilize the Runtime Roles feature on clusters which are managed by a StepFunctions state machine.

Use Case

I specifically intend to use this functionality to migrate an existing process to utilize LakeFormation's access delegation instead of having an instance role which has to have full access to the underlying S3 bucket.

Proposed Solution

In order to implement, we only need to add some property to the step's props then pass that value through when rendering the task. There are two reasonable options that I see:

Option 1 - Expose an executionRoleArn property as a string

In order to keep the solution as simple as possible and avoid the same issue as #21319, we can simply expose the parameter as an optional string.

  1. Add a new executionRole parameter to the EmrAddStepProps interface:
    export interface EmrContainersStartJobRunProps extends sfn.TaskStateBaseProps {
      ...
      readonly executionRoleArn?: string;
      ...
    }
    
  2. Update _renderTask() to emit the required ExecutionRoleArn field when it is provided:
      protected _renderTask(): any {
        return {
          ...
          Parameters: sfn.FieldUtils.renderObject({
            ...
            ExecutionRoleArn: this.props.executionRoleArn,
            ...
          }),
        };
      }
    

Option 2 - Expose executionRole as an IRole and executionRoleArn as a string

In #21319, it appears that we had originally only implemented exposing a executionRole as IRole and only later realized that doesn't work for JsonPath-provided values. If we want to stay consistent with that pattern, we can do the same.

  1. Add a new executionRole parameter to the EmrAddStepProps interface:
    export interface EmrContainersStartJobRunProps extends sfn.TaskStateBaseProps {
      ...
      readonly executionRole?: iam.IRole;
      readonly executionRoleArn?: string;
      ...
    }
    
  2. Validate that only either executionRole or executionRoleArn are provided:
    constructor(scope: Construct, id: string, private readonly props: EmrAddStepProps) {
      ...
      if (props.executionRole !== undefined && props.executionRoleArn !== undefined) {
        throw new Error(...);
      }
      ...
    }
    
  3. Update _renderTask() to emit the required ExecutionRoleArn field when it is provided:
      protected _renderTask(): any {
        return {
          ...
          Parameters: sfn.FieldUtils.renderObject({
            ...
            ExecutionRoleArn: this.props.executionRoleArn ?? this.props.executionRole?.roleArn,
            ...
          }),
        };
      }
    

Other Information

No response

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

CDK version used

2.95.1

Environment details (OS name and version, etc.)

Mac 13.5

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions