-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Description
Describe the bug
Using Python, I am trying to create a step functions state machine that runs an AWS SageMaker batch transform job using the .sync version of the API like this:
batch_inference_job = sfn_tasks.SageMakerCreateTransformJob(
self,
"BatchInferenceTransformJob",
integration_pattern=sfn.IntegrationPattern.RUN_JOB,
transform_job_name=sfn.JsonPath.string_at("$.transform_job_name"),
model_name="xxx",
...
)
state_machine = sfn.StateMachine(
self,
"BatchInferencePipeline",
state_machine_name="xxx"
definition=batch_inference_job
)Note that I am not providing anything in the role parameter when I instantiate the StateMachine. When I try to execute the state machine, I get the following type of error:
User: arn:aws:sts::xxx:assumed-role/xxx is not authorized to perform: sagemaker:AddTags on resource: arn:aws:sagemaker:eu-west-1:xxx:transform-job/xxx because no identity-based policy allows the sagemaker:AddTags action (Service: AmazonSageMaker; Status Code: 400; Error Code: AccessDeniedException; Request ID: xxx; Proxy: null)
When I take a closer look at what tags step functions is trying to set for the transform job (I am not setting any tags for the job myself), I see some AWS managed tags, which presumably the step functions service appends:
{
"Key": "MANAGED_BY_AWS",
"Value": "STARTED_BY_STEP_FUNCTIONS"
}
So from my viewpoint it seems that the role generated by CDK for the state machine should already by default include a policy that allows the sagemaker:AddTags action. When I tried spinning up the batch transform job with sfn.IntegrationPattern.REQUEST_RESPONSE, step functions didn't try to set any tags and submitting the job worked as expected.
Expected Behavior
The default role generated by cdk for the step functions state machine should have all the necessary permissions to start a job when using integration_pattern=sfn.IntegrationPattern.RUN_JOB, including sagemaker:AddTags.
Current Behavior
Got an error when step functions tried to create the batch transform job:
User: arn:aws:sts::xxx:assumed-role/xxx is not authorized to perform: sagemaker:AddTags on resource: arn:aws:sagemaker:eu-west-1:xxx:transform-job/xxx because no identity-based policy allows the sagemaker:AddTags action (Service: AmazonSageMaker; Status Code: 400; Error Code: AccessDeniedException; Request ID: xxx; Proxy: null)
Reproduction Steps
batch_inference_job = sfn_tasks.SageMakerCreateTransformJob(
self,
"BatchInferenceTransformJob",
integration_pattern=sfn.IntegrationPattern.RUN_JOB,
transform_job_name=sfn.JsonPath.string_at("$.transform_job_name"),
model_name="xxx",
...
)
state_machine = sfn.StateMachine(
self,
"BatchInferencePipeline",
state_machine_name="xxx"
definition=batch_inference_job
)Possible Solution
Not tested, but it seems that the policies for the role are added in: https://github.com/aws/aws-cdk/blob/main/packages/aws-cdk-lib/aws-stepfunctions-tasks/lib/sagemaker/create-transform-job.ts#L273. So simply adding a policy statement that allows sagemaker:AddTags there.
Additional Information/Context
No response
CDK CLI Version
2.81.0
Framework Version
No response
Node.js Version
v18.16.0
OS
MacOS 12.5
Language
Python
Language Version
3.10.9
Other information
No response