Skip to content

aws_applicationautoscaling: Enable SageMaker serverless provisioned concurrency utilization predefined metric #29065

@mulhod

Description

@mulhod

Describe the feature

It appears to be currently impossible to use the provisioned concurrency utilization predefined metric for a SageMaker serverless endpoint. I know how to do this for a Lambda function and have even tried to use aws_cdk.aws_applicationautoscaling.PredefinedMetric.LAMBDA_PROVISIONED_CONCURRENCY_UTILIZATION, but I get the following error:

Resource handler returned message: "Scalable dimension sagemaker:variant:Des
iredProvisionedConcurrency only supports the following predefined metric types: SageMakerVariantProvisi
onedConcurrencyUtilization (Service: ApplicationAutoScaling, Status Code: 400, Request ID: eec66ae6-1f8
b-42b6-87b6-7ae4b08aeaf9)"

Use Case

I would like to use autoscaling along with provisioned concurrency for serverless SageMaker endpoints.

Proposed Solution

I think that aws_cdk.aws_applicationautoscaling.PredefinedMetric.SAGEMAKER_VARIANT_PROVISIONED_CONCURRENCY_UTILIZATION enum value here could be added in.

Full code example:

        # Create SageMaker endpoint
        self.endpoint = sagemaker.CfnEndpoint(
            self,
            endpoint_name,
            endpoint_name=endpoint_name,
            endpoint_config_name=self.endpoint_configuration.endpoint_config_name,
        )
        self.endpoint_arn = self.endpoint.ref
        self.endpoint_name = self.endpoint.endpoint_name

        # Enable autoscaling -- TEST
        target = appscaling.ScalableTarget(
            self,
            f"{construct_id}-scalable_target",
            service_namespace=appscaling.ServiceNamespace.SAGEMAKER,
            max_capacity=2,
            min_capacity=1,
            resource_id=f"endpoint/{endpoint_name}/variant/{model_name}",
            scalable_dimension="sagemaker:variant:DesiredProvisionedConcurrency",
        )
        target.scale_to_track_metric(
            "SageMakerVariantProvisionedConcurrencyUtilization",
            target_value=0.8,
            predefined_metric=PredefinedMetric.SAGEMAKER_VARIANT_PROVISIONED_CONCURRENCY_UTILIZATION,
        )
        target.node.add_dependency(self.endpoint)

Other Information

No response

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

CDK version used

2.127.0

Environment details (OS name and version, etc.)

Darwin M-AI813838 23.3.0 Darwin Kernel Version 23.3.0: Wed Dec 20 21:30:44 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T6000 arm64

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions