Describe the feature
It appears to be currently impossible to use the provisioned concurrency utilization predefined metric for a SageMaker serverless endpoint. I know how to do this for a Lambda function and have even tried to use aws_cdk.aws_applicationautoscaling.PredefinedMetric.LAMBDA_PROVISIONED_CONCURRENCY_UTILIZATION, but I get the following error:
Resource handler returned message: "Scalable dimension sagemaker:variant:Des
iredProvisionedConcurrency only supports the following predefined metric types: SageMakerVariantProvisi
onedConcurrencyUtilization (Service: ApplicationAutoScaling, Status Code: 400, Request ID: eec66ae6-1f8
b-42b6-87b6-7ae4b08aeaf9)"
Use Case
I would like to use autoscaling along with provisioned concurrency for serverless SageMaker endpoints.
Proposed Solution
I think that aws_cdk.aws_applicationautoscaling.PredefinedMetric.SAGEMAKER_VARIANT_PROVISIONED_CONCURRENCY_UTILIZATION enum value here could be added in.
Full code example:
# Create SageMaker endpoint
self.endpoint = sagemaker.CfnEndpoint(
self,
endpoint_name,
endpoint_name=endpoint_name,
endpoint_config_name=self.endpoint_configuration.endpoint_config_name,
)
self.endpoint_arn = self.endpoint.ref
self.endpoint_name = self.endpoint.endpoint_name
# Enable autoscaling -- TEST
target = appscaling.ScalableTarget(
self,
f"{construct_id}-scalable_target",
service_namespace=appscaling.ServiceNamespace.SAGEMAKER,
max_capacity=2,
min_capacity=1,
resource_id=f"endpoint/{endpoint_name}/variant/{model_name}",
scalable_dimension="sagemaker:variant:DesiredProvisionedConcurrency",
)
target.scale_to_track_metric(
"SageMakerVariantProvisionedConcurrencyUtilization",
target_value=0.8,
predefined_metric=PredefinedMetric.SAGEMAKER_VARIANT_PROVISIONED_CONCURRENCY_UTILIZATION,
)
target.node.add_dependency(self.endpoint)
Other Information
No response
Acknowledgements
CDK version used
2.127.0
Environment details (OS name and version, etc.)
Darwin M-AI813838 23.3.0 Darwin Kernel Version 23.3.0: Wed Dec 20 21:30:44 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T6000 arm64
Describe the feature
It appears to be currently impossible to use the provisioned concurrency utilization predefined metric for a SageMaker serverless endpoint. I know how to do this for a Lambda function and have even tried to use
aws_cdk.aws_applicationautoscaling.PredefinedMetric.LAMBDA_PROVISIONED_CONCURRENCY_UTILIZATION, but I get the following error:Use Case
I would like to use autoscaling along with provisioned concurrency for serverless SageMaker endpoints.
Proposed Solution
I think that
aws_cdk.aws_applicationautoscaling.PredefinedMetric.SAGEMAKER_VARIANT_PROVISIONED_CONCURRENCY_UTILIZATIONenum value here could be added in.Full code example:
Other Information
No response
Acknowledgements
CDK version used
2.127.0
Environment details (OS name and version, etc.)
Darwin M-AI813838 23.3.0 Darwin Kernel Version 23.3.0: Wed Dec 20 21:30:44 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T6000 arm64