-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Description
Describe the bug
ECS hotswap deployments report success immediately, instead of waiting for the deployment to succeed (or fail).
Example showing complete deploy time of 16s:
> pnpm exec cdk deploy --hotswap --exclusively my-cool-stack
✨ Synthesis time: 4.43s
⚠️ The --hotswap and --hotswap-fallback flags deliberately introduce CloudFormation drift to speed up deployments
⚠️ They should only be used for development - never use them for your production Stacks!
my-cool-stack: start: Building 763ed553d17755524a692452e0dbdc4aac573b775b6003699d978e3a3c5d9297:current_account-current_region
my-cool-stack: success: Built 763ed553d17755524a692452e0dbdc4aac573b775b6003699d978e3a3c5d9297:current_account-current_region
my-cool-stack: start: Publishing 763ed553d17755524a692452e0dbdc4aac573b775b6003699d978e3a3c5d9297:current_account-current_region
my-cool-stack: success: Published 763ed553d17755524a692452e0dbdc4aac573b775b6003699d978e3a3c5d9297:current_account-current_region
my-cool-stack: deploying... [1/1]
✨ hotswapping resources:
✨ ECS Task Definition 'my-cool-stack-api'
✨ ECS Service 'my-cool-stack-backendServiceC9D5DD77-jJXtgE5oL9az'
✨ ECS Task Definition 'my-cool-stack-frontend'
✨ ECS Service 'my-cool-stack-frontendService12C63704-yOwzQjJgpvjX'
✨ ECS Task Definition 'my-cool-stack-frontend' hotswapped!
✨ ECS Service 'my-cool-stack-frontendService12C63704-yOwzQjJgpvjX' hotswapped!
✨ ECS Task Definition 'my-cool-stack-api' hotswapped!
✨ ECS Service 'my-cool-stack-backendServiceC9D5DD77-jJXtgE5oL9az' hotswapped!
✅ my-cool-stack
✨ Deployment time: 12.54s
Stack ARN:
xxx
✨ Total time: 16.96s
I note this behaviour looks to have been the same since the hotswap was initially implemented and so any users of this feature might expect that it is behaving as expected
Expected Behavior
I expect that the CDK hotswap deployment monitors the state of the triggered deployment via the DescribeServices API to ensure it completes successfully before continuing
Current Behavior
Currently the CDK pushes the ECS hotswap deployment and then immediately reports it as a success and continues.
The CDK does set up a custom waiter to await the successful deployment but the success acceptor is configured as the expression:
length(services[].deployments[? status == 'PRIMARY' && runningCount < desiredCount][]) == `0`
This doesn't wait correctly as the new PRIMARY deployment is first created with an intermediate state of runningCount: 0 and desiredCount: 0. It is then populated correctly with a desired and pending count as the scheduler gets to work. But in that initial zero state runningCount < desiredCount is false and therefore the waiter matches on it for success and continues.
Reproduction Steps
Perform any ECS hotswap deployment
Possible Solution
The following waiter acceptor expression should more accurately interrogate the DescribeServices state. I can raise a PR if we agree this is an issue that needs to be fixed.
length(services[].deployments[? status == 'PRIMARY' && rolloutState == 'COMPLETED'][]) == `1`
Additional Information/Context
Running this command I observed the following deployment state changes:
watch -n 1 aws ecs describe-services --cluster $cluster --services $service --query 'services[].deployments'New deployment created in "zero" state
[
[
{
"status": "PRIMARY",
...
"desiredCount": 0,
"pendingCount": 0,
"runningCount": 0,
...
"rolloutState": "IN_PROGRESS",
"rolloutStateReason": "ECS deployment ecs-svc/9717487399336357090 in progress."
},
{
"status": "ACTIVE",
....
"desiredCount": 1,
"pendingCount": 0,
"runningCount": 1,
....
"rolloutState": "COMPLETED",
"rolloutStateReason": "ECS deployment ecs-svc/5831249761506821993 completed."
}
]
]
Deployment gets correct counts
[
[
{
"status": "PRIMARY",
...
"desiredCount": 1,
"pendingCount": 1,
"runningCount": 0,
...
"rolloutState": "IN_PROGRESS",
"rolloutStateReason": "ECS deployment ecs-svc/9717487399336357090 in progress."
},
{
"status": "ACTIVE",
....
"desiredCount": 1,
"pendingCount": 0,
"runningCount": 1,
....
"rolloutState": "COMPLETED",
"rolloutStateReason": "ECS deployment ecs-svc/5831249761506821993 completed."
}
]
]
Deployment launches new task successfully, previous deployment scaled down
[
[
{
"status": "PRIMARY",
...
"desiredCount": 1,
"pendingCount": 0,
"runningCount": 1,
...
"rolloutState": "IN_PROGRESS",
"rolloutStateReason": "ECS deployment ecs-svc/9717487399336357090 in progress."
},
{
"status": "ACTIVE",
....
"desiredCount": 0,
"pendingCount": 0,
"runningCount": 1,
....
"rolloutState": "COMPLETED",
"rolloutStateReason": "ECS deployment ecs-svc/5831249761506821993 completed."
}
]
]
Previous deployment scaled down, moves into DRAINING state
[
[
{
"status": "PRIMARY",
...
"desiredCount": 1,
"pendingCount": 0,
"runningCount": 1,
...
"rolloutState": "IN_PROGRESS",
"rolloutStateReason": "ECS deployment ecs-svc/9717487399336357090 in progress."
},
{
"status": "DRAINING",
....
"desiredCount": 0,
"pendingCount": 0,
"runningCount": 0,
....
"rolloutState": "COMPLETED",
"rolloutStateReason": "ECS deployment ecs-svc/5831249761506821993 completed."
}
]
]
Previous deployment removed
[
[
{
"status": "PRIMARY",
...
"desiredCount": 1,
"pendingCount": 0,
"runningCount": 1,
...
"rolloutState": "IN_PROGRESS",
"rolloutStateReason": "ECS deployment ecs-svc/9717487399336357090 in progress."
}
]
]
New deployment completed
[
[
{
"status": "PRIMARY",
...
"desiredCount": 1,
"pendingCount": 0,
"runningCount": 1,
...
"rolloutState": "COMPLETED",
"rolloutStateReason": "ECS deployment ecs-svc/9717487399336357090 completed."
}
]
]
CDK CLI Version
2.103.0 (build d0d7547)
Framework Version
No response
Node.js Version
18.16.0
OS
MacOS
Language
TypeScript
Language Version
4.9.5
Other information
No response