Skip to content

fix: hotswap is not discoverable via telemetry#1230

Merged
aws-cdk-automation merged 11 commits intomainfrom
conroy/htoswpa
Mar 19, 2026
Merged

fix: hotswap is not discoverable via telemetry#1230
aws-cdk-automation merged 11 commits intomainfrom
conroy/htoswpa

Conversation

@kaizencc
Copy link
Contributor

Adds a new HOTSWAP telemetry event type that fires when hotswap deployments are attempted, tracking success/failure state and resource change counts. The motivation behind this is to track hotswap efficiency.

For example,

{
  "event": {
    "state": "FAILED",
    "eventType": "HOTSWAP",
    "duration": 456,
    "error": {
      "name": "UnknownError"
    },
    "counters": {
      "hotswapped": 0,
      "hotswappableChanges": 1,
      "nonHotswappableChanges": 0
    }
  }
}

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@aws-cdk-automation aws-cdk-automation requested a review from a team March 17, 2026 20:44
@kaizencc kaizencc changed the title chore: hotswap event emitted to telemetry fix: hotswap is not discoverable via telemetry Mar 17, 2026
@github-actions github-actions bot added the p2 label Mar 17, 2026
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@codecov-commenter
Copy link

codecov-commenter commented Mar 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.97%. Comparing base (48e9b5d) to head (2903be5).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1230      +/-   ##
==========================================
+ Coverage   87.94%   87.97%   +0.02%     
==========================================
  Files          74       74              
  Lines       10339    10363      +24     
  Branches     1377     1384       +7     
==========================================
+ Hits         9093     9117      +24     
  Misses       1220     1220              
  Partials       26       26              
Flag Coverage Δ
suite.unit 87.97% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

}
}

function hotswapToEventResult(result: HotswapResult): TelemetryEvent {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

callout in case you disagree:

a hotswap event that results in hotswapped: false (as in, it falls back to regular OR does nothing), will result in state: SUCCEEDED. my opinion is that state is for situations where errors are reported. The way we will differentiate hotswap happening or not is the hotswap counter.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds reasonable to me, since the deployment technically did not fail.

Copy link

@ShadowCat567 ShadowCat567 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question about hotswap event registration to make sure I am understanding how it works.

}
}

function hotswapToEventResult(result: HotswapResult): TelemetryEvent {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds reasonable to me, since the deployment technically did not fail.

stack,
mode: hotswapMode,
hotswapped: true,
hotswapped: !error,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit confusing to me.
From what I understand, hotswapped: true means that we successfully executed hotswap. However hotswap: false and error being defined means we tried to hotswap but ran into some SDK issue so hotswap deployment was failed and hotswap: false and error: undefined means we did some fallback behavior.
So does that mean is hotswap set to false by default? And the only way it can get set to true is if it passes through here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hotswap behavior when nonHotswappableChanges are detected is either 1) fall back to full CFN deployment or 2) ignore nonHotswappableChanges and hotswap what can be hotswapped.

Behavior 1) happens before this line:

if (hotswapMode === 'fall-back') {
    if (nonHotswappableChanges.length > 0) {
      return {
        stack,
        mode: hotswapMode,
        hotswapped: false,
        hotswappableChanges,
        nonHotswappableChanges,
      };
    }
  }

So this is behavior 2). Previously, if an SDK error is encountered we do not catch and actually this span never ends. This now matters because the end command is what sends telemetry. So I have decided that in this scenario, we say hotswapped: false because although we attempted hotswap, it was unsuccessful. If there's no error, we return hotswapped: true like before.


All that is to say, I am preserving old behavior and also your explanation of what is happening is also right.

So does that mean is hotswap set to false by default? And the only way it can get set to true is if it passes through here?

Yes. the only way we set hotswap: true is if applyAllHotswapOperations is invoked and does not return an error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants