Skip to content

Fix WithHttpCommand flaky tests#13769

Merged
radical merged 6 commits intomainfrom
jamesnk/with-http-command-flaky-tests
Jan 6, 2026
Merged

Fix WithHttpCommand flaky tests#13769
radical merged 6 commits intomainfrom
jamesnk/with-http-command-flaky-tests

Conversation

@JamesNK
Copy link
Member

@JamesNK JamesNK commented Jan 6, 2026

Should fix #9670
Should fix #9725
Should fix #9800
Should fix #9818
Should fix #9789
Should fix #9772
Should fix #8101
Should fix #9811

Various issues:

  • The tests relied on DCP to build and launch projects. However I've found DCP to be a point of flaky failure in tests. I think this is the biggest problem of these tests. PR updates all tests to use a custom resource and a fake HttpHandler. No more building .NET project over and over, or sending HTTP. More reliable and faster.
  • The logic for UpdateState was odd. There was a background task looking at the resource state and then using that to update the command state the next time the resource was updated (maybe it would update immediately, may it would wait for the next update before the state was it. It was a race). The problem is if the resource was never updated again then the command state would never change. Caused one flaky test. The fix is to use the resource status from the context which is much simpler. @DamianEdwards is there a reason why UpdateState was written this way?
  • Resilency retry logic caused the test that returned a 500 status to retry over and over. It took forever. Fixed by disabling resilience retries in tests.

@JamesNK JamesNK requested a review from DamianEdwards January 6, 2026 09:09
@JamesNK JamesNK requested a review from mitchdenny as a code owner January 6, 2026 09:09
Copilot AI review requested due to automatic review settings January 6, 2026 09:09
@JamesNK JamesNK added the area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication label Jan 6, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Jan 6, 2026

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/dotnet/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 13769

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/dotnet/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 13769"

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes flaky tests in WithHttpCommandTests by addressing three core issues:

  1. Simplified UpdateState logic: Replaced a complex background task approach with direct access to context.ResourceSnapshot.State, eliminating race conditions
  2. Disabled resilience retries in tests: Prevents long-running retry loops that caused slow test execution (especially for 500 status code tests)
  3. Replaced DCP dependency with custom resources: Uses CustomResource with pre-allocated endpoints and FakeHttpMessageHandler instead of relying on DCP to build/launch projects, improving test reliability and speed

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/Aspire.Hosting/ResourceBuilderExtensions.cs Simplified UpdateState logic to directly read resource state from context instead of using a background task watching resource events
tests/Aspire.Hosting.Tests/WithHttpCommandTests.cs Refactored all tests to use CustomResource with FakeHttpMessageHandler instead of real projects, added helper methods for resource/endpoint creation, disabled HTTP resilience retries, and removed quarantine attributes

Copy link
Contributor

Copilot AI commented Jan 6, 2026

@JamesNK I've opened a new pull request, #13771, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 3 commits January 6, 2026 18:09
Co-authored-by: JamesNK <303201+JamesNK@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
@karolz-ms
Copy link
Member

The tests relied on DCP to build and launch projects. However I've found DCP to be a point of flaky failure in tests. I think this is the biggest problem of these tests. PR updates all tests to use a custom resource and a fake HttpHandler. No more building .NET project over and over, or sending HTTP. More reliable and faster.

I noticed that if I run the same .NET project multiple times, in parallel, from different tests, I get unpredictable errors from the build. I suspect this was what was going on here.

@radical
Copy link
Member

radical commented Jan 6, 2026

I changed all the Fixes .. lines to Should fix .. so the issues are not closed. The issues will get closed separately and the attributes removed.

@radical radical merged commit 1ad2af5 into main Jan 6, 2026
286 checks passed
@radical radical deleted the jamesnk/with-http-command-flaky-tests branch January 6, 2026 19:14
@dotnet-policy-service dotnet-policy-service bot added this to the 13.2 milestone Jan 6, 2026
@DamianEdwards
Copy link
Member

@JamesNK

The fix is to use the resource status from the context which is much simpler. @DamianEdwards is there a reason why UpdateState was written this way?

Yeah no idea, was too long ago.

@JamesNK
Copy link
Member Author

JamesNK commented Jan 7, 2026

Keep an eye on enabling/disabling HTTP commands for the upcoming release. If it isn't what you expect, this PR is the problem!

@github-actions github-actions bot locked and limited conversation to collaborators Feb 6, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.