Conversation
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/dotnet/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 14928Or
iex "& { $(irm https://raw.githubusercontent.com/dotnet/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 14928" |
There was a problem hiding this comment.
Pull request overview
Fixes flakiness in OpenAIFunctionalTests.DependentResourceWaitsForOpenAIModelResourceWithHealthCheckToBeHealthy by preventing the model resource from inheriting the parent OpenAI resource’s default health check that triggers external HTTP calls during test execution.
Changes:
- Refactors the test to keep a handle to the parent OpenAI resource builder.
- Removes the parent OpenAI resource’s default
"resource_check"health check annotation before building the app to avoid external calls tostatus.openai.com.
| // Remove the default status page health check from the parent OpenAI resource | ||
| // to avoid external HTTP calls to status.openai.com during tests. | ||
| var statusPageHealthCheck = Assert.Single(openai.Resource.Annotations, x => x is HealthCheckAnnotation hca && hca.Key == "resource_check"); | ||
| openai.Resource.Annotations.Remove(statusPageHealthCheck); |
There was a problem hiding this comment.
How about removing everything other than blocking_check? That's the only one this test cares about.
There was a problem hiding this comment.
What about DependentResourceWaitsForOpenAIResourceWithHealthCheckToBeHealthy?
There was a problem hiding this comment.
@copilot respond to the feedback, and handle that.
There was a problem hiding this comment.
How about removing everything other than blocking_check? That's the only one this test cares about.
Is this statement even true?
If the idea of the tests is to run OAI health checks, and the OAI health checks then ping the OAI servers, then these tests are inheritanly flaky depending on network access to OAI and OAI being up
…HTTP calls The OpenAI model resource inherits the parent resource's health check via TryGetAnnotationsIncludingAncestorsOfType. The parent's resource_check calls https://status.openai.com/api/v2/status.json which can timeout in CI, preventing the model from reaching healthy status. Remove the parent's status page health check in the test, matching the pattern already used by the non-quarantined sibling test. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
826e6dc to
7a8a73c
Compare
🎬 CLI E2E Test RecordingsThe following terminal recordings are available for commit
📹 Recordings uploaded automatically from CI run #22652814908 |
Flaky Test Fix
Test
Aspire.Hosting.OpenAI.Tests.OpenAIFunctionalTests.DependentResourceWaitsForOpenAIModelResourceWithHealthCheckToBeHealthyRoot Cause
The
OpenAIModelResourceimplementsIResourceWithParent<OpenAIResource>, soResourceHealthCheckServicediscovers health checks from both the model and its parent viaTryGetAnnotationsIncludingAncestorsOfType<HealthCheckAnnotation>(). The parent'sresource_checkhealth check makes an external HTTP call tohttps://status.openai.com/api/v2/status.json, which can timeout or fail in CI, preventing the model resource from ever reaching healthy status.Fix
Remove the parent's
resource_checkhealth check annotation in the test before building the app, matching the pattern already used by the non-quarantined sibling test (DependentResourceWaitsForOpenAIResourceWithHealthCheckToBeHealthy).Verification
Local runs:
Verification Rationale
High confidence — root cause is a clear pattern match (external HTTP calls in health check inherited by child resource). The sibling test already demonstrates the correct fix pattern. CI verification confirms 50/50 pass on Linux, the most affected OS (14% failure rate pre-fix).
Notes
[QuarantinedTest]attribute kept — unquarantining happens separately after 21 days of zero failuresThis fix was generated using the fix-flaky-test skill.