Skip to content

Fix hang when trying to deploy single compute resource#12797

Merged
davidfowl merged 4 commits intomainfrom
copilot/fix-deployment-hang-issue
Nov 7, 2025
Merged

Fix hang when trying to deploy single compute resource#12797
davidfowl merged 4 commits intomainfrom
copilot/fix-deployment-hang-issue

Conversation

Copy link
Contributor

Copilot AI commented Nov 7, 2025

Fix hang when deploying single Azure resource to App Service

Problem: When deploying a subset of Azure resources using commands like aspire do deploy-ai-agent, the deployment hangs because dependencies between Bicep resources are not expressed in the pipeline.

Root Cause: Azure Bicep resources didn't declare dependencies on other Azure resources referenced in their parameters, causing deployment deadlocks when running subsets of the dependency graph.

Solution:

  • Force evaluation of Bicep template to ensure parameters are expanded
  • Walk parameters to find IAzureResource references using IValueWithReferences
  • Depend on provision infrastructure steps of referenced Azure resources
  • Test the implementation to ensure it works correctly

Key Changes:

  1. AzureBicepResource.cs: Added PipelineConfigurationAnnotation that:

    • Calls GetBicepTemplateString() to materialize parameters
    • Uses ProcessAzureReferences() to recursively find IAzureResource dependencies via IValueWithReferences
    • Sets up provision step dependencies using context.GetSteps()
  2. Test: Added DeployAsync_WithAzureResourceDependencies_DoesNotHang test that:

    • Recreates the issue scenario (AppService + KeyVault + compute resource with secret reference)
    • Uses diagnostics mode to verify dependency graph structure
    • Confirms provision-api-website depends on provision-kv (preventing hangs)

Verification: The diagnostic output snapshot shows correct dependency: provision-api-website now depends on provision-kv (line 101 in snapshot).

Original prompt

This section details on the original issue you should resolve

<issue_title>Hang when trying to deploy a single comptue resource to app service</issue_title>
<issue_description>App

#:package Aspire.Hosting.Python@13.1.0-pr.12787.g68c1262c
#:package Aspire.Hosting.OpenAI@13.1.0-pr.12787.g68c1262c
#:package Aspire.Hosting.Docker@13.1.0-pr.12787.g68c1262c
#:package Aspire.Hosting.Azure.AppService@13.1.0-pr.12787.g68c1262c
#:package Aspire.Hosting.Azure.KeyVault@13.1.0-pr.12787.g68c1262c
#:sdk Aspire.AppHost.Sdk@13.1.0-pr.12787.g68c1262c

var builder = DistributedApplication.CreateBuilder(args);

builder.AddAzureAppServiceEnvironment("env");

var model = builder.AddOpenAI("openai");

var oaiKey = builder.AddAzureKeyVault("kv")
               .AddSecret("oaikey", model.Resource.Key);

builder.AddPythonApp("ai-agent", "./agent", "main.py")
    .WithHttpEndpoint(env: "PORT")
    .WithExternalHttpEndpoints()
    .WithEnvironment("HOST", "0.0.0.0")
    .WithEnvironment("LOG_LEVEL", "INFO")
    .WaitForStart(model)
    .WithEnvironment("OPENAI_API_KEY", oaiKey.Resource);

builder.Build().Run();

Log

<snip>
23:35:51 (provision-ai-agent-website) → Starting provision-ai-agent-website...
23:35:51 (provision-ai-agent-website) → Deploying ai-agent-website
^C23:39:38 (provision-ai-agent-website) i [DBG] Checksum mismatch for resource ai-agent-website. Expected: (null), Actual: 214fd003

PS: aspire deploy works, aspire do deploy-ai-agent hangs</issue_description>

<agent_instructions>We need to setup dependencies between azure resources. We do this by looking at parameters and finding azure resource references, then depending on their provision steps. This will happen in

Annotations.Add(new PipelineStepAnnotation((factoryContext) =>
via a PipelineConfigurationAnnotation.

There are some examples of visiting parameters here

private static void Visit(object? value, Action<object> visitor) =>
Visit(value, visitor, []);
private static void Visit(object? value, Action<object> visitor, HashSet<object> visited)
{
if (value is null || !visited.Add(value))
{
return;
}
visitor(value);
if (value is IValueWithReferences vwr)
{
foreach (var reference in vwr.References)
{
Visit(reference, visitor, visited);
}
}
}
private static Task VisitAsync(object? value, Func<object, CancellationToken, Task> visitor, CancellationToken cancellationToken = default) =>
VisitAsync(value, visitor, [], cancellationToken);
private static async Task VisitAsync(object? value, Func<object, CancellationToken, Task> visitor, HashSet<object> visited, CancellationToken cancellationToken = default)
{
if (value is null || !visited.Add(value))
{
return;
}
await visitor(value, cancellationToken).ConfigureAwait(false);
if (value is IValueWithReferences vwr)
{
foreach (var reference in vwr.References)
{
await VisitAsync(reference, visitor, visited, cancellationToken).ConfigureAwait(false);
}
}
}
.

Here

void CaptureBicepOutputs(object value)
{
if (value is BicepOutputReference bo)
{
outputs[bo.ValueExpression] = bo;
}
}
void CaptureBicepOutputsFromParameters(IResourceWithParameters resource)
{
foreach (var parameter in resource.Parameters)
{
Visit(parameter.Value, CaptureBicepOutputs);
}
}

Another example of walking the graph to find azure references

private static void ProcessAzureReferences(HashSet<IAzureResource> azureReferences, object? value)
.

So the steps are:

  1. In the pipeline step annoation call resource.GetBicepTemplateString() to Force evaluation of the Bicep template to ensure parameters are expanded.
  2. Walk the parameters, find IAzureResourceReferences
  3. Call GetSteps(resource, WellKnownPipelineTags.ProvisionInfrastructure) and depend on those steps from the azure resources own provision infrastructure steps.</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@davidfowl OK figured it out, this is a sneaky one. We don't express the dependencies between bicep resources today. This works well when all run concurrently and dependent resources unblock each other.

Where it DOES NOT WORK is when you run subsets of the graph as we need to know what dependees to run first.</comment_new>


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Add PipelineConfigurationAnnotation to AzureBicepResource to establish dependencies between Azure resources based on parameter references. This fixes hangs when deploying subsets of resources (e.g., aspire do deploy-<resource>).

Changes:
- Added PipelineConfigurationAnnotation in AzureBicepResource constructor
- Walks parameters using IValueWithReferences to find Azure resource dependencies
- Makes provision steps depend on referenced Azure resources' provision steps
- Added test to verify dependency graph via diagnostics output

Co-authored-by: davidfowl <95136+davidfowl@users.noreply.github.com>
@github-actions
Copy link
Contributor

github-actions bot commented Nov 7, 2025

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/dotnet/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 12797

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/dotnet/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 12797"

@davidfowl davidfowl closed this Nov 7, 2025
@davidfowl davidfowl reopened this Nov 7, 2025
@dotnet-policy-service dotnet-policy-service bot added this to the 13.1 milestone Nov 7, 2025
Copilot AI requested a review from davidfowl November 7, 2025 09:06
@davidfowl davidfowl marked this pull request as ready for review November 7, 2025 09:06
Copilot AI review requested due to automatic review settings November 7, 2025 09:06
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a deployment hang issue that occurs when Azure Bicep resources depend on other Azure resources (e.g., when a compute resource references a KeyVault secret). The fix establishes proper pipeline dependencies between Azure resources by detecting resource references through the IValueWithReferences interface and ensuring provision steps execute in the correct order.

Key Changes

  • Added pipeline configuration logic to AzureBicepResource constructor that automatically detects and establishes dependencies on referenced Azure resources
  • Implemented ProcessAzureReferences helper method to recursively extract Azure resource references from parameter values
  • Added comprehensive test that verifies the fix prevents deployment hangs and includes snapshot verification of the dependency graph

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/Aspire.Hosting.Azure/AzureBicepResource.cs Added PipelineConfigurationAnnotation to detect Azure resource references in parameters and establish provision step dependencies; added ProcessAzureReferences helper methods
tests/Aspire.Hosting.Azure.Tests/AzureDeployerTests.cs Added parameterized test covering both deploy and diagnostics modes to verify deployment completes without hanging when resources have dependencies
tests/Aspire.Hosting.Azure.Tests/Snapshots/AzureDeployerTests.DeployAsync_WithAzureResourceDependencies_DoesNotHang_step=diagnostics.verified.txt Snapshot file capturing expected dependency graph diagnostic output showing correct execution order and dependencies

Comment on lines +499 to +503
/// <summary>
/// Processes a value to extract Azure resource references and adds them to the collection.
/// Uses IValueWithReferences to recursively walk the reference graph.
/// </summary>
private static void ProcessAzureReferences(HashSet<IAzureResource> azureReferences, object? value)
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XML documentation for ProcessAzureReferences is missing <param> tags for the parameters. According to the coding guidelines, all parameters should be documented with <param> tags. Add: <param name="azureReferences">The collection to populate with discovered Azure resource references.</param> and <param name="value">The value to process for Azure resource references.</param>

Copilot generated this review using guidance from repository custom instructions.
ProcessAzureReferences(azureReferences, value, []);
}

private static void ProcessAzureReferences(HashSet<IAzureResource> azureReferences, object? value, HashSet<object> visited)
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This private overload is missing XML documentation. Since it's a private method, it should have a brief <summary> tag explaining its purpose and parameters. The public coding guidelines indicate even internal/private APIs should have minimal documentation with brief <summary> tags.

Copilot generated this review using guidance from repository custom instructions.
@davidfowl davidfowl changed the title [WIP] Fix hang when trying to deploy single compute resource Fix hang when trying to deploy single compute resource Nov 7, 2025
@davidfowl
Copy link
Member

/backport to release/13.0

@github-actions
Copy link
Contributor

github-actions bot commented Nov 7, 2025

Started backporting to release/13.0: https://github.com/dotnet/aspire/actions/runs/19164786381

Copilot AI added a commit that referenced this pull request Nov 7, 2025
- Add PipelineConfigurationAnnotation to AzureBicepResource to set up dependencies
- Add ProcessAzureReferences helper methods to find Azure resource dependencies
- Add test DeployAsync_WithAzureResourceDependencies_DoesNotHang (adapted for release/13.0)
- Update snapshot for DeployAsync_WithMultipleComputeEnvironments_Works
- Tests adjusted to work without IContainerRuntime.LoginToRegistryAsync (not in release/13.0)

Co-authored-by: davidfowl <95136+davidfowl@users.noreply.github.com>
@davidfowl davidfowl merged commit 5cffa03 into main Nov 7, 2025
296 checks passed
@davidfowl davidfowl deleted the copilot/fix-deployment-hang-issue branch November 7, 2025 16:19
Comment on lines +64 to +65
// Force evaluation of the Bicep template to ensure parameters are expanded
_ = GetBicepTemplateString();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make a better way to do this at some point. Initialize() or something.

joperezr pushed a commit that referenced this pull request Nov 7, 2025
…#12800)

* Initial plan

* Backport PR #12797 fix for deployment hang to release/13.0

- Add PipelineConfigurationAnnotation to AzureBicepResource to set up dependencies
- Add ProcessAzureReferences helper methods to find Azure resource dependencies
- Add test DeployAsync_WithAzureResourceDependencies_DoesNotHang (adapted for release/13.0)
- Update snapshot for DeployAsync_WithMultipleComputeEnvironments_Works
- Tests adjusted to work without IContainerRuntime.LoginToRegistryAsync (not in release/13.0)

Co-authored-by: davidfowl <95136+davidfowl@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: davidfowl <95136+davidfowl@users.noreply.github.com>
@github-actions github-actions bot locked and limited conversation to collaborators Dec 8, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hang when trying to deploy a single comptue resource to app service

5 participants