Skip to content

fix: apply --deploy-mode flags to adopted azure.yaml (#8923)#8926

Merged
trangevi merged 6 commits into
mainfrom
trangevi/fix-8923-adopt-azure-yaml-deploy-mode
Jul 2, 2026
Merged

fix: apply --deploy-mode flags to adopted azure.yaml (#8923)#8926
trangevi merged 6 commits into
mainfrom
trangevi/fix-8923-adopt-azure-yaml-deploy-mode

Conversation

@trangevi

@trangevi trangevi commented Jul 1, 2026

Copy link
Copy Markdown
Member

Why

When azd ai agent init -m <azure.yaml> routes to the unified Foundry adoption path, the --deploy-mode, --runtime, and --entry-point flags are silently ignored. The adopted azure.yaml is written as-is with no code_configuration applied to the agent service. This means users cannot select code deploy when adopting a sample.

Approach

After scaffoldProject succeeds, the new applyDeployModeToAdoptedProject function locates the azure.ai.agent service in the adopted project and applies deploy-mode configuration using the same promptDeployMode / promptCodeConfig functions the manifest path already uses.

Decision logic:

  • Explicit --deploy-mode flag -- always apply (user intent overrides sample)
  • No flag + service already has code_configuration or docker property -- no-op (respect pre-configured sample)
  • No flag + neither configured -- prompt interactively (or auto-select per noPrompt/userProvidedManifest rules)

For code deploy: writes code_configuration (runtime, entry_point, dependency_resolution) onto the service, updates language to python/csharp, clears any docker property.

For container deploy: sets the docker property (remoteBuild: true), updates language to docker, clears any code_configuration.

Testing

  • Added TestAdoptedServiceHasCodeConfig covering nil props, empty props, valid struct value, and null value cases
  • All existing adoption tests continue to pass
  • Full go test ./internal/cmd/... -short passes

Fixes: #8923

The unified azure.yaml adoption path (runInitFromAzureYaml) now processes
--deploy-mode, --runtime, and --entry-point flags after scaffolding the
project. For code deploy, code_configuration is written onto the agent
service and the language is updated. For container deploy, the docker
property is set.

When no flags are provided and the service already has code_configuration
or a docker property configured, the service is left unchanged (the
sample's pre-configured state is respected).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings July 1, 2026 17:18
@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown

📋 Prioritization Note

Thanks for the contribution! The linked issue isn't in the current milestone yet.
Thank you for logging this issue; our team is reviewing it. If you need urgent prioritization, tag @RickWinter and @kristenwomack to let us know.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes #8923, where azd ai agent init -m <azure.yaml> (the unified Foundry adoption path added in #8885) silently ignored --deploy-mode, --runtime, and --entry-point. After the sample's azure.yaml is adopted, a new applyDeployModeToAdoptedProject hook locates the azure.ai.agent service and applies code- or container-deploy configuration, reusing the same promptDeployMode/promptCodeConfig helpers as the manifest path. Configuration is written back to the service via SetServiceConfigValue.

Changes:

  • Add applyDeployModeToAdoptedProject and the applyCodeDeployToService/applyContainerDeployToService/adoptedServiceHasCodeConfig helpers, invoked after scaffolding/provider stamping in runInitFromAzureYaml.
  • Resolve deploy mode with explicit-flag precedence, no-op when the sample is already configured, and prompt/auto-select otherwise; write code_configuration/docker/language onto the agent service.
  • Add TestAdoptedServiceHasCodeConfig covering nil/empty/struct/null property cases.

The core issue with this PR is a key-casing mismatch: the adopted path reads and writes snake_case (code_configuration, entry_point, dependency_resolution), but the unified azure.yaml inline agent shape and the extension's deploy/read path use camelCase (codeConfiguration, entryPoint, dependencyResolution, per schemas/azure.ai.agent.json and AgentDefinitionInline). Because SetServiceConfigValue writes the path verbatim into azure.yaml, the code configuration this PR writes will not be recognized at deploy time, so --deploy-mode code still has no effect — the bug the PR intends to fix. Detection of already-configured samples and the code→container clear path share the same casing problem, and language/runtime detection runs against . instead of the agent's source subdirectory.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt.go Adds the deploy-mode application hook and helpers for the adopted unified azure.yaml; uses snake_case config keys and . for language detection (both incorrect).
cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt_test.go Adds TestAdoptedServiceHasCodeConfig, whose fixtures encode the same snake_case key as the code under test.

Comment thread cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt.go Outdated
Comment thread cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt.go Outdated
Comment thread cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt.go Outdated
Comment thread cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt.go Outdated
Comment thread cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt.go
Comment thread cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt_test.go

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two verified issues that affect correctness, plus a robustness concern:

1. Language detection uses project root instead of service directory (HIGH)

\ argetDir := "."\ means \isPythonProject/\isDotnetProject\ scan the project root, but adopted samples declare agent services in subdirectories (e.g., \project: ./agents/assistant). The manifest path uses the downloaded service directory (\init.go:1388). Here, \�gentSvc.GetRelativePath()\ should be used so auto-detection finds the correct files.

2. Non-deterministic map iteration picks an arbitrary service (MEDIUM)

esp.GetProject().GetServices()\ is a map. With multi-agent samples (researcher, writer, triage), the \�reak\ on line 559 applies configuration to whichever service the runtime iterates first. Either apply to all matching services or collect them and let the user choose.

3. Silently discarded \structpb.NewValue\ errors (LOW)

Lines 665 and 697 discard the error from \structpb.NewValue(language)\ with _. While unlikely to fail for string literals, ignoring errors deviates from the extension's error handling conventions.

Note on the existing bot comments about snake_case vs camelCase: those comments cite \internal/project/agent_definition.go\ and \AgentDefinitionInline, neither of which exist in the codebase. The agent_yaml struct uses YAML tag \code_configuration\ and the service target detection (\service_target_agent.go:1122) also reads \code_configuration. The PR's use of snake_case appears internally consistent. I'd recommend verifying which key the deploy path actually reads from the azure.yaml inline properties to confirm end-to-end correctness.

Comment thread cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt.go
Comment thread cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt.go
Comment thread cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt.go
@jongio

jongio commented Jul 1, 2026

Copy link
Copy Markdown
Member

Corrected review summary (the review body above has formatting issues from a shell escape bug):

Two verified issues that affect correctness, plus a robustness concern:

1. Language detection uses project root instead of service directory (HIGH)

targetDir := "." means isPythonProject/isDotnetProject scan the project root, but adopted samples declare agent services in subdirectories (e.g., project: ./agents/assistant). The manifest path uses the downloaded service directory (init.go:1388). Here, agentSvc.GetRelativePath() should be used so auto-detection finds the correct files.

2. Non-deterministic map iteration picks an arbitrary service (MEDIUM)

resp.GetProject().GetServices() is a map. With multi-agent samples (researcher, writer, triage), the break on line 559 applies configuration to whichever service the runtime iterates first. Either apply to all matching services or collect them and let the user choose.

3. Silently discarded structpb.NewValue errors (LOW)

Lines 665 and 697 discard the error from structpb.NewValue(language) with _. While unlikely to fail for string literals, ignoring errors deviates from the extension's error handling conventions.

Note on the existing bot comments about snake_case vs camelCase: those comments cite internal/project/agent_definition.go and AgentDefinitionInline, neither of which exist in the codebase. The agent_yaml struct uses YAML tag code_configuration and the service target detection (service_target_agent.go:1122) also reads code_configuration. The PR's use of snake_case appears internally consistent. I'd recommend verifying which key the deploy path actually reads from the azure.yaml inline properties to confirm end-to-end correctness.

When --image is passed with -m <azure.yaml>, write it onto the adopted
agent service's image field and force container deploy mode. Validate
early that --image + --deploy-mode code is incompatible (same as the
manifest path).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incremental review of commit 511d165 (--image flag handling in adoption path).

The new �alidateImageFlag + image-override logic is well-structured and the test coverage for validation is solid. Two notes on this new commit:

Prior findings still open:

  1. argetDir := "." scans the project root instead of the service subdirectory for language detection (HIGH, line 605). This still causes isPythonProject/isDotnetProject to miss files in samples like ./agents/assistant.
  2. Non-deterministic map iteration picks an arbitrary service when multiple �zure.ai.agent services exist (MEDIUM, line 559).

New issue in this commit:

  1. imageValue, _ := structpb.NewValue(flags.image) (line 572) discards the error. While �alidateImageFlag already checks the format, structpb.NewValue can still fail for other reasons. This follows the same pattern I flagged in my previous review at line 697.

The --image feature itself looks correct: early validation prevents conflicts with --deploy-mode code, the image property is written before applying container config, and the tests cover the key scenarios. Please address findings 1 and 2 from the prior review before merging.

Comment thread cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt.go Outdated
@jongio

jongio commented Jul 1, 2026

Copy link
Copy Markdown
Member

Corrected review summary (the review body above has formatting issues from a shell escape bug):

Incremental review of commit 511d165 (--image flag handling in adoption path).

The new validateImageFlag + image-override logic is well-structured and the test coverage for validation is solid. Two notes on this new commit:

Prior findings still open:

  1. targetDir := "." scans the project root instead of the service subdirectory for language detection (HIGH, line 605). This still causes isPythonProject/isDotnetProject to miss files in samples like ./agents/assistant.
  2. Non-deterministic map iteration picks an arbitrary service when multiple azure.ai.agent services exist (MEDIUM, line 559).

New issue in this commit:

  1. imageValue, _ := structpb.NewValue(flags.image) (line 572) discards the error. While validateImageFlag already checks the format, structpb.NewValue can still fail for other reasons. This follows the same pattern I flagged in my previous review at line 697.

The --image feature itself looks correct: early validation prevents conflicts with --deploy-mode code, the image property is written before applying container config, and the tests cover the key scenarios. Please address findings 1 and 2 from the prior review before merging.

…rors

- Use camelCase keys (codeConfiguration, entryPoint, dependencyResolution)
  in SetServiceConfigValue paths to match the azure.yaml inline format
  that the deploy-time read path expects via JSON unmarshal.
- Apply deploy-mode configuration to ALL azure.ai.agent services in the
  adopted project, not just the first one found (fixes non-deterministic
  map iteration issue with multi-agent samples).
- Use svc.GetRelativePath() for language detection instead of '.' so
  isPythonProject/isDotnetProject scan the service subdirectory.
- Handle structpb.NewValue errors instead of discarding with _.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All three findings from my prior reviews are addressed:

  1. Language detection now uses the service's relative path instead of project root.
  2. Multi-service case handled correctly by collecting all agent services and iterating the full set.
  3. structpb.NewValue errors are no longer silently discarded.

The camelCase key rename (codeConfiguration, entryPoint, dependencyResolution) is consistent with how the deploy path reads these via JSON unmarshal.

Comment thread cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt.go Outdated

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huimiu's observation is correct. I traced the full conversion path:

  1. Internal project.ServiceConfig declares Docker DockerProjectOptions as a value type (service_config.go:48), so it's always the zero-value struct when no docker: section exists in azure.yaml.

  2. The mapper at mapper_registry.go:165 registers a converter DockerProjectOptions -> *azdext.DockerProjectOptions that unconditionally returns a non-nil pointer (line 193).

  3. The ServiceConfig converter at mapper_registry.go:122 calls mapper.Convert(src.Docker, &docker) with that zero-value struct, getting back a non-nil *DockerProjectOptions with all empty fields.

  4. Result: svc.GetDocker() != nil is always true after the gRPC round-trip, so the early return on line 612 always fires when flags.deployMode == "".

Fix: mirror the adoptedServiceHasCodeConfig approach by checking additionalProperties for a docker key, or check whether the Docker struct has any non-zero fields (e.g. svc.GetDocker().GetPath() != "" || svc.GetDocker().GetRemoteBuild()). The additionalProperties approach is more consistent with how the deploy path reads these values.

Withdrawing my earlier approval since this means the no-flag adoption path silently skips deploy-mode configuration.

Comment thread cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt.go Outdated
svc.GetDocker() is always non-nil after the gRPC round-trip because the
mapper unconditionally converts the zero-value DockerProjectOptions to a
non-nil pointer. This caused the no-flag adoption path to always skip
deploy-mode configuration.

Replace with adoptedServiceHasDocker() that checks additionalProperties
for a 'docker' key, matching how codeConfiguration is detected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@trangevi trangevi force-pushed the trangevi/fix-8923-adopt-azure-yaml-deploy-mode branch from 6cf65b0 to 1c8c9f1 Compare July 2, 2026 17:16

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All findings from my prior reviews are addressed in this commit. The adoptedServiceHasDocker function correctly checks additionalProperties instead of svc.GetDocker() (which always returns non-nil due to the gRPC mapper zero-value behavior). The implementation mirrors adoptedServiceHasCodeConfig and the test coverage is solid.

When switching deploy modes, use UnsetServiceConfig (which deletes the
map key) instead of SetServiceConfigValue with a null protobuf value.
The null approach left 'codeConfiguration: null' or 'docker: null' in
the azure.yaml because nil values in map[string]any are marshaled as
YAML null rather than being omitted.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using UnsetServiceConfig instead of writing null is the correct approach. It properly removes the key rather than leaving an explicit null in the YAML, and the guard conditions avoid unnecessary API calls when the property isn't set.

…pt-azure-yaml-deploy-mode

# Conflicts:
#	cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt.go
#	cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt_test.go
@trangevi trangevi merged commit 7b0d35a into main Jul 2, 2026
27 checks passed

@RickWinter RickWinter left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This applies --deploy-mode / --runtime / --entry-point to the adopted azure.yaml agent service, closing the gap where those flags were silently dropped on the Foundry adoption path. The approach is the right shape: it reuses promptDeployMode / promptCodeConfig, writes codeConfiguration with the same camelCase keys the schema and AgentDefinitionToServiceProperties already use (runtime, entryPoint, dependencyResolution all check out), and the has-code-config / has-docker detection correctly reads additionalProperties rather than trusting the always-non-nil Docker pointer. The added tests cover the detection helpers and image-flag validation well.

One thing worth resolving before merge: applyContainerDeployToService hardcodes remoteBuild: true, while the code-based init path turns it off for VNET network-injected Foundry projects, so adopting a container sample against such a project will fail the first deploy and require a manual edit. The map-iteration ordering is a minor nondeterminism nit for the multi-agent case. Nothing else blocks; the key handling and the respect-existing-sample logic are correct as written.

svc *azdext.ServiceConfig,
) error {
// Set docker property with remote build enabled.
dockerMap := map[string]any{"remoteBuild": true}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applyContainerDeployToService always writes remoteBuild: true, but the code-based init path in init_from_code.go sets serviceConfig.Docker = &azdext.DockerProjectOptions{RemoteBuild: !networkInjected}, deliberately turning remote build off when the target Foundry project has VNET network injection, since the ACR Tasks worker can't reach a registry inside the VNET. The deploy path doesn't re-derive this; it surfaces guidance telling the user to flip docker.remoteBuild to false by hand. So a sample adopted with container deploy against a network-injected project will fail the first azd deploy and force a manual edit, while the same sample created via init_from_code would have worked. Can we mirror the networkInjected check here (thread the selected Foundry project through) so the two init paths agree?

svc *azdext.ServiceConfig
}
var agentServices []agentEntry
for name, svc := range resp.GetProject().GetServices() {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ranging over resp.GetProject().GetServices() (a map) gives nondeterministic order, so with more than one azure.ai.agent service the prompt order and log output vary run to run. Single-agent samples won't notice, but sorting agentServices by name before the apply loop would make the behavior deterministic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ext-agents azure.ai.agents extension

Projects

None yet

Development

Successfully merging this pull request may close these issues.

azd ai agent init -m <azure.yaml> ignores --deploy-mode code and does not update the adopted azure.yaml

6 participants