fix: apply --deploy-mode flags to adopted azure.yaml (#8923)#8926
Conversation
The unified azure.yaml adoption path (runInitFromAzureYaml) now processes --deploy-mode, --runtime, and --entry-point flags after scaffolding the project. For code deploy, code_configuration is written onto the agent service and the language is updated. For container deploy, the docker property is set. When no flags are provided and the service already has code_configuration or a docker property configured, the service is left unchanged (the sample's pre-configured state is respected). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
📋 Prioritization NoteThanks for the contribution! The linked issue isn't in the current milestone yet. |
There was a problem hiding this comment.
Pull request overview
This PR fixes #8923, where azd ai agent init -m <azure.yaml> (the unified Foundry adoption path added in #8885) silently ignored --deploy-mode, --runtime, and --entry-point. After the sample's azure.yaml is adopted, a new applyDeployModeToAdoptedProject hook locates the azure.ai.agent service and applies code- or container-deploy configuration, reusing the same promptDeployMode/promptCodeConfig helpers as the manifest path. Configuration is written back to the service via SetServiceConfigValue.
Changes:
- Add
applyDeployModeToAdoptedProjectand theapplyCodeDeployToService/applyContainerDeployToService/adoptedServiceHasCodeConfighelpers, invoked after scaffolding/provider stamping inrunInitFromAzureYaml. - Resolve deploy mode with explicit-flag precedence, no-op when the sample is already configured, and prompt/auto-select otherwise; write
code_configuration/docker/languageonto the agent service. - Add
TestAdoptedServiceHasCodeConfigcovering nil/empty/struct/null property cases.
The core issue with this PR is a key-casing mismatch: the adopted path reads and writes snake_case (code_configuration, entry_point, dependency_resolution), but the unified azure.yaml inline agent shape and the extension's deploy/read path use camelCase (codeConfiguration, entryPoint, dependencyResolution, per schemas/azure.ai.agent.json and AgentDefinitionInline). Because SetServiceConfigValue writes the path verbatim into azure.yaml, the code configuration this PR writes will not be recognized at deploy time, so --deploy-mode code still has no effect — the bug the PR intends to fix. Detection of already-configured samples and the code→container clear path share the same casing problem, and language/runtime detection runs against . instead of the agent's source subdirectory.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt.go |
Adds the deploy-mode application hook and helpers for the adopted unified azure.yaml; uses snake_case config keys and . for language detection (both incorrect). |
cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt_test.go |
Adds TestAdoptedServiceHasCodeConfig, whose fixtures encode the same snake_case key as the code under test. |
jongio
left a comment
There was a problem hiding this comment.
Two verified issues that affect correctness, plus a robustness concern:
1. Language detection uses project root instead of service directory (HIGH)
\ argetDir := "."\ means \isPythonProject/\isDotnetProject\ scan the project root, but adopted samples declare agent services in subdirectories (e.g., \project: ./agents/assistant). The manifest path uses the downloaded service directory (\init.go:1388). Here, \�gentSvc.GetRelativePath()\ should be used so auto-detection finds the correct files.
2. Non-deterministic map iteration picks an arbitrary service (MEDIUM)
esp.GetProject().GetServices()\ is a map. With multi-agent samples (researcher, writer, triage), the \�reak\ on line 559 applies configuration to whichever service the runtime iterates first. Either apply to all matching services or collect them and let the user choose.
3. Silently discarded \structpb.NewValue\ errors (LOW)
Lines 665 and 697 discard the error from \structpb.NewValue(language)\ with _. While unlikely to fail for string literals, ignoring errors deviates from the extension's error handling conventions.
Note on the existing bot comments about snake_case vs camelCase: those comments cite \internal/project/agent_definition.go\ and \AgentDefinitionInline, neither of which exist in the codebase. The agent_yaml struct uses YAML tag \code_configuration\ and the service target detection (\service_target_agent.go:1122) also reads \code_configuration. The PR's use of snake_case appears internally consistent. I'd recommend verifying which key the deploy path actually reads from the azure.yaml inline properties to confirm end-to-end correctness.
|
Corrected review summary (the review body above has formatting issues from a shell escape bug): Two verified issues that affect correctness, plus a robustness concern: 1. Language detection uses project root instead of service directory (HIGH)
2. Non-deterministic map iteration picks an arbitrary service (MEDIUM)
3. Silently discarded Lines 665 and 697 discard the error from Note on the existing bot comments about snake_case vs camelCase: those comments cite |
When --image is passed with -m <azure.yaml>, write it onto the adopted agent service's image field and force container deploy mode. Validate early that --image + --deploy-mode code is incompatible (same as the manifest path). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jongio
left a comment
There was a problem hiding this comment.
Incremental review of commit 511d165 (--image flag handling in adoption path).
The new �alidateImageFlag + image-override logic is well-structured and the test coverage for validation is solid. Two notes on this new commit:
Prior findings still open:
- argetDir := "." scans the project root instead of the service subdirectory for language detection (HIGH, line 605). This still causes isPythonProject/isDotnetProject to miss files in samples like ./agents/assistant.
- Non-deterministic map iteration picks an arbitrary service when multiple �zure.ai.agent services exist (MEDIUM, line 559).
New issue in this commit:
- imageValue, _ := structpb.NewValue(flags.image) (line 572) discards the error. While �alidateImageFlag already checks the format, structpb.NewValue can still fail for other reasons. This follows the same pattern I flagged in my previous review at line 697.
The --image feature itself looks correct: early validation prevents conflicts with --deploy-mode code, the image property is written before applying container config, and the tests cover the key scenarios. Please address findings 1 and 2 from the prior review before merging.
|
Corrected review summary (the review body above has formatting issues from a shell escape bug): Incremental review of commit 511d165 (--image flag handling in adoption path). The new Prior findings still open:
New issue in this commit:
The --image feature itself looks correct: early validation prevents conflicts with |
…rors - Use camelCase keys (codeConfiguration, entryPoint, dependencyResolution) in SetServiceConfigValue paths to match the azure.yaml inline format that the deploy-time read path expects via JSON unmarshal. - Apply deploy-mode configuration to ALL azure.ai.agent services in the adopted project, not just the first one found (fixes non-deterministic map iteration issue with multi-agent samples). - Use svc.GetRelativePath() for language detection instead of '.' so isPythonProject/isDotnetProject scan the service subdirectory. - Handle structpb.NewValue errors instead of discarding with _. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jongio
left a comment
There was a problem hiding this comment.
All three findings from my prior reviews are addressed:
- Language detection now uses the service's relative path instead of project root.
- Multi-service case handled correctly by collecting all agent services and iterating the full set.
- structpb.NewValue errors are no longer silently discarded.
The camelCase key rename (codeConfiguration, entryPoint, dependencyResolution) is consistent with how the deploy path reads these via JSON unmarshal.
jongio
left a comment
There was a problem hiding this comment.
@huimiu's observation is correct. I traced the full conversion path:
-
Internal
project.ServiceConfigdeclaresDocker DockerProjectOptionsas a value type (service_config.go:48), so it's always the zero-value struct when nodocker:section exists in azure.yaml. -
The mapper at
mapper_registry.go:165registers a converterDockerProjectOptions -> *azdext.DockerProjectOptionsthat unconditionally returns a non-nil pointer (line 193). -
The ServiceConfig converter at
mapper_registry.go:122callsmapper.Convert(src.Docker, &docker)with that zero-value struct, getting back a non-nil*DockerProjectOptionswith all empty fields. -
Result:
svc.GetDocker() != nilis always true after the gRPC round-trip, so the early return on line 612 always fires whenflags.deployMode == "".
Fix: mirror the adoptedServiceHasCodeConfig approach by checking additionalProperties for a docker key, or check whether the Docker struct has any non-zero fields (e.g. svc.GetDocker().GetPath() != "" || svc.GetDocker().GetRemoteBuild()). The additionalProperties approach is more consistent with how the deploy path reads these values.
Withdrawing my earlier approval since this means the no-flag adoption path silently skips deploy-mode configuration.
svc.GetDocker() is always non-nil after the gRPC round-trip because the mapper unconditionally converts the zero-value DockerProjectOptions to a non-nil pointer. This caused the no-flag adoption path to always skip deploy-mode configuration. Replace with adoptedServiceHasDocker() that checks additionalProperties for a 'docker' key, matching how codeConfiguration is detected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
6cf65b0 to
1c8c9f1
Compare
jongio
left a comment
There was a problem hiding this comment.
All findings from my prior reviews are addressed in this commit. The adoptedServiceHasDocker function correctly checks additionalProperties instead of svc.GetDocker() (which always returns non-nil due to the gRPC mapper zero-value behavior). The implementation mirrors adoptedServiceHasCodeConfig and the test coverage is solid.
When switching deploy modes, use UnsetServiceConfig (which deletes the map key) instead of SetServiceConfigValue with a null protobuf value. The null approach left 'codeConfiguration: null' or 'docker: null' in the azure.yaml because nil values in map[string]any are marshaled as YAML null rather than being omitted. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jongio
left a comment
There was a problem hiding this comment.
Using UnsetServiceConfig instead of writing null is the correct approach. It properly removes the key rather than leaving an explicit null in the YAML, and the guard conditions avoid unnecessary API calls when the property isn't set.
…pt-azure-yaml-deploy-mode # Conflicts: # cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt.go # cli/azd/extensions/azure.ai.agents/internal/cmd/init_adopt_test.go
RickWinter
left a comment
There was a problem hiding this comment.
This applies --deploy-mode / --runtime / --entry-point to the adopted azure.yaml agent service, closing the gap where those flags were silently dropped on the Foundry adoption path. The approach is the right shape: it reuses promptDeployMode / promptCodeConfig, writes codeConfiguration with the same camelCase keys the schema and AgentDefinitionToServiceProperties already use (runtime, entryPoint, dependencyResolution all check out), and the has-code-config / has-docker detection correctly reads additionalProperties rather than trusting the always-non-nil Docker pointer. The added tests cover the detection helpers and image-flag validation well.
One thing worth resolving before merge: applyContainerDeployToService hardcodes remoteBuild: true, while the code-based init path turns it off for VNET network-injected Foundry projects, so adopting a container sample against such a project will fail the first deploy and require a manual edit. The map-iteration ordering is a minor nondeterminism nit for the multi-agent case. Nothing else blocks; the key handling and the respect-existing-sample logic are correct as written.
| svc *azdext.ServiceConfig, | ||
| ) error { | ||
| // Set docker property with remote build enabled. | ||
| dockerMap := map[string]any{"remoteBuild": true} |
There was a problem hiding this comment.
applyContainerDeployToService always writes remoteBuild: true, but the code-based init path in init_from_code.go sets serviceConfig.Docker = &azdext.DockerProjectOptions{RemoteBuild: !networkInjected}, deliberately turning remote build off when the target Foundry project has VNET network injection, since the ACR Tasks worker can't reach a registry inside the VNET. The deploy path doesn't re-derive this; it surfaces guidance telling the user to flip docker.remoteBuild to false by hand. So a sample adopted with container deploy against a network-injected project will fail the first azd deploy and force a manual edit, while the same sample created via init_from_code would have worked. Can we mirror the networkInjected check here (thread the selected Foundry project through) so the two init paths agree?
| svc *azdext.ServiceConfig | ||
| } | ||
| var agentServices []agentEntry | ||
| for name, svc := range resp.GetProject().GetServices() { |
There was a problem hiding this comment.
nit: ranging over resp.GetProject().GetServices() (a map) gives nondeterministic order, so with more than one azure.ai.agent service the prompt order and log output vary run to run. Single-agent samples won't notice, but sorting agentServices by name before the apply loop would make the behavior deterministic.
Why
When
azd ai agent init -m <azure.yaml>routes to the unified Foundry adoption path, the--deploy-mode,--runtime, and--entry-pointflags are silently ignored. The adopted azure.yaml is written as-is with nocode_configurationapplied to the agent service. This means users cannot select code deploy when adopting a sample.Approach
After
scaffoldProjectsucceeds, the newapplyDeployModeToAdoptedProjectfunction locates theazure.ai.agentservice in the adopted project and applies deploy-mode configuration using the samepromptDeployMode/promptCodeConfigfunctions the manifest path already uses.Decision logic:
--deploy-modeflag -- always apply (user intent overrides sample)code_configurationordockerproperty -- no-op (respect pre-configured sample)For code deploy: writes
code_configuration(runtime, entry_point, dependency_resolution) onto the service, updates language topython/csharp, clears any docker property.For container deploy: sets the
dockerproperty (remoteBuild: true), updates language todocker, clears any code_configuration.Testing
TestAdoptedServiceHasCodeConfigcovering nil props, empty props, valid struct value, and null value casesgo test ./internal/cmd/... -shortpassesFixes: #8923