Skip to content

Add live golden-path (Tier 2) pipeline for azd ai agent extension#8758

Merged
huimiu merged 36 commits into
mainfrom
wujia/ai-agent-live-tests
Jun 30, 2026
Merged

Add live golden-path (Tier 2) pipeline for azd ai agent extension#8758
huimiu merged 36 commits into
mainfrom
wujia/ai-agent-live-tests

Conversation

@v1212

@v1212 v1212 commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Fixes #8759

Summary

  • Adds eng/pipelines/ext-azure-ai-agents-live.yml: an on-demand / weekly Azure DevOps pipeline that runs the Tier 2 live golden path (init → provision → deploy → invoke → down) for the azure.ai.agents extension against a real Azure (TME) subscription, for both code and container deploy modes.
  • Adds the Tier 2 live driver as a Go test (tier2_live_test.go, console_test.go, assert.go) under cli/azd/extensions/azure.ai.agents/tests/e2e-live/, driving azd ai agent through a pseudo-terminal with go-expect + vt10x + creack/pty (replacing the original tmux/Python prototype from test: add static E2E tests for azure.ai.agents extension #8692), with CI auth detection for Azure DevOps (TF_BUILD / E2E_USE_AZ_CLI_AUTH).
  • Adds a README.md documenting how to run (CI + local) and the one-time ADO setup.

Together with the PR-gate tests in #8754 (Tier 0 offline + Tier 1 recording/playback), this covers Tier 0/1/2 from the original prototype (#8692).

At a glance

  1. What it tests — the full live golden path init → provision → deploy → invoke → down of azd ai agent, driving a real CLI over a pseudo-terminal (Go go-expect + vt10x + creack/pty), for both code (zip) and container (ACR build) deploy modes, run sequentially. The interactive init phase is an event-driven prompt loop (wait for the survey UI to settle, read the rendered screen, dispatch on the verbatim prompt strings) rather than a fixed script, so it tolerates the live model/region/project branches. The invoke step asserts a live model answers "what is 2+2?" with a standalone 4 / four.

  2. How to triggertrigger: none / pr: none, so it never runs on PR push. Run it via: a PR comment /azp run ext-azure-ai-agents-live (requires repo write), the weekly Mon 07:00 UTC schedule, or a manual ADO queue (pick deployModes = both / code / container).

  3. Identity / subscription / what it does — authenticates as the azure-sdk-tests ADO service connection (Workload Identity Federation / OIDC) into the shared TME test subscription, region eastus2. It creates a fresh Foundry project + gpt-4o-mini deployment (+ an ACR in container mode) and the agent service, exercises them, then tears them down. az logs in via OIDC and azd reuses that session (auth.useAzCliAuth + keepAzSessionActive: true, so the WIF token survives the multi-minute run).

  4. Leftover resources?No, by design. Each mode registers a t.Cleanup that runs azd down --force --purge, and a teardown failure fails the run (never green while leaking). A condition: always() AzureCLI@2 step is a second safety net: it force-purges any project dirs left under /tmp/e2e-tests/tier2-*/ after a crash or timeout. (Crash-path cleanup is best-effort, so the TME subscription is worth an occasional spot-check.)

  5. Runs on Azure DevOps, not GitHub Actions — the YAML lives in this GitHub repo (eng/pipelines/…) but is an ADO pipeline (extends: 1es-redirect, task: AzureCLI@2, $(Build.*), ##vso[...]), executed by the azure-sdk ADO project. GitHub CI for this PR stays green/inert; the live run only happens in ADO.

  6. Post-merge admin action — needs an Azure DevOps admin on the azure-sdk project (Azure SDK EngSys)not a GitHub admin — to: (a) register the pipeline as ext-azure-ai-agents-live (the exact name /azp run uses) against this repo + YAML path; (b) confirm the azure-sdk-tests service connection maps to the TME subscription with RBAC to create Foundry projects and deploy models (Contributor + Azure AI Developer + Cognitive Services Contributor, or equivalent); (c) kick the first /azp run to validate the keepAzSessionActive auth path. Merging the PR itself only needs a GitHub maintainer approval. The init GitHub token is already wired via the ambient azuresdk-github-pat org secret, so no extra secret setup is needed.

Testing

  • The golden-path flow was validated end-to-end in the test: add static E2E tests for azure.ai.agents extension #8692 prototype fork run (code ~669s, container ~711s). The driver has since been rewritten from tmux/Python to the Go PTY harness in this PR; the live run of the Go version is exercised post-merge via /azp run (the test is gated behind AZURE_AI_AGENTS_E2E_LIVE=1 and never runs in PR CI).
  • Local validation of the Go harness: gofmt, go vet, golangci-lint (ubuntu + windows), and a go test -c compile all clean on the linux build tag; the no-tag assert.go answer-matching logic is covered by assert_test.go (runs on the host); README.md passes cspell.
  • Driver design choices hardened across review: an event-driven prompt loop with output-settle detection (replacing a fixed poll) so live conditional prompts don't desync; loop detection that advances past or fails out of a stuck prompt; t.Context() as the run-context parent so go test -timeout cancellation propagates; LIFO t.Cleanup ordering so teardown runs before the copied AZD_CONFIG_DIR is removed; a destructive-RemoveAll guardrail on the test dir; and a robust standalone-token check for the invoke result (accepts 4 or four, ignores incidental 4s such as gpt-4o-mini / 4.1 / 404).
  • Timeout hierarchy: per-mode runTimeout (60 min, a context deadline) > the sum of the per-phase budgets (init/provision/deploy/invoke ≈ 32 min) so a slow-but-healthy run is never preempted mid-phase; both modes fit under go test -timeout 125m, and the ADO AzureCLI@2 step (130 min) and job (150 min) caps sit strictly above that, so a run can never be hard-killed mid-azd down (which would leak live resources).
  • The ADO pipeline itself only runs once registered in ADO (inert in GitHub CI by design).

Adds eng/pipelines/ext-azure-ai-agents-live.yml, an on-demand/weekly Azure DevOps pipeline that drives the real 'azd ai agent' CLI through tmux against live Azure (TME), exercising init -> provision -> deploy -> invoke -> down for both code and container deploy modes.

This is the live counterpart to the PR-gate checks (Tier 0 offline + Tier 1 recording/playback in #8754). Per Azure SDK EngSys / SFI guidance, live access stays out of the automatic PR pipeline (trigger: none) and runs only via '/azp run ext-azure-ai-agents-live' or the weekly schedule.

The Tier 2 tmux driver (test_full_e2e.py, test_tier2.py) is migrated from the #8692 prototype; CI auth detection is extended to recognize Azure DevOps (TF_BUILD) and an explicit E2E_USE_AZ_CLI_AUTH override.
@v1212 v1212 added ext-agents azure.ai.agents extension ext-foundry azure.ai.{agents,connections,inspector,projects,routines,skills,toolboxes}, microsoft.foundry labels Jun 22, 2026
Jian Wu added 2 commits June 22, 2026 19:54
…zSessionActive

The azure-sdk-tests service connection uses Workload Identity Federation, whose
az session is isolated to the task's private AZURE_CONFIG_DIR and expires after
~10 min. Running the ~50 min golden-path test (and the cleanup) as plain bash
steps after a separate login step would fail auth on both counts. Run them
inside AzureCLI@2 with keepAzSessionActive:true (matching build-cli.yml) so the
session stays refreshed and reaches azd (auth.useAzCliAuth) through tmux, which
inherits AZURE_CONFIG_DIR. Subscription/tenant are now read in-script via
az account show instead of cross-step pipeline variables.
test_tier2.py always ran sequentially, but kept a tautological if-condition
(len==1 or len>1), an unused concurrent.futures import, a no-op --serial flag,
and a docstring/print claiming parallel execution. Simplify to an explicit
sequential loop and update the docstring to match. Also fix test_full_e2e.py's
module docstring to point at README.md (LOCAL-TEST-GUIDE.md does not exist).
@v1212 v1212 marked this pull request as ready for review June 22, 2026 12:08
Copilot AI review requested due to automatic review settings June 22, 2026 12:08

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a dedicated Azure DevOps pipeline and supporting Python drivers/docs to run the Tier 2 live golden-path E2E for the azure.ai.agents azd extension (init → provision → deploy → invoke → down) against real Azure resources, outside of the GitHub PR gate.

Changes:

  • Added an on-demand + weekly ADO pipeline (ext-azure-ai-agents-live) to run Tier 2 live E2E for both code and container deploy modes.
  • Added a tmux-driven Python Tier 2 runner (test_tier2.py) and a full golden-path driver (test_full_e2e.py) adapted for ADO CI auth detection.
  • Added documentation for running the live Tier 2 tests locally and in CI.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
eng/pipelines/ext-azure-ai-agents-live.yml New ADO pipeline wiring to build azd + extension, run Tier 2 live E2E under AzureCLI@2, and publish logs/cleanup.
cli/azd/extensions/azure.ai.agents/tests/e2e-live/test_tier2.py Tier 2 orchestrator: runs code + container golden paths sequentially with isolation and timeout handling.
cli/azd/extensions/azure.ai.agents/tests/e2e-live/test_full_e2e.py tmux-driven end-to-end golden path driver (init/provision/deploy/invoke/down) with CI/local auth switching.
cli/azd/extensions/azure.ai.agents/tests/e2e-live/README.md Docs for CI setup (ADO registration/service connection/secrets) and local WSL execution.

Comment thread eng/pipelines/ext-azure-ai-agents-live.yml Outdated
Comment thread eng/pipelines/ext-azure-ai-agents-live.yml Outdated
Comment thread eng/pipelines/ext-azure-ai-agents-live.yml Outdated
Comment thread cli/azd/extensions/azure.ai.agents/tests/e2e-live/test_full_e2e.py Outdated
Comment thread cli/azd/extensions/azure.ai.agents/tests/e2e-live/test_tier2.py Outdated
Comment thread cli/azd/extensions/azure.ai.agents/tests/e2e-live/test_full_e2e.py Outdated
- Use the ambient azure-sdk org secret `azuresdk-github-pat` for GH_TOKEN
  instead of an empty `GitHubPat` placeholder variable (mirrors eval-waza.yml);
  removes a misleading masked variable and the need for admin PAT setup.
- Harden the AzureCLI@2 inline script: `set -euo pipefail` and assign-then-verify
  subscription/tenant so an `az account show` failure fails fast (a plain
  `export X=$(...)` would have masked the error from set -e).
- Reword the extension-install comment to be self-contained (it no longer
  inaccurately claims to mirror lint-ext-azure-ai-agents.yml).
- Clarify the test_full_e2e.py auth prerequisite: only local WSL runs leave
  auth.useAzCliAuth unset; CI auto-enables az CLI auth.
- Clear tmux scrollback after env setup so the exported GH token cannot leak
  into capture() output on failures/timeouts.
- _cleanup_leaked_resources now checks azd down's return code and reports
  failures instead of always printing "Cleanup complete".

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Comment thread cli/azd/extensions/azure.ai.agents/tests/e2e-live/test_tier2.py Outdated
Comment thread cli/azd/extensions/azure.ai.agents/tests/e2e-live/test_full_e2e.py Outdated
Comment thread cli/azd/extensions/azure.ai.agents/tests/e2e-live/test_tier2.py Outdated
Comment thread cli/azd/extensions/azure.ai.agents/tests/e2e-live/test_tier2.py Outdated
Comment thread eng/pipelines/ext-azure-ai-agents-live.yml Outdated
- Stream child E2E output live with a watchdog-enforced hard timeout
  instead of buffering everything via capture_output
- Shell-escape the GitHub token (shlex.quote) before exporting in tmux
- Clean up the per-mode AZD_CONFIG_DIR temp copy unless E2E_KEEP_ARTIFACTS
- Use sha256 instead of md5 for the agent-name uniqueness suffix
- Derive the agent binary arch from uname -m instead of hard-coding amd64

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Comment thread cli/azd/extensions/azure.ai.agents/tests/e2e-live/test_full_e2e.py Outdated
Comment thread cli/azd/extensions/azure.ai.agents/tests/e2e-live/test_full_e2e.py Outdated
Comment thread cli/azd/extensions/azure.ai.agents/tests/e2e-live/test_full_e2e.py Outdated
Comment thread cli/azd/extensions/azure.ai.agents/tests/e2e-live/test_tier2.py Outdated
- Shell-escape HOME/PATH/TENANT, the cd target, and the agent name with
  shlex.quote() (consistent with the earlier token fix)
- On Tier 2 timeout, kill the child's detached tmux server so reused CI
  agents do not accumulate orphaned tmux sockets
@v1212 v1212 requested a review from Copilot June 22, 2026 13:18

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor nit on the location dispatch case for consistency with the other tightened prompt matches.

Merging main shifted prompt_service.go, leaving the cited line numbers
(143/190/226) pointing at the wrong lines. Drop the brittle file:line
reference and rely on the function name plus the quoted picker messages,
which the case clauses already match exactly.

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous feedback (errors.AsType, t.Context(), yaml.Unmarshal for service name parsing, tightening prompt dispatch matches, logging default-case fall-throughs) all addressed in recent commits. Code quality is solid.

One observation: the PR description mentions a condition: always() AzureCLI@2 step as a "second safety net" for crash-path cleanup, but the pipeline YAML only has PublishPipelineArtifact@1 with condition: always(). The Go .Cleanup(r.teardown) approach is sufficient given the timeout hierarchy (job 150m > step 130m > go test 125m > per-mode 60m), so this isn't a gap in practice, but the description is slightly overstated. Consider updating the PR body to reflect the actual cleanup mechanism (t.Cleanup + LIFO ordering) or adding the explicit cleanup step if you want the belt-and-suspenders guarantee for cases where the Go process is killed before t.Cleanup runs.

No blocking issues. The event-driven prompt loop design, the timeout hierarchy, and the �ssertSafeTestDir guard are all well thought out.

ab890b1 tightened the subscription case to has("select an azure
subscription"), but that string is the extension's fmt.Println preamble,
not the survey "?" line activePrompt reads. ensureSubscription passes an
empty PromptSubscriptionRequest, so the picker renders azd-core's default
message "Select subscription" (prompt_service.go:507). The tightened match
missed it, sending the prompt to the default Enter and selecting the wrong
subscription in a live run. Match "select subscription" instead.

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incremental commit fixes the subscription dispatch to match azd-core's actual picker message (Select subscription from prompt_service.go:507) rather than the extension's preamble text. Verified: activePrompt lowercases before dispatch, so the case-sensitive has() correctly matches.

Minor note for future awareness: select subscription (singular) is also a prefix/substring of select subscriptions to include in the filter (config_sub_filter.go:188). This won't cause false matches since that multi-subscription filter prompt doesn't appear in the agent init flow, but worth keeping in mind if future init steps add subscription filtering.

@RickWinter RickWinter left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a Tier 2 live golden-path test (init/provision/deploy/invoke/down) for the azure.ai.agents extension, driven through a real PTY with go-expect + vt10x, plus an on-demand/weekly Azure DevOps pipeline that runs it against the TME subscription. The shape is right: it is gated behind AZURE_AI_AGENTS_E2E_LIVE, kept out of the PR pipeline, runs the two deploy modes sequentially, and registers an azd down teardown with a condition: always() safety net so a crash does not leak billable resources. go.sum already carries the three new direct dependencies, so the extension module still builds, and the answer matcher is well-factored and covered on its own by assert_test.go.

One thing is worth tightening before a green run is trusted as signal: the invoke assertion matches a standalone "4" against the entire combined CLI output rather than the agent's actual response, so incidental azd chatter (an elapsed-seconds token, a count, a step number) could turn the check green while the agent is broken. That is the difference between a live test that verifies the agent and one that mostly verifies the pipeline ran, so I would scope it to the response payload.

The rest are non-blocking. The numerous file:line source citations will rot silently the first time those files are edited, and the pipeline hand-writes the extension manifest into config.json where it can drift from the real one. Nothing here runs in PR CI, so the blast radius is bounded to the manual or scheduled live run. None of it blocks merge; I would fix the invoke matcher scope before relying on a green run and treat the other two as follow-ups.

Comment thread cli/azd/extensions/azure.ai.agents/tests/e2e-live/tier2_live_test.go Outdated
Comment thread cli/azd/extensions/azure.ai.agents/tests/e2e-live/tier2_live_test.go Outdated
Comment thread eng/pipelines/ext-azure-ai-agents-live.yml Outdated

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed RickWinter's invoke assertion concern (replied on his thread with details). responseHasExpectedAnswer runs on the full combined stdout+stderr stream, and patterns like "4 seconds" or "step 4:" pass the standalone-four check since space/colon aren't word runes. Consider adding -o raw to the invoke args so the assertion is scoped to the server response payload only.

No other new findings. The event-driven dispatch loop, layered timeout hierarchy (per-mode 60m < go test 125m < ADO step 130m < job 150m), and LIFO cleanup ordering are correctly structured. Prior review feedback has been addressed in the current HEAD.

Jian Wu added 2 commits June 29, 2026 10:49
Match only the model's reply ([agent] ... up to 'Server responded in')
so stray digits elsewhere in CLI output can't false-pass; replace stale
source line numbers in dispatch comments with function names.
Derive the extension manifest fields from extension.yaml so they cannot drift from a hand-maintained copy; only path/source/test version injected.

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incremental review of commit 3c0f717 (scopes the 2+2 invoke assertion to the agent response region).

The agentResponseRegion approach is correct: slicing between [<agent>] and "Server responded in" cleanly isolates the model's answer from surrounding CLI noise (model names, status codes, timing info). Fallback to full text when markers are absent is a reasonable defensive choice since hasStandaloneFour already rejects most false-positive patterns.

The yq-from-extension.yaml change in the pipeline eliminates the manifest drift problem. Comment references now use function names instead of fragile line numbers. Both solid improvements.

One suggestion on test coverage below.

Comment thread cli/azd/extensions/azure.ai.agents/tests/e2e-live/assert_test.go

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incremental review of commit 3c0f717 (scopes the 2+2 invoke assertion to the agent response region).

The agentResponseRegion approach is correct: slicing between [<agent>] and "Server responded in" cleanly isolates the model's answer from surrounding CLI noise (model names, status codes, timing info). Fallback to full text when markers are absent is a reasonable defensive choice since hasStandaloneFour already rejects most false-positive patterns.

The yq-from-extension.yaml change in the pipeline eliminates the manifest drift problem. Comment references now use function names instead of fragile line numbers. Both solid improvements.

One suggestion on test coverage below.

Comment thread cli/azd/extensions/azure.ai.agents/tests/e2e-live/assert_test.go
Per review: verify a real standalone digit ahead of the [agent] line is
excluded by the response-region slicing.

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been reviewing this PR across multiple rounds. Here's my final assessment of the current state:

The Go PTY-based test driver is well-structured. The event-driven prompt dispatch (waitForQuiet + screen-read + dispatch) is a significant improvement over fixed-interval polling, and the loop detection (promptKey + repeat counter) prevents silent hangs when prompts change upstream.

The answer assertion logic (assert.go) correctly handles the tricky problem of detecting a standalone "4" without false-positiving on model names, versions, or status codes. The agentResponseRegion scoping adds a solid second layer of defense.

Timeout hierarchy is correctly ordered (per-mode runTimeout < go test -timeout < ADO step < ADO job) so the in-test teardown always gets a chance to run before anything force-kills the process.

My prior comments were all non-blocking suggestions (broader keyword-match tightening in dispatchPrompt, an additional test case for region-scoping in assert_test.go). None are blockers.

One note for the broader reviewer group: the go.mod adds go-expect, creack/pty, and vt10x at 2022-era commits. These are stable, widely-used packages for PTY testing in Go (used by AlecAivazis/survey itself). They don't have newer tagged releases that would be materially better.

@v1212

v1212 commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator Author

Thanks @RickWinter @jongio — all threads addressed and resolved. This round:

  • invoke 2+2 check scoped to the agent reply region (agentResponseRegion), so azd's own "4 warnings/4s" can't false-pass. 3c0f717
  • dispatch comments name source functions instead of stale file:line. 3c0f717
  • live pipeline generates config.json from extension.yaml via yq (no drift). 517544a
  • added the standalone-4-before-[agent] region test. 2d9dfb7

CI's green.

Comment thread eng/pipelines/ext-azure-ai-agents-live.yml

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One gap to close before merge: the EngSys DeleteAfter tag on resource groups (as @danieljurek noted). The in-test azd down and the condition: always() safety-net cover the happy and crash paths, but if both fail (VM hard-kill, network timeout during teardown), resources leak with no EngSys visibility.

Suggested fix: after phaseProvision returns, shell out to tag the resource group:

// inside runner, after phaseProvision succeeds:
rgName := r.resolveResourceGroup() // azd env get-values | grep AZURE_RESOURCE_GROUP
if rgName != "" {
    deleteAfter := time.Now().Add(48 * time.Hour).UTC().Format(time.RFC3339)
    r.runCmd(ctx, "az", "tag", "update",
        "--resource-id", "/subscriptions/"+sub+"/resourceGroups/"+rgName,
        "--operation", "merge", "--tags", "DeleteAfter="+deleteAfter)
}

This way the EngSys cleanup pipeline can find and purge any survivors. The 48h window gives plenty of room for the normal teardown to succeed first.

The rest of the implementation is solid: event-driven prompt loop, proper timeout hierarchy, LIFO cleanup ordering, and the agentResponseRegion scoping all look correct.

Comment thread eng/pipelines/ext-azure-ai-agents-live.yml
…eanup

Resolve cspell.yaml (union word lists) and go.mod (keep main's otel/x-deps
bumps and the temporary azd replace; re-add the PTY test deps) conflicts.

Stamp a DeleteAfter cleanup tag (now+48h, RFC3339 UTC) on the provisioned
resource group so the EngSys garbage collector reclaims it even if the
post-run `azd down` teardown never runs (agent crash, post-timeout budget
exhaustion, lost network). Tagged in two places:
- the Go driver, immediately after a successful `azd provision`
- the pipeline cleanup loop, before `azd down`

Uses `az group update --set tags.DeleteAfter=` so the single tag is merged
without clobbering azd's own tags (e.g. azd-env-name). Addresses review
feedback from @danieljurek and @jongio.

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incremental review against 6c66c25 (new commits since my last review at 2d9dfb7).

The DeleteAfter tagging implementation addresses my prior concern and @danieljurek's feedback. Both the in-test tag (after successful provision) and the pipeline cleanup step correctly use RFC 3339 UTC format per the EngSys spec. One non-blocking edge case noted inline.

Jian Wu added 2 commits June 30, 2026 09:48
A provision that times out before persisting AZURE_RESOURCE_GROUP would
leave its resource group untagged, so the EngSys cleanup pipeline could
not reclaim it if azd down also failed. Enumerate azd envs on disk
(.azure/<env>) and tag the deterministic rg-<env> group, which exists
from the start of provisioning, closing the only permanent-leak gap.

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incremental review of 2 new commits since my last pass (6c66c25).

30dfb29 (SplitSeq): Correct go fix modernization. strings.SplitSeq returns a lazy iterator, semantically identical to the previous strings.Split usage.

e1c7ab8 (cleanup tagging): The pipeline safety-net step now enumerates .azure/*/ directories to construct
g- directly, instead of reading AZURE_RESOURCE_GROUP from �zd env get-values. This closes the edge case I flagged in my last review: if provision times out before persisting env values, the old approach would skip tagging entirely. The hard-coded
g- convention is safe here since the test always uses default azd naming and the || true guard makes it best-effort.

All prior findings (errors.AsType, t.Context(), yaml.Unmarshal for service parsing, stderr visibility, prompt dispatch tightening, DeleteAfter tagging) are addressed.

No new issues.

@v1212

v1212 commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator Author

@vhvb1989 @danieljurek @trangevi could you help re-approve the PR and merge? latest commits addressed some nit issues, and they staled the earlier approvals. thanks.

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All prior review concerns (response region scoping, function-name anchors instead of file:line, yq-generated config.json, DeleteAfter tagging including the provision-timeout case) are addressed in the latest commits.

One low-severity note on the test dependencies (non-blocking).

Comment thread cli/azd/extensions/azure.ai.agents/go.mod

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incremental review of 990ffa5 (bumps creack/pty from v1.1.17 to v1.1.24).

This addresses my prior comment about starting on the latest stable version for new test infrastructure. v1.1.24 is a legitimate release (Oct 2024), the pty.Open/Setsize API is unchanged, and the author confirmed clean Linux build/vet/lint.

No new issues found across the full changeset. Prior concerns (response region scoping, function-name anchors instead of file:line, yq-generated config.json, DeleteAfter tagging for the provision-timeout case) are all resolved in earlier commits.

@huimiu huimiu merged commit 0c6294c into main Jun 30, 2026
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ext-agents azure.ai.agents extension ext-foundry azure.ai.{agents,connections,inspector,projects,routines,skills,toolboxes}, microsoft.foundry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add automated test infrastructure (PR-gate + live E2E) for the azure.ai.agents extension

8 participants