Skip to content

Stop assigning Azure AI User role to per-agent managed identity after deploy#8941

Merged
m5i-work merged 2 commits into
mainfrom
m5i-work-drop-agent-identity-role-assignment
Jul 3, 2026
Merged

Stop assigning Azure AI User role to per-agent managed identity after deploy#8941
m5i-work merged 2 commits into
mainfrom
m5i-work-drop-agent-identity-role-assignment

Conversation

@m5i-work

@m5i-work m5i-work commented Jul 2, 2026

Copy link
Copy Markdown
Member

Why

When deploying a hosted agent, the azure.ai.agents extension assigned the Azure AI User role to each hosted agent's per-agent managed identity in the post-deploy step. That client-side assignment is now redundant: the Microsoft Foundry service grants the per-agent identity its required permissions internally. Worse, it failed noisily when the deploying user lacked Microsoft.Authorization/roleAssignments/write, blocking otherwise-successful deploys for users who only hold data-plane roles (for example Foundry User).

This mirrors the equivalent change already made in the Foundry service tooling (microsoft/Skylight#4910).

What changed

  • Post-deploy handler (listen.go): dropped the block that fetched the agent version, read its instance-identity principal ID, and called EnsureAgentIdentityRBAC. The handler still sets up the project endpoint and credential and does optimization reporting.
  • RBAC write path (agent_identity_rbac.go): removed EnsureAgentIdentityRBAC and its helpers. Kept the shared helpers (parseAgentIdentityInfo, assignRoleToIdentity, extractSubscriptionID, roleAzureAIUser) that are still used by the separate developer pre-flight RBAC check.
  • Doctor: removed the remote.agent-identity-roles check (and its query module agent_identity_query.go). Because the service no longer creates ARM role assignments, that check enumerated nothing and folded every agent into a false aggregate failure. Updated the remote-checks contract test (6 -> 5 checks), the remediation switch, and stale doc comments.

Scope note

This PR only removes the redundant Foundry Azure AI User assignment. ACR role assignment behavior (including ABAC-aware roles) is intentionally left untouched and out of scope.

Verification

  • go build, go vet, and the full go test ./... suite pass for the extension.
  • Live end-to-end: deployed the echo-dual dual-protocol hosted container agent against an existing Foundry project. The deploy produced no client-side RBAC banner, the agent became active and responded to azd ai agent invoke (responses protocol), and azd ai agent doctor no longer lists remote.agent-identity-roles.

Fixes #8940

@github-actions github-actions Bot added the ext-agents azure.ai.agents extension label Jul 2, 2026
The Foundry service now grants each hosted agent's per-agent managed identity its required permissions internally, so the client-side Azure AI User role assignment in the post-deploy handler was redundant. It also failed noisily when the deploying user lacked roleAssignments/write, blocking deploys for users holding only data-plane roles.

Remove the post-deploy role-assignment path and the now-false remote.agent-identity-roles doctor check, which enumerated ARM role assignments the service no longer creates and therefore reported false failures.

Fixes #8940

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
@m5i-work m5i-work force-pushed the m5i-work-drop-agent-identity-role-assignment branch from 7021e1c to b1cf34f Compare July 2, 2026 09:41
@m5i-work m5i-work marked this pull request as ready for review July 2, 2026 09:49
Copilot AI review requested due to automatic review settings July 2, 2026 09:49
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

📋 Prioritization Note

Thanks for the contribution! The linked issue isn't in the current milestone yet.
Thank you for logging this issue; our team is reviewing it. If you need urgent prioritization, tag @RickWinter and @kristenwomack to let us know.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes the redundant client-side assignment of the Azure AI User role to each hosted agent's per-agent managed identity in the azure.ai.agents extension's post-deploy step. The Microsoft Foundry service now grants the per-agent identity its required permissions internally, so the client-side assignment was both unnecessary and harmful — it failed noisily (blocking otherwise-successful deploys) when the deploying user lacked Microsoft.Authorization/roleAssignments/write. It also removes the now-always-false remote.agent-identity-roles doctor check that enumerated ARM role assignments the service no longer creates.

Changes:

  • Dropped the post-deploy RBAC block in listen.go (fetching agent version + EnsureAgentIdentityRBAC), keeping endpoint/credential setup for optimization reporting.
  • Removed EnsureAgentIdentityRBAC/verifyRoleAssignment and the entire agent_identity_query.go, while preserving shared helpers (parseAgentIdentityInfo, assignRoleToIdentity, extractSubscriptionID, roleAzureAIUser) still used by the developer pre-flight RBAC check.
  • Removed the remote.agent-identity-roles doctor check and its wiring, updating the contract test (6 → 5 checks), the remediation switch, test seams, and stale doc comments.

I verified via repo-wide search that no references to the removed symbols (EnsureAgentIdentityRBAC, QueryAgentIdentityRoles, newCheckAgentIdentityRoles, probeAgentPrincipal, queryAgentIdentityRoles) remain, that all kept helpers/struct fields are still consumed, and that the agent_api import was correctly dropped from listen.go while toServiceKey remains used elsewhere.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.

Show a summary per file
File Description
internal/cmd/listen.go Removed post-deploy agent-identity RBAC block and unused agent_api import; retained endpoint/credential setup for optimization reporting
internal/project/agent_identity_rbac.go Removed EnsureAgentIdentityRBAC, ensureAgentIdentityRBACWithCred, ensureSingleAgentRBAC, verifyRoleAssignment, and now-unused imports/consts; kept shared helpers
internal/project/agent_identity_query.go Deleted entirely (read-side role enumeration no longer needed)
internal/cmd/doctor/checks_agent_identity_roles.go Deleted the remote.agent-identity-roles check implementation
internal/cmd/doctor/checks_agent_identity_roles_test.go Deleted the corresponding tests
internal/cmd/doctor/checks_remote.go Removed check registration; added explanatory note
internal/cmd/doctor/checks_remote_test.go Updated contract test to expect 5 checks and new ordering
internal/cmd/doctor/checks_local.go Removed probeAgentPrincipal / queryAgentIdentityRoles test seams from Dependencies
internal/cmd/doctor/types.go Updated StatusInfo doc comment to drop the removed check as its example
internal/cmd/doctor/shared_test.go Updated helper comment referencing the removed check
internal/cmd/doctor_format.go Removed remote.agent-identity-roles from the remediation switch

@jongio jongio left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean removal of the redundant Azure AI User role assignment from the post-deploy path. The Foundry service handles these permissions internally now, so the client-side write (and the doctor check that enumerated them) are correctly removed.

Verified:

  • Shared helpers (parseAgentIdentityInfo, �ssignRoleToIdentity, �xtractSubscriptionID,
    oleAzureAIUser) are still used by the developer pre-flight RBAC check in developer_rbac_check.go
  • listen.go post-deploy flow remains correct: credential is still created for the optimization reporting that follows
  • Remote checks contract test correctly updated (6 to 5 checks)
  • No dead imports or unreferenced code left behind
  • ACR role assignment behavior intentionally untouched (out of scope)

@RickWinter RickWinter left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This drops the client-side Azure AI User role assignment for per-agent managed identities (the post-deploy step and the doctor remote.agent-identity-roles check), on the basis that Foundry now grants that permission server-side and the client assignment only produced noisy failures for data-plane-only users. The direction is right and the deletion is clean: the shared helpers (parseAgentIdentityInfo, assignRoleToIdentity, extractSubscriptionID, roleAzureAIUser) are correctly retained because developer_rbac_check.go still uses them, endpointResp and cred remain consumed by optimization reporting so nothing goes unused, and the remote-checks contract test is updated from 6 to 5.

One follow-up worth considering: with RBAC gone, the endpoint/tenant/credential resolution in postdeployHandler now feeds only best-effort optimization reporting, yet it still returns hard errors. That is a heavier failure mode than the remaining feature warrants. Not a blocker. The ACR scope note is correct to leave that path untouched.

Comment thread cli/azd/extensions/azure.ai.agents/internal/cmd/listen.go
…metry env

Addresses PR #8941 review follow-up: with the client-side agent-identity RBAC
assignment removed, the endpoint/tenant/credential resolution in
postdeployHandler now feeds only best-effort optimization reporting (wrapped in
recover, never propagated). Downgrade the FOUNDRY_PROJECT_ENDPOINT /
AZURE_TENANT_ID / credential setup guards from hard errors to logged warnings +
return nil, so missing optimization telemetry inputs can no longer fail an
otherwise-successful deploy.

Adds TestPostdeployHandler_MissingTelemetryEnv_ReturnsNil covering the missing/
empty endpoint and missing tenant cases.
@m5i-work m5i-work enabled auto-merge (squash) July 3, 2026 05:03
@m5i-work m5i-work merged commit 19c2319 into main Jul 3, 2026
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ext-agents azure.ai.agents extension

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stop assigning Azure AI User role to per-agent managed identity after deploy

6 participants