Bug Report
Follow up #6883 (review)
Description
When running the error handling workflow with the new Copilot SDK changes, the retry of provision step fails with an authentication context cancellation error. The authentication request to Microsoft identity platform is canceled before it can complete.
Error Message
ERROR: error executing step command 'provision': failed to authenticate: server response error:
Get "https://login.microsoftonline.com/common/discovery/instance?api-version=1.1&authorization_endpoint=https%3A%2F%2Flogin.microsoftonline.com%2Forganizations%2Foauth2%2Fv2.0%2Fauthorize": context canceled
Steps to Reproduce
- Enable the LLM alpha feature:
azd config set alpha.llm on
- Run
azd up (or a command that triggers provisioning)
- Trigger the error handling workflow (e.g., a failing provision step that engages the Copilot agent)
- After the Copilot agent generate the trouble shoot and ask user opts to retry, the retried provisioning command fails with the authentication context canceled error
Expected Behavior
The retry loop in the error handling middleware should properly re-establish authentication context. The authentication HTTP request to login.microsoftonline.com should complete successfully without context cancellation.
Actual Behavior
The authentication request is canceled mid-flight, suggesting the context passed to the authentication layer during the retry is already canceled or gets canceled prematurely.
Analysis
The error originates from the MSAL HTTP client making a discovery request to login.microsoftonline.com. The context canceled is an HTTP-level cancellation, not a user-initiated Ctrl+C.
Potential root causes:
- Context lifecycle in retry loop — In
cmd/middleware/error.go, when the error middleware retries the original command (line ~287: actionResult, err = next(ctx)), the context may inherit cancellation from the previous failed attempt or the agent session teardown.
- Agent session context leakage — The
azdAgent.Stop() is deferred, but the Copilot agent session may share or affect the parent context used for the retry. If the agent's internal context propagates cancellation upstream, the retried command's auth flow would see a canceled context.
shouldSkipErrorAnalysis classification — The error contains the string context canceled but is wrapped as an auth/HTTP error, not as context.Canceled. This means errors.Is(err, context.Canceled) in shouldSkipErrorAnalysis() may not match, causing the middleware to attempt AI analysis on a fundamentally broken context, compounding the issue.
classifyError handling — If the error is not wrapped as *auth.AuthFailedError or *auth.ReLoginRequiredError, it may be classified as AzureContextAndOtherError instead of UserContextError, leading to an inappropriate automated fix attempt.
Environment
- OS: Windows
- Feature flags:
alpha.llm on
- Command:
azd up (provision step)
Relevant Code
cli/azd/cmd/middleware/error.go — Error handling middleware and retry loop
cli/azd/pkg/auth/ — Authentication flow and token acquisition
cli/azd/internal/agent/copilot_agent.go — Copilot agent session management
Bug Report
Follow up #6883 (review)
Description
When running the error handling workflow with the new Copilot SDK changes, the retry of
provisionstep fails with an authentication context cancellation error. The authentication request to Microsoft identity platform is canceled before it can complete.Error Message
Steps to Reproduce
azd config set alpha.llm onazd up(or a command that triggers provisioning)Expected Behavior
The retry loop in the error handling middleware should properly re-establish authentication context. The authentication HTTP request to
login.microsoftonline.comshould complete successfully without context cancellation.Actual Behavior
The authentication request is canceled mid-flight, suggesting the context passed to the authentication layer during the retry is already canceled or gets canceled prematurely.
Analysis
The error originates from the MSAL HTTP client making a discovery request to
login.microsoftonline.com. Thecontext canceledis an HTTP-level cancellation, not a user-initiated Ctrl+C.Potential root causes:
cmd/middleware/error.go, when the error middleware retries the original command (line ~287:actionResult, err = next(ctx)), the context may inherit cancellation from the previous failed attempt or the agent session teardown.azdAgent.Stop()is deferred, but the Copilot agent session may share or affect the parent context used for the retry. If the agent's internal context propagates cancellation upstream, the retried command's auth flow would see a canceled context.shouldSkipErrorAnalysisclassification — The error contains the stringcontext canceledbut is wrapped as an auth/HTTP error, not ascontext.Canceled. This meanserrors.Is(err, context.Canceled)inshouldSkipErrorAnalysis()may not match, causing the middleware to attempt AI analysis on a fundamentally broken context, compounding the issue.classifyErrorhandling — If the error is not wrapped as*auth.AuthFailedErroror*auth.ReLoginRequiredError, it may be classified asAzureContextAndOtherErrorinstead ofUserContextError, leading to an inappropriate automated fix attempt.Environment
alpha.llm onazd up(provision step)Relevant Code
cli/azd/cmd/middleware/error.go— Error handling middleware and retry loopcli/azd/pkg/auth/— Authentication flow and token acquisitioncli/azd/internal/agent/copilot_agent.go— Copilot agent session management