feat: Add validation-provider extension capability for preflight checks#8656
Conversation
Add a new 'validation-provider' capability that allows extensions to register validation checks dispatched during bicep provider local preflight. This enables extensions to contribute custom validation rules alongside the built-in checks. Key changes: - New gRPC validation protocol with chunked context delivery for large templates - Extension-side SDK: ValidationCheckProvider interface, ValidationContext with typed accessors (ARMTemplate, PredictedResources, ResourcesSnapshot, etc.) - Core-side grouped dispatch: sends context once per extension, runs all checks - PredictedResource exported SDK type for extensions to parse resource data - Automatic context building via validationContext.extensionContext() - Telemetry: new PreflightExtensionRulesKey for extension check metrics - Demo extension check using ParsePredictedResources() for E2E validation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a new validation-provider extension capability and end-to-end plumbing so extensions can register and run validation checks during the Bicep provider’s local preflight pipeline, with chunked context delivery to avoid gRPC message-size limits.
Changes:
- Introduces a new validation gRPC protocol/service and extension-side SDK types (context, provider interface, typed predicted resources).
- Integrates extension check dispatch into
BicepProvider.validatePreflight()and records extension rule IDs separately for telemetry. - Updates extension registry validation, docs, and adds a demo extension check + tests.
Reviewed changes
Copilot reviewed 30 out of 35 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| cli/azd/pkg/infra/provisioning/validation_dispatcher.go | Adds dispatcher interface to decouple provisioning from gRPC server implementation. |
| cli/azd/pkg/infra/provisioning/bicep/local_preflight.go | Builds extension validation context from local preflight inputs/snapshot. |
| cli/azd/pkg/infra/provisioning/bicep/bicep_provider.go | Dispatches extension validation checks and records extension rule IDs to telemetry. |
| cli/azd/pkg/infra/provisioning/bicep/bicep_provider_coverage_test.go | Updates tests for changed validate() signature. |
| cli/azd/pkg/grpcbroker/message_broker.go | Adds ResourceExhausted detection helper and invokes it on send/recv paths. |
| cli/azd/pkg/extensions/validate_registry.go | Allows validation-provider in capability validation list. |
| cli/azd/pkg/extensions/registry.go | Defines new ValidationProviderCapability constant. |
| cli/azd/pkg/azdext/validation.pb.go | Generated protobuf types for validation protocol. |
| cli/azd/pkg/azdext/validation_test.go | Unit tests for envelope/context helpers and chunk assembler behavior. |
| cli/azd/pkg/azdext/validation_provider.go | Adds extension SDK types: ValidationContext, ValidationCheckProvider, PredictedResource, context keys. |
| cli/azd/pkg/azdext/validation_manager.go | Implements extension-side registration, context chunk assembly, and request dispatch. |
| cli/azd/pkg/azdext/validation_grpc.pb.go | Generated gRPC client/server code for validation service. |
| cli/azd/pkg/azdext/validation_envelope.go | Adds message envelope implementation for validation broker messages. |
| cli/azd/pkg/azdext/prompt.pb.go | Updates comments around quota semantics. |
| cli/azd/pkg/azdext/extension_host.go | Wires validation manager + registration lifecycle into extension host. |
| cli/azd/pkg/azdext/azd_client.go | Adds validation client and enables gzip compression by default. |
| cli/azd/pkg/azdext/ai_model.pb.go | Updates comments around quota semantics. |
| cli/azd/internal/tracing/fields/fields.go | Adds PreflightExtensionRulesKey telemetry attribute. |
| cli/azd/internal/grpcserver/validation_service.go | Implements core validation service: stream auth/capability check, registration, context chunking, dispatch. |
| cli/azd/internal/grpcserver/validation_service_test.go | Adds tests for registration validation and no-check dispatch behavior. |
| cli/azd/internal/grpcserver/server.go | Registers validation service and imports gzip support. |
| cli/azd/internal/grpcserver/server_test.go | Updates server ctor usage for new service parameter. |
| cli/azd/internal/grpcserver/server_coverage3_test.go | Updates NewServer call signature in coverage test. |
| cli/azd/internal/grpcserver/prompt_service_test.go | Updates test server setup for new service parameter. |
| cli/azd/grpc/proto/validation.proto | Defines validation protocol (registration, chunked context delivery, execution, results). |
| cli/azd/extensions/microsoft.azd.demo/internal/project/demo_validation_check.go | Adds demo validation check implementation using predicted resources parsing. |
| cli/azd/extensions/microsoft.azd.demo/internal/project/demo_validation_check_test.go | Tests demo validation check message behavior and predicted resource parsing. |
| cli/azd/extensions/microsoft.azd.demo/internal/cmd/root.go | Adds root annotations and reserved flags for demo extension. |
| cli/azd/extensions/microsoft.azd.demo/internal/cmd/listen.go | Registers demo validation check with extension host. |
| cli/azd/extensions/microsoft.azd.demo/extension.yaml | Bumps demo extension version and declares validation-provider capability. |
| cli/azd/docs/extensions/extension-framework.md | Documents validation-provider capability and usage example. |
| cli/azd/docs/design/local-preflight-validation.md | Documents extension-provided checks and context keys in local preflight design doc. |
| cli/azd/cmd/middleware/extensions.go | Includes validation capability in startup set and logs extension-reported error on readiness failure. |
| cli/azd/cmd/container.go | Registers validation service and binds dispatcher interface in IoC container. |
Files not reviewed (4)
- cli/azd/pkg/azdext/ai_model.pb.go: Generated file
- cli/azd/pkg/azdext/prompt.pb.go: Generated file
- cli/azd/pkg/azdext/validation.pb.go: Generated file
- cli/azd/pkg/azdext/validation_grpc.pb.go: Generated file
- Only set arm_parameters in extensionContext when serialization succeeds - Fix doc example to match actual ValidationCheckProvider interface signature - Replace panicOnResourceExhausted with wrapResourceExhausted (returns error) - Add PreflightExtensionRulesKey to telemetry-data.md reference docs - Suppress broker response for nil handler returns (intermediate chunks) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 31 out of 36 changed files in this pull request and generated 8 comments.
Files not reviewed (4)
- cli/azd/pkg/azdext/ai_model.pb.go: Generated file
- cli/azd/pkg/azdext/prompt.pb.go: Generated file
- cli/azd/pkg/azdext/validation.pb.go: Generated file
- cli/azd/pkg/azdext/validation_grpc.pb.go: Generated file
- Track invoked rule IDs via dispatcher return (not from diagnostic results) - Remove type assertion in container.go DI; use concrete *ValidationService - Wrap original gRPC error in wrapResourceExhausted (multiple %w) - Fix tab alignment in server.go struct - Make demo RuleID consistent with DiagnosticId (demo_warning) - Add cache eviction after validation check completes - Reject duplicate registrations on server side - Fix Register race: check+set under same Lock Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 31 out of 36 changed files in this pull request and generated 2 comments.
Files not reviewed (4)
- cli/azd/pkg/azdext/ai_model.pb.go: Generated file
- cli/azd/pkg/azdext/prompt.pb.go: Generated file
- cli/azd/pkg/azdext/validation.pb.go: Generated file
- cli/azd/pkg/azdext/validation_grpc.pb.go: Generated file
- Use reference counting for context eviction instead of immediate delete - Track invoked rule IDs only after successful dispatch (not upfront) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 31 out of 36 changed files in this pull request and generated 3 comments.
Files not reviewed (4)
- cli/azd/pkg/azdext/ai_model.pb.go: Generated file
- cli/azd/pkg/azdext/prompt.pb.go: Generated file
- cli/azd/pkg/azdext/validation.pb.go: Generated file
- cli/azd/pkg/azdext/validation_grpc.pb.go: Generated file
- Move invokedRuleIDs append after error check (only track successful dispatches) - Clear contextRefCounts in Close() for consistency - Use status.Errorf(codes.AlreadyExists) for duplicate registration errors Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 31 out of 36 changed files in this pull request and generated 2 comments.
Files not reviewed (4)
- cli/azd/pkg/azdext/ai_model.pb.go: Generated file
- cli/azd/pkg/azdext/prompt.pb.go: Generated file
- cli/azd/pkg/azdext/validation.pb.go: Generated file
- cli/azd/pkg/azdext/validation_grpc.pb.go: Generated file
- Return concrete *ValidationService from constructor; add interface binding - Use defer for context ref count decrement to handle error paths - Guard against negative ref counts when context wasn't cached Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 31 out of 36 changed files in this pull request and generated 3 comments.
Files not reviewed (4)
- cli/azd/pkg/azdext/ai_model.pb.go: Generated file
- cli/azd/pkg/azdext/prompt.pb.go: Generated file
- cli/azd/pkg/azdext/validation.pb.go: Generated file
- cli/azd/pkg/azdext/validation_grpc.pb.go: Generated file
- Remove global gzip compression default for backward compatibility - Clone registeredKeys under mutex before cleanup iteration (race fix) - Enforce rule_id uniqueness across all extensions (not just per-extension) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 31 out of 36 changed files in this pull request and generated 1 comment.
Files not reviewed (4)
- cli/azd/pkg/azdext/ai_model.pb.go: Generated file
- cli/azd/pkg/azdext/prompt.pb.go: Generated file
- cli/azd/pkg/azdext/validation.pb.go: Generated file
- cli/azd/pkg/azdext/validation_grpc.pb.go: Generated file
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 31 out of 36 changed files in this pull request and generated 3 comments.
Files not reviewed (4)
- cli/azd/pkg/azdext/ai_model.pb.go: Generated file
- cli/azd/pkg/azdext/prompt.pb.go: Generated file
- cli/azd/pkg/azdext/validation.pb.go: Generated file
- cli/azd/pkg/azdext/validation_grpc.pb.go: Generated file
- Clarify extensionContext comment (not all fields sent, only extension-relevant) - Add grpc.UseCompressor hint to ResourceExhausted error message Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Run gofmt on bicep_provider.go and local_preflight.go - Fix gosec G115 integer overflow: use int32 for totalKeys upfront - Fix cspell: 'acks' -> 'acknowledges', 'azd\'s' -> 'the azd' Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- TestListenCapabilities: expect 5 capabilities (added validation-provider) - Test_PackageLevelErrorsMapped: exclude ErrResourceExhausted (broker-level) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
📋 Prioritization NoteThanks for the contribution! The linked issue isn't in the current milestone yet. |
Add tests to address code coverage gate failures: - grpcbroker: wrapResourceExhausted, invokeHandler nil suppression, processHandlerRequest paths - grpcserver: DispatchChecks with mock broker, sendContextChunks, duplicate registration rejection - azdext: ParsePredictedResources, onPrepareContextChunk reassembly, onValidationCheck with ref-counting, Close, getOrCreateProvider Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Cover the previously-untested streaming paths (ensureStream, registerHandlers, Register, Receive, Ready) using a bufconn-backed fake ValidationServiceServer. Raises pkg/azdext coverage to resolve the coverage gate regression. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jongio
left a comment
There was a problem hiding this comment.
Solid extension capability design. The chunked protocol handles the 4MB gRPC limit cleanly, the grouped dispatch (context once per extension, checks in parallel) is efficient, and the interface separation via ValidationCheckDispatcher keeps the bicep provider decoupled from gRPC internals.
One concurrency concern in the chunk assembly path worth considering (details inline).
- validation_manager: make the IsLastKey chunk handler the sole owner of the PrepareValidationContextResponse ack and have it wait on a per-assembler done channel. This fixes a low-probability race where concurrent chunk handlers could ack with a non-matching request_id, leaving the core's SendAndWait blocked until cancellation. The wait respects context cancellation to avoid leaking the handler goroutine. - validation_provider: use omitempty (not omitzero) on PredictedResource.SKU and Tags for consistency with the rest of the struct. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- bicep_provider: bound extension validation DispatchChecks with a 60s context.WithTimeout and treat a timeout as a non-fatal skip (log + continue), so a blocked or unresponsive extension check cannot hang core preflight. - demo extension: remove the accidentally checked-in 'demo' build binary and ignore it via .gitignore. - demo extension: refactor NewRootCommand to use azdext.NewExtensionRootCommand instead of hand-rolling the azd-sdk-root annotation and --debug/--no-prompt flags, matching the standard extension pattern. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Thanks Victor, this looks great. One ask as a close, non-blocking follow-up: the demo extension is the only worked example right now. For extension authors outside our team to adopt this, we need in-repo docs covering a conceptual overview of Happy to help write these, or at minimum review. |
Azure Dev CLI Install InstructionsInstall scriptsMacOS/Linux
bash: pwsh: WindowsPowerShell install MSI install Standalone Binary
MSI
Documentationlearn.microsoft.com documentationtitle: Azure Developer CLI reference
|
hemarina
left a comment
There was a problem hiding this comment.
The design is solid and the prior review rounds resolved the major concerns (context eviction, error-path refcount leaks, cross-extension rule uniqueness, the done-channel ack race, and the preflight dispatch timeout). One issue from those rounds looks incompletely fixed.
Refcount eviction still races when checks complete sequentially
cli/azd/pkg/azdext/validation_manager.go — onValidationCheck (refcount increment / defer decrement)
The earlier fix replaced eager eviction with reference counting, described as: "Context is only deleted when the ref count reaches zero, ensuring parallel checks sharing the same context_id don't race." That guarantee only holds when the checks' lifetimes overlap.
The refcount is incremented when each check handler starts, not seeded upfront with the expected number of checks. Core dispatches all checks for one extension in parallel under a single context_id (validation_service.go, DispatchChecks → wg.Go), and the extension broker runs one goroutine per message. So this interleaving is possible:
- Handler A:
refcount[id] = 1, runsValidate(fast),defer→refcount[id] = 0→delete(cachedContexts, id). - Handler B (started microseconds later):
cachedContexts[id]is already gone →valCtx == nil→ falls back to an emptyValidationContext{Data: map[string][]byte{}}.
Check B then silently validates against empty context (no ARM template, no predicted resources). The window is narrow — it requires 2+ checks registered by one extension and a fast Validate — which is why the single-check demo and TestValidationManager_Dispatch_Integration don't catch it.
Suggested fix: seed the refcount before dispatch instead of counting on the fly. Core already knows the per-group count (len(g.entries)); send it on the final context chunk (e.g. an expected_checks field) so the extension initializes contextRefCounts[id] once, and decrement-to-evict only after all checks have run. Alternatively, retain the context until stream teardown with a bounded LRU/TTL.
Test gap: add a case with two checks sharing one context_id where the first returns immediately, asserting the second still receives populated context.
Minor
armParametersJSON(bicep_provider.go) returnsnilonjson.Marshalfailure, andextensionContextthen silently omitsarm_parameters. Extensions can't distinguish "no params" from "serialization failed." Alog.Printfon the error path would aid diagnosis. Low impact.
jongio
left a comment
There was a problem hiding this comment.
Both concerns from my prior review are addressed cleanly.
The done channel for the chunk ack race is correct: only the IsLastKey handler owns the ack, and it blocks on done (with ctx cancellation) when assembly hasn't finished yet. This eliminates the window where a wrong request_id could leave the core's SendAndWait hung.
The omitempty consistency fix and the dispatch timeout guard (60s, non-fatal skip with differentiated logging) are both solid.
One note: @hemarina raised a valid point about the refcount race when checks complete sequentially (Handler A finishes and evicts context before Handler B starts). That's worth tracking as a follow-up, but it's a separate concern from the chunk ordering race this round fixed, and the window is narrow (requires a check to complete entirely before its sibling acquires the lock). Doesn't block this PR.
Summary
Add a new
validation-providercapability that allows extensions to register validation checks dispatched during the bicep provider's local preflight pipeline. This enables extensions to contribute custom validation rules alongside built-in checks.Key Changes
Protocol & SDK
validation.proto) with chunked context delivery for arbitrarily large ARM templatesValidationCheckProviderinterface,ValidationContextwith typed accessors (ARMTemplate(),PredictedResources(),ResourcesSnapshot(), etc.)PredictedResourceexported SDK type inpkg/azdextfor extensions to parse resource data without raw JSON handlingvalidationContext.extensionContext()— new context fields are automatically sent to extensionsCore Integration
PreflightExtensionRulesKeyfor tracking extension check metrics separately from core rulesDemo Extension
ParsePredictedResources()— displays resource count from bicep snapshotDesign Decisions
validation-providernaming (notpreflight-check-provider) to support future check types viacheck_typefieldextensionContext()as single source of truth ensures extensions always receive complete contextTesting
fix: #8715