-
Notifications
You must be signed in to change notification settings - Fork 6.3k
Description
We've been experiencing throughput issues with our validation infrastructure.
We do have work in progress to directly help as well as longer term work to address the root cause. The initial work is moving to a GitHub App and implementing a queueing mechanism so we can throttle our traffic to reduce the number of failures based on timeouts.
- Use GitHub App integration for validation #263724
- [New Feature]: Improve queueing/scheduling for Validation pipelines #185545
Longer term, we're working with our infrastructure partner on increasing our overall capacity as well as the compute resources we run validation against.
The Azure DevOps Analytics view can provide some detail on the current success/failure rates and typical durations for how long validation takes to complete.
We do have automated retry enabled, but we're still seeing higher failure rates than normal. The most common result is
Internal-Error-Dynamic-Scan
We're also tracking some other known classes of failures:
The current ETA for both items to be completed and deployed is February. Both of the work items are large in scope. We're going to include several "quality of experience" upgrades including exposing the validation failure reasons in the PR body.
Sub-issues
Metadata
Metadata
Assignees
Labels
Type
Projects
Status