You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We evaluated microsoft/waza against the Agentic Repository Rubric and identified a few real product improvements that would strengthen the framework itself. This issue is not about gaming the leaderboard or adding cosmetic automation; it is about shipping capabilities that make Waza more useful for users.
Goal
Improve Waza in ways that are directly valuable to the product and its users, while also increasing the maturity of the repository’s own agentic workflows as a byproduct.
Proposed work
Replace the squad-ci.yml, squad-preview.yml, squad-release.yml, and squad-insider-release.yml stubs with real build, test, validation, and release steps.
Add a genuine failure-handling path for evaluation runs (for example: capture failing artifacts, surface a concise triage summary, and open a follow-up issue or safe remediation PR when appropriate).
Add a recurring improvement loop that turns telemetry or evaluation output into actionable regression tasks or benchmark updates.
Expand run-time observability so eval results, agent activity, and validation output are easier to inspect and compare over time.
Increase safe, measurable agent-assisted throughput only where human review and branch protections are still in place.
Non-goals
Do not add fake automation just to raise the rubric score.
Do not weaken human review or branch protections to make the score look better.
Do not change the rubric or leaderboard to hide gaps.
Notes
If these items turn out to be independent enough, this issue can be split into child issues later. The expectation is that each item should result in a real product or platform improvement, not just a scoring artifact.
Summary
We evaluated
microsoft/wazaagainst the Agentic Repository Rubric and identified a few real product improvements that would strengthen the framework itself. This issue is not about gaming the leaderboard or adding cosmetic automation; it is about shipping capabilities that make Waza more useful for users.Goal
Improve Waza in ways that are directly valuable to the product and its users, while also increasing the maturity of the repository’s own agentic workflows as a byproduct.
Proposed work
squad-ci.yml,squad-preview.yml,squad-release.yml, andsquad-insider-release.ymlstubs with real build, test, validation, and release steps.Non-goals
Notes
If these items turn out to be independent enough, this issue can be split into child issues later. The expectation is that each item should result in a real product or platform improvement, not just a scoring artifact.