Summary
Add support for evaluating VS Code custom agents (.agent.md files) alongside existing SKILL.md-based skills. Custom agents use the same Copilot engine but have different file format and frontmatter schema.
Motivation
Custom agents (VS Code docs) are .agent.md files with YAML frontmatter defining tools, model preferences, handoffs, and MCP servers. They are the primary way teams build specialized AI personas (security reviewers, planners, implementers). Waza should evaluate them with the same rigor as skills.
Changes Required
P0 — File Discovery
- Extend
discoverSkills() in internal/orchestration/skill_discovery.go to detect .agent.md files alongside SKILL.md
- Extend
buildSkillSystemMessage() in internal/execution/copilot.go to load .agent.md content
- Extend
loadSkillDefinition() to parse agent frontmatter (name, description, tools, model)
P0 — Agent Frontmatter Parsing
- Parse agent-specific fields:
tools, model, handoffs, mcp-servers, agents
- Map to waza concepts: tools → tool constraints, model → eval model, handoffs → follow-up workflows
P1 — Auto-Generated Tool Constraints
- When an
.agent.md specifies tools: ['search/codebase', 'web/fetch'], automatically validate the agent only used those tools during eval
- Generate implicit
tool_constraint grader from agent frontmatter
P1 — Example Eval Suite
- Add
examples/custom-agent/ with a sample .agent.md + eval.yaml + tasks + trigger tests
- Demonstrate tool restriction validation, output quality grading, handoff testing
Documentation
- Update site CLI reference, eval-yaml guide
- Add a "Evaluating Custom Agents" guide to the docs site
- Update README
Agent ↔ Waza Mapping
| Agent Property |
Waza Equivalent |
tools |
tool_calls / tool_constraint grader |
model |
config.model |
description |
Trigger testing |
| Body instructions |
Skill content injection |
handoffs |
follow_up_prompts + action_sequence |
mcp-servers |
config.mcp_servers |
Summary
Add support for evaluating VS Code custom agents (
.agent.mdfiles) alongside existing SKILL.md-based skills. Custom agents use the same Copilot engine but have different file format and frontmatter schema.Motivation
Custom agents (VS Code docs) are
.agent.mdfiles with YAML frontmatter defining tools, model preferences, handoffs, and MCP servers. They are the primary way teams build specialized AI personas (security reviewers, planners, implementers). Waza should evaluate them with the same rigor as skills.Changes Required
P0 — File Discovery
discoverSkills()ininternal/orchestration/skill_discovery.goto detect.agent.mdfiles alongsideSKILL.mdbuildSkillSystemMessage()ininternal/execution/copilot.goto load.agent.mdcontentloadSkillDefinition()to parse agent frontmatter (name,description,tools,model)P0 — Agent Frontmatter Parsing
tools,model,handoffs,mcp-servers,agentsP1 — Auto-Generated Tool Constraints
.agent.mdspecifiestools: ['search/codebase', 'web/fetch'], automatically validate the agent only used those tools during evaltool_constraintgrader from agent frontmatterP1 — Example Eval Suite
examples/custom-agent/with a sample.agent.md+ eval.yaml + tasks + trigger testsDocumentation
Agent ↔ Waza Mapping
toolstool_calls/tool_constraintgradermodelconfig.modeldescriptionhandoffsfollow_up_prompts+action_sequencemcp-serversconfig.mcp_servers