feat: support custom agent (.agent.md) file discovery and parsing #225#226
Merged
Conversation
- Add AgentFrontmatter types in internal/skill/agent.go - Extend loadSkillDefinition() to detect .agent.md files - Extend discoverSkills() for agent file discovery - Extend workspace detection for .agent.md - Extend coverage command to include .agent.md files - Add comprehensive tests for agent frontmatter parsing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- New guide: Evaluating Custom Agents with tool constraint validation - Update eval-yaml guide: add agent targeting and custom agents section - Update graders guide: add callout for auto-injected tool_constraint - Update CLI reference: document .agent.md discovery in coverage and run - Add custom-agents to sidebar navigation - Update README.md with custom agents support note Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nt example #225 P1 scope for #225: - Auto-inject tool_constraint grader when eval targets a .agent.md with tools field - Skip injection if user already defined a tool_constraint grader (opt-out) - Add LoadAgentDefinition() helper in internal/skill/agent.go - Add examples/custom-agent/ with security-reviewer agent, tasks, and fixtures - 9 new tests covering injection, opt-out, no-tools, non-agent, and missing file cases Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The clean.go fixture imports a SQL driver to demonstrate parameterized queries for the security-reviewer agent eval, but it isn't part of the module build. Add //go:build ignore to keep `go test ./...` clean. Also includes Livingston's history + decision file for the docs work. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Member
Author
CI status — all changes-related checks pass ✅
The "Run Waza Evaluation" failure is not caused by this PR. I reproduced the identical failure on a fresh Local verification on this branch:
Ready for review. |
github-actions Bot
pushed a commit
that referenced
this pull request
Apr 28, 2026
- Add .agent.md coverage to quick-start.mdx, getting-started.mdx, docs/GETTING-STARTED.md, docs/GUIDE.md, docs/TUTORIAL.md for #226 - Add custom-agent, required-skills-demo, rubrics to examples/README.md - Update mock engine description in docs/INTEGRATION-TESTING.md and eval-yaml.mdx to reflect #228 file content echo behavior - No stale BenchmarkSpec/TestRunner refs found (#222 rename was thorough) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #225
Summary
Adds support for evaluating VS Code custom agents (
.agent.mdfiles) alongside existing SKILL.md-based skills. Custom agents share the same Copilot engine and YAML-frontmatter / markdown-body structure but expose agent-specific frontmatter fields (tools,model,handoffs,mcp-servers,agents).What Changed
P0 — Discovery & loading
internal/skill/agent.go—AgentFrontmatter,AgentHandoff,AgentMCPServer,ParseAgentFrontmatter,IsAgentFile,LoadAgentDefinitionloadSkillDefinition()(copilot.go) — falls back to.agent.mdwhen noSKILL.mdpresentdiscoverSkills()(orchestration) — discovers.agent.mdfor skill injectiontryParseSkill()(workspace) — workspace detection picks up.agent.mddiscoverSkillFiles()(cmd_coverage) — coverage grid includes agent filesP1 — Auto-injected tool_constraint grader
internal/orchestration/agent_graders.go—augmentGradersFromAgent().agent.mdwhose frontmatter declarestools: [...], an implicittool_constraintgrader is added withexpect_toolspopulated from the frontmattertool_constraintgrader, the implicit one is skippedP1 — Example suite
examples/custom-agent/:security-reviewer.agent.md— realistic security-review agent withtools:declaredeval.yaml— uses text + prompt graders (tool_constraint auto-injected)tasks/— 3 tasks: SQL injection, XSS, clean-code (negative case)fixtures/— vulnerable.py, xss.html, clean.go (build-taggedignore)trigger_tests.yaml— should/shouldn't trigger promptsREADME.md— walkthroughDocs
site/src/content/docs/guides/custom-agents.mdx(Evaluating Custom Agents)eval-yaml.mdx— added "Targeting Custom Agents" sectiongraders.mdx— auto-injection callout on tool_constraintreference/cli.mdx— agent.md notes onwaza runandwaza coverageDesign decisions
.agent.mdmatch is usedSkillInfo— minimal blast radius, no parallel type hierarchytool_constraintgrader disables the implicit oneTesting
internal/skill,internal/orchestration,cmd/waza— all passgo test ./...greengo vet ./...cleanOut of Scope (future work for #225)
handoffsandmcp-serversfrontmatter fields are parsed but not yet wired into evals (P2)Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com