docs: custom-agent guide uses agent: but eval parser/schema require skill:

## Summary

The custom-agent docs tell users to target `.agent.md` files with a top-level `agent:` field in `eval.yaml`, but the current Go eval spec model and JSON schema only accept `skill:`. Because `LoadEvalSpec` uses strict YAML parsing, following the docs appears likely to fail before an eval can run.

I could not find a `.github/ISSUE_TEMPLATE` directory in this repository, so I am using this plain bug report format.

## Evidence

- `site/src/content/docs/guides/custom-agents.mdx` documents:
  - `agent: my-agent` in the example eval: https://github.com/microsoft/waza/blob/0f5f24508a075dd3e11f0fde4162f447bf16540d/site/src/content/docs/guides/custom-agents.mdx#L42-L46
  - "Use the `agent:` field instead of `skill:`" later in the guide: https://github.com/microsoft/waza/blob/0f5f24508a075dd3e11f0fde4162f447bf16540d/site/src/content/docs/guides/custom-agents.mdx#L247-L252
- `internal/models/spec.go` only has `SkillName string yaml:"skill"` and no `agent` field: https://github.com/microsoft/waza/blob/0f5f24508a075dd3e11f0fde4162f447bf16540d/internal/models/spec.go#L16-L28
- `LoadEvalSpec` uses `decoder.KnownFields(true)`, so unknown fields should be rejected: https://github.com/microsoft/waza/blob/0f5f24508a075dd3e11f0fde4162f447bf16540d/internal/models/spec.go#L253-L270
- `schemas/eval.schema.json` also requires `skill` and does not include `agent`: https://github.com/microsoft/waza/blob/0f5f24508a075dd3e11f0fde4162f447bf16540d/schemas/eval.schema.json#L7-L14

## Expected

One of these should be true:

1. The parser/schema support `agent:` as a first-class target field, matching the custom-agent docs.
2. The docs use `skill:` for custom agents and explain that `.agent.md` files are currently discovered through the skill-discovery path.

## Actual

The docs and source appear to disagree. A user following the docs can write an `eval.yaml` that the current strict parser/schema do not accept.

## Suggested fix

If `agent:` is intended:

- Add `AgentName string yaml:"agent,omitempty"` or equivalent to the eval spec model.
- Update validation to require exactly one of `skill` or `agent`.
- Update `schemas/eval.schema.json` with a `oneOf`/mutual-exclusion rule.
- Add tests that a custom-agent eval with `agent:` loads and runs.

If `agent:` is not intended yet:

- Update the custom-agent docs and CLI reference to use `skill:` for `.agent.md` targets.
- Add a warning that `SKILL.md` takes priority when both `SKILL.md` and `.agent.md` exist in one directory.


	```yaml
	name: my-agent-eval
	description: Evaluating my custom agent
	agent: my-agent # Points to my-agent.agent.md in the same directory
	version: "1.0"

	- type: code
	name: suggests_fix
	config:
	assertions:
	- "output_length > 200"

	type EvalSpec struct {
	SpecIdentity `yaml:",inline"`
	SkillName string `yaml:"skill"`
	Version string `yaml:"version"`
	Config Config `yaml:"config"`
	Hooks hooks.HooksConfig `yaml:"hooks,omitempty"`
	Inputs map[string]string `yaml:"inputs,omitempty" json:"inputs,omitempty"`
	TasksFrom string `yaml:"tasks_from,omitempty" json:"tasks_from,omitempty"`
	Range [2]int `yaml:"range,omitempty" json:"range,omitempty"`
	Graders []GraderConfig `yaml:"graders"`
	Metrics []MeasurementDef `yaml:"metrics"`
	Tasks []string `yaml:"tasks"`
	Baseline bool `yaml:"baseline,omitempty" json:"baseline,omitempty"`

	// LoadEvalSpec loads a spec from a YAML file with strict validation.
	//
	// Normally the schema validation will catch errors in the eval.yaml, but this also does
	// strict YAML parsing to catch errors like unknown fields or type errors that the schema
	// validation might miss.
	func LoadEvalSpec(path string) (*EvalSpec, error) {
	data, err := os.ReadFile(path)
	if err != nil {
	return nil, err
	}

	var spec EvalSpec

	decoder := yaml.NewDecoder(bytes.NewReader(data))
	decoder.KnownFields(true)
	if err := decoder.Decode(&spec); err != nil {
	return nil, fmt.Errorf("parsing eval spec YAML (%s): %w", path, err)
	}

	"required": [
	"name",
	"skill",
	"config",
	"metrics",
	"tasks"
	],
	"additionalProperties": false,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: custom-agent guide uses agent: but eval parser/schema require skill: #275

Summary

Evidence

Expected

Actual

Suggested fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

docs: custom-agent guide uses agent: but eval parser/schema require skill: #275

Description

Summary

Evidence

Expected

Actual

Suggested fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions