Skip to content

feat: Add trigger heuristic grader #80

Description

@spboyer

Summary

Add a new grader that validates whether a skill should or shouldn't activate for a given prompt. The grader extracts keywords from SKILL.md (name, description, content) and matches them against the task prompt with a configurable threshold.

Motivation

Currently there's no way to validate trigger accuracy in evals — whether a skill correctly activates for relevant prompts and stays silent for irrelevant ones. This is the primary missing grader type for skill quality assessment.

Proposed Implementation

Create internal/graders/trigger_grader.go:

  • Inputs: SKILL.md path, prompt text, expected outcome (should_trigger / should_not_trigger)
  • Keyword extraction: Parse SKILL.md description, trigger phrases, and content to build a keyword set
  • Matching: Score prompt against keyword set using configurable threshold (default: 0.6)
  • Modes:
    • positive: skill SHOULD activate for the prompt (score >= threshold → pass)
    • negative: skill should NOT activate (score < threshold → pass)

Example eval.yaml

graders:
  - type: trigger
    params:
      skill_path: skills/azure-deploy/SKILL.md
      mode: positive
      threshold: 0.6

Acceptance Criteria

  • trigger_grader.go with keyword extraction from SKILL.md
  • Configurable threshold and positive/negative modes
  • Tests covering: positive match, negative match, threshold edge cases
  • Registered in grader factory

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestgoPull requests that update go codepriority:p0Blocking releasesquad:linusAssigned to Linus (Backend Developer)

Fields

No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions