Skip to content

Adding Skill Dependencies with Version Validation + Testing Specification (useful for enterprise adopters) #110

@AndoSan84

Description

@AndoSan84

##A Little Bit of Context:
I am pushing in my company to adopt skills, but this seems like a "must have" for Enterprise clients, every software engineer that has worked with Maven/Npm or so will ask for something like that.
I am posting everything required to adopt this change request, code for the "validator" included, in the repo mentioned above.

Let me know if you like the idea, i was gonna implement this for myself and just felt like sharing it.

Summary

This RFC proposes two optional additions to the Agent Skills specification:

  1. requires — Declare dependencies between skills with version validation
  2. test — Declare test cases for skill validation

I've seen related discussions in #90 (skill relationships) and #46 (versioning). This proposal takes a complementary approach focused on tooling-side validation rather than agent-side hints, plus adds testing which hasn't been addressed yet.

Relationship to Existing Proposals

Aspect #90 (Relationships) This RFC
Dependencies Soft hints for agent Hard validation at submission
Versioning Not addressed Minimum version check
Validation Agent decides Tooling enforces
Testing Not addressed ✅ Full test specification

The key difference: #90 helps agents reason about skill combinations. This RFC helps tooling catch problems before skills reach agents.

Both can coexist — prerequisite-skills for agent hints, requires for tooling validation.

The Problems

1. Silent breakage

Skill A assumes skill B exists. Someone modifies B. A breaks at runtime with no warning.

2. Version drift

Skill A was developed with B@1.2.0. Environment has B@1.0.0. Subtle bugs emerge.

3. No standard testing

Every organization invents their own approach. Skills can't be validated portably.

Proposed Solution

requires field

---
name: integration-test-runner
description: Runs integration tests on specified environment.

metadata:
  version: "2.1.0"

requires:
  - skill: environment-selector
    version: "1.2.0"
  - skill: logging-standards
---

Validation (at submission time):

  • environment-selector must exist
  • Its version must be >= 1.2.0
  • Circular dependencies are rejected

Agents require no changes. They never see requires — tooling validates before skills become available.

test field

---
name: environment-selector
description: Selects deployment environment.

test:
  cases: test/cases.yaml
---
# test/cases.yaml
cases:
  - name: select_dev
    input: "Select DEV environment"
    assertions:
      output_contains:
        - "DEV"
      output_not_contains:
        - "PROD"
        - "error"

  - name: require_prod_confirmation
    input: "Select PROD environment"
    assertions:
      output_contains:
        - "confirm"
      semantic_match:
        criterion: "Response asks for explicit confirmation before production deployment"

Assertions are simple and agent-agnostic:

  • output_contains / output_not_contains — string matching
  • output_matches — regex
  • semantic_match — LLM judge for fuzzy criteria

Tests are deterministic. No retry logic. If a well-written criterion fails, either the skill is broken or the test needs fixing.

Design Decisions

Decision Rationale
Tooling validates, not agents Zero changes to existing agents
Version = minimum required Snapshot of "tested with", allows newer
No lockfiles Git handles version control
No namespaces Simple name lookup in skills directory
No optional dependencies If it's optional, it's not a dependency
Deterministic tests Flaky tests get disabled; clear criteria pass consistently

Reference Implementation

I've built a Python tool that extends skills-ref:

# Validate dependencies and versions
skills-ref validate ./my-skill --skills-root ./skills

# Initialize skill with dependency snapshot
skills-ref init ./new-skill --skills-root ./skills

# Run tests
skills-ref test ./my-skill

# Check for circular dependencies
skills-ref deps --check-circular

# Visualize dependency graph
skills-ref deps --graph

Repository: https://github.com/AndoSan84/scalble_skills

Full RFC Document

Complete specification with schema definitions, validation semantics, security considerations, and migration guide:

📄 RFC-001-dependencies-testing.md

Questions

  1. Does this complement Proposal: Skill Relationship Fields - adding prerequisite_skills and related_skills to the SKILL.md spec #90 well, or should we merge the approaches?
  2. Is the testing spec useful? Any concerns about semantic_match?
  3. PR to skills-ref or separate package?

Thanks for considering!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions