Skip to content

design: semantic skill discovery via description embedding instead of keyword extraction #355

@Aaronontheweb

Description

@Aaronontheweb

Problem

Skill auto-loading uses keyword token matching against LLM-enriched keywords
extracted from the skill body. This fails when:

  1. Skill content is operator-facing — the body says "check the daemon log for
    structured patterns" but the user says "something is wrong with my session."
    Zero token overlap.

  2. Skills authored for other harnesses — Claude Code skills, Cursor rules,
    and future OpenClaw community skills have a description field in natural
    language but no Netclaw-specific metadata.triggers. When ported, the keyword
    extractor mines the procedural body and gets operator vocabulary.

Eval evidence

From the 0.7.5/0.7.6 eval baseline (10 runs each):

  • skill_manual (netclaw-manual): 10/10 — "Can you schedule reminders?"
    has direct token overlap with the skill's triggers
  • skill_diagnostics (netclaw-diagnostics): 0/10 — "Something is wrong with
    my session, can you diagnose it?" has no token overlap with operator-facing
    diagnostics skill content
  • skill_memory (netclaw-memory): 0/10 — "What do you remember about our
    previous conversations?" doesn't match operator-facing memory guidance
  • skill_citation (search-citation): 0/10 — "Search the web for the latest
    Akka.NET release" doesn't match citation policy terminology

Memory recall works well for the same queries because memories are written in
the user's own language. Skills are written in the operator's language.

Ecosystem Context

All major agent harnesses use natural language description for skill discovery:

Harness Discovery Signal Format
Claude Code description field in YAML frontmatter "Use when explaining how code works..."
Cursor description field for "Apply Intelligently" mode Natural language
Windsurf Directory scoping (no semantic matching) N/A
Netclaw (current) Keyword extraction from body + metadata.triggers Token matching

When OpenClaw opens the skill ecosystem, authors will bring skills with a
description and a procedural body. No metadata.triggers. The current keyword
approach won't match.

Proposed Direction

Make description the primary discovery signal using embedding similarity
(same approach as memory recall), not keyword token matching.

Why this works

  • description is the one field every skill ecosystem writes in natural language
    targeting "when should this activate"
  • Embedding similarity handles vocabulary gaps naturally — "something is wrong
    with my session" embeds close to "diagnose session issues" even with zero
    token overlap
  • Memory recall already proves this works at scale in the same codebase
  • No Netclaw-specific metadata required from skill authors

Approach

  1. At skill load time, compute an embedding of the description field
  2. On each user message, compute the message embedding (already done for memory
    recall)
  3. Rank skills by cosine similarity to the user message
  4. Load skills above a configurable threshold
  5. Keep keyword matching as a fallback/boost signal (not primary)

Interim fix

Add sample user queries to metadata.triggers for system skills so current
keyword matching covers the eval cases. This is a band-aid — the semantic
approach is the real fix.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions