Problem
Skill auto-loading uses keyword token matching against LLM-enriched keywords
extracted from the skill body. This fails when:
-
Skill content is operator-facing — the body says "check the daemon log for
structured patterns" but the user says "something is wrong with my session."
Zero token overlap.
-
Skills authored for other harnesses — Claude Code skills, Cursor rules,
and future OpenClaw community skills have a description field in natural
language but no Netclaw-specific metadata.triggers. When ported, the keyword
extractor mines the procedural body and gets operator vocabulary.
Eval evidence
From the 0.7.5/0.7.6 eval baseline (10 runs each):
skill_manual (netclaw-manual): 10/10 — "Can you schedule reminders?"
has direct token overlap with the skill's triggers
skill_diagnostics (netclaw-diagnostics): 0/10 — "Something is wrong with
my session, can you diagnose it?" has no token overlap with operator-facing
diagnostics skill content
skill_memory (netclaw-memory): 0/10 — "What do you remember about our
previous conversations?" doesn't match operator-facing memory guidance
skill_citation (search-citation): 0/10 — "Search the web for the latest
Akka.NET release" doesn't match citation policy terminology
Memory recall works well for the same queries because memories are written in
the user's own language. Skills are written in the operator's language.
Ecosystem Context
All major agent harnesses use natural language description for skill discovery:
| Harness |
Discovery Signal |
Format |
| Claude Code |
description field in YAML frontmatter |
"Use when explaining how code works..." |
| Cursor |
description field for "Apply Intelligently" mode |
Natural language |
| Windsurf |
Directory scoping (no semantic matching) |
N/A |
| Netclaw (current) |
Keyword extraction from body + metadata.triggers |
Token matching |
When OpenClaw opens the skill ecosystem, authors will bring skills with a
description and a procedural body. No metadata.triggers. The current keyword
approach won't match.
Proposed Direction
Make description the primary discovery signal using embedding similarity
(same approach as memory recall), not keyword token matching.
Why this works
description is the one field every skill ecosystem writes in natural language
targeting "when should this activate"
- Embedding similarity handles vocabulary gaps naturally — "something is wrong
with my session" embeds close to "diagnose session issues" even with zero
token overlap
- Memory recall already proves this works at scale in the same codebase
- No Netclaw-specific metadata required from skill authors
Approach
- At skill load time, compute an embedding of the
description field
- On each user message, compute the message embedding (already done for memory
recall)
- Rank skills by cosine similarity to the user message
- Load skills above a configurable threshold
- Keep keyword matching as a fallback/boost signal (not primary)
Interim fix
Add sample user queries to metadata.triggers for system skills so current
keyword matching covers the eval cases. This is a band-aid — the semantic
approach is the real fix.
Related
Problem
Skill auto-loading uses keyword token matching against LLM-enriched keywords
extracted from the skill body. This fails when:
Skill content is operator-facing — the body says "check the daemon log for
structured patterns" but the user says "something is wrong with my session."
Zero token overlap.
Skills authored for other harnesses — Claude Code skills, Cursor rules,
and future OpenClaw community skills have a
descriptionfield in naturallanguage but no Netclaw-specific
metadata.triggers. When ported, the keywordextractor mines the procedural body and gets operator vocabulary.
Eval evidence
From the 0.7.5/0.7.6 eval baseline (10 runs each):
skill_manual(netclaw-manual): 10/10 — "Can you schedule reminders?"has direct token overlap with the skill's triggers
skill_diagnostics(netclaw-diagnostics): 0/10 — "Something is wrong withmy session, can you diagnose it?" has no token overlap with operator-facing
diagnostics skill content
skill_memory(netclaw-memory): 0/10 — "What do you remember about ourprevious conversations?" doesn't match operator-facing memory guidance
skill_citation(search-citation): 0/10 — "Search the web for the latestAkka.NET release" doesn't match citation policy terminology
Memory recall works well for the same queries because memories are written in
the user's own language. Skills are written in the operator's language.
Ecosystem Context
All major agent harnesses use natural language
descriptionfor skill discovery:descriptionfield in YAML frontmatterdescriptionfield for "Apply Intelligently" modemetadata.triggersWhen OpenClaw opens the skill ecosystem, authors will bring skills with a
descriptionand a procedural body. Nometadata.triggers. The current keywordapproach won't match.
Proposed Direction
Make
descriptionthe primary discovery signal using embedding similarity(same approach as memory recall), not keyword token matching.
Why this works
descriptionis the one field every skill ecosystem writes in natural languagetargeting "when should this activate"
with my session" embeds close to "diagnose session issues" even with zero
token overlap
Approach
descriptionfieldrecall)
Interim fix
Add sample user queries to
metadata.triggersfor system skills so currentkeyword matching covers the eval cases. This is a band-aid — the semantic
approach is the real fix.
Related
SkillRegistry.MatchByKeywords, enrichment inSystemSkillSyncService.csSQLiteMemoryRecallCoordinator