design: semantic skill discovery via description embedding instead of keyword extraction

## Problem

Skill auto-loading uses keyword token matching against LLM-enriched keywords
extracted from the skill body. This fails when:

1. **Skill content is operator-facing** — the body says "check the daemon log for
   structured patterns" but the user says "something is wrong with my session."
   Zero token overlap.

2. **Skills authored for other harnesses** — Claude Code skills, Cursor rules,
   and future OpenClaw community skills have a `description` field in natural
   language but no Netclaw-specific `metadata.triggers`. When ported, the keyword
   extractor mines the procedural body and gets operator vocabulary.

### Eval evidence

From the 0.7.5/0.7.6 eval baseline (10 runs each):

- `skill_manual` (netclaw-manual): **10/10** — "Can you schedule reminders?"
  has direct token overlap with the skill's triggers
- `skill_diagnostics` (netclaw-diagnostics): **0/10** — "Something is wrong with
  my session, can you diagnose it?" has no token overlap with operator-facing
  diagnostics skill content
- `skill_memory` (netclaw-memory): **0/10** — "What do you remember about our
  previous conversations?" doesn't match operator-facing memory guidance
- `skill_citation` (search-citation): **0/10** — "Search the web for the latest
  Akka.NET release" doesn't match citation policy terminology

Memory recall works well for the same queries because memories are written in
the user's own language. Skills are written in the operator's language.

## Ecosystem Context

All major agent harnesses use natural language `description` for skill discovery:

| Harness | Discovery Signal | Format |
|---------|-----------------|--------|
| Claude Code | `description` field in YAML frontmatter | "Use when explaining how code works..." |
| Cursor | `description` field for "Apply Intelligently" mode | Natural language |
| Windsurf | Directory scoping (no semantic matching) | N/A |
| Netclaw (current) | Keyword extraction from body + `metadata.triggers` | Token matching |

When OpenClaw opens the skill ecosystem, authors will bring skills with a
`description` and a procedural body. No `metadata.triggers`. The current keyword
approach won't match.

## Proposed Direction

Make `description` the primary discovery signal using **embedding similarity**
(same approach as memory recall), not keyword token matching.

### Why this works

- `description` is the one field every skill ecosystem writes in natural language
  targeting "when should this activate"
- Embedding similarity handles vocabulary gaps naturally — "something is wrong
  with my session" embeds close to "diagnose session issues" even with zero
  token overlap
- Memory recall already proves this works at scale in the same codebase
- No Netclaw-specific metadata required from skill authors

### Approach

1. At skill load time, compute an embedding of the `description` field
2. On each user message, compute the message embedding (already done for memory
   recall)
3. Rank skills by cosine similarity to the user message
4. Load skills above a configurable threshold
5. Keep keyword matching as a fallback/boost signal (not primary)

### Interim fix

Add sample user queries to `metadata.triggers` for system skills so current
keyword matching covers the eval cases. This is a band-aid — the semantic
approach is the real fix.

## Related

- PR #347 — eval suite (surfaced this gap)
- #352 — command-level approval gates (related security posture work)
- Current keyword matching: `SkillRegistry.MatchByKeywords`, enrichment in
  `SystemSkillSyncService.cs`
- Memory recall (reference implementation): `SQLiteMemoryRecallCoordinator`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

design: semantic skill discovery via description embedding instead of keyword extraction #355

Problem

Eval evidence

Ecosystem Context

Proposed Direction

Why this works

Approach

Interim fix

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Harness	Discovery Signal	Format
Claude Code	`description` field in YAML frontmatter	"Use when explaining how code works..."
Cursor	`description` field for "Apply Intelligently" mode	Natural language
Windsurf	Directory scoping (no semantic matching)	N/A
Netclaw (current)	Keyword extraction from body + `metadata.triggers`	Token matching

design: semantic skill discovery via description embedding instead of keyword extraction #355

Description

Problem

Eval evidence

Ecosystem Context

Proposed Direction

Why this works

Approach

Interim fix

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions