Feature: Semantic Skill Retrieval with SQLite FTS5

# Feature: Semantic Skill Retrieval with SQLite FTS5

**Problem:** Skills system doesn't scale — 167 skills × 4500 tokens/turn = massive waste

## Current State

```
Every turn injects ALL skills into system prompt:
- 167 skills indexed
- 17,943 characters
- ~4,485 tokens per turn
- Cost: ~$40/month at 3k turns
- Relevant skills per turn: 0-3 (98% waste)
```

Memory system has strict caps (MEMORY.md: 3000 chars, USER.md: 2500 chars) to control token cost, but skills get broadcast unfiltered every turn.

## Architectural Inconsistency

The codebase ALREADY does semantic retrieval for:
- `session_search()` — FTS5 across past conversations
- `mem0_search()` — vector search for facts
- Issue #13332 proposes hybrid search for **tools** (BM25 + embeddings)

But skills still use batch broadcast from when there were 20 skills, not 167.

## Proposed Solution

**SQLite-backed skill index with FTS5 semantic search** (same pattern as SessionDB):

```python
# ~/.hermes/skills.db

CREATE TABLE skills (
    name TEXT PRIMARY KEY,
    description TEXT,
    category TEXT,
    tags TEXT,  -- JSON
    path TEXT,
    use_count INTEGER DEFAULT 0,
    last_used INTEGER
);

CREATE VIRTUAL TABLE skills_fts USING fts5(
    name, description, tags, content='skills'
);
```

### Workflow

```python
# On skill sync: populate DB (one-time cost)
sync_skills_to_db(~/.hermes/skills/, db)

# Per turn: build compact skill index
def build_skills_system_prompt(user_message=""):
    # Top 10 by usage (proven useful)
    pinned = db.get_top_by_usage(limit=10)
    
    # Top 5 by FTS5 match to current message
    candidates = db.search(user_message, limit=5) if user_message else []
    
    # Result: ~15 skills max
    return format_index(pinned + candidates)
```

### Impact

| Metric | Before | After | Savings |
|--------|--------|-------|---------|
| Skills injected | 167 | 15 | 91% |
| Tokens/turn | 4,485 | ~200 | 95% |
| Cost/month (3k turns) | $40.50 | $1.80 | $38.70 |

## Implementation Notes

**Infrastructure already exists:**
- SQLite + FTS5 experience (SessionDB)
- Skill sync mechanism (tools/skills_sync.py)
- Usage tracking (Skill Topography telemetry)

**Code changes needed:**
1. `skill_db.py` — new module with SkillDB class
2. `agent/prompt_builder.py` — replace `build_skills_system_prompt()` with semantic retrieval
3. `tools/skills_sync.py` — call `db.upsert_skill()` during sync
4. `tools/skills_tool.py` — call `db.record_usage()` on skill_view()

**Config:**
```yaml
skills:
  retrieval: semantic  # new: use DB search (default)
  # retrieval: broadcast  # old: inject all skills
  top_k: 15  # max skills per turn
```

## Related

- #13332 — Hybrid tool pre-selection (same problem, different namespace)
- Skill Topography system already tracks usage stats (can feed `use_count`)

## Priority

**P2 (High)** — Real cost impact. Memory is capped at 2500 chars to save tokens, but skills waste 4500 tokens/turn. Architectural debt at scale.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Semantic Skill Retrieval with SQLite FTS5 — Replace 4500-token broadcast with on-demand search #17649

Current State

Architectural Inconsistency

Proposed Solution

Workflow

Impact

Implementation Notes

Related

Priority

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Metric	Before	After	Savings
Skills injected	167	15	91%
Tokens/turn	4,485	~200	95%
Cost/month (3k turns)	$40.50	$1.80	$38.70

Feature: Semantic Skill Retrieval with SQLite FTS5 — Replace 4500-token broadcast with on-demand search #17649

Description

Feature: Semantic Skill Retrieval with SQLite FTS5

Current State

Architectural Inconsistency

Proposed Solution

Workflow

Impact

Implementation Notes

Related

Priority

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions