Feature: Semantic Skill Retrieval with SQLite FTS5
Problem: Skills system doesn't scale — 167 skills × 4500 tokens/turn = massive waste
Current State
Every turn injects ALL skills into system prompt:
- 167 skills indexed
- 17,943 characters
- ~4,485 tokens per turn
- Cost: ~$40/month at 3k turns
- Relevant skills per turn: 0-3 (98% waste)
Memory system has strict caps (MEMORY.md: 3000 chars, USER.md: 2500 chars) to control token cost, but skills get broadcast unfiltered every turn.
Architectural Inconsistency
The codebase ALREADY does semantic retrieval for:
But skills still use batch broadcast from when there were 20 skills, not 167.
Proposed Solution
SQLite-backed skill index with FTS5 semantic search (same pattern as SessionDB):
# ~/.hermes/skills.db
CREATE TABLE skills (
name TEXT PRIMARY KEY,
description TEXT,
category TEXT,
tags TEXT, -- JSON
path TEXT,
use_count INTEGER DEFAULT 0,
last_used INTEGER
);
CREATE VIRTUAL TABLE skills_fts USING fts5(
name, description, tags, content='skills'
);
Workflow
# On skill sync: populate DB (one-time cost)
sync_skills_to_db(~/.hermes/skills/, db)
# Per turn: build compact skill index
def build_skills_system_prompt(user_message=""):
# Top 10 by usage (proven useful)
pinned = db.get_top_by_usage(limit=10)
# Top 5 by FTS5 match to current message
candidates = db.search(user_message, limit=5) if user_message else []
# Result: ~15 skills max
return format_index(pinned + candidates)
Impact
| Metric |
Before |
After |
Savings |
| Skills injected |
167 |
15 |
91% |
| Tokens/turn |
4,485 |
~200 |
95% |
| Cost/month (3k turns) |
$40.50 |
$1.80 |
$38.70 |
Implementation Notes
Infrastructure already exists:
- SQLite + FTS5 experience (SessionDB)
- Skill sync mechanism (tools/skills_sync.py)
- Usage tracking (Skill Topography telemetry)
Code changes needed:
skill_db.py — new module with SkillDB class
agent/prompt_builder.py — replace build_skills_system_prompt() with semantic retrieval
tools/skills_sync.py — call db.upsert_skill() during sync
tools/skills_tool.py — call db.record_usage() on skill_view()
Config:
skills:
retrieval: semantic # new: use DB search (default)
# retrieval: broadcast # old: inject all skills
top_k: 15 # max skills per turn
Related
Priority
P2 (High) — Real cost impact. Memory is capped at 2500 chars to save tokens, but skills waste 4500 tokens/turn. Architectural debt at scale.
Feature: Semantic Skill Retrieval with SQLite FTS5
Problem: Skills system doesn't scale — 167 skills × 4500 tokens/turn = massive waste
Current State
Memory system has strict caps (MEMORY.md: 3000 chars, USER.md: 2500 chars) to control token cost, but skills get broadcast unfiltered every turn.
Architectural Inconsistency
The codebase ALREADY does semantic retrieval for:
session_search()— FTS5 across past conversationsmem0_search()— vector search for factsBut skills still use batch broadcast from when there were 20 skills, not 167.
Proposed Solution
SQLite-backed skill index with FTS5 semantic search (same pattern as SessionDB):
Workflow
Impact
Implementation Notes
Infrastructure already exists:
Code changes needed:
skill_db.py— new module with SkillDB classagent/prompt_builder.py— replacebuild_skills_system_prompt()with semantic retrievaltools/skills_sync.py— calldb.upsert_skill()during synctools/skills_tool.py— calldb.record_usage()on skill_view()Config:
Related
use_count)Priority
P2 (High) — Real cost impact. Memory is capped at 2500 chars to save tokens, but skills waste 4500 tokens/turn. Architectural debt at scale.