Skip to content

Feature: Semantic Skill Retrieval with SQLite FTS5 — Replace 4500-token broadcast with on-demand search #17649

@rustyorb

Description

@rustyorb

Feature: Semantic Skill Retrieval with SQLite FTS5

Problem: Skills system doesn't scale — 167 skills × 4500 tokens/turn = massive waste

Current State

Every turn injects ALL skills into system prompt:
- 167 skills indexed
- 17,943 characters
- ~4,485 tokens per turn
- Cost: ~$40/month at 3k turns
- Relevant skills per turn: 0-3 (98% waste)

Memory system has strict caps (MEMORY.md: 3000 chars, USER.md: 2500 chars) to control token cost, but skills get broadcast unfiltered every turn.

Architectural Inconsistency

The codebase ALREADY does semantic retrieval for:

But skills still use batch broadcast from when there were 20 skills, not 167.

Proposed Solution

SQLite-backed skill index with FTS5 semantic search (same pattern as SessionDB):

# ~/.hermes/skills.db

CREATE TABLE skills (
    name TEXT PRIMARY KEY,
    description TEXT,
    category TEXT,
    tags TEXT,  -- JSON
    path TEXT,
    use_count INTEGER DEFAULT 0,
    last_used INTEGER
);

CREATE VIRTUAL TABLE skills_fts USING fts5(
    name, description, tags, content='skills'
);

Workflow

# On skill sync: populate DB (one-time cost)
sync_skills_to_db(~/.hermes/skills/, db)

# Per turn: build compact skill index
def build_skills_system_prompt(user_message=""):
    # Top 10 by usage (proven useful)
    pinned = db.get_top_by_usage(limit=10)
    
    # Top 5 by FTS5 match to current message
    candidates = db.search(user_message, limit=5) if user_message else []
    
    # Result: ~15 skills max
    return format_index(pinned + candidates)

Impact

Metric Before After Savings
Skills injected 167 15 91%
Tokens/turn 4,485 ~200 95%
Cost/month (3k turns) $40.50 $1.80 $38.70

Implementation Notes

Infrastructure already exists:

  • SQLite + FTS5 experience (SessionDB)
  • Skill sync mechanism (tools/skills_sync.py)
  • Usage tracking (Skill Topography telemetry)

Code changes needed:

  1. skill_db.py — new module with SkillDB class
  2. agent/prompt_builder.py — replace build_skills_system_prompt() with semantic retrieval
  3. tools/skills_sync.py — call db.upsert_skill() during sync
  4. tools/skills_tool.py — call db.record_usage() on skill_view()

Config:

skills:
  retrieval: semantic  # new: use DB search (default)
  # retrieval: broadcast  # old: inject all skills
  top_k: 15  # max skills per turn

Related

Priority

P2 (High) — Real cost impact. Memory is capped at 2500 chars to save tokens, but skills waste 4500 tokens/turn. Architectural debt at scale.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3Low — cosmetic, nice to havecomp/agentCore agent loop, run_agent.py, prompt buildertool/skillsSkills system (list, view, manage)type/featureNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions