Skip to content

Feature: Per-Agent SQLite Database Tool — Persistent Queryable Structured Storage (inspired by Gobii) #485

@teknium1

Description

@teknium1

Overview

Gobii (GitHub, MIT) is an open-source AI agent platform built on browser-use. One of its most compelling features is a per-agent private SQLite database — every agent gets an embedded, persistent SQL database as a first-class tool. Agents can create tables, insert data, run queries, and build structured datasets across sessions.

This is fundamentally different from Hermes's current memory system (flat text notes in MEMORY.md) and from #346 (Structured Memory with typed nodes and graph edges). The SQLite tool is a data workspace — it enables structured data collection, tracking, analysis, and reporting workflows that are currently awkward or impossible with flat-text memory.

Think of it as giving the agent a spreadsheet on steroids: collect web scraping results into tables, track metrics over time, maintain inventories, aggregate research findings, build datasets — all queryable with SQL across sessions.


Research Findings

How Gobii's SQLite State Works

Gobii's implementation (api/agent/tools/sqlite_state.py) uses a Restore-Operate-Persist lifecycle:

  1. Restore: Downloads a compressed .db.zst file from object storage at session start, decompresses via subprocess
  2. Operate: Yields the temp .db path for tool use during the session
  3. Maintenance: Drops ephemeral tables, runs VACUUM + PRAGMA optimize
  4. Persist: Re-compresses with zstd level 3, re-uploads (databases >100MB are wiped)

Key Design Details:

  • Schema-aware prompting: get_sqlite_schema_prompt() builds LLM-readable schema summaries with row counts, data samples, and column type detection (JSON, CSV, emails, URLs, dates within text columns). Capped at 30KB, max 25 tables.
  • 30+ custom SQL functions registered for agent convenience:
    • Text: REGEXP, regexp_extract, grep_context, word_count
    • CSV: csv_parse, csv_column, csv_headers
    • Cleaning: html_to_text, clean_text, parse_number, parse_date
    • Extraction: extract_json, extract_emails, extract_urls
    • LLM hallucination aliases: NOW(), CURDATE(), GETDATE(), LEN() (maps to SQLite equivalents)
    • Statistics: CORR, STDDEV, VARIANCE (Welford's algorithm)
  • Guardrails (sqlite_guardrails.py):
    • SQLite authorizer blocks ATTACH, DETACH, load_extension, readfile
    • 30-second query timeout via set_progress_handler
    • SQLITE_LIMIT_ATTACHED = 0 (no external DB mounting)
  • Kanban bridge: Gobii even seeds an ephemeral __kanban_cards table per turn and syncs changes back to Postgres — demonstrating the power of giving agents SQL as a thinking/state tool.

Tool exposed to LLM:

Name: sqlite_query
Params: { query: string }
Description: "Executes a single SQL statement against the agent's private SQLite memory"

Why SQL is a Better Fit Than Key-Value for Structured Data

The current memory tool stores free-text notes. For structured data workflows, this creates problems:

  • No way to query across entries (e.g., "show me all items where price > $50")
  • No aggregation (sum, count, average, group by)
  • No schema enforcement or data types
  • Appending data is easy; querying it later is hard
  • LLMs are already excellent at generating SQL

Current State in Hermes Agent

What we have:

  • memory_tool.py: File-backed ~/.hermes/memories/MEMORY.md — plain text notes with add/replace/remove operations. Great for preferences and context, bad for structured data.
  • todo_tool.py: In-memory task list, not persistent across sessions.
  • hermes_state.py (SessionDB): Global SQLite at ~/.hermes/state.db with FTS5 search — stores sessions and messages. Internal only, not agent-accessible.
  • MCP: Can connect @modelcontextprotocol/server-sqlite as an external MCP server, but this requires manual setup and doesn't get the lifecycle management, guardrails, or custom functions.

What's missing:

  • No per-agent/per-user queryable database
  • No structured data tool
  • No SQL interface for agents
  • No persistent data collection across sessions

Related issues:


Implementation Plan

Skill vs. Tool Classification

This should be a tool because:

  • Requires database lifecycle management (create, checkpoint, cleanup, size limits)
  • Needs deterministic SQL execution with precise guardrails (not "best effort" LLM interpretation)
  • Handles security-sensitive operations (SQL injection prevention, filesystem isolation, resource limits)
  • Needs custom Python integration for registering SQL functions and the SQLite authorizer API

What We'd Need

  1. SQLite database manager — Per-user persistent .db file at ~/.hermes/databases/<user_id>.db
  2. sqlite_query tool — Execute SQL statements against the user's private database
  3. Guardrail system — Authorizer to block dangerous operations, query timeout, size limits
  4. Custom SQL functions — Convenience functions LLMs commonly need (regexp, date parsing, etc.)
  5. Schema context injection — Optionally inject current schema summary into system prompt or make it queryable

Phased Rollout

Phase 1: Core SQLite Tool

  • New tools/sqlite_tool.py with sqlite_query tool
  • Per-user persistent database at ~/.hermes/databases/
  • Basic guardrails: block ATTACH/DETACH/load_extension, query timeout (30s), max DB size (50MB)
  • Schema introspection tool: sqlite_schema returns current tables, columns, row counts
  • WAL mode for concurrent read safety
  • Register common aliases: NOW()datetime('now'), LEN()length()
  • Deliverable: Agents can create tables, insert data, query across sessions

Phase 2: Custom Functions & Schema Context

  • Register 15-20 custom SQL functions: REGEXP, parse_date, extract_emails, extract_urls, html_to_text, csv_parse, word_count, statistical aggregates
  • Auto-inject schema summary into system prompt when database has tables (like Gobii's get_sqlite_schema_prompt())
  • sqlite_export tool: export query results as CSV/JSON to a file
  • Result formatting: return query results as formatted tables for LLM readability
  • Deliverable: Agents have a rich SQL environment with data processing functions

Phase 3: Multi-Database & Integration

  • Named databases: sqlite_query(database="research", query="...") for topic-scoped storage
  • Import tool: sqlite_import to load CSV/JSON files directly into tables
  • Cronjob integration: scheduled agents can accumulate data across runs in the same database
  • Optional: expose via MCP for external tool access
  • Deliverable: Full-featured data workspace

Pros & Cons

Pros

  • Unlocks structured data workflows — Data collection, tracking, analysis, and reporting become natural agent tasks
  • LLMs are great at SQL — Models generate correct SQL far more reliably than custom query DSLs
  • Persistent across sessions — Data survives session resets, unlike todo lists
  • Complementary to memory — Memory stores context/preferences; SQLite stores structured datasets
  • Low dependency cost — SQLite is in Python's stdlib (sqlite3), zero external dependencies
  • Battle-tested pattern — Gobii has proven this works well in production with real agents
  • Enables new use cases — Price tracking, competitor monitoring, research aggregation, personal CRM, reading lists, etc.

Cons / Risks

  • Disk usage — Databases grow over time. Need size limits and cleanup policies.
  • SQL injection via prompts — Indirect prompt injection could craft malicious SQL. The authorizer guardrail mitigates this, but it's a real threat surface.
  • Schema drift — Agents may create messy schemas over many sessions. Schema introspection helps but doesn't prevent bloat.
  • Complexity — Another persistent state system alongside memory. Need clear guidance on when to use which.
  • Query result size — Large result sets could blow up context. Need row limits and truncation.

Open Questions

  1. One database per user or per purpose? Gobii uses one per agent. For Hermes (single-user, many sessions), one per user seems right for Phase 1, with named databases in Phase 3.
  2. Should schema be auto-injected into the system prompt? Pro: agent always knows what data it has. Con: uses context tokens. Could be opt-in or triggered by keyword detection.
  3. Size limits? Gobii caps at 100MB. For local Hermes, 50MB default seems reasonable with configurable override.
  4. Should this be enabled by default? Or gated behind a config flag? Default-on seems right since it's zero-dependency.
  5. Interaction with memory tool — Should we document when to use memory vs SQLite? Memory = preferences, context, notes. SQLite = structured data, collections, tracking.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions