|
| 1 | +# KG Triple Extraction |
| 2 | + |
| 3 | +LLM-based extraction of typed relationship facts from drawer content |
| 4 | +into the AGE knowledge graph. Complements the existing regex |
| 5 | +`MENTIONS` extractor with structured `(subject)-[:RELATION]->(object)` |
| 6 | +triples that enable temporal queries, dependency maps, and |
| 7 | +relationship-aware graph search. |
| 8 | + |
| 9 | +Spec: [`docs/specs/kg-triple-extraction.md`](specs/kg-triple-extraction.md). |
| 10 | + |
| 11 | +## Architecture |
| 12 | + |
| 13 | +``` |
| 14 | +┌──────────────────────┐ |
| 15 | +│ Drawer write path │ PostgresCollection._insert_rows() |
| 16 | +│ (kg_writethrough) │ ├─ regex → MENTIONS edges (50ms, inline) |
| 17 | +│ │ └─ enqueue drawer_id → extraction queue (1ms) |
| 18 | +└──────────────────────┘ |
| 19 | + │ |
| 20 | + ▼ |
| 21 | +┌──────────────────────┐ mempalace_kg_extraction_queue |
| 22 | +│ Queue table │ (drawer_id, queued_at, started_at, |
| 23 | +│ │ completed_at, error) |
| 24 | +└──────────────────────┘ |
| 25 | + │ |
| 26 | + ▼ |
| 27 | +┌──────────────────────┐ asyncio + semaphore(N) |
| 28 | +│ Worker (systemd) │ ├─ UPDATE ... SKIP LOCKED claim |
| 29 | +│ kg_triple_worker.py │ ├─ POST to llama-server |
| 30 | +│ │ ├─ parse JSON triples |
| 31 | +│ │ └─ kg.add_triple() per fact |
| 32 | +└──────────────────────┘ |
| 33 | + │ |
| 34 | + ▼ |
| 35 | +┌──────────────────────┐ |
| 36 | +│ AGE knowledge graph │ (Entity)-[:RELATION {confidence}]->(Entity) |
| 37 | +└──────────────────────┘ |
| 38 | +``` |
| 39 | + |
| 40 | +Two key invariants: |
| 41 | + |
| 42 | +- **Idempotent.** `add_triple` uses `MERGE`, so re-processing a drawer |
| 43 | + is a no-op. SIGTERM and SIGKILL are both safe. |
| 44 | +- **Resumable.** The queue table is the cursor. Stop and restart the |
| 45 | + worker; it picks up where it left off via `SKIP LOCKED`. |
| 46 | + |
| 47 | +## Install |
| 48 | + |
| 49 | +On familiar (where the worker and llama-server both run): |
| 50 | + |
| 51 | +```bash |
| 52 | +# 1. install the package |
| 53 | +cd /opt/mempalace |
| 54 | +sudo -u jp pip install -e . |
| 55 | + |
| 56 | +# 2. create the env file |
| 57 | +sudo install -d -m 0750 -o root -g jp /etc/mempalace |
| 58 | +sudo install -m 0640 -o root -g jp \ |
| 59 | + deploy/systemd/kg-extract.env.example /etc/mempalace/kg-extract.env |
| 60 | +sudo editor /etc/mempalace/kg-extract.env # set MEMPALACE_POSTGRES_DSN |
| 61 | + |
| 62 | +# 3. install + enable the unit (llama-server-extractor must be up first) |
| 63 | +sudo install -m 0644 deploy/systemd/mempalace-kg-extract.service \ |
| 64 | + /etc/systemd/system/ |
| 65 | +sudo systemctl daemon-reload |
| 66 | +sudo systemctl enable --now mempalace-kg-extract.service |
| 67 | +``` |
| 68 | + |
| 69 | +The unit `Wants=llama-server-extractor.service` so starting it brings |
| 70 | +llama-server up if it isn't already. |
| 71 | + |
| 72 | +## Backfill the existing palace |
| 73 | + |
| 74 | +The writethrough hook only enqueues drawers written *after* it was |
| 75 | +installed. For the existing 364K drawers, run the backfill driver: |
| 76 | + |
| 77 | +```bash |
| 78 | +# default — 24 in-flight workers, batch of 100 |
| 79 | +python scripts/backfill_kg_triples.py |
| 80 | + |
| 81 | +# custom tuning |
| 82 | +python scripts/backfill_kg_triples.py --workers 16 --batch-size 50 --poll-interval 30 |
| 83 | +``` |
| 84 | + |
| 85 | +The driver: |
| 86 | + |
| 87 | +- Wraps `mempalace-kg-extract --backfill --workers N --batch-size N`. |
| 88 | +- Emits one-line progress every 60s: |
| 89 | + ``` |
| 90 | + drawers_completed=12345 in_flight=7 pending=350000 rate=24.6/min errors=12 eta=10.2d elapsed=1800s |
| 91 | + ``` |
| 92 | +- Releases in-flight queue rows on SIGTERM so a restart re-claims them. |
| 93 | +- Is resumable — the queue table itself is the cursor. Kill and re-run. |
| 94 | + |
| 95 | +For true CPU parallelism, the queue claim uses `UPDATE ... SKIP LOCKED`, |
| 96 | +so multiple processes can run side-by-side trivially: |
| 97 | + |
| 98 | +```bash |
| 99 | +# four parallel backfill processes |
| 100 | +for i in 1 2 3 4; do |
| 101 | + python scripts/backfill_kg_triples.py --workers 8 \ |
| 102 | + > /var/log/mempalace/backfill-$i.log 2>&1 & |
| 103 | +done |
| 104 | +``` |
| 105 | + |
| 106 | +At ~25 drawers/min per worker, 364K drawers takes ~10 days with a |
| 107 | +single 24-worker process — or ~2.5 days with four parallel processes. |
| 108 | + |
| 109 | +## Observability |
| 110 | + |
| 111 | +### Worker status |
| 112 | + |
| 113 | +```bash |
| 114 | +mempalace-kg-extract --status |
| 115 | +``` |
| 116 | + |
| 117 | +Prints queue depth, in-flight, completed today, errors, and recent |
| 118 | +throughput in plain text. |
| 119 | + |
| 120 | +### Daemon endpoint |
| 121 | + |
| 122 | +```bash |
| 123 | +curl -H "X-API-Key: $PALACE_API_KEY" \ |
| 124 | + http://familiar.jphe.in:8085/kg-extract/status | jq |
| 125 | +``` |
| 126 | + |
| 127 | +Returns JSON — see `scratch/kg-extract/palace-daemon-patch.md` for the |
| 128 | +shape. Pair with the existing `/backfill-age/status` endpoint for a |
| 129 | +complete picture of graph-population state. |
| 130 | + |
| 131 | +### Journal tail |
| 132 | + |
| 133 | +```bash |
| 134 | +# systemd-managed worker |
| 135 | +journalctl -u mempalace-kg-extract.service -f |
| 136 | + |
| 137 | +# backfill driver (foreground or via systemd run) |
| 138 | +journalctl -t backfill-kg-triples -f |
| 139 | +``` |
| 140 | + |
| 141 | +### AGE-side counters |
| 142 | + |
| 143 | +```bash |
| 144 | +mempalace kg-stats | jq '.relationships' |
| 145 | +``` |
| 146 | + |
| 147 | +After a successful backfill on the full palace, expect 100K+ RELATION |
| 148 | +edges across the typical predicate set (`works_on`, `depends_on`, |
| 149 | +`migrated_from`, `lives_in`, …). Tune from there. |
| 150 | + |
| 151 | +## Common failures |
| 152 | + |
| 153 | +| Symptom | Cause | Fix | |
| 154 | +|---|---|---| |
| 155 | +| `error: connection refused` on extractor | llama-server not running | `systemctl status llama-server-extractor` and start it | |
| 156 | +| Queue stalls (no `completed_at` advancing) | Worker crashed leaving in-flight claims | Backfill driver SIGTERM hook releases these; SIGKILL needs manual `UPDATE ... SET started_at = NULL WHERE started_at < NOW() - INTERVAL '10 minutes'` | |
| 157 | +| Duplicate-looking triples | Different drawers, same fact | Expected — `MERGE` makes it idempotent at the AGE layer. Confidence is averaged across sources. | |
| 158 | +| `cuda: out of memory` in llama-server log | `--parallel` too high for VRAM | Lower llama-server `--parallel` (currently 8 on P102 10GB) | |
| 159 | +| `errors_total` climbing in status endpoint | Malformed LLM JSON output | Check `error` column on the queue table; usually a context-length overflow. Drawer-splitter (spec open question #2) addresses this. | |
| 160 | + |
| 161 | +## Tuning |
| 162 | + |
| 163 | +| Knob | Default | Notes | |
| 164 | +|---|---|---| |
| 165 | +| `--workers` (driver / unit) | 24 (driver), 8 (unit) | In-flight HTTP requests to llama-server. Above llama-server's `--parallel` (8), excess requests queue at the server with no extra throughput. | |
| 166 | +| `--batch-size` | 100 (driver), 20 (unit) | Drawers claimed per dequeue round-trip. Larger = fewer round-trips but bigger claim window (more rows orphaned on SIGKILL). | |
| 167 | +| `--poll-interval` | 30 | Seconds the worker sleeps between dequeues when the queue is empty. Lower = quicker resumption after a write burst, higher = lighter DB load. | |
| 168 | +| llama-server `--parallel` | 8 | Concurrent inference slots on the P102. Bumping above 8 risks OOM at Q4. | |
| 169 | +| DB connection pool | psycopg2 default | Each worker opens 1-2 connections. With 24 in-flight + 4 backfill processes that's ~100 connections — well under postgres's default `max_connections=100` but worth bumping if running many parallel backfills. | |
| 170 | + |
| 171 | +## See also |
| 172 | + |
| 173 | +- Spec: [`docs/specs/kg-triple-extraction.md`](specs/kg-triple-extraction.md) |
| 174 | +- Companion: [`docs/AGE_NOTES.md`](AGE_NOTES.md) for the underlying graph |
| 175 | +- Daemon patch: [`scratch/kg-extract/palace-daemon-patch.md`](../scratch/kg-extract/palace-daemon-patch.md) |
| 176 | +- llama-server unit: [`scratch/kg-extract/llama-server-extractor.service`](../scratch/kg-extract/llama-server-extractor.service) |
0 commit comments