Skip to content

Commit 59ac0bc

Browse files
committed
feat(kg-extract): backfill driver + systemd unit + docs
Phases 4 + 5 of docs/specs/kg-triple-extraction.md. - scripts/backfill_kg_triples.py: 24-worker driver wrapping mempalace-kg-extract --backfill. SIGTERM releases in-flight queue rows, one-line progress log every 60s, resumable via queue table cursor. SKIP LOCKED claim lets multiple processes run side-by-side. - deploy/systemd/mempalace-kg-extract.service + kg-extract.env.example: systemd unit (NOT installed on familiar yet — that's consolidation), Wants= the llama-server-extractor unit, EnvironmentFile keeps DSN out of the unit. - docs/kg-extraction.md: operator guide covering install, backfill, observability (worker --status, daemon endpoint, journalctl), common failures, and tuning knobs. - tests/test_backfill_kg_triples.py: 10 tests covering progress log format (operator grep contract), SIGTERM release SQL shape, counters query, and CLI defaults match docs. Palace-daemon /kg-extract/status endpoint patch lives at scratch/kg-extract/palace-daemon-patch.md for Sandman to land in the daemon's own PR flow.
1 parent 8d9a1be commit 59ac0bc

5 files changed

Lines changed: 870 additions & 0 deletions

File tree

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# MemPalace KG triple extraction worker environment.
2+
#
3+
# Install path on familiar (suggested):
4+
# sudo install -m 0640 -o root -g jp kg-extract.env /etc/mempalace/kg-extract.env
5+
#
6+
# Then enable the unit:
7+
# sudo systemctl daemon-reload
8+
# sudo systemctl enable --now mempalace-kg-extract.service
9+
#
10+
# Required.
11+
MEMPALACE_POSTGRES_DSN=postgresql://mempalace:CHANGE_ME@localhost:5433/mempalace
12+
13+
# Required — llama.cpp inference server hosting the extraction model.
14+
# When the worker runs on familiar alongside llama-server, this is
15+
# localhost; when running off-host, use http://familiar.jphe.in:11436.
16+
MEMPALACE_KG_LLM_ENDPOINT=http://localhost:11436
17+
18+
# Optional — defaults documented in docs/kg-extraction.md.
19+
# MEMPALACE_KG_LLM_MODEL=phi-4-mini
20+
# MEMPALACE_KG_MIN_CONFIDENCE=0.5
21+
# MEMPALACE_KG_MAX_TRIPLES_PER_DRAWER=10
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
[Unit]
2+
Description=MemPalace KG triple extraction worker
3+
Documentation=https://github.com/techempower-org/mempalace/blob/main/docs/kg-extraction.md
4+
After=network.target llama-server-extractor.service
5+
Wants=llama-server-extractor.service
6+
7+
[Service]
8+
Type=simple
9+
User=jp
10+
EnvironmentFile=/etc/mempalace/kg-extract.env
11+
ExecStart=/usr/local/bin/mempalace-kg-extract --workers 8 --batch-size 20 --poll-interval 30
12+
Restart=on-failure
13+
RestartSec=10
14+
15+
# Resource hints — extraction is HTTP-bound (waits on llama-server),
16+
# so CPU/memory budgets are modest. Tune if running multiple workers.
17+
MemoryMax=512M
18+
CPUQuota=200%
19+
20+
# Logging — journalctl is the operator's primary observability channel.
21+
# See docs/kg-extraction.md for tail commands.
22+
StandardOutput=journal
23+
StandardError=journal
24+
SyslogIdentifier=mempalace-kg-extract
25+
26+
[Install]
27+
WantedBy=multi-user.target

docs/kg-extraction.md

Lines changed: 176 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
# KG Triple Extraction
2+
3+
LLM-based extraction of typed relationship facts from drawer content
4+
into the AGE knowledge graph. Complements the existing regex
5+
`MENTIONS` extractor with structured `(subject)-[:RELATION]->(object)`
6+
triples that enable temporal queries, dependency maps, and
7+
relationship-aware graph search.
8+
9+
Spec: [`docs/specs/kg-triple-extraction.md`](specs/kg-triple-extraction.md).
10+
11+
## Architecture
12+
13+
```
14+
┌──────────────────────┐
15+
│ Drawer write path │ PostgresCollection._insert_rows()
16+
│ (kg_writethrough) │ ├─ regex → MENTIONS edges (50ms, inline)
17+
│ │ └─ enqueue drawer_id → extraction queue (1ms)
18+
└──────────────────────┘
19+
20+
21+
┌──────────────────────┐ mempalace_kg_extraction_queue
22+
│ Queue table │ (drawer_id, queued_at, started_at,
23+
│ │ completed_at, error)
24+
└──────────────────────┘
25+
26+
27+
┌──────────────────────┐ asyncio + semaphore(N)
28+
│ Worker (systemd) │ ├─ UPDATE ... SKIP LOCKED claim
29+
│ kg_triple_worker.py │ ├─ POST to llama-server
30+
│ │ ├─ parse JSON triples
31+
│ │ └─ kg.add_triple() per fact
32+
└──────────────────────┘
33+
34+
35+
┌──────────────────────┐
36+
│ AGE knowledge graph │ (Entity)-[:RELATION {confidence}]->(Entity)
37+
└──────────────────────┘
38+
```
39+
40+
Two key invariants:
41+
42+
- **Idempotent.** `add_triple` uses `MERGE`, so re-processing a drawer
43+
is a no-op. SIGTERM and SIGKILL are both safe.
44+
- **Resumable.** The queue table is the cursor. Stop and restart the
45+
worker; it picks up where it left off via `SKIP LOCKED`.
46+
47+
## Install
48+
49+
On familiar (where the worker and llama-server both run):
50+
51+
```bash
52+
# 1. install the package
53+
cd /opt/mempalace
54+
sudo -u jp pip install -e .
55+
56+
# 2. create the env file
57+
sudo install -d -m 0750 -o root -g jp /etc/mempalace
58+
sudo install -m 0640 -o root -g jp \
59+
deploy/systemd/kg-extract.env.example /etc/mempalace/kg-extract.env
60+
sudo editor /etc/mempalace/kg-extract.env # set MEMPALACE_POSTGRES_DSN
61+
62+
# 3. install + enable the unit (llama-server-extractor must be up first)
63+
sudo install -m 0644 deploy/systemd/mempalace-kg-extract.service \
64+
/etc/systemd/system/
65+
sudo systemctl daemon-reload
66+
sudo systemctl enable --now mempalace-kg-extract.service
67+
```
68+
69+
The unit `Wants=llama-server-extractor.service` so starting it brings
70+
llama-server up if it isn't already.
71+
72+
## Backfill the existing palace
73+
74+
The writethrough hook only enqueues drawers written *after* it was
75+
installed. For the existing 364K drawers, run the backfill driver:
76+
77+
```bash
78+
# default — 24 in-flight workers, batch of 100
79+
python scripts/backfill_kg_triples.py
80+
81+
# custom tuning
82+
python scripts/backfill_kg_triples.py --workers 16 --batch-size 50 --poll-interval 30
83+
```
84+
85+
The driver:
86+
87+
- Wraps `mempalace-kg-extract --backfill --workers N --batch-size N`.
88+
- Emits one-line progress every 60s:
89+
```
90+
drawers_completed=12345 in_flight=7 pending=350000 rate=24.6/min errors=12 eta=10.2d elapsed=1800s
91+
```
92+
- Releases in-flight queue rows on SIGTERM so a restart re-claims them.
93+
- Is resumable — the queue table itself is the cursor. Kill and re-run.
94+
95+
For true CPU parallelism, the queue claim uses `UPDATE ... SKIP LOCKED`,
96+
so multiple processes can run side-by-side trivially:
97+
98+
```bash
99+
# four parallel backfill processes
100+
for i in 1 2 3 4; do
101+
python scripts/backfill_kg_triples.py --workers 8 \
102+
> /var/log/mempalace/backfill-$i.log 2>&1 &
103+
done
104+
```
105+
106+
At ~25 drawers/min per worker, 364K drawers takes ~10 days with a
107+
single 24-worker process — or ~2.5 days with four parallel processes.
108+
109+
## Observability
110+
111+
### Worker status
112+
113+
```bash
114+
mempalace-kg-extract --status
115+
```
116+
117+
Prints queue depth, in-flight, completed today, errors, and recent
118+
throughput in plain text.
119+
120+
### Daemon endpoint
121+
122+
```bash
123+
curl -H "X-API-Key: $PALACE_API_KEY" \
124+
http://familiar.jphe.in:8085/kg-extract/status | jq
125+
```
126+
127+
Returns JSON — see `scratch/kg-extract/palace-daemon-patch.md` for the
128+
shape. Pair with the existing `/backfill-age/status` endpoint for a
129+
complete picture of graph-population state.
130+
131+
### Journal tail
132+
133+
```bash
134+
# systemd-managed worker
135+
journalctl -u mempalace-kg-extract.service -f
136+
137+
# backfill driver (foreground or via systemd run)
138+
journalctl -t backfill-kg-triples -f
139+
```
140+
141+
### AGE-side counters
142+
143+
```bash
144+
mempalace kg-stats | jq '.relationships'
145+
```
146+
147+
After a successful backfill on the full palace, expect 100K+ RELATION
148+
edges across the typical predicate set (`works_on`, `depends_on`,
149+
`migrated_from`, `lives_in`, …). Tune from there.
150+
151+
## Common failures
152+
153+
| Symptom | Cause | Fix |
154+
|---|---|---|
155+
| `error: connection refused` on extractor | llama-server not running | `systemctl status llama-server-extractor` and start it |
156+
| Queue stalls (no `completed_at` advancing) | Worker crashed leaving in-flight claims | Backfill driver SIGTERM hook releases these; SIGKILL needs manual `UPDATE ... SET started_at = NULL WHERE started_at < NOW() - INTERVAL '10 minutes'` |
157+
| Duplicate-looking triples | Different drawers, same fact | Expected — `MERGE` makes it idempotent at the AGE layer. Confidence is averaged across sources. |
158+
| `cuda: out of memory` in llama-server log | `--parallel` too high for VRAM | Lower llama-server `--parallel` (currently 8 on P102 10GB) |
159+
| `errors_total` climbing in status endpoint | Malformed LLM JSON output | Check `error` column on the queue table; usually a context-length overflow. Drawer-splitter (spec open question #2) addresses this. |
160+
161+
## Tuning
162+
163+
| Knob | Default | Notes |
164+
|---|---|---|
165+
| `--workers` (driver / unit) | 24 (driver), 8 (unit) | In-flight HTTP requests to llama-server. Above llama-server's `--parallel` (8), excess requests queue at the server with no extra throughput. |
166+
| `--batch-size` | 100 (driver), 20 (unit) | Drawers claimed per dequeue round-trip. Larger = fewer round-trips but bigger claim window (more rows orphaned on SIGKILL). |
167+
| `--poll-interval` | 30 | Seconds the worker sleeps between dequeues when the queue is empty. Lower = quicker resumption after a write burst, higher = lighter DB load. |
168+
| llama-server `--parallel` | 8 | Concurrent inference slots on the P102. Bumping above 8 risks OOM at Q4. |
169+
| DB connection pool | psycopg2 default | Each worker opens 1-2 connections. With 24 in-flight + 4 backfill processes that's ~100 connections — well under postgres's default `max_connections=100` but worth bumping if running many parallel backfills. |
170+
171+
## See also
172+
173+
- Spec: [`docs/specs/kg-triple-extraction.md`](specs/kg-triple-extraction.md)
174+
- Companion: [`docs/AGE_NOTES.md`](AGE_NOTES.md) for the underlying graph
175+
- Daemon patch: [`scratch/kg-extract/palace-daemon-patch.md`](../scratch/kg-extract/palace-daemon-patch.md)
176+
- llama-server unit: [`scratch/kg-extract/llama-server-extractor.service`](../scratch/kg-extract/llama-server-extractor.service)

0 commit comments

Comments
 (0)