perf(age): index edge endpoints in backfill + bind anonymous RELATION targets (#335) by jphein · Pull Request #337 · techempower-org/mempalace

jphein · 2026-05-31T01:18:57Z

Closes #335 — the three mempalace-side AGE graph-walk follow-ups surfaced by the Cat 7b hybrid-latency root-cause (palace-daemon docs/perf/2026-05-30-hybrid-graph-walk-latency.md, merged as palace-daemon#206). The per-entity graph-walk Cypher in searcher._graph_expand_* and the daemon's /search/age-fused lookup were the entire hybrid-vs-union latency delta.

1. Auto edge-endpoint indexes in backfill

AGE only btree-indexes a label table's own id (_ag_label_edge_pkey), never the start_id/end_id graphid columns the edge walks join on. So every per-entity lookup parallel-seq-scanned the whole edge table (MENTIONS 6.69M rows on prod, ~5.8s cold for a hot entity, × N query entities).

KnowledgeGraphAGE._ensure_edge_endpoint_indexes() (new, mirrors _ensure_drawer_unique_index) creates idx_mentions_{start,end}_id + idx_relation_{start,end}_id with CREATE INDEX IF NOT EXISTS, skipping any label whose backing table doesn't exist yet (the label is only created on first edge write). backfill_age.backfill() calls it after the edge pass so fresh palaces get the indexes automatically. Names/columns kept in sync with palace-daemon's scripts/age_graph_indexes.sql + the operator-online POST /backfill-age/indexes (CONCURRENTLY) route.

2. Bind the anonymous RELATION target

MATCH (a:Entity)-[r:RELATION]->() left the far endpoint anonymous, so AGE built a Parallel Append over every vertex label (Entity + Drawer + Room + Wing + _ag_label_vertex, ~1.58M rows on prod) and nested-loop-joined to validate the endpoint exists — materializing the union and spilling to /dev/shm.

RELATION is always (Entity)->(Entity), so binding the open end to :Entity is semantically identical and collapses the Append to a single Entity scan. Verified row-for-row on a 300K-edge AGE graph: ->() and ->(:Entity) both returned 1816 rows; with the start_id index the RELATION scan became a Bitmap Index Scan (152ms → 46ms). All four expand sites updated (_graph_expand_from_seeds outbound/inbound/per-entity + _graph_expand_from_entities).

3. shm-size (deploy-side — documented, not code)

mempalace-db is built from the sibling disks repo and docker run on familiar, not provisioned from this repo, so there's no in-repo compose file to patch. The required value (--shm-size=256m, off Docker's 64MB default) is documented as an operator action in docs/operators/2026-05-31-age-graph-walk-shm-size.md, with the EXPLAIN evidence and verify step.

Tests

6 new, all green:

test_searcher_stopwords.py::TestGraphExpandBoundEndpoint — source guards that neither expand fn leaves a -[r:RELATION]->() / ()-[r:RELATION]-> anonymous endpoint (sits alongside the existing perf(search): make graph-expand cypher directional in hybrid candidate strategy — ~100× speedup #291 directional guards).
test_knowledge_graph_age.py::test_edge_endpoint_index_targets_are_the_four_we_expect — pure-unit, pins the index name/label/column set in sync with the daemon SQL.
test_knowledge_graph_age.py::test_ensure_edge_endpoint_indexes_skips_absent_then_installs — @pgmark integration test against a live AGE container: skip-when-edge-tables-absent → install-all-when-present → idempotent on re-run.

Full suite green without TEST_POSTGRES_DSN (repo default); ruff (0.15.14) clean. FORK_CHANGELOG.md + README.md re-rendered from docs/fork-changes.yaml (the row-renumber is generator output, commit: HEAD resolves at ship-prep per repo convention).

🤖 Generated with Claude Code

@Pgmark

… targets Three AGE graph-walk follow-ups to the Cat 7b hybrid-latency root-cause (MemPalace#335). The per-entity graph-walk Cypher in searcher._graph_expand_* and the daemon's /search/age-fused lookup were the entire hybrid-vs-union latency delta. 1. Auto edge-endpoint indexes. AGE only btree-indexes a label table's own id, never the start_id/end_id graphid columns the edge walks join on — so every per-entity lookup parallel-seq-scanned the whole edge table (MENTIONS 6.69M rows, ~5.8s cold). Add KnowledgeGraphAGE._ensure_edge_endpoint_indexes (idx_mentions_{start,end}_id + idx_relation_{start,end}_id, CREATE INDEX IF NOT EXISTS, skips labels whose table doesn't exist yet), mirroring _ensure_drawer_unique_index; backfill_age.backfill calls it after the edge pass so fresh palaces get them automatically. Names/columns kept in sync with palace-daemon scripts/age_graph_indexes.sql + POST /backfill-age/indexes. 2. Bind the anonymous RELATION target. MATCH (a:Entity)-[r:RELATION]->() left the far endpoint anonymous, so AGE built a Parallel Append over every vertex label (~1.58M rows) + nested-loop to validate it, spilling to /dev/shm. RELATION is always (Entity)->(Entity); binding the open end to :Entity collapses the Append to a single Entity scan and is row-for-row identical (verified on a 300K-edge AGE graph: ->() and ->(:Entity) both return 1816 rows; with the start_id index the RELATION scan became a bitmap index scan, 152ms->46ms). All four expand sites updated. 3. shm-size (deploy-side). mempalace-db is built from the sibling disks repo, not provisioned here, so the required --shm-size=256m (off Docker's 64MB default) is documented as an operator action in docs/operators/2026-05-31-age-graph-walk-shm-size.md. Tests: 6 new (4 anonymous-target source guards + edge-index-targets unit + @Pgmark integration verifying skip-when-absent -> install-when-present -> idempotent against a live AGE container). Full suite green without TEST_POSTGRES_DSN; ruff clean. FORK_CHANGELOG.md + README.md re-rendered from docs/fork-changes.yaml (the row-renumber is generator output). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

gemini-code-assist · 2026-05-31T01:19:01Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

…llowups

The mempalace check-docs gate (step 1) was red on main: #336 added tests but left README at 3853 while pytest collected more. Merged origin/main (#336) into this branch and bumped the count via the canonical method (pytest --collect-only -q → "N/M tests collected" → N), which now includes both #336's predicate-norm tests and #335's 6 new graph-walk tests. check-docs step 1 verified green: README 3872 == pytest collects 3872. render-docs --check clean; ruff clean; #336's 40 predicate-norm tests pass on the merged branch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…link Green the check-docs / lint / lychee gates on #337 (all mechanical): - lint: `ruff format` knowledge_graph_age.py (the _ensure_edge_endpoint_indexes block — multi-line arg + split f-string the formatter rejoins). ruff check and ruff format --check now both clean. - check-docs: regenerate website/public/llms-full.txt (render-llms-full.py) and website/reference/python-api/ (render-api-docs.py) — both were stale relative to the merged tree. - lychee: the fork-changes entry had `pr: 335`, which rendered an upstream-PR link to github.com/MemPalace/pull/335 — that PR does not exist (#335 is a techempower-org fork *issue*, referenced as MemPalace#335 in the prose). Dropped the pr/pr_state fields (the field is for upstream PRs only). Also changed the commit placeholder HEAD -> TBD: commit/TBD is in lychee.toml's exclude list (the documented "resolved to a real SHA when the entry lands" placeholder), commit/HEAD is not. check-docs all 7 sub-checks green (README 3872 == pytest 3872); ruff clean. vale (advisory prose lint) intentionally left as-is. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The only real lychee error on #337 (CI run 26700068716) was [404] https://github.com/doobidoo/mcp-memory-service in README.md + docs/research/2026-05-24-memory-system-benchmarks.md — a comparison pointer whose upstream repo was deleted/renamed (confirmed 404). It's pre-existing on main (added in 5502c1d/9f8a1f7/0aceea3, not by #335) so lychee is red on main too; this greens it for both. Added to lychee.toml's existing "external sites that 404 from CI but are intentionally referenced" exclude block, matching kostadis/CampaignGenerator etc. Verified: the pattern matches the URL, commit/TBD stays excluded, real commit SHAs are not over-excluded, TOML parses. (My #335 commit-link fix — HEAD→TBD — already landed in d35b8ad and shows [EXCLUDED] in the same CI log.) Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 31, 2026 01:18

Copilot started reviewing on behalf of jphein May 31, 2026 01:19 View session

jphein mentioned this pull request May 31, 2026

perf: AGE graph-walk follow-ups from Cat 7b (auto-index in backfill_age, bind ->() target, raise shm-size) #335

Closed

Copilot AI reviewed May 31, 2026

jphein and others added 3 commits May 30, 2026 18:21

Merge remote-tracking branch 'origin/main' into perf/335-age-graph-fo…

8d840aa

…llowups

jphein merged commit e37978e into main May 31, 2026
9 of 11 checks passed

jphein deleted the perf/335-age-graph-followups branch May 31, 2026 01:35

This was referenced May 31, 2026

ci(lychee): main-level link liabilities fail the gate on every PR (dead external ref + commit/HEAD placeholders) #338

Closed

fix(lychee): exclude deleted doobidoo/mcp-memory-service (404 on main) #339

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(age): index edge endpoints in backfill + bind anonymous RELATION targets (#335)#337

perf(age): index edge endpoints in backfill + bind anonymous RELATION targets (#335)#337
jphein merged 4 commits into
mainfrom
perf/335-age-graph-followups

jphein commented May 31, 2026

Uh oh!

gemini-code-assist Bot commented May 31, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jphein commented May 31, 2026

1. Auto edge-endpoint indexes in backfill

2. Bind the anonymous RELATION target

3. shm-size (deploy-side — documented, not code)

Tests

Uh oh!

gemini-code-assist Bot commented May 31, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants