perf(age): index edge endpoints in backfill + bind anonymous RELATION targets (#335)#337
Merged
Conversation
… targets Three AGE graph-walk follow-ups to the Cat 7b hybrid-latency root-cause (MemPalace#335). The per-entity graph-walk Cypher in searcher._graph_expand_* and the daemon's /search/age-fused lookup were the entire hybrid-vs-union latency delta. 1. Auto edge-endpoint indexes. AGE only btree-indexes a label table's own id, never the start_id/end_id graphid columns the edge walks join on — so every per-entity lookup parallel-seq-scanned the whole edge table (MENTIONS 6.69M rows, ~5.8s cold). Add KnowledgeGraphAGE._ensure_edge_endpoint_indexes (idx_mentions_{start,end}_id + idx_relation_{start,end}_id, CREATE INDEX IF NOT EXISTS, skips labels whose table doesn't exist yet), mirroring _ensure_drawer_unique_index; backfill_age.backfill calls it after the edge pass so fresh palaces get them automatically. Names/columns kept in sync with palace-daemon scripts/age_graph_indexes.sql + POST /backfill-age/indexes. 2. Bind the anonymous RELATION target. MATCH (a:Entity)-[r:RELATION]->() left the far endpoint anonymous, so AGE built a Parallel Append over every vertex label (~1.58M rows) + nested-loop to validate it, spilling to /dev/shm. RELATION is always (Entity)->(Entity); binding the open end to :Entity collapses the Append to a single Entity scan and is row-for-row identical (verified on a 300K-edge AGE graph: ->() and ->(:Entity) both return 1816 rows; with the start_id index the RELATION scan became a bitmap index scan, 152ms->46ms). All four expand sites updated. 3. shm-size (deploy-side). mempalace-db is built from the sibling disks repo, not provisioned here, so the required --shm-size=256m (off Docker's 64MB default) is documented as an operator action in docs/operators/2026-05-31-age-graph-walk-shm-size.md. Tests: 6 new (4 anonymous-target source guards + edge-index-targets unit + @Pgmark integration verifying skip-when-absent -> install-when-present -> idempotent against a live AGE container). Full suite green without TEST_POSTGRES_DSN; ruff clean. FORK_CHANGELOG.md + README.md re-rendered from docs/fork-changes.yaml (the row-renumber is generator output). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
The mempalace check-docs gate (step 1) was red on main: #336 added tests but left README at 3853 while pytest collected more. Merged origin/main (#336) into this branch and bumped the count via the canonical method (pytest --collect-only -q → "N/M tests collected" → N), which now includes both #336's predicate-norm tests and #335's 6 new graph-walk tests. check-docs step 1 verified green: README 3872 == pytest collects 3872. render-docs --check clean; ruff clean; #336's 40 predicate-norm tests pass on the merged branch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…link Green the check-docs / lint / lychee gates on #337 (all mechanical): - lint: `ruff format` knowledge_graph_age.py (the _ensure_edge_endpoint_indexes block — multi-line arg + split f-string the formatter rejoins). ruff check and ruff format --check now both clean. - check-docs: regenerate website/public/llms-full.txt (render-llms-full.py) and website/reference/python-api/ (render-api-docs.py) — both were stale relative to the merged tree. - lychee: the fork-changes entry had `pr: 335`, which rendered an upstream-PR link to github.com/MemPalace/pull/335 — that PR does not exist (#335 is a techempower-org fork *issue*, referenced as MemPalace#335 in the prose). Dropped the pr/pr_state fields (the field is for upstream PRs only). Also changed the commit placeholder HEAD -> TBD: commit/TBD is in lychee.toml's exclude list (the documented "resolved to a real SHA when the entry lands" placeholder), commit/HEAD is not. check-docs all 7 sub-checks green (README 3872 == pytest 3872); ruff clean. vale (advisory prose lint) intentionally left as-is. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This was referenced May 31, 2026
jphein
added a commit
that referenced
this pull request
May 31, 2026
The only real lychee error on #337 (CI run 26700068716) was [404] https://github.com/doobidoo/mcp-memory-service in README.md + docs/research/2026-05-24-memory-system-benchmarks.md — a comparison pointer whose upstream repo was deleted/renamed (confirmed 404). It's pre-existing on main (added in 5502c1d/9f8a1f7/0aceea3, not by #335) so lychee is red on main too; this greens it for both. Added to lychee.toml's existing "external sites that 404 from CI but are intentionally referenced" exclude block, matching kostadis/CampaignGenerator etc. Verified: the pattern matches the URL, commit/TBD stays excluded, real commit SHAs are not over-excluded, TOML parses. (My #335 commit-link fix — HEAD→TBD — already landed in d35b8ad and shows [EXCLUDED] in the same CI log.) Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #335 — the three mempalace-side AGE graph-walk follow-ups surfaced by the Cat 7b hybrid-latency root-cause (palace-daemon
docs/perf/2026-05-30-hybrid-graph-walk-latency.md, merged as palace-daemon#206). The per-entity graph-walk Cypher insearcher._graph_expand_*and the daemon's/search/age-fusedlookup were the entire hybrid-vs-union latency delta.1. Auto edge-endpoint indexes in backfill
AGE only btree-indexes a label table's own
id(_ag_label_edge_pkey), never thestart_id/end_idgraphid columns the edge walks join on. So every per-entity lookup parallel-seq-scanned the whole edge table (MENTIONS 6.69M rows on prod, ~5.8s cold for a hot entity, × N query entities).KnowledgeGraphAGE._ensure_edge_endpoint_indexes()(new, mirrors_ensure_drawer_unique_index) createsidx_mentions_{start,end}_id+idx_relation_{start,end}_idwithCREATE INDEX IF NOT EXISTS, skipping any label whose backing table doesn't exist yet (the label is only created on first edge write).backfill_age.backfill()calls it after the edge pass so fresh palaces get the indexes automatically. Names/columns kept in sync with palace-daemon'sscripts/age_graph_indexes.sql+ the operator-onlinePOST /backfill-age/indexes(CONCURRENTLY) route.2. Bind the anonymous RELATION target
MATCH (a:Entity)-[r:RELATION]->()left the far endpoint anonymous, so AGE built aParallel Appendover every vertex label (Entity + Drawer + Room + Wing + _ag_label_vertex, ~1.58M rows on prod) and nested-loop-joined to validate the endpoint exists — materializing the union and spilling to/dev/shm.RELATION is always
(Entity)->(Entity), so binding the open end to:Entityis semantically identical and collapses the Append to a single Entity scan. Verified row-for-row on a 300K-edge AGE graph:->()and->(:Entity)both returned 1816 rows; with thestart_idindex the RELATION scan became aBitmap Index Scan(152ms → 46ms). All four expand sites updated (_graph_expand_from_seedsoutbound/inbound/per-entity +_graph_expand_from_entities).3. shm-size (deploy-side — documented, not code)
mempalace-dbis built from the siblingdisksrepo anddocker runon familiar, not provisioned from this repo, so there's no in-repo compose file to patch. The required value (--shm-size=256m, off Docker's 64MB default) is documented as an operator action indocs/operators/2026-05-31-age-graph-walk-shm-size.md, with the EXPLAIN evidence and verify step.Tests
6 new, all green:
test_searcher_stopwords.py::TestGraphExpandBoundEndpoint— source guards that neither expand fn leaves a-[r:RELATION]->()/()-[r:RELATION]->anonymous endpoint (sits alongside the existing perf(search): make graph-expand cypher directional in hybrid candidate strategy — ~100× speedup #291 directional guards).test_knowledge_graph_age.py::test_edge_endpoint_index_targets_are_the_four_we_expect— pure-unit, pins the index name/label/column set in sync with the daemon SQL.test_knowledge_graph_age.py::test_ensure_edge_endpoint_indexes_skips_absent_then_installs—@pgmarkintegration test against a live AGE container: skip-when-edge-tables-absent → install-all-when-present → idempotent on re-run.Full suite green without
TEST_POSTGRES_DSN(repo default); ruff (0.15.14) clean.FORK_CHANGELOG.md+README.mdre-rendered fromdocs/fork-changes.yaml(the row-renumber is generator output,commit: HEADresolves at ship-prep per repo convention).🤖 Generated with Claude Code