Skip to content

Commit ddf00b4

Browse files
jpheinclaude
andauthored
docs: soften engram-2 '17% E2E QA' framing per research doc fact-check (#321)
Closes #319. docs/research/2026-05-24-memory-system-benchmarks.md:215 documented that the "17% E2E QA for MemPalace" attribution to engram-2 was not substantiated in their published materials — what engram-2 actually published is a ~17-point gap between their own LoCoMo score (74.5%) and SOTA (91.7%), attributed to the answerer model. README.md:163 and docs/ECOSYSTEM.md:45 predated that fact-check and still stated the attribution as fact. This change adopts the 2026-05-24 doc as source-of-truth: - README "Active investigations" reframes the entry as "End-to-end QA measurement on the post-structural-fix palace". The corpus-shape pathology that prompted #168 is real (pre-migration kind=content returned 3 tokens/Q vs post-migration 1,267) and is closed; the deliverable becomes a positive measurement rather than a rebuttal. - docs/ECOSYSTEM.md describes what engram-2 actually published and cross-links the research doc. #168 itself stays open — the deliverable at notebook/data/ cat9-postmigrate-e2e/REPORT.md is unchanged. Only the framing of what the numbers answer changes. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 3fb9428 commit ddf00b4

5 files changed

Lines changed: 247 additions & 172 deletions

File tree

FORK_CHANGELOG.md

Lines changed: 36 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -251,10 +251,45 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
251251
*Files:* `mempalace/knowledge_graph_age.py`, `mempalace/kg_triple_worker.py`, `mempalace/mcp_server.py`, `tests/test_knowledge_graph_age.py`, `tests/test_kg_triple_worker.py`, `tests/test_mcp_server.py`
252252

253253

254+
### Changed
255+
256+
257+
- **README.md + docs/ECOSYSTEM.md — soften 'engram-2 17% E2E QA' framing per the 2026-05-24 research doc's unsubstantiated finding (#319)** ([`HEAD`](https://github.com/techempower-org/mempalace/commit/HEAD))
258+
Three docs referenced the engram-2 "17% E2E QA for MemPalace"
259+
attribution with inconsistent framing:
260+
261+
- ``README.md:163`` treated it as a critique we're answering.
262+
- ``docs/ECOSYSTEM.md:45`` stated it as a factual claim engram-2
263+
makes.
264+
- ``docs/research/2026-05-24-memory-system-benchmarks.md:215``
265+
documented that the 17% attribution to engram-2 is **not
266+
substantiated** in their published materials — what engram-2
267+
actually published is a ~17-point gap between their own LoCoMo
268+
score (74.5%) and SOTA (91.7%), attributed to the answerer model.
269+
270+
The 2026-05-24 doc is the more recent and more carefully sourced;
271+
README + ECOSYSTEM.md predated the fact-check. This PR adopts the
272+
2026-05-24 doc as source-of-truth:
273+
274+
- ``README.md`` "Active investigations" reframes the entry as
275+
"End-to-end QA measurement on the post-structural-fix palace" —
276+
the corpus-shape pathology that prompted #168 is real and the
277+
structural fix closed it on our corpus; the deliverable becomes
278+
a positive measurement rather than a rebuttal.
279+
- ``docs/ECOSYSTEM.md`` describes what engram-2 actually published
280+
and cross-links the research doc for readers wanting more.
281+
282+
#168 itself stays open — the deliverable (publish E2E numbers at
283+
``notebook/data/cat9-postmigrate-e2e/REPORT.md``) is unchanged.
284+
Only the framing of what the numbers are answering changes.
285+
286+
*Files:* `README.md`, `docs/ECOSYSTEM.md`
287+
288+
254289
### Fixed
255290

256291

257-
- **kg_llm_extractor rewrites AGE dollar-quote tag in triples so drawers indexing palace source code don't fail at add_triple (#313)** ([`HEAD`](https://github.com/techempower-org/mempalace/commit/HEAD))
292+
- **kg_llm_extractor rewrites AGE dollar-quote tag in triples so drawers indexing palace source code don't fail at add_triple (#313)** ([`3fb9428`](https://github.com/techempower-org/mempalace/commit/3fb9428))
258293
Drawers indexing palace-daemon / mempalace source code contain
259294
``mp_age_q`` verbatim (the AGE dollar-quote tag the cypher
260295
wrapper uses to delimit its outer SQL literal). When the LLM

0 commit comments

Comments
 (0)