Skip to content

refactor(preamble): G1 renders graft INDEX only — workers fetch on demand (closes #71 PR3)#73

Merged
PowerCreek merged 1 commit into
mainfrom
issue-71-preamble-index-only
May 24, 2026
Merged

refactor(preamble): G1 renders graft INDEX only — workers fetch on demand (closes #71 PR3)#73
PowerCreek merged 1 commit into
mainfrom
issue-71-preamble-index-only

Conversation

@PowerCreek

Copy link
Copy Markdown

Final PR (3 of 3) closing #71. Depends on:

Summary

G1's preamble loader no longer inlines graft bodies. Each grafted-context becomes a single index line: - `graft_id` · source · `path` — abstract (first 120 chars of content). Worker fetches full bodies on demand via the grafted_context_fetch(graft_id) MCP tool added in PR2.

Loader explicitly teaches the worker the idiom in the preamble itself:

Full bodies are NOT included in this preamble. Use the grafted_context_fetch(graft_id) MCP tool to load any row's full content on demand — typical pattern is to skim the index, identify relevant rows by source/path/abstract, then fetch only what you need for the current turn.

Per-turn cost reduction

Vertical size Pre-#71 preamble Post-#71 preamble
5 grafts × 2KB content ~10KB ~750 bytes
8 grafts × 2KB content (cap) ~16KB ~1.2KB
70 grafts × varying (poly-explorer) ~16KB (8 shown, 62 hidden) ~11KB (ALL 70 indexed)
200 grafts (theoretical future) ~16KB (8 shown, 192 hidden) ~30KB (ALL 200 indexed)

Old caps removed: _PER_GRAFT_CONTENT_CHARS=2048 and _MAX_GRAFTS_RENDERED=8 are deleted (no longer meaningful — content isn't in the preamble at all). New cap: _ABSTRACT_CHARS=120 per graft. _MAX_TOTAL_PREAMBLE_CHARS=32768 remains as a safety net.

What stays inlined

  • Vertical-spec header (name, user_id, created_ts, manifest path, source count) — small, always relevant.
  • Worker-guardrails — short, critical, can't reasonably be lazy-loaded since they govern every move the worker makes.

Test plan

  • 11 tests in tests/plugins/devagentic-vertical-preamble/test_preamble.py cover: no-user_id gate, default-profile gate, empty-rollup gate, NEW: index-not-full-body assertion (poly-explorer-scale fixture with 7KB content body verifies full body NOT inlined, only abstract), NEW: 70-graft all-indexed assertion, NEW: abstract-truncated-to-120-chars assertion, load-once gate, fetch exception, resolve_user_id exception, client env override, client empty-user_id guard.
  • All 11 pass.
  • Sibling tests (test_grafted_context_fetch.py from PR2, test_client.py from G4): no regression.

Polynomial-explorer end-to-end after deploy

  1. Devbox restart picks up devagentic#220 (server side query).
  2. Container rebuild picks up feat(mcp): grafted_context_fetch tool — extends devagentic-lane-h plugin (#71 PR2) #72 + this PR (hermes side: MCP tool + index loader).
  3. Next polynomial-explorer turn:
    • Preamble: ~11KB index + ~1KB guardrails + vertical-spec header ≈ 12KB (was 16-32KB).
    • Worker sees 70 grafts in the index. Reads abstracts. Identifies relevant rows for the current concept-permutation step.
    • Fetches selectively via grafted_context_fetch("doc-graft-...") — one or two at a time.
    • Model attention now scales with what the worker asks for, not what the vertical has.

Empty-content failure (the symptom that surfaced #71) should resolve organically — model no longer overwhelmed by tools+preamble+grafts. PR #69's synthetic recovery remains as a safety net for any residual failures.

Notes

  • Index ordering is preserved from the rollup (path-sorted server-side by verticalContext).
  • One-line abstract format chosen for grep-ability: - `<id>` · <source> · `<path>` — <abstract>. Workers can grep the preamble for source/path keywords to find candidate grafts.
  • No "showing first N" suffix anymore — ALL grafts are in the index since each line is ~150 chars.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant