fix(scripts): use line-anchored regex to extract ARCHON_STATE_JSON markers#1695
Conversation
The persist script extracted Claude's state-JSON block by substring-matching the BEGIN/END markers, which gave a false match whenever the marker string appeared inside the brief's prose or inside a JSON string value (e.g. a PR title narrating this very bug). PR #1676's switch to lastIndexOf made the self-referential case worse, since the last substring occurrence may now sit inside a JSON value. Match the markers only when they occupy an entire line (line-anchored ^...$ regex with the m flag) and pick the last END plus the last BEGIN before it, so duplicate-emission and substring-in-content both resolve correctly. Changes: - Replace indexOf/lastIndexOf substring matching in Tier 1 with line-anchored matchAll regex over BEGIN/END markers - Add tests for marker substring in brief prose, marker substring inside state JSON value, and combined duplicate-BEGIN + marker-in-prose case - Keep PR #1676's existing test cases (single block, duplicate BEGINs, JSON-wrapper fallback, no-valid-format exit 1) Fixes #1674
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughUpdated ChangesMaintainer standup persist robustness
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Comprehensive PR ReviewPR: #1695 — fix(scripts): use line-anchored regex to extract ARCHON_STATE_JSON markers SummaryThe core fix is correct — replacing Verdict:
🟠 High Issues (Should Fix Before Merge)1. Silent fallthrough when all BEGIN markers follow the last END📍 When View fixif (beginsBeforeEnd.length > 0) {
// ... existing extraction code
} else {
process.stderr.write(
`WARN: ARCHON_STATE_JSON_BEGIN/END markers found but all BEGIN markers appear after the last END; skipping delimiter extraction.\n`,
);
}2. JSON parse error lacks candidate preview — message is not actionable📍 The catch block logs View fix} catch (err) {
const preview = stateText.length > 200 ? stateText.slice(0, 200) + '…' : stateText;
process.stderr.write(
`Delimiter found but state JSON parse failed: ${(err as Error).message}\nFailed candidate (first 200 chars): ${preview}\n`,
);
}🟡 Medium Issues (Consider Fixing in Same Pass)3.
|
| Issue | Location | Note |
|---|---|---|
CRLF line endings silently break $ matching |
ts:36-37 |
Theoretical — all known AI providers emit \n. YAGNI applies. |
Test 6 description is misleading — case also passed under old indexOf code |
test.ts:100-121 |
Test is not harmful; consider relabeling or adding a clarifying comment. |
| Prose preamble stripping (lines 108-113) is pre-existing and untested | ts:108-113 |
Low regression risk; not introduced by this PR. |
Section header // ── Tier 1: ── explains what, not why |
ts:33 |
Pre-existing; kept for scannability alongside the Tier 2 header. |
✅ What's Good
- Selection strategy is correct — "last END, then last BEGIN before it" is the right heuristic for duplicate/truncated emissions and the strategy comment captures the WHY
beginMatches[0]for brief extraction is correct — preserves all human-readable prose even in duplicate-BEGIN scenarios- WARN fires only on successful parse — the duplicate-marker warning is inside the
tryblock, avoiding confusing mixed signals on failure - Robust test harness —
Promise.all([stdout, stderr])prevents pipe-buffer deadlock;finallyensures temp dir cleanup on assertion failure - Scope discipline maintained — Tier 2 JSON-wrapper path untouched, no packages/ changes
- All CLAUDE.md rules pass — type safety, no
any, fail-fast, YAGNI, test determinism
No Docs Changes Required
Internal project script fix — transparent to users, no user-facing API or config change, no CHANGELOG entry needed.
Reviewed by Archon comprehensive-pr-review workflow
- Add WARN diagnostic when all BEGIN markers appear after last END (previously silent fallthrough produced misleading terminal error) - Include 200-char candidate preview in JSON parse error message (error was unactionable in headless workflow without raw output) - Wrap mkdirSync/writeFileSync in try-catch with structured PERSIST FAILED message instead of raw Bun stack trace on disk errors - Fix brief assignment comment to explain WHY [0] vs last-pair asymmetry - Add test: BEGIN present but END absent (truncated output) exits 1 - Add test: prose preamble before first heading is stripped from brief - Add clarifying comment to Test 6 noting it is defence-in-depth, not a case that failed under the old indexOf approach
⚡ Self-Fix Report (Aggressive)Status: COMPLETE Fixes Applied (5 total)
View all fixes
Tests Added
Skipped (2)
Suggested Follow-up Issues(none — all actionable findings addressed) Validation✅ Type check | ✅ Lint | ✅ Tests (9 passed) Self-fix by Archon · aggressive mode · fixes pushed to |
…rkers (coleam00#1695) * Fix: extract ARCHON_STATE_JSON markers as standalone lines (coleam00#1674) The persist script extracted Claude's state-JSON block by substring-matching the BEGIN/END markers, which gave a false match whenever the marker string appeared inside the brief's prose or inside a JSON string value (e.g. a PR title narrating this very bug). PR coleam00#1676's switch to lastIndexOf made the self-referential case worse, since the last substring occurrence may now sit inside a JSON value. Match the markers only when they occupy an entire line (line-anchored ^...$ regex with the m flag) and pick the last END plus the last BEGIN before it, so duplicate-emission and substring-in-content both resolve correctly. Changes: - Replace indexOf/lastIndexOf substring matching in Tier 1 with line-anchored matchAll regex over BEGIN/END markers - Add tests for marker substring in brief prose, marker substring inside state JSON value, and combined duplicate-BEGIN + marker-in-prose case - Keep PR coleam00#1676's existing test cases (single block, duplicate BEGINs, JSON-wrapper fallback, no-valid-format exit 1) Fixes coleam00#1674 * fix(scripts): address review findings in maintainer-standup-persist - Add WARN diagnostic when all BEGIN markers appear after last END (previously silent fallthrough produced misleading terminal error) - Include 200-char candidate preview in JSON parse error message (error was unactionable in headless workflow without raw output) - Wrap mkdirSync/writeFileSync in try-catch with structured PERSIST FAILED message instead of raw Bun stack trace on disk errors - Fix brief assignment comment to explain WHY [0] vs last-pair asymmetry - Add test: BEGIN present but END absent (truncated output) exits 1 - Add test: prose preamble before first heading is stripped from brief - Add clarifying comment to Test 6 noting it is defence-in-depth, not a case that failed under the old indexOf approach * simplify: collapse multi-line comment blocks to single lines
Summary
maintainer-standup-persist.tsmatchedARCHON_STATE_JSON_BEGIN/ENDas arbitrary substrings, causingJSON.parseto fail when either marker appeared inside brief prose or as a substring within a JSON string value (e.g., a PR title referencing the bug fix itself)..archon/scripts/maintainer-standup-persist.tsnow uses line-anchored regex (^ARCHON_STATE_JSON_BEGIN$/^ARCHON_STATE_JSON_END$,mflag) withmatchAll, selecting the last completeBEGIN/ENDpair. New test file.archon/scripts/maintainer-standup-persist.test.tscovers 7 cases (4 from PR fix(scripts): handle duplicate ARCHON_STATE_JSON_BEGIN blocks in persist #1676 + 3 new failure modes).packages/code.UX Journey
Before
After
Architecture Diagram
Before
After
Connection inventory:
maintainer-standup.yamlmaintainer-standup-persist.tsmaintainer-standup-minimax.yamlmaintainer-standup-persist.tsmaintainer-standup-persist.ts$ARTIFACTS_DIR/state.jsonLabel Snapshot
risk: lowsize: Sscriptsscripts:maintainer-standup-persistChange Metadata
bugmulti(scripts + test)Linked Issue
lastIndexOffix — this branch is a strict superset; PR fix(scripts): handle duplicate ARCHON_STATE_JSON_BEGIN blocks in persist #1676 should be closed)Validation Evidence (required)
Test breakdown:
Test 6 fails on PR #1676's
lastIndexOfapproach, confirming the regression.bun run validateclean.Security Impact (required)
Compatibility / Migration
Human Verification (required)
BEGIN(Tier 2 fallback runs), multipleENDmarkers (last used),BEGINas first line (empty brief — acceptable).Side Effects / Blast Radius (required)
maintainer-standup.yamlandmaintainer-standup-minimax.yamlboth use the persist script — fix applies to both automatically.BEGINmarker is found (duplicate emission case), preserving PR fix(scripts): handle duplicate ARCHON_STATE_JSON_BEGIN blocks in persist #1676's diagnostic intent.Rollback Plan (required)
git revert 5d51f764and re-push — script reverts to substring matching.FAILEDon the persist node.Risks and Mitigations
matchAllnot available in very old Bun versions.String.prototype.matchAllis ES2020 (Bun 1.0+); all supported Archon targets meet this bar.Summary by CodeRabbit
Bug Fixes
Tests