Skip to content

fix(scripts): handle duplicate ARCHON_STATE_JSON_BEGIN blocks in persist#1676

Closed
kagura-agent wants to merge 1 commit into
coleam00:devfrom
kagura-agent:fix/1674-duplicate-json-begin
Closed

fix(scripts): handle duplicate ARCHON_STATE_JSON_BEGIN blocks in persist#1676
kagura-agent wants to merge 1 commit into
coleam00:devfrom
kagura-agent:fix/1674-duplicate-json-begin

Conversation

@kagura-agent

@kagura-agent kagura-agent commented May 14, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: maintainer-standup-persist.ts fails when the synthesize node emits duplicate ARCHON_STATE_JSON_BEGIN blocks (e.g. a truncated emission followed by a complete one after mid-output restart). indexOf(BEGIN) pairs the first (partial) BEGIN with the first END, slicing across both blocks → JSON.parse fails → exit 1, no brief/state written.
  • Why it matters: Standup workflow reports failure even though a usable state was produced. Requires manual log extraction.
  • What changed: Tier 1 extraction uses lastIndexOf(BEGIN) to find the last (complete) block, then searches for END only after that position. Brief boundary uses indexOf(BEGIN) (first occurrence) so preamble including truncated content is captured and cleaned by the existing heading filter.
  • What did not change: Tier 2 JSON-wrapper fallback, file output paths, heading-strip logic, error reporting.

UX Journey

Before

User                       Archon synthesize           persist script
────                       ──────────────────          ──────────────
runs maintainer-standup ──▶ emits partial state ──┐
                            restarts mid-output    │──▶ reads stdin
                            emits complete state   │    indexOf(BEGIN) → first (partial)
                                                   │    indexOf(END)  → only one
                                                   ▼    slice spans both blocks
                                                        JSON.parse fails
                                                        exits 1
sees "Workflow failed" ◀── no brief, no state.json

After

User                       Archon synthesize           persist script
────                       ──────────────────          ──────────────
runs maintainer-standup ──▶ emits partial state ──┐
                            restarts mid-output    │──▶ reads stdin
                            emits complete state   │    lastIndexOf(BEGIN) → last (complete)
                                                   │    indexOf(END, lastBegin) → correct END
                                                   ▼    slice contains only complete JSON
                                                        JSON.parse succeeds
sees standup brief     ◀── brief + state.json written ✅

Architecture Diagram

Before & After

synthesize node ──stdin──▶ [~] maintainer-standup-persist.ts ──▶ .archon/maintainer-standup/
                                                                    ├── state.json
                                                                    └── briefs/YYYY-MM-DD.md

Only the persist script is modified. No new modules, no changed connections.

Connection inventory:

From To Status Notes
synthesize node persist script Unchanged stdin pipe
persist script filesystem Unchanged writes state.json + brief

Label Snapshot

Label Value
Type Bug fix
Scope .archon/scripts/
Risk Low — isolated script, no runtime imports

Change Metadata

Metric Value
Files changed 1 (+ 1 test file)
Insertions ~90
Deletions 3
New dependencies None
DB migrations None

Linked Issue

Closes #1674

Validation Evidence

$ bun test ./.archon/scripts/maintainer-standup-persist.test.ts

 4 pass, 0 fail
 - single BEGIN/END block succeeds
 - duplicate BEGIN blocks — takes last complete block (fixes #1674)
 - JSON-wrapper fallback works
 - no valid format exits 1

Lint-staged (eslint + prettier) passed on commit.

Security Impact

None — no auth, no network, no user input beyond stdin from a trusted workflow node.

Compatibility/Migration

Fully backward compatible. Single-block inputs behave identically (lastIndexOf == indexOf when there is only one occurrence).

Human Verification

Risks and Mitigations

Risk Mitigation
Multiple END markers indexOf(END, beginIdx) finds first END after last BEGIN — correct pairing
Brief includes truncated JSON Existing heading filter (lines.findIndex(l => l.startsWith("# "))) strips preamble prose

Side Effects/Blast Radius

None. The persist script is standalone — no imports from packages/, no runtime callers beyond the bash node in the standup workflow.

Rollback Plan

Revert this commit. Previous behavior (crash on duplicate blocks) is restored.

Summary by CodeRabbit

  • Tests

    • Added comprehensive test suite for maintainer standup persistence, covering standard and alternate input formats.
  • Bug Fixes

    • Enhanced JSON block parsing to reliably locate correct blocks when duplicate markers are present.

Review Change Stack

…ist (coleam00#1674)

When the synthesize node emits a truncated state block followed by a
complete one (e.g. after a mid-emission restart), the persist script
paired the first BEGIN with the first END — slicing across both blocks
and producing invalid JSON.

Fix: use lastIndexOf(BEGIN) to find the last (complete) block, then
search for END only after that position. Brief text uses the first
BEGIN as its boundary so preamble prose (including truncated earlier
emission) is captured and cleaned by the existing heading filter.

Adds integration tests covering single-block, duplicate-block (coleam00#1674
repro), JSON-wrapper fallback, and invalid-input paths.
@coderabbitai

coderabbitai Bot commented May 14, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5cd0d71e-8500-4168-bc85-ceb389c433b4

📥 Commits

Reviewing files that changed from the base of the PR and between 3d290d8 and 2f76174.

📒 Files selected for processing (2)
  • .archon/scripts/maintainer-standup-persist.test.ts
  • .archon/scripts/maintainer-standup-persist.ts

📝 Walkthrough

Walkthrough

This PR fixes a robustness issue in the maintainer-standup-persist script where duplicate ARCHON_STATE_JSON_BEGIN/END blocks (emitted when the synthesize node restarts) caused extraction failure. The fix switches from indexOf to lastIndexOf(BEGIN) with a forward-scanning END, and adds a comprehensive test suite validating single blocks, duplicate blocks, JSON fallback, and error cases.

Changes

Delimiter Extraction and Persistence Robustness

Layer / File(s) Summary
Delimiter extraction logic for duplicate blocks
.archon/scripts/maintainer-standup-persist.ts
Extraction now finds the last ARCHON_STATE_JSON_BEGIN marker and searches forward for the first ARCHON_STATE_JSON_END, enabling the script to recover from multiple incomplete emissions and select the final complete block.
Test helper and validation suite
.archon/scripts/maintainer-standup-persist.test.ts
Added runPersist(stdin) helper that spawns the script in a temp directory and captures persisted state and brief files. Four test cases validate: single-block persistence, duplicate-block recovery (fixes #1674), JSON-wrapper fallback format, and exit-code/stderr handling on invalid input.

Possibly related PRs

  • coleam00/Archon#1480: Introduced the original delimiter-based JSON extraction with fallback; this PR fixes a robustness issue in that extraction logic to handle duplicate marker emissions.

Poem

🐰 When delimiters dance in duplicate,
Last-first shall reign—no more regret!
The persist now grabs the final block,
With tests to guard it, solid as a rock! ✨

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title concisely and accurately summarizes the main change—fixing duplicate ARCHON_STATE_JSON_BEGIN block handling in the persist script.
Description check ✅ Passed The description follows the template with all required sections: summary, UX journey (before/after flows), architecture diagram, change metadata, linked issue (closes #1674), validation evidence (4 passing tests), security impact, backward compatibility, human verification, risks/mitigations, and rollback plan.
Linked Issues check ✅ Passed The code changes directly address issue #1674: tier-1 extraction now uses lastIndexOf(BEGIN) to find the last complete block instead of pairing first BEGIN with first END, brief extraction preserves the first BEGIN boundary, and tier-2 fallback remains unchanged. Test coverage validates all four objectives (single-block, duplicate-BEGIN fix, JSON-wrapper fallback, invalid input handling).
Out of Scope Changes check ✅ Passed All changes are tightly scoped to the stated objectives: the main script change addresses the duplicate-marker bug, the test file validates the fix and existing behavior, and no unrelated modifications are present.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Wirasm added a commit that referenced this pull request May 15, 2026
…rkers (#1695)

* Fix: extract ARCHON_STATE_JSON markers as standalone lines (#1674)

The persist script extracted Claude's state-JSON block by substring-matching
the BEGIN/END markers, which gave a false match whenever the marker string
appeared inside the brief's prose or inside a JSON string value (e.g. a PR
title narrating this very bug). PR #1676's switch to lastIndexOf made the
self-referential case worse, since the last substring occurrence may now sit
inside a JSON value.

Match the markers only when they occupy an entire line (line-anchored
^...$ regex with the m flag) and pick the last END plus the last BEGIN
before it, so duplicate-emission and substring-in-content both resolve
correctly.

Changes:
- Replace indexOf/lastIndexOf substring matching in Tier 1 with
  line-anchored matchAll regex over BEGIN/END markers
- Add tests for marker substring in brief prose, marker substring inside
  state JSON value, and combined duplicate-BEGIN + marker-in-prose case
- Keep PR #1676's existing test cases (single block, duplicate BEGINs,
  JSON-wrapper fallback, no-valid-format exit 1)

Fixes #1674

* fix(scripts): address review findings in maintainer-standup-persist

- Add WARN diagnostic when all BEGIN markers appear after last END
  (previously silent fallthrough produced misleading terminal error)
- Include 200-char candidate preview in JSON parse error message
  (error was unactionable in headless workflow without raw output)
- Wrap mkdirSync/writeFileSync in try-catch with structured PERSIST FAILED
  message instead of raw Bun stack trace on disk errors
- Fix brief assignment comment to explain WHY [0] vs last-pair asymmetry
- Add test: BEGIN present but END absent (truncated output) exits 1
- Add test: prose preamble before first heading is stripped from brief
- Add clarifying comment to Test 6 noting it is defence-in-depth, not
  a case that failed under the old indexOf approach

* simplify: collapse multi-line comment blocks to single lines
@Wirasm

Wirasm commented May 15, 2026

Copy link
Copy Markdown
Collaborator

@kagura-agent — closing this in favor of #1695, which has just merged with a more complete fix.

Genuine credit owed: #1695 builds directly on your test design. Your four tests (`single BEGIN/END block succeeds`, `duplicate BEGIN blocks — takes last complete block (fixes #1674)`, `JSON-wrapper fallback works`, `no valid format exits 1`) are preserved byte-for-byte in #1695's test file — the scaffolding (`runPersist` helper, temp-dir-per-test isolation, exit-code assertions) is yours. #1695 added five more tests on top to cover the secondary bug class (marker substring inside brief prose / JSON string values) that surfaced after your fix attempt landed.

Why #1695 over this PR: your approach used `lastIndexOf` for substring matching, which fixes the duplicate-emission case but doesn't handle markers appearing as substrings inside content (e.g. a PR title containing `ARCHON_STATE_JSON_BEGIN`). #1695 switches to line-anchored regex (`^ARCHON_STATE_JSON_BEGIN$/gm`), which solves both classes at once.

This wasn't wasted work — your fast turnaround on issue #1674 (3 hours from filing to PR) set the test framework that #1695 extended. Thank you.

@Wirasm Wirasm closed this May 15, 2026
cropse pushed a commit to cropse/Archon that referenced this pull request May 19, 2026
…rkers (coleam00#1695)

* Fix: extract ARCHON_STATE_JSON markers as standalone lines (coleam00#1674)

The persist script extracted Claude's state-JSON block by substring-matching
the BEGIN/END markers, which gave a false match whenever the marker string
appeared inside the brief's prose or inside a JSON string value (e.g. a PR
title narrating this very bug). PR coleam00#1676's switch to lastIndexOf made the
self-referential case worse, since the last substring occurrence may now sit
inside a JSON value.

Match the markers only when they occupy an entire line (line-anchored
^...$ regex with the m flag) and pick the last END plus the last BEGIN
before it, so duplicate-emission and substring-in-content both resolve
correctly.

Changes:
- Replace indexOf/lastIndexOf substring matching in Tier 1 with
  line-anchored matchAll regex over BEGIN/END markers
- Add tests for marker substring in brief prose, marker substring inside
  state JSON value, and combined duplicate-BEGIN + marker-in-prose case
- Keep PR coleam00#1676's existing test cases (single block, duplicate BEGINs,
  JSON-wrapper fallback, no-valid-format exit 1)

Fixes coleam00#1674

* fix(scripts): address review findings in maintainer-standup-persist

- Add WARN diagnostic when all BEGIN markers appear after last END
  (previously silent fallthrough produced misleading terminal error)
- Include 200-char candidate preview in JSON parse error message
  (error was unactionable in headless workflow without raw output)
- Wrap mkdirSync/writeFileSync in try-catch with structured PERSIST FAILED
  message instead of raw Bun stack trace on disk errors
- Fix brief assignment comment to explain WHY [0] vs last-pair asymmetry
- Add test: BEGIN present but END absent (truncated output) exits 1
- Add test: prose preamble before first heading is stripped from brief
- Add clarifying comment to Test 6 noting it is defence-in-depth, not
  a case that failed under the old indexOf approach

* simplify: collapse multi-line comment blocks to single lines
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

maintainer-standup persist fails when synthesize emits duplicate ARCHON_STATE_JSON_BEGIN/END blocks

2 participants