Skip to content

feat: no-self-review pattern + auditable decisions drop-box for approval gates #700

@Aureliolo

Description

@Aureliolo

Context

GitHub Squad independently validates three patterns:

  1. No-self-review: Coordinator prevents original agent from reviewing its own work
  2. Decisions drop-box: Append-only decisions.md for persistent auditable shared knowledge
  3. Versioned agent identities: Agent charters stored as plain text alongside code

Why This Matters

SynthOrg's approval gates should structurally prevent self-review (not just by convention). The decisions drop-box is more durable than current org memory consolidation for cross-agent institutional knowledge. Agent Skill Bus (#55) independently arrived at the same append-only pattern (JSONL queue).

Action Items

  • Audit approval gates: ensure original agent cannot review its own output
  • Design an auditable shared-knowledge artifact (versioned org-decisions file/store)
  • Evaluate append-only pattern vs current org memory consolidation approach
  • Consider versioning agent identities/charters as first-class artifacts

References


Additional Research (2026-03-26)

Generator/Evaluator/Planner Separation

Source: Harness Design for Long-Running Application Development (Anthropic Engineering, 2026-03-24)

Key findings that validate and extend the no-self-review pattern:

  • "Out of the box, Claude is a poor QA agent" -- required iterative prompt refinement to achieve reasonable grading. Separate evaluator agents are "far more tractable than making a generator critical of its own work."
  • GAN-inspired adversarial dynamic: Generator and Evaluator improve iteratively through adversarial feedback loops
  • Sprint contract pattern: Generator and Evaluator negotiate expected outcomes before each sprint, bridging high-level specs and testable implementations
  • Context anxiety: models lose coherence as context fills, exhibiting premature task completion. Solution: context resets with clean slates (connects to research: evaluate agent-controlled compaction for engine hybrid loop #687 compaction)

Cost Implications

  • 20x cost multiplier for multi-agent quality: $9 solo agent vs $200 full harness (3 agents, 6 hours). Budget module needs to account for multi-agent QA cost when review gates are active.
  • Harness simplification principle: complexity should decrease as models improve. Components encode assumptions about model limitations -- those assumptions need periodic stress testing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    prio:highImportant, should be prioritizedscope:medium1-3 days of workspec:securityDESIGN_SPEC Section 12 - Security & Approval Systemspec:task-workflowDESIGN_SPEC Section 6 - Task & Workflow Enginetype:featureNew feature implementationv0.7Minor version v0.7v0.7.0Patch release v0.7.0

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions