Skip to content

feat(skills): add santa-method - multi-agent adversarial verification#760

Merged
affaan-m merged 2 commits into
affaan-m:mainfrom
fischperson0000000:fischperson0000000-patch-1
Mar 22, 2026
Merged

feat(skills): add santa-method - multi-agent adversarial verification#760
affaan-m merged 2 commits into
affaan-m:mainfrom
fischperson0000000:fischperson0000000-patch-1

Conversation

@fischperson0000000

@fischperson0000000 fischperson0000000 commented Mar 22, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds the Santa Method - a multi-agent adversarial verification framework for post-generation quality assurance.

Core idea: A single agent reviewing its own output shares the same biases and blind spots that produced it. Two independent reviewers with no shared context break this failure mode.

Why This Change

The existing verification-loop skill handles deterministic checks (build, lint, test). Santa Method fills the gap for semantic verification - catching hallucinations, factual errors, compliance violations, and inconsistencies that no linter can detect.

The 4 Phases

  1. Make a List - Generate the deliverable (no changes to normal workflow)
  2. Check It Twice - Two independent review agents evaluate against a shared rubric (context-isolated, parallel)
  3. Naughty or Nice - Both must pass. No partial credit.
  4. Fix Until Nice - Convergence loop with max 3 iterations, then escalate

What's Included

  • Architecture diagram and full phase documentation
  • Reviewer prompt template with structured JSON output
  • Rubric design guide with domain-specific extensions (content, code, compliance)
  • Three implementation patterns: subagent (recommended), sequential inline, batch sampling
  • Failure modes table with mitigations
  • Integration guidance with existing ECC skills (verification-loop, eval-harness, continuous-learning-v2)
  • Metrics framework and cost analysis

Type of Change

  • feat: New feature

Testing Done

  • Manual testing completed - used in production across multiple content generation and code deployment workflows
  • Edge cases considered and tested - includes failure mode mitigations and max iteration caps

Origin: Ronald Skelton - Founder, RapportScore.ai


Summary by cubic

Adds the Santa Method skill, a multi-agent adversarial verification framework with a convergence loop. Two independent reviewers verify generated output against a shared rubric; both must pass before shipping.

  • New Features
    • Adds skills/santa-method/SKILL.md with architecture diagram and four phases (generate, dual review, verdict gate, fix loop).
    • Includes reviewer prompt with structured JSON output and a rubric design guide with domain-specific extensions.
    • Provides three implementation patterns (subagent recommended, inline fallback, batch sampling) and a max-3-iteration convergence loop with escalation.
    • Documents failure mode mitigations, metrics, cost analysis, and integration guidance with verification-loop, eval-harness, and continuous-learning-v2.

Written for commit 90518eb. Summary will update on new commits.

Summary by CodeRabbit

Documentation

  • Added comprehensive documentation for the Santa Method, a multi-agent verification framework. Includes activation criteria, architecture details, control flow phases, reviewer requirements, evaluation rubrics, implementation patterns, failure mitigation strategies, and integration guidance.

Multi-agent adversarial verification with convergence loop. Two independent review agents evaluate output against a shared rubric. Both must pass before shipping. Includes architecture diagram, implementation patterns (subagent, inline, batch sampling), domain-specific rubric extensions, failure mode mitigations, and integration guidance with existing ECC skills.
Expanded the SKILL.md documentation for the Santa Method, detailing architecture, phases, implementation patterns, failure modes, integration with other skills, metrics, and cost analysis.
@coderabbitai

coderabbitai Bot commented Mar 22, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

A new documentation file introduces the Santa Method, a multi-agent adversarial verification framework. The document outlines a four-phase control flow: generation, dual independent reviews, verdict gate requiring both passes, and iterative fixes. It includes architecture overview, reviewer independence requirements, evaluation rubrics, JSON output specifications, failure modes, mitigations, and integration guidance.

Changes

Cohort / File(s) Summary
Santa Method Framework Documentation
skills/santa-method/SKILL.md
Comprehensive framework documentation (306 lines) defining the multi-agent adversarial verification approach, including activation criteria, four-phase control flow, reviewer independence requirements, structured evaluation rubrics, JSON output formats, convergence loop with iteration caps, failure modes, implementation patterns (parallel, sequential, batch), and integration guidance.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Suggested reviewers

  • affaan-m

Poem

🎅 A framework checked twice, with agents so clever,
Dual reviews ensuring excellence ever,
Santa's verdict gate keeps quality bright,
Iterating till perfect, making things right! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(skills): add santa-method - multi-agent adversarial verification' directly and clearly summarizes the main change: introducing a new skill called the Santa Method for multi-agent adversarial verification.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can validate your CodeRabbit configuration file in your editor.

If your editor has YAML language server, you can enable auto-completion and validation by adding # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json at the top of your CodeRabbit configuration file.

@greptile-apps

greptile-apps Bot commented Mar 22, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds the Santa Method skill — a multi-agent adversarial verification framework that spawns two context-isolated reviewer agents in parallel after generation and gates shipping on both reviewers passing, with a convergence loop of up to three fix iterations before escalating to a human. The concept fills a real gap between the existing verification-loop skill (deterministic checks) and fully manual review by targeting semantic quality issues such as hallucinations, compliance violations, and factual errors.

Key observations from the review:

  • santa_verdict return-type inconsistency (P1): The function returns a bare string "NICE" on the happy path but a 3-tuple on the NAUGHTY path. The call site unpacks it as verdict, issues, suggestions = santa_verdict(...), which will raise ValueError: too many values to unpack in Python the moment both reviewers pass — crashing the loop on its first success. The NICE branch must return a consistent tuple (e.g. return "NICE", [], []).

  • Convergence loop off-by-one (P1): After MAX_ITERATIONS = 3 fixes, fresh review_b/review_c agents are created but the for loop terminates and those agents are never consulted. escalate_to_human is then called with issues from the verdict check before the third fix, potentially escalating stale problem descriptions. A single additional santa_verdict call after the loop would close this gap.

  • santa_batch uncapped recursion (P1): santa_batch calls itself recursively on every NAUGHTY sample result with no depth guard. The skill's own failure-mode table prescribes a "Max iteration cap (3)" for infinite loops, but santa_batch does not implement it — a stubborn batch could trigger a Python RecursionError.

  • Style / format alignment (P2): CLAUDE.md specifies skill sections as "When to Use, How It Works, Examples". This skill uses ## When to Activate and omits both ## How It Works and ## Examples. Additionally, the tools frontmatter field (used by other agent-spawning skills such as eval-harness) is absent despite the skill's core reliance on the Agent tool.

Confidence Score: 2/5

  • The skill introduces three reproducible logic bugs in its pseudocode that need to be fixed before the skill can be reliably used or implemented.
  • While the concept and documentation structure are valuable, the pseudocode examples — which practitioners will directly implement — contain a crash-on-success bug (santa_verdict unpacking), a logic correctness error (last fix never verified), and an unbounded recursion in the batch pattern. These are not hypothetical edge cases: the santa_verdict bug hits every time the output passes review on the first try.
  • skills/santa-method/SKILL.md — specifically the Phase 3 verdict function, Phase 4 convergence loop, and Pattern C batch sampling function.

Important Files Changed

Filename Overview
skills/santa-method/SKILL.md New skill introducing a multi-agent adversarial verification framework. The concept and documentation structure are solid, but the pseudocode contains three reproducible bugs: (1) santa_verdict returns a bare string on NICE causing a tuple-unpacking error at the call site, (2) the convergence loop never checks the output after the final fix iteration, and (3) santa_batch recurses without a depth limit, contradicting the skill's own failure-mode mitigation table.

Sequence Diagram

sequenceDiagram
    participant U as User/Caller
    participant G as Generator (Agent A)
    participant B as Reviewer B (fresh per round)
    participant C as Reviewer C (fresh per round)
    participant V as Verdict Gate
    participant F as Fix Agent
    participant H as Human Escalation

    U->>G: task_spec + rubric
    G-->>U: output (Phase 1: Make a List)

    loop Phase 4: Fix Until Nice (max 3 iterations)
        par Phase 2: Check It Twice (parallel, context-isolated)
            U->>B: output + rubric (no prior context)
            B-->>U: JSON verdict (PASS/FAIL + issues)
        and
            U->>C: output + rubric (no prior context)
            C-->>U: JSON verdict (PASS/FAIL + issues)
        end

        U->>V: (review_b, review_c)
        alt Phase 3: Both PASS → NICE
            V-->>U: NICE → ship output
        else At least one FAIL → NAUGHTY
            V-->>U: NAUGHTY + merged issues
            U->>F: output + critical_issues
            F-->>U: fixed output
            Note over B,C: Fresh agents created (no anchoring bias)
        end
    end

    U->>H: escalate(output, issues) [if max iterations exceeded]
Loading

Reviews (1): Last reviewed commit: "Enhance SKILL.md with detailed Santa Met..." | Re-trigger Greptile

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 1 file

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="skills/santa-method/SKILL.md">

<violation number="1" location="skills/santa-method/SKILL.md:93">
P1: Reviewer prompt embeds untrusted output directly with no anti-injection guardrails, allowing adversarial reviewed text to steer verdicts.</violation>

<violation number="2" location="skills/santa-method/SKILL.md:164">
P1: `santa_verdict` returns inconsistent shapes (string vs tuple), causing runtime unpacking failure on the NICE path.</violation>

<violation number="3" location="skills/santa-method/SKILL.md:255">
P1: Batch-sampling pseudocode can crash on small batches and recurse without a convergence cap, risking runtime errors and non-termination.</violation>
</file>

Since this is your first cubic review, here's how it works:

  • cubic automatically reviews your code and comments on bugs and improvements
  • Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
  • Add one-off context when rerunning by tagging @cubic-dev-ai with guidance or docs links (including llms.txt)
  • Ask questions if you need clarification on any suggestion

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

{task_spec}

## Output Under Review
{output}

@cubic-dev-ai cubic-dev-ai Bot Mar 22, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Reviewer prompt embeds untrusted output directly with no anti-injection guardrails, allowing adversarial reviewed text to steer verdicts.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At skills/santa-method/SKILL.md, line 93:

<comment>Reviewer prompt embeds untrusted output directly with no anti-injection guardrails, allowing adversarial reviewed text to steer verdicts.</comment>

<file context>
@@ -0,0 +1,306 @@
+{task_spec}
+
+## Output Under Review
+{output}
+
+## Evaluation Rubric
</file context>
Fix with Cubic

import random

def santa_batch(items, rubric, sample_rate=0.15):
sample = random.sample(items, max(5, int(len(items) * sample_rate)))

@cubic-dev-ai cubic-dev-ai Bot Mar 22, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Batch-sampling pseudocode can crash on small batches and recurse without a convergence cap, risking runtime errors and non-termination.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At skills/santa-method/SKILL.md, line 255:

<comment>Batch-sampling pseudocode can crash on small batches and recurse without a convergence cap, risking runtime errors and non-termination.</comment>

<file context>
@@ -0,0 +1,306 @@
+import random
+
+def santa_batch(items, rubric, sample_rate=0.15):
+    sample = random.sample(items, max(5, int(len(items) * sample_rate)))
+
+    for item in sample:
</file context>
Fix with Cubic

def santa_verdict(review_b, review_c):
"""Both reviewers must pass. No partial credit."""
if review_b.verdict == "PASS" and review_c.verdict == "PASS":
return "NICE" # Ship it

@cubic-dev-ai cubic-dev-ai Bot Mar 22, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: santa_verdict returns inconsistent shapes (string vs tuple), causing runtime unpacking failure on the NICE path.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At skills/santa-method/SKILL.md, line 164:

<comment>`santa_verdict` returns inconsistent shapes (string vs tuple), causing runtime unpacking failure on the NICE path.</comment>

<file context>
@@ -0,0 +1,306 @@
+def santa_verdict(review_b, review_c):
+    """Both reviewers must pass. No partial credit."""
+    if review_b.verdict == "PASS" and review_c.verdict == "PASS":
+        return "NICE"  # Ship it
+
+    # Merge flags from both reviewers, deduplicate
</file context>
Suggested change
return "NICE" # Ship it
return "NICE", [], [] # Ship it
Fix with Cubic

Comment on lines +161 to +170
def santa_verdict(review_b, review_c):
"""Both reviewers must pass. No partial credit."""
if review_b.verdict == "PASS" and review_c.verdict == "PASS":
return "NICE" # Ship it

# Merge flags from both reviewers, deduplicate
all_issues = dedupe(review_b.critical_issues + review_c.critical_issues)
all_suggestions = dedupe(review_b.suggestions + review_c.suggestions)

return "NAUGHTY", all_issues, all_suggestions

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 santa_verdict return value causes tuple-unpacking error on NICE path

santa_verdict returns the bare string "NICE" on success, but the call site at line 181 attempts to destructure the result as verdict, issues, suggestions = santa_verdict(review_b, review_c). In Python, unpacking a 4-character string into exactly 3 variables raises ValueError: too many values to unpack (expected 3) — so any successful first-pass would crash the convergence loop instead of shipping.

The fix is to return a consistent tuple from santa_verdict in both branches:

Suggested change
def santa_verdict(review_b, review_c):
"""Both reviewers must pass. No partial credit."""
if review_b.verdict == "PASS" and review_c.verdict == "PASS":
return "NICE" # Ship it
# Merge flags from both reviewers, deduplicate
all_issues = dedupe(review_b.critical_issues + review_c.critical_issues)
all_suggestions = dedupe(review_b.suggestions + review_c.suggestions)
return "NAUGHTY", all_issues, all_suggestions
def santa_verdict(review_b, review_c):
"""Both reviewers must pass. No partial credit."""
if review_b.verdict == "PASS" and review_c.verdict == "PASS":
return "NICE", [], [] # Ship it
# Merge flags from both reviewers, deduplicate
all_issues = dedupe(review_b.critical_issues + review_c.critical_issues)
all_suggestions = dedupe(review_b.suggestions + review_c.suggestions)
return "NAUGHTY", all_issues, all_suggestions

Comment on lines +178 to +200
MAX_ITERATIONS = 3

for iteration in range(MAX_ITERATIONS):
verdict, issues, suggestions = santa_verdict(review_b, review_c)

if verdict == "NICE":
log_santa_result(output, iteration, "passed")
return ship(output)

# Fix all critical issues (suggestions are optional)
output = fix_agent.execute(
output=output,
issues=issues,
instruction="Fix ONLY the flagged issues. Do not refactor or add unrequested changes."
)

# Re-run BOTH reviewers on fixed output (fresh agents, no memory of previous round)
review_b = Agent(prompt=REVIEWER_PROMPT.format(output=output, ...))
review_c = Agent(prompt=REVIEWER_PROMPT.format(output=output, ...))

# Exhausted iterations — escalate
log_santa_result(output, MAX_ITERATIONS, "escalated")
escalate_to_human(output, issues)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Convergence loop off-by-one: the final fix's output is never verified

With MAX_ITERATIONS = 3, the loop runs iterations 0, 1, and 2. At the end of iteration 2 the code applies a third fix and creates fresh review_b/review_c agents — but the loop ends immediately after and those agents are never consulted. escalate_to_human is then called with issues from the verdict check before the third fix, which may no longer accurately describe the current (post-fix) state of output.

The architecture diagram also shows inconsistency: it says if i > MAX: escalate (implying the check happens before the fix on the last iteration) but the pseudocode fixes first and then escalates — the final fix is silently dropped.

A minimal correction is to perform one final verdict check after the loop:

MAX_ITERATIONS = 3

for iteration in range(MAX_ITERATIONS):
    verdict, issues, suggestions = santa_verdict(review_b, review_c)

    if verdict == "NICE":
        log_santa_result(output, iteration, "passed")
        return ship(output)

    output = fix_agent.execute(
        output=output,
        issues=issues,
        instruction="Fix ONLY the flagged issues. Do not refactor or add unrequested changes."
    )

    review_b = Agent(prompt=REVIEWER_PROMPT.format(output=output, ...))
    review_c = Agent(prompt=REVIEWER_PROMPT.format(output=output, ...))

# One final check on the output produced by the last fix
verdict, issues, suggestions = santa_verdict(review_b, review_c)
if verdict == "NICE":
    log_santa_result(output, MAX_ITERATIONS, "passed")
    return ship(output)

# Genuinely exhausted — escalate with current issues
log_santa_result(output, MAX_ITERATIONS, "escalated")
escalate_to_human(output, issues)

Comment on lines +254 to +264
def santa_batch(items, rubric, sample_rate=0.15):
sample = random.sample(items, max(5, int(len(items) * sample_rate)))

for item in sample:
result = santa_full(item, rubric)
if result.verdict == "NAUGHTY":
pattern = classify_failure(result.issues)
items = batch_fix(items, pattern) # Fix all items matching pattern
return santa_batch(items, rubric) # Re-sample

return items # Clean sample → ship batch

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 santa_batch has no recursion depth limit — contradicts stated failure-mode mitigation

santa_batch calls itself recursively every time a NAUGHTY item is found. There is no max_iterations guard, so if batch_fix never fully resolves the failure pattern (or introduces new ones), this recurses until a Python RecursionError is raised. The failure modes table in the same document explicitly lists "Infinite loop" and prescribes a "Max iteration cap (3)" as the mitigation — but santa_batch does not implement it.

Suggested change
def santa_batch(items, rubric, sample_rate=0.15):
sample = random.sample(items, max(5, int(len(items) * sample_rate)))
for item in sample:
result = santa_full(item, rubric)
if result.verdict == "NAUGHTY":
pattern = classify_failure(result.issues)
items = batch_fix(items, pattern) # Fix all items matching pattern
return santa_batch(items, rubric) # Re-sample
return items # Clean sample → ship batch
def santa_batch(items, rubric, sample_rate=0.15, _depth=0, max_depth=3):
if _depth >= max_depth:
raise RuntimeError(f"santa_batch: exceeded max fix depth ({max_depth}). Escalate to human.")
sample = random.sample(items, max(5, int(len(items) * sample_rate)))
for item in sample:
result = santa_full(item, rubric)
if result.verdict == "NAUGHTY":
pattern = classify_failure(result.issues)
items = batch_fix(items, pattern) # Fix all items matching pattern
return santa_batch(items, rubric, sample_rate, _depth + 1, max_depth)
return items # Clean sample → ship batch

Comment on lines +1 to +5
---
name: santa-method
description: "Multi-agent adversarial verification with convergence loop. Two independent review agents must both pass before output ships."
origin: "Ronald Skelton - Founder, RapportScore.ai"
---

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing tools frontmatter field and non-standard section headings

CLAUDE.md specifies that skills should have "Clear sections (When to Use, How It Works, Examples)". Two deviations:

  1. Section headings — this skill uses ## When to Activate instead of the standard ## When to Use, and there is no ## How It Works or ## Examples section. The detailed phase/pattern content could be reorganised under those headings to stay consistent with the rest of the skill library.

  2. Missing tools frontmatter — Other skills that invoke the Agent tool (e.g. eval-harness) declare tools: Read, Write, Edit, Bash, Grep, Glob in their YAML frontmatter. Since this skill explicitly relies on spawning subagents, adding a tools key helps the harness and users understand the required permissions upfront.

Suggested change
---
name: santa-method
description: "Multi-agent adversarial verification with convergence loop. Two independent review agents must both pass before output ships."
origin: "Ronald Skelton - Founder, RapportScore.ai"
---
---
name: santa-method
description: "Multi-agent adversarial verification with convergence loop. Two independent review agents must both pass before output ships."
origin: "Ronald Skelton - Founder, RapportScore.ai"
tools: Agent
---

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/santa-method/SKILL.md`:
- Line 13: Change the section header "## When to Activate" to "## When to Use"
in SKILL.md (look for the header string "When to Activate"), and ensure the file
includes the required sibling sections "How It Works" and "Examples" (add them
if missing) so the skill follows the repository documentation standard.
- Line 65: The document's headings don't follow the repo skill format: move the
existing "Phase Details" section to be a subsection under a new top-level "How
It Works" heading; specifically add a top-level "How It Works" header and nest
"Phase Details" beneath it (e.g., "## How It Works" followed by "### Phase
Details"), ensuring the rest of the content is reflowed under that new structure
so the skill matches repository patterns.
- Line 206: The doc is missing a top-level "Examples" section and
"Implementation Patterns" should be moved under it to match repo guidelines;
update SKILL.md to ensure there are clear top-level headings "When to Use", "How
It Works", and "Examples", create an "Examples" section if absent, relocate the
existing "Implementation Patterns" content under that "Examples" heading
(renaming if needed), and adjust any Table of Contents or cross-references so
the new heading references (e.g., "Implementation Patterns" -> "Examples /
Implementation Patterns") remain correct.
- Around line 1-305: The workflow-quality module's paths array in the install
modules manifest is missing the new skill; add the string "skills/santa-method"
to the paths array alongside existing entries like "verification-loop",
"eval-harness", and "continuous-learning-v2" so the Santa Method skill is
registered for installation; ensure the value is quoted and comma-separated
consistently with the other path entries and commit the manifest change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a437ed88-a5bd-45f8-8996-78dc6ca254cb

📥 Commits

Reviewing files that changed from the base of the PR and between 57fa3b5 and 90518eb.

📒 Files selected for processing (1)
  • skills/santa-method/SKILL.md

Comment on lines +1 to +305
---
name: santa-method
description: "Multi-agent adversarial verification with convergence loop. Two independent review agents must both pass before output ships."
origin: "Ronald Skelton - Founder, RapportScore.ai"
---

# Santa Method

Multi-agent adversarial verification framework. Make a list, check it twice. If it's naughty, fix it until it's nice.

The core insight: a single agent reviewing its own output shares the same biases, knowledge gaps, and systematic errors that produced the output. Two independent reviewers with no shared context break this failure mode.

## When to Activate

Invoke this skill when:
- Output will be published, deployed, or consumed by end users
- Compliance, regulatory, or brand constraints must be enforced
- Code ships to production without human review
- Content accuracy matters (technical docs, educational material, customer-facing copy)
- Batch generation at scale where spot-checking misses systemic patterns
- Hallucination risk is elevated (claims, statistics, API references, legal language)

Do NOT use for internal drafts, exploratory research, or tasks with deterministic verification (use build/test/lint pipelines for those).

## Architecture

```
┌─────────────┐
│ GENERATOR │ Phase 1: Make a List
│ (Agent A) │ Produce the deliverable
└──────┬───────┘
│ output
┌──────────────────────────────┐
│ DUAL INDEPENDENT REVIEW │ Phase 2: Check It Twice
│ │
│ ┌───────────┐ ┌───────────┐ │ Two agents, same rubric,
│ │ Reviewer B │ │ Reviewer C │ │ no shared context
│ └─────┬─────┘ └─────┬─────┘ │
│ │ │ │
└────────┼──────────────┼────────┘
│ │
▼ ▼
┌──────────────────────────────┐
│ VERDICT GATE │ Phase 3: Naughty or Nice
│ │
│ B passes AND C passes → NICE │ Both must pass.
│ Otherwise → NAUGHTY │ No exceptions.
└──────┬──────────────┬─────────┘
│ │
NICE NAUGHTY
│ │
▼ ▼
[ SHIP ] ┌─────────────┐
│ FIX CYCLE │ Phase 4: Fix Until Nice
│ │
│ iteration++ │ Collect all flags.
│ if i > MAX: │ Fix all issues.
│ escalate │ Re-run both reviewers.
│ else: │ Loop until convergence.
│ goto Ph.2 │
└──────────────┘
```

## Phase Details

### Phase 1: Make a List (Generate)

Execute the primary task. No changes to your normal generation workflow. Santa Method is a post-generation verification layer, not a generation strategy.

```python
# The generator runs as normal
output = generate(task_spec)
```

### Phase 2: Check It Twice (Independent Dual Review)

Spawn two review agents in parallel. Critical invariants:

1. **Context isolation** — neither reviewer sees the other's assessment
2. **Identical rubric** — both receive the same evaluation criteria
3. **Same inputs** — both receive the original spec AND the generated output
4. **Structured output** — each returns a typed verdict, not prose

```python
REVIEWER_PROMPT = """
You are an independent quality reviewer. You have NOT seen any other review of this output.

## Task Specification
{task_spec}

## Output Under Review
{output}

## Evaluation Rubric
{rubric}

## Instructions
Evaluate the output against EACH rubric criterion. For each:
- PASS: criterion fully met, no issues
- FAIL: specific issue found (cite the exact problem)

Return your assessment as structured JSON:
{
"verdict": "PASS" | "FAIL",
"checks": [
{"criterion": "...", "result": "PASS|FAIL", "detail": "..."}
],
"critical_issues": ["..."], // blockers that must be fixed
"suggestions": ["..."] // non-blocking improvements
}

Be rigorous. Your job is to find problems, not to approve.
"""
```

```python
# Spawn reviewers in parallel (Claude Code subagents)
review_b = Agent(prompt=REVIEWER_PROMPT.format(...), description="Santa Reviewer B")
review_c = Agent(prompt=REVIEWER_PROMPT.format(...), description="Santa Reviewer C")

# Both run concurrently — neither sees the other
```

### Rubric Design

The rubric is the most important input. Vague rubrics produce vague reviews. Every criterion must have an objective pass/fail condition.

| Criterion | Pass Condition | Failure Signal |
|-----------|---------------|----------------|
| Factual accuracy | All claims verifiable against source material or common knowledge | Invented statistics, wrong version numbers, nonexistent APIs |
| Hallucination-free | No fabricated entities, quotes, URLs, or references | Links to pages that don't exist, attributed quotes with no source |
| Completeness | Every requirement in the spec is addressed | Missing sections, skipped edge cases, incomplete coverage |
| Compliance | Passes all project-specific constraints | Banned terms used, tone violations, regulatory non-compliance |
| Internal consistency | No contradictions within the output | Section A says X, section B says not-X |
| Technical correctness | Code compiles/runs, algorithms are sound | Syntax errors, logic bugs, wrong complexity claims |

#### Domain-Specific Rubric Extensions

**Content/Marketing:**
- Brand voice adherence
- SEO requirements met (keyword density, meta tags, structure)
- No competitor trademark misuse
- CTA present and correctly linked

**Code:**
- Type safety (no `any` leaks, proper null handling)
- Error handling coverage
- Security (no secrets in code, input validation, injection prevention)
- Test coverage for new paths

**Compliance-Sensitive (regulated, legal, financial):**
- No outcome guarantees or unsubstantiated claims
- Required disclaimers present
- Approved terminology only
- Jurisdiction-appropriate language

### Phase 3: Naughty or Nice (Verdict Gate)

```python
def santa_verdict(review_b, review_c):
"""Both reviewers must pass. No partial credit."""
if review_b.verdict == "PASS" and review_c.verdict == "PASS":
return "NICE" # Ship it

# Merge flags from both reviewers, deduplicate
all_issues = dedupe(review_b.critical_issues + review_c.critical_issues)
all_suggestions = dedupe(review_b.suggestions + review_c.suggestions)

return "NAUGHTY", all_issues, all_suggestions
```

Why both must pass: if only one reviewer catches an issue, that issue is real. The other reviewer's blind spot is exactly the failure mode Santa Method exists to eliminate.

### Phase 4: Fix Until Nice (Convergence Loop)

```python
MAX_ITERATIONS = 3

for iteration in range(MAX_ITERATIONS):
verdict, issues, suggestions = santa_verdict(review_b, review_c)

if verdict == "NICE":
log_santa_result(output, iteration, "passed")
return ship(output)

# Fix all critical issues (suggestions are optional)
output = fix_agent.execute(
output=output,
issues=issues,
instruction="Fix ONLY the flagged issues. Do not refactor or add unrequested changes."
)

# Re-run BOTH reviewers on fixed output (fresh agents, no memory of previous round)
review_b = Agent(prompt=REVIEWER_PROMPT.format(output=output, ...))
review_c = Agent(prompt=REVIEWER_PROMPT.format(output=output, ...))

# Exhausted iterations — escalate
log_santa_result(output, MAX_ITERATIONS, "escalated")
escalate_to_human(output, issues)
```

Critical: each review round uses **fresh agents**. Reviewers must not carry memory from previous rounds, as prior context creates anchoring bias.

## Implementation Patterns

### Pattern A: Claude Code Subagents (Recommended)

Subagents provide true context isolation. Each reviewer is a separate process with no shared state.

```bash
# In a Claude Code session, use the Agent tool to spawn reviewers
# Both agents run in parallel for speed
```

```python
# Pseudocode for Agent tool invocation
reviewer_b = Agent(
description="Santa Review B",
prompt=f"Review this output for quality...\n\nRUBRIC:\n{rubric}\n\nOUTPUT:\n{output}"
)
reviewer_c = Agent(
description="Santa Review C",
prompt=f"Review this output for quality...\n\nRUBRIC:\n{rubric}\n\nOUTPUT:\n{output}"
)
```

### Pattern B: Sequential Inline (Fallback)

When subagents aren't available, simulate isolation with explicit context resets:

1. Generate output
2. New context: "You are Reviewer 1. Evaluate ONLY against this rubric. Find problems."
3. Record findings verbatim
4. Clear context completely
5. New context: "You are Reviewer 2. Evaluate ONLY against this rubric. Find problems."
6. Compare both reviews, fix, repeat

The subagent pattern is strictly superior — inline simulation risks context bleed between reviewers.

### Pattern C: Batch Sampling

For large batches (100+ items), full Santa on every item is cost-prohibitive. Use stratified sampling:

1. Run Santa on a random sample (10-15% of batch, minimum 5 items)
2. Categorize failures by type (hallucination, compliance, completeness, etc.)
3. If systematic patterns emerge, apply targeted fixes to the entire batch
4. Re-sample and re-verify the fixed batch
5. Continue until a clean sample passes

```python
import random

def santa_batch(items, rubric, sample_rate=0.15):
sample = random.sample(items, max(5, int(len(items) * sample_rate)))

for item in sample:
result = santa_full(item, rubric)
if result.verdict == "NAUGHTY":
pattern = classify_failure(result.issues)
items = batch_fix(items, pattern) # Fix all items matching pattern
return santa_batch(items, rubric) # Re-sample

return items # Clean sample → ship batch
```

## Failure Modes and Mitigations

| Failure Mode | Symptom | Mitigation |
|-------------|---------|------------|
| Infinite loop | Reviewers keep finding new issues after fixes | Max iteration cap (3). Escalate. |
| Rubber stamping | Both reviewers pass everything | Adversarial prompt: "Your job is to find problems, not approve." |
| Subjective drift | Reviewers flag style preferences, not errors | Tight rubric with objective pass/fail criteria only |
| Fix regression | Fixing issue A introduces issue B | Fresh reviewers each round catch regressions |
| Reviewer agreement bias | Both reviewers miss the same thing | Mitigated by independence, not eliminated. For critical output, add a third reviewer or human spot-check. |
| Cost explosion | Too many iterations on large outputs | Batch sampling pattern. Budget caps per verification cycle. |

## Integration with Other Skills

| Skill | Relationship |
|-------|-------------|
| Verification Loop | Use for deterministic checks (build, lint, test). Santa for semantic checks (accuracy, hallucinations). Run verification-loop first, Santa second. |
| Eval Harness | Santa Method results feed eval metrics. Track pass@k across Santa runs to measure generator quality over time. |
| Continuous Learning v2 | Santa findings become instincts. Repeated failures on the same criterion → learned behavior to avoid the pattern. |
| Strategic Compact | Run Santa BEFORE compacting. Don't lose review context mid-verification. |

## Metrics

Track these to measure Santa Method effectiveness:

- **First-pass rate**: % of outputs that pass Santa on round 1 (target: >70%)
- **Mean iterations to convergence**: average rounds to NICE (target: <1.5)
- **Issue taxonomy**: distribution of failure types (hallucination vs. completeness vs. compliance)
- **Reviewer agreement**: % of issues flagged by both reviewers vs. only one (low agreement = rubric needs tightening)
- **Escape rate**: issues found post-ship that Santa should have caught (target: 0)

## Cost Analysis

Santa Method costs approximately 2-3x the token cost of generation alone per verification cycle. For most high-stakes output, this is a bargain:

```
Cost of Santa = (generation tokens) + 2×(review tokens per round) × (avg rounds)
Cost of NOT Santa = (reputation damage) + (correction effort) + (trust erosion)
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check if santa-method is registered in the workflow-quality module

# Search for the workflow-quality module and display its paths array
jq '.[] | select(.id == "workflow-quality") | {id, paths}' manifests/install-modules.json

Repository: affaan-m/everything-claude-code

Length of output: 162


🏁 Script executed:

#!/bin/bash
# First, check if the manifest file exists and view its overall structure
echo "=== Checking if manifest file exists ==="
if [ -f "manifests/install-modules.json" ]; then
    echo "File found. First 50 lines:"
    head -50 manifests/install-modules.json
    echo ""
    echo "=== Total lines in file ==="
    wc -l manifests/install-modules.json
else
    echo "File not found at manifests/install-modules.json"
    # Try to find it
    echo "Searching for install-modules.json:"
    find . -name "install-modules.json" -type f 2>/dev/null
fi

Repository: affaan-m/everything-claude-code

Length of output: 1283


🏁 Script executed:

#!/bin/bash
# Search for the workflow-quality module in the manifest
echo "=== Searching for workflow-quality module ==="
grep -n "workflow-quality" manifests/install-modules.json

echo ""
echo "=== Checking if santa-method is mentioned in manifest ==="
grep -n "santa-method" manifests/install-modules.json

echo ""
echo "=== Using jq to extract workflow-quality module ==="
python3 <<'PYTHON_EOF'
import json

with open('manifests/install-modules.json', 'r') as f:
    data = json.load(f)

# Find workflow-quality module
for module in data.get('modules', []):
    if module.get('id') == 'workflow-quality':
        print("Found workflow-quality module:")
        print(json.dumps(module, indent=2))
        break
else:
    print("workflow-quality module not found")
    print("\nAvailable modules:")
    for module in data.get('modules', []):
        print(f"  - {module.get('id')}")
PYTHON_EOF

Repository: affaan-m/everything-claude-code

Length of output: 1116


Add santa-method to the workflow-quality module manifest.

The new skill is not registered in manifests/install-modules.json. It should be added to the workflow-quality module's paths array (currently at line 182) alongside related skills like verification-loop, eval-harness, and continuous-learning-v2.

Add "skills/santa-method" to the paths array in the workflow-quality module definition.

🧰 Tools
🪛 LanguageTool

[style] ~173-~173: Consider an alternative for the overused word “exactly”.
Context: ...eal. The other reviewer's blind spot is exactly the failure mode Santa Method exists to...

(EXACTLY_PRECISELY)


[style] ~235-~235: Consider using a different adverb to strengthen your wording.
Context: ...cord findings verbatim 4. Clear context completely 5. New context: "You are Reviewer 2. Ev...

(COMPLETELY_ENTIRELY)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/santa-method/SKILL.md` around lines 1 - 305, The workflow-quality
module's paths array in the install modules manifest is missing the new skill;
add the string "skills/santa-method" to the paths array alongside existing
entries like "verification-loop", "eval-harness", and "continuous-learning-v2"
so the Santa Method skill is registered for installation; ensure the value is
quoted and comma-separated consistently with the other path entries and commit
the manifest change.


The core insight: a single agent reviewing its own output shares the same biases, knowledge gaps, and systematic errors that produced the output. Two independent reviewers with no shared context break this failure mode.

## When to Activate

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Rename section to match repository guidelines.

The section title should be "When to Use" rather than "When to Activate" to align with the repository's skill documentation standards. As per coding guidelines, skills must be formatted with clear sections for "When to Use", "How It Works", and "Examples".

📝 Proposed fix
-## When to Activate
+## When to Use
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## When to Activate
## When to Use
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/santa-method/SKILL.md` at line 13, Change the section header "## When
to Activate" to "## When to Use" in SKILL.md (look for the header string "When
to Activate"), and ensure the file includes the required sibling sections "How
It Works" and "Examples" (add them if missing) so the skill follows the
repository documentation standard.

└──────────────┘
```

## Phase Details

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Restructure to align with repository skill format.

The repository guidelines require an explicit "How It Works" section for skill documentation. Consider restructuring the content so that "Phase Details" becomes a subsection under a top-level "How It Works" section. This would align the document with established repository patterns for skills.

📝 Suggested structure
-## Phase Details
+## How It Works
+
+### Phase Details
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## Phase Details
## How It Works
### Phase Details
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/santa-method/SKILL.md` at line 65, The document's headings don't
follow the repo skill format: move the existing "Phase Details" section to be a
subsection under a new top-level "How It Works" heading; specifically add a
top-level "How It Works" header and nest "Phase Details" beneath it (e.g., "##
How It Works" followed by "### Phase Details"), ensuring the rest of the content
is reflowed under that new structure so the skill matches repository patterns.

Critical: each review round uses **fresh agents**. Reviewers must not carry memory from previous rounds, as prior context creates anchoring bias.

## Implementation Patterns

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Restructure to include explicit "Examples" section.

Repository guidelines require skills to have clear sections for "When to Use", "How It Works", and "Examples". Consider moving or reorganizing "Implementation Patterns" under a top-level "Examples" section to align with established repository patterns.

📝 Suggested structure
-## Implementation Patterns
+## Examples
+
+### Implementation Patterns
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/santa-method/SKILL.md` at line 206, The doc is missing a top-level
"Examples" section and "Implementation Patterns" should be moved under it to
match repo guidelines; update SKILL.md to ensure there are clear top-level
headings "When to Use", "How It Works", and "Examples", create an "Examples"
section if absent, relocate the existing "Implementation Patterns" content under
that "Examples" heading (renaming if needed), and adjust any Table of Contents
or cross-references so the new heading references (e.g., "Implementation
Patterns" -> "Examples / Implementation Patterns") remain correct.

@affaan-m

Copy link
Copy Markdown
Owner

Interesting approach to multi-agent verification. Will review this week.

@fischperson0000000

fischperson0000000 commented Mar 22, 2026 via email

Copy link
Copy Markdown
Contributor Author

@affaan-m affaan-m left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well-written adversarial verification pattern. Self-contained, integrates nicely with existing ECC skills.

@affaan-m affaan-m merged commit 6f13b05 into affaan-m:main Mar 22, 2026
4 checks passed
mecemis pushed a commit to mecemis/everything-claude-code that referenced this pull request Mar 25, 2026
…affaan-m#760)

* feat(skills): add santa-method

Multi-agent adversarial verification with convergence loop. Two independent review agents evaluate output against a shared rubric. Both must pass before shipping. Includes architecture diagram, implementation patterns (subagent, inline, batch sampling), domain-specific rubric extensions, failure mode mitigations, and integration guidance with existing ECC skills.

* Enhance SKILL.md with detailed Santa Method documentation

Expanded the SKILL.md documentation for the Santa Method, detailing architecture, phases, implementation patterns, failure modes, integration with other skills, metrics, and cost analysis.
FrancescoRosciano pushed a commit to FRosciano-Mambo/everything-claude-code that referenced this pull request Jun 1, 2026
…affaan-m#760)

* feat(skills): add santa-method

Multi-agent adversarial verification with convergence loop. Two independent review agents evaluate output against a shared rubric. Both must pass before shipping. Includes architecture diagram, implementation patterns (subagent, inline, batch sampling), domain-specific rubric extensions, failure mode mitigations, and integration guidance with existing ECC skills.

* Enhance SKILL.md with detailed Santa Method documentation

Expanded the SKILL.md documentation for the Santa Method, detailing architecture, phases, implementation patterns, failure modes, integration with other skills, metrics, and cost analysis.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants