feat(skills): add santa-method - multi-agent adversarial verification#760
Conversation
Multi-agent adversarial verification with convergence loop. Two independent review agents evaluate output against a shared rubric. Both must pass before shipping. Includes architecture diagram, implementation patterns (subagent, inline, batch sampling), domain-specific rubric extensions, failure mode mitigations, and integration guidance with existing ECC skills.
Expanded the SKILL.md documentation for the Santa Method, detailing architecture, phases, implementation patterns, failure modes, integration with other skills, metrics, and cost analysis.
📝 WalkthroughWalkthroughA new documentation file introduces the Santa Method, a multi-agent adversarial verification framework. The document outlines a four-phase control flow: generation, dual independent reviews, verdict gate requiring both passes, and iterative fixes. It includes architecture overview, reviewer independence requirements, evaluation rubrics, JSON output specifications, failure modes, mitigations, and integration guidance. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~15 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can validate your CodeRabbit configuration file in your editor.If your editor has YAML language server, you can enable auto-completion and validation by adding |
Greptile SummaryThis PR adds the Santa Method skill — a multi-agent adversarial verification framework that spawns two context-isolated reviewer agents in parallel after generation and gates shipping on both reviewers passing, with a convergence loop of up to three fix iterations before escalating to a human. The concept fills a real gap between the existing Key observations from the review:
Confidence Score: 2/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant U as User/Caller
participant G as Generator (Agent A)
participant B as Reviewer B (fresh per round)
participant C as Reviewer C (fresh per round)
participant V as Verdict Gate
participant F as Fix Agent
participant H as Human Escalation
U->>G: task_spec + rubric
G-->>U: output (Phase 1: Make a List)
loop Phase 4: Fix Until Nice (max 3 iterations)
par Phase 2: Check It Twice (parallel, context-isolated)
U->>B: output + rubric (no prior context)
B-->>U: JSON verdict (PASS/FAIL + issues)
and
U->>C: output + rubric (no prior context)
C-->>U: JSON verdict (PASS/FAIL + issues)
end
U->>V: (review_b, review_c)
alt Phase 3: Both PASS → NICE
V-->>U: NICE → ship output
else At least one FAIL → NAUGHTY
V-->>U: NAUGHTY + merged issues
U->>F: output + critical_issues
F-->>U: fixed output
Note over B,C: Fresh agents created (no anchoring bias)
end
end
U->>H: escalate(output, issues) [if max iterations exceeded]
Reviews (1): Last reviewed commit: "Enhance SKILL.md with detailed Santa Met..." | Re-trigger Greptile |
There was a problem hiding this comment.
3 issues found across 1 file
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="skills/santa-method/SKILL.md">
<violation number="1" location="skills/santa-method/SKILL.md:93">
P1: Reviewer prompt embeds untrusted output directly with no anti-injection guardrails, allowing adversarial reviewed text to steer verdicts.</violation>
<violation number="2" location="skills/santa-method/SKILL.md:164">
P1: `santa_verdict` returns inconsistent shapes (string vs tuple), causing runtime unpacking failure on the NICE path.</violation>
<violation number="3" location="skills/santa-method/SKILL.md:255">
P1: Batch-sampling pseudocode can crash on small batches and recurse without a convergence cap, risking runtime errors and non-termination.</violation>
</file>
Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Add one-off context when rerunning by tagging
@cubic-dev-aiwith guidance or docs links (includingllms.txt) - Ask questions if you need clarification on any suggestion
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| {task_spec} | ||
|
|
||
| ## Output Under Review | ||
| {output} |
There was a problem hiding this comment.
P1: Reviewer prompt embeds untrusted output directly with no anti-injection guardrails, allowing adversarial reviewed text to steer verdicts.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At skills/santa-method/SKILL.md, line 93:
<comment>Reviewer prompt embeds untrusted output directly with no anti-injection guardrails, allowing adversarial reviewed text to steer verdicts.</comment>
<file context>
@@ -0,0 +1,306 @@
+{task_spec}
+
+## Output Under Review
+{output}
+
+## Evaluation Rubric
</file context>
| import random | ||
|
|
||
| def santa_batch(items, rubric, sample_rate=0.15): | ||
| sample = random.sample(items, max(5, int(len(items) * sample_rate))) |
There was a problem hiding this comment.
P1: Batch-sampling pseudocode can crash on small batches and recurse without a convergence cap, risking runtime errors and non-termination.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At skills/santa-method/SKILL.md, line 255:
<comment>Batch-sampling pseudocode can crash on small batches and recurse without a convergence cap, risking runtime errors and non-termination.</comment>
<file context>
@@ -0,0 +1,306 @@
+import random
+
+def santa_batch(items, rubric, sample_rate=0.15):
+ sample = random.sample(items, max(5, int(len(items) * sample_rate)))
+
+ for item in sample:
</file context>
| def santa_verdict(review_b, review_c): | ||
| """Both reviewers must pass. No partial credit.""" | ||
| if review_b.verdict == "PASS" and review_c.verdict == "PASS": | ||
| return "NICE" # Ship it |
There was a problem hiding this comment.
P1: santa_verdict returns inconsistent shapes (string vs tuple), causing runtime unpacking failure on the NICE path.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At skills/santa-method/SKILL.md, line 164:
<comment>`santa_verdict` returns inconsistent shapes (string vs tuple), causing runtime unpacking failure on the NICE path.</comment>
<file context>
@@ -0,0 +1,306 @@
+def santa_verdict(review_b, review_c):
+ """Both reviewers must pass. No partial credit."""
+ if review_b.verdict == "PASS" and review_c.verdict == "PASS":
+ return "NICE" # Ship it
+
+ # Merge flags from both reviewers, deduplicate
</file context>
| return "NICE" # Ship it | |
| return "NICE", [], [] # Ship it |
| def santa_verdict(review_b, review_c): | ||
| """Both reviewers must pass. No partial credit.""" | ||
| if review_b.verdict == "PASS" and review_c.verdict == "PASS": | ||
| return "NICE" # Ship it | ||
|
|
||
| # Merge flags from both reviewers, deduplicate | ||
| all_issues = dedupe(review_b.critical_issues + review_c.critical_issues) | ||
| all_suggestions = dedupe(review_b.suggestions + review_c.suggestions) | ||
|
|
||
| return "NAUGHTY", all_issues, all_suggestions |
There was a problem hiding this comment.
santa_verdict return value causes tuple-unpacking error on NICE path
santa_verdict returns the bare string "NICE" on success, but the call site at line 181 attempts to destructure the result as verdict, issues, suggestions = santa_verdict(review_b, review_c). In Python, unpacking a 4-character string into exactly 3 variables raises ValueError: too many values to unpack (expected 3) — so any successful first-pass would crash the convergence loop instead of shipping.
The fix is to return a consistent tuple from santa_verdict in both branches:
| def santa_verdict(review_b, review_c): | |
| """Both reviewers must pass. No partial credit.""" | |
| if review_b.verdict == "PASS" and review_c.verdict == "PASS": | |
| return "NICE" # Ship it | |
| # Merge flags from both reviewers, deduplicate | |
| all_issues = dedupe(review_b.critical_issues + review_c.critical_issues) | |
| all_suggestions = dedupe(review_b.suggestions + review_c.suggestions) | |
| return "NAUGHTY", all_issues, all_suggestions | |
| def santa_verdict(review_b, review_c): | |
| """Both reviewers must pass. No partial credit.""" | |
| if review_b.verdict == "PASS" and review_c.verdict == "PASS": | |
| return "NICE", [], [] # Ship it | |
| # Merge flags from both reviewers, deduplicate | |
| all_issues = dedupe(review_b.critical_issues + review_c.critical_issues) | |
| all_suggestions = dedupe(review_b.suggestions + review_c.suggestions) | |
| return "NAUGHTY", all_issues, all_suggestions |
| MAX_ITERATIONS = 3 | ||
|
|
||
| for iteration in range(MAX_ITERATIONS): | ||
| verdict, issues, suggestions = santa_verdict(review_b, review_c) | ||
|
|
||
| if verdict == "NICE": | ||
| log_santa_result(output, iteration, "passed") | ||
| return ship(output) | ||
|
|
||
| # Fix all critical issues (suggestions are optional) | ||
| output = fix_agent.execute( | ||
| output=output, | ||
| issues=issues, | ||
| instruction="Fix ONLY the flagged issues. Do not refactor or add unrequested changes." | ||
| ) | ||
|
|
||
| # Re-run BOTH reviewers on fixed output (fresh agents, no memory of previous round) | ||
| review_b = Agent(prompt=REVIEWER_PROMPT.format(output=output, ...)) | ||
| review_c = Agent(prompt=REVIEWER_PROMPT.format(output=output, ...)) | ||
|
|
||
| # Exhausted iterations — escalate | ||
| log_santa_result(output, MAX_ITERATIONS, "escalated") | ||
| escalate_to_human(output, issues) |
There was a problem hiding this comment.
Convergence loop off-by-one: the final fix's output is never verified
With MAX_ITERATIONS = 3, the loop runs iterations 0, 1, and 2. At the end of iteration 2 the code applies a third fix and creates fresh review_b/review_c agents — but the loop ends immediately after and those agents are never consulted. escalate_to_human is then called with issues from the verdict check before the third fix, which may no longer accurately describe the current (post-fix) state of output.
The architecture diagram also shows inconsistency: it says if i > MAX: escalate (implying the check happens before the fix on the last iteration) but the pseudocode fixes first and then escalates — the final fix is silently dropped.
A minimal correction is to perform one final verdict check after the loop:
MAX_ITERATIONS = 3
for iteration in range(MAX_ITERATIONS):
verdict, issues, suggestions = santa_verdict(review_b, review_c)
if verdict == "NICE":
log_santa_result(output, iteration, "passed")
return ship(output)
output = fix_agent.execute(
output=output,
issues=issues,
instruction="Fix ONLY the flagged issues. Do not refactor or add unrequested changes."
)
review_b = Agent(prompt=REVIEWER_PROMPT.format(output=output, ...))
review_c = Agent(prompt=REVIEWER_PROMPT.format(output=output, ...))
# One final check on the output produced by the last fix
verdict, issues, suggestions = santa_verdict(review_b, review_c)
if verdict == "NICE":
log_santa_result(output, MAX_ITERATIONS, "passed")
return ship(output)
# Genuinely exhausted — escalate with current issues
log_santa_result(output, MAX_ITERATIONS, "escalated")
escalate_to_human(output, issues)| def santa_batch(items, rubric, sample_rate=0.15): | ||
| sample = random.sample(items, max(5, int(len(items) * sample_rate))) | ||
|
|
||
| for item in sample: | ||
| result = santa_full(item, rubric) | ||
| if result.verdict == "NAUGHTY": | ||
| pattern = classify_failure(result.issues) | ||
| items = batch_fix(items, pattern) # Fix all items matching pattern | ||
| return santa_batch(items, rubric) # Re-sample | ||
|
|
||
| return items # Clean sample → ship batch |
There was a problem hiding this comment.
santa_batch has no recursion depth limit — contradicts stated failure-mode mitigation
santa_batch calls itself recursively every time a NAUGHTY item is found. There is no max_iterations guard, so if batch_fix never fully resolves the failure pattern (or introduces new ones), this recurses until a Python RecursionError is raised. The failure modes table in the same document explicitly lists "Infinite loop" and prescribes a "Max iteration cap (3)" as the mitigation — but santa_batch does not implement it.
| def santa_batch(items, rubric, sample_rate=0.15): | |
| sample = random.sample(items, max(5, int(len(items) * sample_rate))) | |
| for item in sample: | |
| result = santa_full(item, rubric) | |
| if result.verdict == "NAUGHTY": | |
| pattern = classify_failure(result.issues) | |
| items = batch_fix(items, pattern) # Fix all items matching pattern | |
| return santa_batch(items, rubric) # Re-sample | |
| return items # Clean sample → ship batch | |
| def santa_batch(items, rubric, sample_rate=0.15, _depth=0, max_depth=3): | |
| if _depth >= max_depth: | |
| raise RuntimeError(f"santa_batch: exceeded max fix depth ({max_depth}). Escalate to human.") | |
| sample = random.sample(items, max(5, int(len(items) * sample_rate))) | |
| for item in sample: | |
| result = santa_full(item, rubric) | |
| if result.verdict == "NAUGHTY": | |
| pattern = classify_failure(result.issues) | |
| items = batch_fix(items, pattern) # Fix all items matching pattern | |
| return santa_batch(items, rubric, sample_rate, _depth + 1, max_depth) | |
| return items # Clean sample → ship batch |
| --- | ||
| name: santa-method | ||
| description: "Multi-agent adversarial verification with convergence loop. Two independent review agents must both pass before output ships." | ||
| origin: "Ronald Skelton - Founder, RapportScore.ai" | ||
| --- |
There was a problem hiding this comment.
Missing
tools frontmatter field and non-standard section headings
CLAUDE.md specifies that skills should have "Clear sections (When to Use, How It Works, Examples)". Two deviations:
-
Section headings — this skill uses
## When to Activateinstead of the standard## When to Use, and there is no## How It Worksor## Examplessection. The detailed phase/pattern content could be reorganised under those headings to stay consistent with the rest of the skill library. -
Missing
toolsfrontmatter — Other skills that invoke theAgenttool (e.g.eval-harness) declaretools: Read, Write, Edit, Bash, Grep, Globin their YAML frontmatter. Since this skill explicitly relies on spawning subagents, adding atoolskey helps the harness and users understand the required permissions upfront.
| --- | |
| name: santa-method | |
| description: "Multi-agent adversarial verification with convergence loop. Two independent review agents must both pass before output ships." | |
| origin: "Ronald Skelton - Founder, RapportScore.ai" | |
| --- | |
| --- | |
| name: santa-method | |
| description: "Multi-agent adversarial verification with convergence loop. Two independent review agents must both pass before output ships." | |
| origin: "Ronald Skelton - Founder, RapportScore.ai" | |
| tools: Agent | |
| --- |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@skills/santa-method/SKILL.md`:
- Line 13: Change the section header "## When to Activate" to "## When to Use"
in SKILL.md (look for the header string "When to Activate"), and ensure the file
includes the required sibling sections "How It Works" and "Examples" (add them
if missing) so the skill follows the repository documentation standard.
- Line 65: The document's headings don't follow the repo skill format: move the
existing "Phase Details" section to be a subsection under a new top-level "How
It Works" heading; specifically add a top-level "How It Works" header and nest
"Phase Details" beneath it (e.g., "## How It Works" followed by "### Phase
Details"), ensuring the rest of the content is reflowed under that new structure
so the skill matches repository patterns.
- Line 206: The doc is missing a top-level "Examples" section and
"Implementation Patterns" should be moved under it to match repo guidelines;
update SKILL.md to ensure there are clear top-level headings "When to Use", "How
It Works", and "Examples", create an "Examples" section if absent, relocate the
existing "Implementation Patterns" content under that "Examples" heading
(renaming if needed), and adjust any Table of Contents or cross-references so
the new heading references (e.g., "Implementation Patterns" -> "Examples /
Implementation Patterns") remain correct.
- Around line 1-305: The workflow-quality module's paths array in the install
modules manifest is missing the new skill; add the string "skills/santa-method"
to the paths array alongside existing entries like "verification-loop",
"eval-harness", and "continuous-learning-v2" so the Santa Method skill is
registered for installation; ensure the value is quoted and comma-separated
consistently with the other path entries and commit the manifest change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a437ed88-a5bd-45f8-8996-78dc6ca254cb
📒 Files selected for processing (1)
skills/santa-method/SKILL.md
| --- | ||
| name: santa-method | ||
| description: "Multi-agent adversarial verification with convergence loop. Two independent review agents must both pass before output ships." | ||
| origin: "Ronald Skelton - Founder, RapportScore.ai" | ||
| --- | ||
|
|
||
| # Santa Method | ||
|
|
||
| Multi-agent adversarial verification framework. Make a list, check it twice. If it's naughty, fix it until it's nice. | ||
|
|
||
| The core insight: a single agent reviewing its own output shares the same biases, knowledge gaps, and systematic errors that produced the output. Two independent reviewers with no shared context break this failure mode. | ||
|
|
||
| ## When to Activate | ||
|
|
||
| Invoke this skill when: | ||
| - Output will be published, deployed, or consumed by end users | ||
| - Compliance, regulatory, or brand constraints must be enforced | ||
| - Code ships to production without human review | ||
| - Content accuracy matters (technical docs, educational material, customer-facing copy) | ||
| - Batch generation at scale where spot-checking misses systemic patterns | ||
| - Hallucination risk is elevated (claims, statistics, API references, legal language) | ||
|
|
||
| Do NOT use for internal drafts, exploratory research, or tasks with deterministic verification (use build/test/lint pipelines for those). | ||
|
|
||
| ## Architecture | ||
|
|
||
| ``` | ||
| ┌─────────────┐ | ||
| │ GENERATOR │ Phase 1: Make a List | ||
| │ (Agent A) │ Produce the deliverable | ||
| └──────┬───────┘ | ||
| │ output | ||
| ▼ | ||
| ┌──────────────────────────────┐ | ||
| │ DUAL INDEPENDENT REVIEW │ Phase 2: Check It Twice | ||
| │ │ | ||
| │ ┌───────────┐ ┌───────────┐ │ Two agents, same rubric, | ||
| │ │ Reviewer B │ │ Reviewer C │ │ no shared context | ||
| │ └─────┬─────┘ └─────┬─────┘ │ | ||
| │ │ │ │ | ||
| └────────┼──────────────┼────────┘ | ||
| │ │ | ||
| ▼ ▼ | ||
| ┌──────────────────────────────┐ | ||
| │ VERDICT GATE │ Phase 3: Naughty or Nice | ||
| │ │ | ||
| │ B passes AND C passes → NICE │ Both must pass. | ||
| │ Otherwise → NAUGHTY │ No exceptions. | ||
| └──────┬──────────────┬─────────┘ | ||
| │ │ | ||
| NICE NAUGHTY | ||
| │ │ | ||
| ▼ ▼ | ||
| [ SHIP ] ┌─────────────┐ | ||
| │ FIX CYCLE │ Phase 4: Fix Until Nice | ||
| │ │ | ||
| │ iteration++ │ Collect all flags. | ||
| │ if i > MAX: │ Fix all issues. | ||
| │ escalate │ Re-run both reviewers. | ||
| │ else: │ Loop until convergence. | ||
| │ goto Ph.2 │ | ||
| └──────────────┘ | ||
| ``` | ||
|
|
||
| ## Phase Details | ||
|
|
||
| ### Phase 1: Make a List (Generate) | ||
|
|
||
| Execute the primary task. No changes to your normal generation workflow. Santa Method is a post-generation verification layer, not a generation strategy. | ||
|
|
||
| ```python | ||
| # The generator runs as normal | ||
| output = generate(task_spec) | ||
| ``` | ||
|
|
||
| ### Phase 2: Check It Twice (Independent Dual Review) | ||
|
|
||
| Spawn two review agents in parallel. Critical invariants: | ||
|
|
||
| 1. **Context isolation** — neither reviewer sees the other's assessment | ||
| 2. **Identical rubric** — both receive the same evaluation criteria | ||
| 3. **Same inputs** — both receive the original spec AND the generated output | ||
| 4. **Structured output** — each returns a typed verdict, not prose | ||
|
|
||
| ```python | ||
| REVIEWER_PROMPT = """ | ||
| You are an independent quality reviewer. You have NOT seen any other review of this output. | ||
|
|
||
| ## Task Specification | ||
| {task_spec} | ||
|
|
||
| ## Output Under Review | ||
| {output} | ||
|
|
||
| ## Evaluation Rubric | ||
| {rubric} | ||
|
|
||
| ## Instructions | ||
| Evaluate the output against EACH rubric criterion. For each: | ||
| - PASS: criterion fully met, no issues | ||
| - FAIL: specific issue found (cite the exact problem) | ||
|
|
||
| Return your assessment as structured JSON: | ||
| { | ||
| "verdict": "PASS" | "FAIL", | ||
| "checks": [ | ||
| {"criterion": "...", "result": "PASS|FAIL", "detail": "..."} | ||
| ], | ||
| "critical_issues": ["..."], // blockers that must be fixed | ||
| "suggestions": ["..."] // non-blocking improvements | ||
| } | ||
|
|
||
| Be rigorous. Your job is to find problems, not to approve. | ||
| """ | ||
| ``` | ||
|
|
||
| ```python | ||
| # Spawn reviewers in parallel (Claude Code subagents) | ||
| review_b = Agent(prompt=REVIEWER_PROMPT.format(...), description="Santa Reviewer B") | ||
| review_c = Agent(prompt=REVIEWER_PROMPT.format(...), description="Santa Reviewer C") | ||
|
|
||
| # Both run concurrently — neither sees the other | ||
| ``` | ||
|
|
||
| ### Rubric Design | ||
|
|
||
| The rubric is the most important input. Vague rubrics produce vague reviews. Every criterion must have an objective pass/fail condition. | ||
|
|
||
| | Criterion | Pass Condition | Failure Signal | | ||
| |-----------|---------------|----------------| | ||
| | Factual accuracy | All claims verifiable against source material or common knowledge | Invented statistics, wrong version numbers, nonexistent APIs | | ||
| | Hallucination-free | No fabricated entities, quotes, URLs, or references | Links to pages that don't exist, attributed quotes with no source | | ||
| | Completeness | Every requirement in the spec is addressed | Missing sections, skipped edge cases, incomplete coverage | | ||
| | Compliance | Passes all project-specific constraints | Banned terms used, tone violations, regulatory non-compliance | | ||
| | Internal consistency | No contradictions within the output | Section A says X, section B says not-X | | ||
| | Technical correctness | Code compiles/runs, algorithms are sound | Syntax errors, logic bugs, wrong complexity claims | | ||
|
|
||
| #### Domain-Specific Rubric Extensions | ||
|
|
||
| **Content/Marketing:** | ||
| - Brand voice adherence | ||
| - SEO requirements met (keyword density, meta tags, structure) | ||
| - No competitor trademark misuse | ||
| - CTA present and correctly linked | ||
|
|
||
| **Code:** | ||
| - Type safety (no `any` leaks, proper null handling) | ||
| - Error handling coverage | ||
| - Security (no secrets in code, input validation, injection prevention) | ||
| - Test coverage for new paths | ||
|
|
||
| **Compliance-Sensitive (regulated, legal, financial):** | ||
| - No outcome guarantees or unsubstantiated claims | ||
| - Required disclaimers present | ||
| - Approved terminology only | ||
| - Jurisdiction-appropriate language | ||
|
|
||
| ### Phase 3: Naughty or Nice (Verdict Gate) | ||
|
|
||
| ```python | ||
| def santa_verdict(review_b, review_c): | ||
| """Both reviewers must pass. No partial credit.""" | ||
| if review_b.verdict == "PASS" and review_c.verdict == "PASS": | ||
| return "NICE" # Ship it | ||
|
|
||
| # Merge flags from both reviewers, deduplicate | ||
| all_issues = dedupe(review_b.critical_issues + review_c.critical_issues) | ||
| all_suggestions = dedupe(review_b.suggestions + review_c.suggestions) | ||
|
|
||
| return "NAUGHTY", all_issues, all_suggestions | ||
| ``` | ||
|
|
||
| Why both must pass: if only one reviewer catches an issue, that issue is real. The other reviewer's blind spot is exactly the failure mode Santa Method exists to eliminate. | ||
|
|
||
| ### Phase 4: Fix Until Nice (Convergence Loop) | ||
|
|
||
| ```python | ||
| MAX_ITERATIONS = 3 | ||
|
|
||
| for iteration in range(MAX_ITERATIONS): | ||
| verdict, issues, suggestions = santa_verdict(review_b, review_c) | ||
|
|
||
| if verdict == "NICE": | ||
| log_santa_result(output, iteration, "passed") | ||
| return ship(output) | ||
|
|
||
| # Fix all critical issues (suggestions are optional) | ||
| output = fix_agent.execute( | ||
| output=output, | ||
| issues=issues, | ||
| instruction="Fix ONLY the flagged issues. Do not refactor or add unrequested changes." | ||
| ) | ||
|
|
||
| # Re-run BOTH reviewers on fixed output (fresh agents, no memory of previous round) | ||
| review_b = Agent(prompt=REVIEWER_PROMPT.format(output=output, ...)) | ||
| review_c = Agent(prompt=REVIEWER_PROMPT.format(output=output, ...)) | ||
|
|
||
| # Exhausted iterations — escalate | ||
| log_santa_result(output, MAX_ITERATIONS, "escalated") | ||
| escalate_to_human(output, issues) | ||
| ``` | ||
|
|
||
| Critical: each review round uses **fresh agents**. Reviewers must not carry memory from previous rounds, as prior context creates anchoring bias. | ||
|
|
||
| ## Implementation Patterns | ||
|
|
||
| ### Pattern A: Claude Code Subagents (Recommended) | ||
|
|
||
| Subagents provide true context isolation. Each reviewer is a separate process with no shared state. | ||
|
|
||
| ```bash | ||
| # In a Claude Code session, use the Agent tool to spawn reviewers | ||
| # Both agents run in parallel for speed | ||
| ``` | ||
|
|
||
| ```python | ||
| # Pseudocode for Agent tool invocation | ||
| reviewer_b = Agent( | ||
| description="Santa Review B", | ||
| prompt=f"Review this output for quality...\n\nRUBRIC:\n{rubric}\n\nOUTPUT:\n{output}" | ||
| ) | ||
| reviewer_c = Agent( | ||
| description="Santa Review C", | ||
| prompt=f"Review this output for quality...\n\nRUBRIC:\n{rubric}\n\nOUTPUT:\n{output}" | ||
| ) | ||
| ``` | ||
|
|
||
| ### Pattern B: Sequential Inline (Fallback) | ||
|
|
||
| When subagents aren't available, simulate isolation with explicit context resets: | ||
|
|
||
| 1. Generate output | ||
| 2. New context: "You are Reviewer 1. Evaluate ONLY against this rubric. Find problems." | ||
| 3. Record findings verbatim | ||
| 4. Clear context completely | ||
| 5. New context: "You are Reviewer 2. Evaluate ONLY against this rubric. Find problems." | ||
| 6. Compare both reviews, fix, repeat | ||
|
|
||
| The subagent pattern is strictly superior — inline simulation risks context bleed between reviewers. | ||
|
|
||
| ### Pattern C: Batch Sampling | ||
|
|
||
| For large batches (100+ items), full Santa on every item is cost-prohibitive. Use stratified sampling: | ||
|
|
||
| 1. Run Santa on a random sample (10-15% of batch, minimum 5 items) | ||
| 2. Categorize failures by type (hallucination, compliance, completeness, etc.) | ||
| 3. If systematic patterns emerge, apply targeted fixes to the entire batch | ||
| 4. Re-sample and re-verify the fixed batch | ||
| 5. Continue until a clean sample passes | ||
|
|
||
| ```python | ||
| import random | ||
|
|
||
| def santa_batch(items, rubric, sample_rate=0.15): | ||
| sample = random.sample(items, max(5, int(len(items) * sample_rate))) | ||
|
|
||
| for item in sample: | ||
| result = santa_full(item, rubric) | ||
| if result.verdict == "NAUGHTY": | ||
| pattern = classify_failure(result.issues) | ||
| items = batch_fix(items, pattern) # Fix all items matching pattern | ||
| return santa_batch(items, rubric) # Re-sample | ||
|
|
||
| return items # Clean sample → ship batch | ||
| ``` | ||
|
|
||
| ## Failure Modes and Mitigations | ||
|
|
||
| | Failure Mode | Symptom | Mitigation | | ||
| |-------------|---------|------------| | ||
| | Infinite loop | Reviewers keep finding new issues after fixes | Max iteration cap (3). Escalate. | | ||
| | Rubber stamping | Both reviewers pass everything | Adversarial prompt: "Your job is to find problems, not approve." | | ||
| | Subjective drift | Reviewers flag style preferences, not errors | Tight rubric with objective pass/fail criteria only | | ||
| | Fix regression | Fixing issue A introduces issue B | Fresh reviewers each round catch regressions | | ||
| | Reviewer agreement bias | Both reviewers miss the same thing | Mitigated by independence, not eliminated. For critical output, add a third reviewer or human spot-check. | | ||
| | Cost explosion | Too many iterations on large outputs | Batch sampling pattern. Budget caps per verification cycle. | | ||
|
|
||
| ## Integration with Other Skills | ||
|
|
||
| | Skill | Relationship | | ||
| |-------|-------------| | ||
| | Verification Loop | Use for deterministic checks (build, lint, test). Santa for semantic checks (accuracy, hallucinations). Run verification-loop first, Santa second. | | ||
| | Eval Harness | Santa Method results feed eval metrics. Track pass@k across Santa runs to measure generator quality over time. | | ||
| | Continuous Learning v2 | Santa findings become instincts. Repeated failures on the same criterion → learned behavior to avoid the pattern. | | ||
| | Strategic Compact | Run Santa BEFORE compacting. Don't lose review context mid-verification. | | ||
|
|
||
| ## Metrics | ||
|
|
||
| Track these to measure Santa Method effectiveness: | ||
|
|
||
| - **First-pass rate**: % of outputs that pass Santa on round 1 (target: >70%) | ||
| - **Mean iterations to convergence**: average rounds to NICE (target: <1.5) | ||
| - **Issue taxonomy**: distribution of failure types (hallucination vs. completeness vs. compliance) | ||
| - **Reviewer agreement**: % of issues flagged by both reviewers vs. only one (low agreement = rubric needs tightening) | ||
| - **Escape rate**: issues found post-ship that Santa should have caught (target: 0) | ||
|
|
||
| ## Cost Analysis | ||
|
|
||
| Santa Method costs approximately 2-3x the token cost of generation alone per verification cycle. For most high-stakes output, this is a bargain: | ||
|
|
||
| ``` | ||
| Cost of Santa = (generation tokens) + 2×(review tokens per round) × (avg rounds) | ||
| Cost of NOT Santa = (reputation damage) + (correction effort) + (trust erosion) | ||
| ``` | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Check if santa-method is registered in the workflow-quality module
# Search for the workflow-quality module and display its paths array
jq '.[] | select(.id == "workflow-quality") | {id, paths}' manifests/install-modules.jsonRepository: affaan-m/everything-claude-code
Length of output: 162
🏁 Script executed:
#!/bin/bash
# First, check if the manifest file exists and view its overall structure
echo "=== Checking if manifest file exists ==="
if [ -f "manifests/install-modules.json" ]; then
echo "File found. First 50 lines:"
head -50 manifests/install-modules.json
echo ""
echo "=== Total lines in file ==="
wc -l manifests/install-modules.json
else
echo "File not found at manifests/install-modules.json"
# Try to find it
echo "Searching for install-modules.json:"
find . -name "install-modules.json" -type f 2>/dev/null
fiRepository: affaan-m/everything-claude-code
Length of output: 1283
🏁 Script executed:
#!/bin/bash
# Search for the workflow-quality module in the manifest
echo "=== Searching for workflow-quality module ==="
grep -n "workflow-quality" manifests/install-modules.json
echo ""
echo "=== Checking if santa-method is mentioned in manifest ==="
grep -n "santa-method" manifests/install-modules.json
echo ""
echo "=== Using jq to extract workflow-quality module ==="
python3 <<'PYTHON_EOF'
import json
with open('manifests/install-modules.json', 'r') as f:
data = json.load(f)
# Find workflow-quality module
for module in data.get('modules', []):
if module.get('id') == 'workflow-quality':
print("Found workflow-quality module:")
print(json.dumps(module, indent=2))
break
else:
print("workflow-quality module not found")
print("\nAvailable modules:")
for module in data.get('modules', []):
print(f" - {module.get('id')}")
PYTHON_EOFRepository: affaan-m/everything-claude-code
Length of output: 1116
Add santa-method to the workflow-quality module manifest.
The new skill is not registered in manifests/install-modules.json. It should be added to the workflow-quality module's paths array (currently at line 182) alongside related skills like verification-loop, eval-harness, and continuous-learning-v2.
Add "skills/santa-method" to the paths array in the workflow-quality module definition.
🧰 Tools
🪛 LanguageTool
[style] ~173-~173: Consider an alternative for the overused word “exactly”.
Context: ...eal. The other reviewer's blind spot is exactly the failure mode Santa Method exists to...
(EXACTLY_PRECISELY)
[style] ~235-~235: Consider using a different adverb to strengthen your wording.
Context: ...cord findings verbatim 4. Clear context completely 5. New context: "You are Reviewer 2. Ev...
(COMPLETELY_ENTIRELY)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@skills/santa-method/SKILL.md` around lines 1 - 305, The workflow-quality
module's paths array in the install modules manifest is missing the new skill;
add the string "skills/santa-method" to the paths array alongside existing
entries like "verification-loop", "eval-harness", and "continuous-learning-v2"
so the Santa Method skill is registered for installation; ensure the value is
quoted and comma-separated consistently with the other path entries and commit
the manifest change.
|
|
||
| The core insight: a single agent reviewing its own output shares the same biases, knowledge gaps, and systematic errors that produced the output. Two independent reviewers with no shared context break this failure mode. | ||
|
|
||
| ## When to Activate |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Rename section to match repository guidelines.
The section title should be "When to Use" rather than "When to Activate" to align with the repository's skill documentation standards. As per coding guidelines, skills must be formatted with clear sections for "When to Use", "How It Works", and "Examples".
📝 Proposed fix
-## When to Activate
+## When to Use📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ## When to Activate | |
| ## When to Use |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@skills/santa-method/SKILL.md` at line 13, Change the section header "## When
to Activate" to "## When to Use" in SKILL.md (look for the header string "When
to Activate"), and ensure the file includes the required sibling sections "How
It Works" and "Examples" (add them if missing) so the skill follows the
repository documentation standard.
| └──────────────┘ | ||
| ``` | ||
|
|
||
| ## Phase Details |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Restructure to align with repository skill format.
The repository guidelines require an explicit "How It Works" section for skill documentation. Consider restructuring the content so that "Phase Details" becomes a subsection under a top-level "How It Works" section. This would align the document with established repository patterns for skills.
📝 Suggested structure
-## Phase Details
+## How It Works
+
+### Phase Details📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ## Phase Details | |
| ## How It Works | |
| ### Phase Details |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@skills/santa-method/SKILL.md` at line 65, The document's headings don't
follow the repo skill format: move the existing "Phase Details" section to be a
subsection under a new top-level "How It Works" heading; specifically add a
top-level "How It Works" header and nest "Phase Details" beneath it (e.g., "##
How It Works" followed by "### Phase Details"), ensuring the rest of the content
is reflowed under that new structure so the skill matches repository patterns.
| Critical: each review round uses **fresh agents**. Reviewers must not carry memory from previous rounds, as prior context creates anchoring bias. | ||
|
|
||
| ## Implementation Patterns | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Restructure to include explicit "Examples" section.
Repository guidelines require skills to have clear sections for "When to Use", "How It Works", and "Examples". Consider moving or reorganizing "Implementation Patterns" under a top-level "Examples" section to align with established repository patterns.
📝 Suggested structure
-## Implementation Patterns
+## Examples
+
+### Implementation Patterns🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@skills/santa-method/SKILL.md` at line 206, The doc is missing a top-level
"Examples" section and "Implementation Patterns" should be moved under it to
match repo guidelines; update SKILL.md to ensure there are clear top-level
headings "When to Use", "How It Works", and "Examples", create an "Examples"
section if absent, relocate the existing "Implementation Patterns" content under
that "Examples" heading (renaming if needed), and adjust any Table of Contents
or cross-references so the new heading references (e.g., "Implementation
Patterns" -> "Examples / Implementation Patterns") remain correct.
|
Interesting approach to multi-agent verification. Will review this week. |
|
I have the changes recommended by the auto reviews ready. I’ll update it
when I’m back in the office.
Ronald Skelton
CEO/Founder RapportScore.ai
Conversational Intelligence and training platform.
918-409-9141
***@***.***
…On Sun, Mar 22, 2026 at 3:01 PM Affaan Mustafa ***@***.***> wrote:
*affaan-m* left a comment (affaan-m/ECC#760)
<#760 (comment)>
Interesting approach to multi-agent verification. Will review this week.
—
Reply to this email directly, view it on GitHub
<#760>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BONB4L7SRR5UVX4XR7D3Z734SBPELAVCNFSM6AAAAACW23OQIKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DCMBXGA3DSMZSG4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
affaan-m
left a comment
There was a problem hiding this comment.
Well-written adversarial verification pattern. Self-contained, integrates nicely with existing ECC skills.
…affaan-m#760) * feat(skills): add santa-method Multi-agent adversarial verification with convergence loop. Two independent review agents evaluate output against a shared rubric. Both must pass before shipping. Includes architecture diagram, implementation patterns (subagent, inline, batch sampling), domain-specific rubric extensions, failure mode mitigations, and integration guidance with existing ECC skills. * Enhance SKILL.md with detailed Santa Method documentation Expanded the SKILL.md documentation for the Santa Method, detailing architecture, phases, implementation patterns, failure modes, integration with other skills, metrics, and cost analysis.
…affaan-m#760) * feat(skills): add santa-method Multi-agent adversarial verification with convergence loop. Two independent review agents evaluate output against a shared rubric. Both must pass before shipping. Includes architecture diagram, implementation patterns (subagent, inline, batch sampling), domain-specific rubric extensions, failure mode mitigations, and integration guidance with existing ECC skills. * Enhance SKILL.md with detailed Santa Method documentation Expanded the SKILL.md documentation for the Santa Method, detailing architecture, phases, implementation patterns, failure modes, integration with other skills, metrics, and cost analysis.
Summary
Adds the Santa Method - a multi-agent adversarial verification framework for post-generation quality assurance.
Core idea: A single agent reviewing its own output shares the same biases and blind spots that produced it. Two independent reviewers with no shared context break this failure mode.
Why This Change
The existing
verification-loopskill handles deterministic checks (build, lint, test). Santa Method fills the gap for semantic verification - catching hallucinations, factual errors, compliance violations, and inconsistencies that no linter can detect.The 4 Phases
What's Included
Type of Change
feat:New featureTesting Done
Origin: Ronald Skelton - Founder, RapportScore.ai
Summary by cubic
Adds the Santa Method skill, a multi-agent adversarial verification framework with a convergence loop. Two independent reviewers verify generated output against a shared rubric; both must pass before shipping.
skills/santa-method/SKILL.mdwith architecture diagram and four phases (generate, dual review, verdict gate, fix loop).verification-loop,eval-harness, andcontinuous-learning-v2.Written for commit 90518eb. Summary will update on new commits.
Summary by CodeRabbit
Documentation