
About
Multi-agent adversarial verification with convergence loop. Two independent review agents must both pass before output ships.
name: santa-method description: "Multi-agent adversarial verification with convergence loop. Two independent review agents must both pass before output ships." origin: "Ronald Skelton - Founder, RapportScore.ai"
Santa Method
Multi-agent adversarial verification framework. Make a list, check it twice. If it's naughty, fix it until it's nice.
The core insight: a single agent reviewing its own output shares the same biases, knowledge gaps, and systematic errors that produced the output. Two independent reviewers with no shared context break this failure mode.
When to Activate
Invoke this skill when:
- Output will be published, deployed, or consumed by end users
- Compliance, regulatory, or brand constraints must be enforced
- Code ships to production without human review
- Content accuracy matters (technical docs, educational material, customer-facing copy)
- Batch generation at scale where spot-checking misses systemic patterns
- Hallucination risk is elevated (claims, statistics, API references, legal language)
Do NOT use for internal drafts, exploratory research, or tasks with deterministic verification (use build/test/lint pipelines for those).
Architecture
┌─────────────┐
│ GENERATOR │ Phase 1: Make a List
│ (Agent A) │ Produce the deliverable
└──────┬───────┘
│ output
▼
┌──────────────────────────────┐
│ DUAL INDEPENDENT REVIEW │ Phase 2: Check It Twice
│ │
│ ┌───────────┐ ┌───────────┐ │ Two agents, same rubric,
│ │ Reviewer B │ │ Reviewer C │ │ no shared context
│ └─────┬─────┘ └─────┬─────┘ │
│ │ │ │
└────────┼──────────────┼────────┘
│ │
▼ ▼
┌──────────────────────────────┐
│ VERDICT GATE │ Phase 3: Naughty or Nice
│ │
│ B passes AND C passes → NICE │ Both must pass.
│ Otherwise → NAUGHTY │ No exceptions.
└──────┬──────────────┬─────────┘
│ │
NICE NAUGHTY
│ │
▼ ▼
[ SHIP ] ┌─────────────┐
│ FIX CYCLE │ Phase 4: Fix Until Nice
│ │
│ iteration++ │ Collect all flags.
│ if i > MAX: │ Fix all issues.
│ escalate │ Re-run both reviewers.
│ else: │ Loop until convergence.
│ goto Ph.2 │
└──────────────┘
Phase Details
Phase 1: Make a List (Generate)
Execute the primary task. No changes to your normal generation workflow. Santa Method is a post-generation verification layer, not a generation strategy.
# The generator runs as normal
output = generate(task_spec)
Phase 2: Check It Twice (Independent Dual Review)
Spawn two review agents in parallel. Critical invariants:
- Context isolation — neither reviewer sees the other's assessment
- Identical rubric — both receive the same evaluation criteria
- Same inputs — both receive the original spec AND the generated output
- Structured output — each returns a typed verdict, not prose
REVIEWER_PROMPT = """
You are an independent quality reviewer. You have NOT seen any other review of this output.
## Task Specification
{task_spec}
## Output Under Review
{output}
## Evaluation Rubric
{rubric}
## Instructions
Evaluate the output against EACH rubric criterion. For each:
- PASS: criterion fully met, no issues
- FAIL: specific issue found (cite the exact problem)
Return your assessment as structured JSON:
{
"verdict": "PASS" | "FAIL",
"checks": [
{"criterion": "...", "result": "PASS|FAIL", "detail": "..."}
],
"critical_issues": ["..."], // blockers that must be fixed
"suggestions": ["..."] // non-blocking improvements
}
Be rigorous. Your job is to find problems, not to approve.
"""
# Spawn reviewers in parallel (Claude Code subagents)
review_b = Agent(prompt=REVIEWER_PROMPT.format(...), description="Santa Reviewer B")
review_c = Agent(prompt=REVIEWER_PROMPT.format(...), description="Santa Reviewer C")
# Both run concurrently — neither sees the other
Rubric Design
The rubric is the most important input. Vague rubrics produce vague reviews. Every criterion must have an objective pass/fail condition.
| Criterion | Pass Condition | Failure Signal | |-----------|---------------|----------------| | Factual accuracy | All claims verifiable against source material or common knowledge | Invented statistics, wrong version numbers, nonexistent APIs | | Hallucination-free | No fabricated entities, quotes, URLs, or references | Links to pages that don't exist, attributed quotes with no source | | Completeness | Every requirement in the spec is addressed | Missing sections, skipped edge cases, incomplete coverage | | Compliance | Passes all project-specific constraints | Banned terms used, tone violations, regu

