An Epistemic Consensus Protocol
Structured multi-agent deliberation for verifiable AI reasoning.
No single model can verify itself. This protocol makes multiple models prove it to each other.
No AI company can grade its own homework. OpenAI won't use Claude to red-team GPT. Google won't let Gemini be challenged by competitors. ThoughtProof exists because neutral verification requires structural independence.
Normalize → Generate → Critique → Evaluate → Synthesize. Multiple models independently propose solutions. A cross-model critic tears them apart. An arbitrator synthesizes consensus. Auditable end-to-end.
Every deliberation produces a signed, versioned block capturing the full reasoning chain: who proposed what, who criticized whom, where models disagreed, and why specific conclusions were reached.
Unlike single-model systems that optimize for confidence, ThoughtProof optimizes for epistemic rigor through mandatory cross-model adversarial evaluation.
Input Query
↓
[1. NORMALIZE] → Structured query + success criteria
↓
[2. GENERATE] → Multiple models, independent proposals
↓
[3. CRITIQUE] → Cross-model adversarial evaluation
↓
[4. EVALUATE] → Multi-dimensional scoring
↓
[5. SYNTHESIZE] → Epistemic Block (signed, versioned)
184+ benchmark runs. Same models. Two approaches.
Ask 4 models, take the consensus. Passed fabricated statistics, hallucinated citations, and factual errors. Correlated bias masquerades as independent confirmation.
Same models + adversarial critic. 96.7% adversarial detection. 92% hallucination detection. The critic catches what consensus misses.
In 23% of cases where majority vote accepted a hallucination, the dedicated critic identified the error through structural analysis — checking reasoning chains, not counting votes. Full results →
Multi-model divergence as prompt injection detection.
PoT identified critical vulnerabilities in major AI frameworks that single-model review missed — including prompt injection bypasses, RCE vectors, and compliance risks. Details published after responsible disclosure windows.
Different models have different vulnerability profiles. When the same input produces divergent responses, that's a reliable injection signal. The more models, the harder coordinated attacks become.
Every deliberation is recorded: which model complied with an injection, which resisted, what the critic flagged. Epistemic Blocks create a growing corpus of attack signatures.
Verifiable audit trails for high-risk AI systems. Regulation (EU) 2024/1689.
High-risk AI systems require continuous risk assessment with documented evidence. Epistemic blocks capture what was assessed, by which models, where disagreement occurred, and what confidence level was reached — creating the audit trail Art. 9(2) demands.
Users must be able to interpret AI outputs and use them appropriately. ThoughtProof's multi-perspective verification with explicit confidence scoring and dissent preservation implements transparency by design — not as an afterthought.
The Act requires humans to "fully understand the capacities and limitations" of high-risk AI. Epistemic blocks present structured disagreement rather than false consensus — enabling the informed oversight Art. 14(4) requires.
Conformity assessment requires documented evidence across all Chapter 2 requirements. A chain of epistemic blocks provides verifiable, timestamped, provider-neutral records that auditors and notified bodies can inspect.
Conformity assessment cannot rely on tools built by the AI provider being assessed. ThoughtProof is structurally independent — no vendor can grade its own homework. The protocol works across all model providers.
Regulators and notified bodies can inspect the verification methodology end-to-end. No black boxes. Every block, every critique, every confidence score is traceable and reproducible.
Where AI decisions have real consequences, single-model confidence isn't enough.
AI agents making clinical recommendations, triaging patients, or summarizing records need verifiable reasoning — not just confidence scores. Epistemic blocks document why a recommendation was made, which models disagreed, and where a physician should look twice. Built by a healthcare practitioner.
AI agents with wallet access can transfer funds based on corrupted reasoning. ThoughtProof verifies the decision chain before execution — catching prompt injection, hallucinated instructions, and social engineering that single-model systems miss. @pot-sdk2/pay →
When AI generates legal analysis, contract reviews, or regulatory assessments, the reasoning must be auditable. Multi-model verification with dissent preservation creates the evidence trail that legal teams and regulators expect.
Zero data leaves your infrastructure. Fully self-hostable. No ThoughtProof servers involved.
Run with Ollama, vLLM, or any OpenAI-compatible endpoint. Models and data stay on your hardware. Full air-gap capable.
docker run thoughtproof/pot-cli \
--provider http://ollama:11434
Use any LLM provider with your own API keys. ThoughtProof never sees your credentials, prompts, or outputs. No intermediary servers.
No data processing by ThoughtProof means no GDPR data processing agreement needed. You control where your data lives — EU, on-prem, or air-gapped.
From CLI to SDK to Agent-to-Agent trust infrastructure.
JavaScript/TypeScript SDK. Drop-in verification for any AI pipeline. verify() any output in one line. Plugin architecture: friend (persistent critic), graph (structural verification), bridge (agent-to-agent trust). npm →
Agent-to-agent verification via MCP. Agents verify each other's reasoning chains before acting on outputs. Transitive trust with configurable decay. The missing trust layer for multi-agent systems.
Run epistemic blocks from the terminal. Connect any model. Audit any output. npm install -g pot-cli
Read the specification. Review the code. Verify everything.
npm install pot-sdk
·
npm install -g pot-cli
Contact: raul@thoughtproof.ai