Claude Trusted — Verifiable / hallucination on PDFs

Home
Chat

Inspiration

One of the most popular uses of LLMs is to ask questions about reference material. For example, the user may want to ask a question about an online article, a research paper, or a textbook. One major problem with this usage is trust; how can the user confirm that what the LLM is claiming is faithful to the provided material? LLMs have been shown to exhibit hallucination, where an LLM confidently outputs made-up information. As shown in the Claude 3.7 Sonnet System Card (Figure 13), even frontier models suffer from this problem. Our project is a step to increase this trust by better grounding information output by the LLM in the reference material.

What it does

In order to help the LLM have more accurate responses, we help confirm that the LLM's response is truly grounded in the reference material. This confirmation is done by asking the LLM to quote part of the reference material as evidence for each claim it makes. For example, if we attach the Wikipedia page about Anthropic to the conversation, and Claude claims that Anthropic was founded in 2021, it should add a quote from the Wikipedia article that shows that Anthropic was founded in 2021. The problem with just doing this is that the quote may be hallucinated. In our tests, Claude hallucinated 17% of the quoted material when just using this approach. Our approach helps solve this problem by augmenting Claude with an MCP tool that helps it confirm that the quotes are correct. Claude calls this tool after every quotation, and the tool tells Claude if the quote is true to the reference material.

How we built it

Frontend – Open WebUI fork with a streamlined “Upload to PDF Search” button and persistent attachment badge. 
Backend scaffolding – FastAPI routes for upload & chat, env‑config.
Verification core – pure search_pdf + MCP wrapper, orchestrated in a Claude tool‑loop.

The clean three‑tier layout lets us parallelise work and plug in the benchmark quickly.

Challenges we ran into

Similarity checking and PDF parsing: When checking if quoted text is truly part of the reference material PDF, we need to parse the PDF on the MCP server. However, since Claude uses proprietary parsing tools, our parsing is not consistent with Claude's. This can lead to false negatives in quote verification, checking if we naively check if the quote is in the parsed PDF text. We had to run extensive tests to deal with this.

* Designing a drop‑in API for Open WebUI: Open WebUI only speaks the OpenAI /v1 spec and expects SSE‑style chunks. We wrote a brand‑new FastAPI layer that mimics those endpoints yet secretly pipes each request through our MCP orchestrator. Getting the stream framing, error codes, and auth handshake right burned most of day 1.

Understanding MCP overnight: Real-world examples of MCP are scarce. We combed through the spec, mapped Claudeʼs tool_use events to FastMCP decorators, and wrestled with JSON‑schema mismatches until the tool loop finally stabilised.
Implementing benchmarking using UK AI Safety Institute's evaluation framework.
Turning Open WebUI into a “trust‑first” demo: The stock UI is a general‑purpose chat wrapper. We ripped out the multi‑menu uploader, injected a single “Upload PDF for Trust Mode” button, and added a persistent blue badge reminding users which document is in scope. Svelte template‑nesting errors and Vite cache gremlins made this surprisingly gnarly