Inspiration

In 2026, generative AI has made it trivially cheap to flood the internet with convincing misinformation. A bad actor can produce thousands of plausible-sounding false claims in minutes, and existing tools like Snopes or PolitiFact only tell you what to think, not how to think.

We kept coming back to one question: what if the goal wasn't to fact-check every claim, but to make people need fact-checkers less? That inversion, a tool designed to make itself obsolete, became the founding idea behind Infodote.

The name is a portmanteau of info and antidote. We're treating misinformation as a public health problem. The goal isn't to treat every infection individually. It's to build population-level immunity through education.

What it does

Infodote is a real-time misinformation analysis platform. Paste any claim (a news headline, a social post, a viral statistic) and Infodote returns:

  • A verdict — True, False, or Misleading
  • The manipulation technique — the specific rhetorical pattern used (False Causation, Appeal to Fear, Cherry Picking, False Equivalence, and more)
  • A plain-English explanation of why the claim is factually wrong or misleading
  • A Critical Thinking Guide showing how to spot this exact technique yourself next time, without needing a tool
  • Three scored dimensions — bias score, logical soundness score, and provocation score
  • Verified sources — the top matched fact-checks from our database of 283 real-world verified claims

Every claim submitted is also written to a live Kibana dashboard that tracks verdict breakdowns, the most common manipulation techniques, average bias scores, and submission volume over time. This turns Infodote from a single-user tool into a population-level misinformation monitoring system.

How we built it

The architecture has four layers working together.

When a user submits a claim, the Next.js frontend sends it to a same-origin proxy, which forwards to our FastAPI backend. This keeps the browser free of cross-origin requests. The backend first checks an in-memory cache for instant repeat retrieval.

If it's a new claim, it gets converted into a 384-dimensional dense vector using the all-MiniLM-L6-v2 sentence-transformers model, running entirely locally in roughly 20 milliseconds with no external API call. That vector is sent to Elasticsearch Serverless on Google Cloud, where hybrid search runs simultaneously. kNN vector search handles semantic meaning and BM25 handles keyword matching, with results merged using Reciprocal Rank Fusion (RRF). This means "mobile towers linked to tumours" matches "5G causes cancer" because the intent is the same, even when the words are completely different.

The search runs against 283 verified fact-checks sourced from the Google Fact Check Tools API, covering publishers including Full Fact, PolitiFact, Reuters, AFP, AAP FactCheck, FactCheck.org, and Science Feedback across 41 misinformation categories. The top 3 matches are passed as grounding context to Claude Haiku 4.5, which returns a structured JSON analysis including the verdict, technique, explanation, critical thinking lesson, and all three scores.

The result is stored in both the in-memory cache and a claim_analyses Elasticsearch index that powers the live Kibana dashboard.

Stack: Next.js 16 + TypeScript + Tailwind + shadcn/ui · FastAPI · sentence-transformers · Elasticsearch Serverless (kNN + BM25 + RRF) · Claude Haiku 4.5 · Google Fact Check Tools API · Kibana

Challenges we ran into

The hybrid search implementation was the most technically demanding part. Getting kNN vector search and BM25 to run simultaneously and merge correctly using Reciprocal Rank Fusion took significant iteration. The ranking behaviour was unpredictable until we understood how RRF weights results from each search method.

The Elasticsearch Serverless setup introduced its own friction. Serverless behaves differently from hosted Elastic in subtle ways, particularly around index configuration and API key scoping. We spent more time here than expected.

Prompt engineering for structured output was also harder than it looked. Getting Claude to return consistent, parseable JSON with all seven fields populated correctly (including nuanced scores that actually reflect the claim's properties) required many iterations before the output was reliable enough to render in the UI.

On the frontend side, coordinating a multi-branch team meant some files weren't committed to the shared branch at the right time, which caused integration delays. We resolved this by establishing a clearer branching convention midway through.

Accomplishments that we're proud of

We're proudest of the Critical Thinking Guide. It's the feature that separates Infodote from every other fact-checker we looked at. Teaching the pattern behind misinformation, not just debunking the specific claim, is a fundamentally different product philosophy and we think it's the right one.

The live Kibana dashboard exceeded our expectations. Watching real-time panels update with each new claim submission during testing made the population-level vision feel real. You can see which manipulation techniques are trending and how bias scores distribute across submitted claims.

We're also proud of the data pipeline. 283 real verified fact-checks from internationally recognised publishers, covering 41 topic categories, ingested and vectorised and live in Elasticsearch, all built and running within the hackathon window.

What we learned

Hybrid search is meaningfully better than either vector or keyword search alone. We tested all three approaches against the same claim set and the RRF-merged results were consistently more relevant, especially for claims that used unusual phrasing. Elasticsearch's vector database capabilities are genuinely powerful when used correctly.

We also learned that prompt design for structured output is a discipline in itself. The difference between a prompt that returns clean JSON 95% of the time and one that returns it 70% of the time is subtle but critically important for a production demo.

The most important architectural decision we made was the proxy layer. Routing browser requests through a Next.js API route rather than hitting FastAPI directly meant zero CORS issues during integration, something that would have cost us hours to debug otherwise.

What's next for Infodote

The immediate next step is expanding the fact-check corpus. 283 claims is a solid foundation but real-world coverage requires thousands. We would ingest from additional publishers via the Google Fact Check Tools API and build a continuous ingestion pipeline so the database stays current.

The browser extension is the roadmap feature we're most excited about. Imagine highlighting any text on any webpage and getting an Infodote analysis inline, with no copy-paste and no tab switching. That's the version that reaches people at the moment they encounter misinformation, not after.

Longer term, the Kibana dashboard becomes genuinely valuable as a public-facing tool. A live map of which manipulation techniques are trending in a region, which sources are most frequently associated with misinformation, and how community digital literacy is improving over time is a product journalists and researchers would actually use.

Built With

  • claude-haiku-4.5-(anthropic-api)
  • elastic
  • elasticsearch-serverless
  • fastapi
  • github
  • google-cloud
  • google-fact-check-tools-api
  • kibana
  • next.js
  • python
  • sentence-transformers
  • shadcn/ui
  • tailwind-css
  • typescript
Share this project:

Updates