Inspiration

Identify the gap

Large language models (LLMs) are now primary sources of answers for many users.

Brands have SEO metrics for search engines but no easy way to measure visibility inside AI-generated answers.

Spark

We wanted a tool that treats LLMs like “search engines” and measures how often they cite or recommend a brand/website.

Real-world value

Marketing/PR teams, product owners, and SEO managers gain actionable insights about AI-driven discovery and reputation.

What it does (step-wise)

User submits a prompt (e.g., “best AI SEO tools”) or a list of queries.

Multi-LLM query: the system sends the prompt to several AI models (ChatGPT, Claude, Gemini, Perplexity — extendable).

Collect responses from each model and store raw output.

Parse responses to extract domain names, explicit brand mentions, and links.

Normalize and count occurrences (per model and overall).

Run a quick verification step with a web search API to validate any factual claims and confirm whether cited links actually exist.

Compute metrics:

Mentions per model

Overall mentions

GEO Score — a visibility percentage for a target domain (e.g., how often the domain appears across total model responses)

Display results on an interactive dashboard: charts, tables, and a fact-checked summary answer.

Allow export of results (CSV/JSON) and saving queries for historical tracking.

How we built it (step-wise, tech + implementation)

Project scaffold

Language: Python

Repo layout: /app (Streamlit UI) | /agents (LangChain logic) | /db (SQLite) | /utils (parsers/verifier)

Agents pipeline (LangChain / crew-style agents)

Prompt Agent — receives and normalizes user queries.

Multi-LLM Agent — concurrently calls configured LLM APIs and returns raw responses.

Analyzer Agent — extracts links, domains, and brand names via regex + lightweight NLP (spaCy / simple named-entity checks).

Verifier Agent — quick web lookups (e.g., using Bing or Google Custom Search API) to confirm sources/links and flag unsupported claims.

Visualization Agent — prepares dataframes and Plotly figure objects for the dashboard.

Final Answer Agent — synthesizes a concise, fact-checked summary combining model consensus and verification results.

UI & dashboard

Streamlit for rapid interactive UI.

Plotly for charts: bar charts (mentions per model), pie (share), time series (if tracking over time).

Interactive controls: target domain input, date range, model selection, number of prompts to run.

Storage & persistence

SQLite (lightweight) stores queries, raw responses, parsed mentions, verification results, timestamps, and GEO scores.

Concurrency & rate limits

Use async calls or threadpool for parallel LLM requests.

Rate-limit management and exponential backoff for API failures.

Security & privacy

Redact sensitive inputs before storage (opt-in).

Store API keys server-side in environment variables; do not expose in frontend.

Demo wiring

One-click demo mode: runs a few curated prompts and shows live charts for immediate presentation.

Challenges we ran into (and how we addressed them)

Inconsistent output formats across LLMs

Solution: a normalization layer (post-processing) that extracts domains/mentions with robust regex + token-based matching.

Noisy or hallucinated links in model answers

Solution: Verifier Agent that performs quick URL checks and flags broken or fabricated links; give models lower trust weight when unverified.

Rate limits & latency when querying multiple LLMs

Solution: parallelize calls, implement timeouts, and show partial results in the UI while slower models finish.

Ambiguous brand mentions (e.g., "apple" = fruit vs Apple Inc.)

Solution: context-aware entity disambiguation using simple heuristics + optional manual confirmation in the UI.

Balancing speed and verification depth

Solution: two verification modes — fast (domain existence + title match) and deep (page content checks) selectable by user.

Dashboard UX clarity

Solution: iterative feedback from teammates; emphasize simple charts, tooltips, and downloadable CSVs for judges.

Accomplishments that we're proud of

End-to-end prototype: from prompt input → multi-LLM responses → verification → dashboard visualization, all working in a single demo flow.

Accurate parsing: reliable domain/brand extraction across multiple LLM outputs with over ~90% precision in our test set.

Meaningful GEO metric: we designed a simple, interpretable GEO Score that helps stakeholders quickly understand AI visibility.

Fast demo-ready UI: Streamlit dashboard that loads results and interactive charts within seconds for common prompts.

Verification integration: implemented a lightweight verification step to reduce reliance on hallucinated model citations.

Extensibility: modular agent architecture makes it easy to add more LLMs or richer verification steps later.

What we learned

LLMs behave differently — each model has distinct writing style and citation habits; aggregating them yields richer insights than any single model.

Verification is essential — models can invent plausible-sounding sources; a verification pass prevents misleading conclusions.

Simplicity scales — a small set of robust parsing rules + a basic verifier gave much better demo value than over-engineered NLP.

UX matters for judges — clear visualizations and a concise final summary are what judges remember, not internal complexity.

Rate-limit planning — always design demos with fallback content or cached responses to avoid live API failures during presentation.

What’s next for Intelligent GEO (step-wise roadmap)

Add more LLMs & data sources

Integrate additional models (Mistral, LLaMA endpoints) and specialty Q&A engines.

Historical tracking & alerts

Save time series of GEO scores and send alerts when visibility increases or decreases for a domain.

Sentiment & context analysis

Add sentiment scoring to measure whether mentions are neutral, positive, or negative.

Ranking & attribution

Provide deeper attribution: did the mention come as a recommendation, part of a list, or a passing reference?

Team & enterprise features

Multi-user accounts, API for programmatic monitoring, and scheduled automated scans.

Improved verification

Fact extraction and structured citation mapping (match claim → citation → verification confidence).

Monetization & integrations

Export to Google Data Studio, Slack alerts, and integrations for PR/SEO tools.

Production hardening

Move DB to cloud (Postgres), add authentication, CI/CD pipeline, and cost monitoring for LLM calls.

Built With

  • axios
  • client:
  • dom
  • fastapi
  • framer
  • frontend:-react-18
  • http
  • icons:
  • integration:
  • lucide
  • motion
  • react
  • reactbackend
  • router
  • routing:
  • tailwind-css
  • vite
Share this project:

Updates

posted an update

Just launched our hackathon project — WebMind Intelligent GEO! Built using FastAPI + AI + modern frontend, it’s designed to make intelligent geo-data analysis fast and efficient.

We’ve put a lot of effort into the backend intelligence and performance. Check it out and drop a like or comment if you find it interesting — feedback means a lot!

Log in or sign up for Devpost to join the conversation.