MenuMind — Turning a Menu Into a Conversation
What inspired us
We kept noticing the same problem every time we visited a local restaurant in Hayward. You go to their website or Instagram, you have a simple question — is this dish gluten-free? what's the cheapest vegan option? — and there's nothing. No chatbot, no live support, just a static PDF menu if you're lucky.
These restaurants aren't ignoring their customers. They just don't have the budget or the technical team to build AI. The tools that exist — Toast, Yelp, OpenTable — handle payments and reviews. Nobody handles the conversation.
That gap is what inspired MenuMind. We wanted to build something that gave every independent restaurant the same AI customer service capability that only large chains could afford — and make it so simple that a restaurant owner who has never written a line of code could be up and running in under 2 minutes.
What we built
MenuMind is a platform where any restaurant owner uploads their menu as a PDF or image and instantly gets a live AI-powered chatbot for their business. Customers access it through a shared link — no app download, no signup, just open and chat.
The pipeline works in four stages:
Stage 1 — Extraction When a restaurant uploads their menu, Google Gemini Vision reads the PDF or image and extracts every dish into structured JSON — names, prices, descriptions, ingredients, and dietary tags.
Stage 2 — Enrichment We pass every menu item through the Perplexity Sonar API, which enriches each dish with allergen safety context, dish origins, nutritional information, and pairing suggestions. This is data the original menu never contained — and it makes the chatbot dramatically more useful than a plain menu lookup tool.
Stage 3 — Knowledge Graph The enriched data is stored in a Neo4j graph database — not as a flat list, but as a connected web of relationships:
$$ (\text{Restaurant}) \xrightarrow{\text{HAS_CATEGORY}} (\text{Category}) \xrightarrow{\text{HAS_ITEM}} (\text{MenuItem}) \xrightarrow{\text{HAS_INGREDIENT}} (\text{Ingredient}) $$
$$ (\text{MenuItem}) \xrightarrow{\text{TAGGED_AS}} (\text{DietaryTag}) $$
This graph structure means that when a customer asks "what vegan dishes don't contain nuts?", the retrieval is a graph traversal — precise, fast, and impossible to hallucinate.
Stage 4 — Chat When a customer sends a message, a rule-based router classifies the intent and queries the knowledge graph directly. The retrieved dishes are passed to Perplexity Sonar as context, which generates a natural, grounded response. Perplexity also acts as a fallback layer for questions outside the menu — reviews, allergy safety, general food knowledge — so the chatbot never hits a dead end.
Tech stack
| Layer | Technology |
|---|---|
| Backend | Flask (Python) |
| Knowledge graph | Neo4j AuraDB |
| Menu extraction | Google Gemini 2.0 Flash Vision |
| Enrichment + Chat | Perplexity Sonar API |
| Frontend | React + Vite |
| Hosting | Render |
Challenges we faced
Getting Gemini to return clean JSON consistently Real-world menus are messy. Prices formatted differently, categories with no clear labels, dishes listed without descriptions. We spent significant time prompt engineering the extraction step to handle edge cases gracefully and always return valid structured JSON regardless of menu format.
Neo4j graph design
Designing the schema correctly before writing any code was critical. We
learned early that getting the node types and relationships wrong would
cascade into every other part of the system. The deduplication of ingredient
and dietary tag nodes — using MERGE instead of CREATE — was a subtle but
important decision that makes cross-menu graph traversal accurate.
Perplexity enrichment at scale Enriching every menu item individually with a separate API call works for a demo but becomes slow for large menus. We handled this by enriching items sequentially during ingestion with clear progress feedback, and plan to move to batch enrichment in production.
Keeping the chatbot grounded The hardest challenge was making sure the chatbot never made things up. Our solution was strict context grounding — the LLM only ever sees the retrieved graph data as context, and is explicitly instructed never to answer from outside it. The Perplexity fallback layer handles out-of-menu questions separately, keeping the two sources of truth clean and distinct.
What we learned
- Graph databases are genuinely the right tool for relational menu data. The moment you need to answer "what dishes share an ingredient with X?", a SQL table becomes painful. Neo4j makes it a two-line query.
- Prompt engineering is a real engineering discipline. The difference between a prompt that works 80% of the time and one that works 99% of the time is not luck — it's iteration, specificity, and testing with real messy data.
- Splitting AI responsibilities clearly — Gemini for extraction, Perplexity for enrichment and fallback, graph traversal for retrieval — made the system far more reliable than using a single model for everything.
What's next
- Embeddable website widget (one script tag, like Intercom)
- QR code generation for physical menus
- Batch enrichment for faster ingestion of large menus
- Multi-language support for non-English menus
- Analytics dashboard for restaurant owners — most asked questions, common dietary requests, peak chat times
Log in or sign up for Devpost to join the conversation.