FairBite is a bias-aware restaurant rating system that detects and corrects for cultural and linguistic bias in Yelp reviews. It uses a trained NLP sentiment model to measure what reviewers actually expressed in their writing, then compares that against their star ratings to surface systematic disparities across cuisine types.
Star ratings on platforms like Yelp are treated as objective measures of quality, but they are produced by humans whose language reflects cultural familiarity and expectation. Reviewers unconsciously use different vocabulary and tone when writing about Mexican food versus French food — and when that pattern is systematic across thousands of reviews, certain cuisines end up with ratings that don't accurately reflect the food.
FairBite detects this bias and produces a corrected rating.
- A bidirectional LSTM sentiment model is trained on Yelp review text, producing a continuous sentiment score in [0, 1] for any input review.
- Every review in the dataset is scored. For each city–cuisine pair, the mean sentiment score is compared against the city-wide baseline.
- The gap between a cuisine's sentiment average and the city average is the bias score.
- Each business receives an adjusted rating that partially corrects for detected bias.
bias_score = cuisine_avg_sentiment − city_avg_sentiment
adjusted_rating = original_rating + 0.5 × bias_score (clipped to [1, 5])
TF-IDF analysis runs in parallel to identify the vocabulary most distinctive to each cuisine, providing interpretable evidence of the language patterns behind any detected bias.
├── ml-server/ # Sentiment model training and inference API
│ ├── model/
│ │ ├── sentiment_model.py # Bidirectional LSTM architecture
│ │ └── train.py # Training script (STSRS balancing, MSE loss)
│ ├── utils/
│ │ ├── clean_text.py # Tokenization, stopword removal, lemmatization
│ │ ├── data_loader.py # Yelp dataset loader (businesses + reviews)
│ │ └── encode.py # Vocabulary builder and sequence encoder
│ └── app.py # FastAPI inference server (POST /score_review)
│
├── bias-server/ # Bias computation and REST API
│ ├── api/
│ │ ├── routes.py # API endpoints
│ │ └── schemas.py # Pydantic response models
│ ├── bias/
│ │ ├── scoring.py # Bias score computation and data loading
│ │ └── explanation.py # Human-readable bias explanation generator
│ ├── precompute/
│ │ └── run_precompute.py # Offline pipeline: score reviews, compute averages, TF-IDF, sample reviews
│ └── data/ # Precomputed JSON output (averages, top_words, sample_reviews)
│
├── frontend/ # Next.js web interface
│ └── src/
│ ├── app/
│ │ ├── App.tsx # Main search page
│ │ ├── compare/page.tsx # Compare cities for a cuisine
│ │ ├── compare-cuisines/page.tsx # Compare cuisines within a city
│ │ └── components/
│ │ ├── Nav.tsx
│ │ ├── SearchBar.tsx
│ │ ├── RestaurantCard.tsx
│ │ ├── BiasInsightPanel.tsx
│ │ └── ReviewsPanel.tsx
│ └── lib/
│ └── api.ts # API client
│
└── data/ # Yelp Academic Dataset (not committed)
├── business.json
└── review.json
- Python 3.10+
- Node.js 18+
- Yelp Academic Dataset (
business.json,review.json) placed indata/
cd ml-server
pip install -r requirements.txt
python model/train.pyThis saves model/saved/model.pt and model/saved/vocab.json.
cd ml-server
uvicorn app:app --port 8000With the ML server running, score all reviews and generate the bias data:
cd bias-server
python precompute/run_precompute.pyThis produces three files in bias-server/data/:
averages.json— sentiment averages and bias scores per city/cuisinetop_words.json— TF-IDF top words per cuisinesample_reviews.json— most illustrative reviews (largest star/sentiment gap)
cd bias-server
uvicorn app:app --port 8001cd frontend
npm install
npm run devOpen http://localhost:3000.
| Method | Endpoint | Description |
|---|---|---|
| GET | /cities |
List available cities |
| GET | /cuisines |
List available cuisines |
| GET | /bias/{city}/{cuisine} |
Bias score and adjusted ratings for a cuisine in a city |
| GET | /reviews/{city}/{cuisine} |
Most illustrative reviews (highest star/sentiment gap) |
| GET | /top_words |
TF-IDF top words per cuisine |
| GET | /city/{city} |
Overview of all cuisines in a city |
| Layer | Technology |
|---|---|
| Model | PyTorch — Bidirectional LSTM |
| ML API | FastAPI |
| Bias API | FastAPI |
| NLP preprocessing | NLTK |
| TF-IDF | scikit-learn |
| Data processing | pandas |
| Frontend | Next.js 15, TypeScript, Tailwind CSS |
| Deployment | Render |
| Model versioning | Git LFS |
| Dataset | Yelp Academic Dataset |
This project uses the Yelp Academic Dataset, available for academic and research use. The dataset files are not included in this repository. Download and place business.json and review.json in the data/ directory before running precompute.
Cuisines: Mexican, French, Chinese, Italian, Japanese
Cities: Philadelphia PA · Tampa FL · Indianapolis IN