Trusty Krusty Reviews
Overview
Trusty Krusty Reviews is an ML system that assesses the quality and relevancy of Google location reviews. It reduces noise (ads, off-topic posts, rants) and surfaces trustworthy feedback so users, businesses, and platforms can rely on cleaner signals at scale.
Problem
Public ratings are often distorted by irrelevant, promotional, or low-effort reviews. The challenge is to design and implement an ML-based system that evaluates the quality and relevancy of location reviews and supports policy-aligned filtering at scale.
Solution
Our system classifies each review into one of four categories—Valid, Advertisement, Irrelevant, or Rant—assigns confidence scores, highlights suspicious tokens (e.g., URLs/promo phrases), and supports bulk processing and export. Inference runs locally for reliability and cost control.
Key Features
- Dual modes: Business Mode (CSV upload and analysis) and Places Mode (live Google Places search and review classification)
- Multi-class predictions with per-class confidence
- Before/after comparison and metrics dashboard
- Compact table view of all class confidence scores
- Export of full datasets with predictions and classifications
- Local inference (no external API keys required at runtime)
How It Works
- Preprocessing and normalization (emoji handling, token cleanup, optional translation)
- Feature engineering (DistilRoBERTa embeddings + numerical features such as review length and relevance scores)
- Classification (multi-modal model + high-precision rules for obvious ads/low-effort content)
- Streamlit UI for analysis, comparison, and export
Development Tools
- VSCode, Jupyter Notebook
- Git/GitHub for version control
- Streamlit for the interactive application
- Local Python environment with optional GPU
APIs Used
- Google Maps API (
googlemaps) for business details and location metadata - Apify: Google Maps Reviews Scraper and Google Maps Scraper for review and business data collection
- LLM-assisted pre-labeling (ChatGPT) followed by manual validation
Libraries and Frameworks
- PyTorch, Transformers, Sentence-Transformers, scikit-learn, datasets, safetensors
- pandas, numpy, emoji, googletrans, python-dotenv, watchdog
- Streamlit, streamlit-folium, matplotlib, seaborn
Assets and Datasets
- ~3,800 Google Maps reviews collected via Apify scrapers
- Semi-automated labeling: LLM-generated initial labels with a manual validation pass
- Repository data files:
dataset/all_reviews.csv(raw)dataset/final_df.csv(processed features)dataset/label_data.csv(labeled training set)dataset/places_data.csv(business metadata)assets/sample_reviews.csv(10-row demo subset)
Relevance and Impact
By filtering promotional, off-topic, and unconstructive content and elevating reliable reviews, the system provides a cleaner, policy-aligned signal for any Google location category. Users make better choices, businesses gain fairer representation, and platforms reduce moderation overhead with transparent, reproducible tooling.
Built With
- google-maps
- googleplacesclient
- numpy
- pandas
- python
- streamlit
Log in or sign up for Devpost to join the conversation.