Amazon Review Analyzer
Machine-learning project that classifies Amazon product reviews as AI-generated (fake) or human-written using text preprocessing, engineered features, and multiple model families (TF–IDF + logistic regression, XGBoost, and a BERT LoRA adapter). Includes a Streamlit webapp for live inference and model comparison.
More details
Problem
Fake or AI-generated product reviews harm customers and distort product ratings. This project detects likely AI-generated or fake Amazon reviews from review text (and optional rating context) to improve trust and support moderation workflows.
Approach
Ingest raw CSV review data, clean and normalize text, and engineer features used by an XGBoost model while also training a TF–IDF baseline and a BERT model using LoRA adapters for efficient fine-tuning. Model artifacts are saved to `model/` and an interactive Streamlit app (`webapp/streamlit_app.py`) allows live classification and feature inspection.
Results & Impact
The repository includes trained artifacts (joblib and PEFT/LoRA adapter files) and utilities for preprocessing, training, and evaluation. The Streamlit app lets users quickly compare model predictions and confidence scores to support moderation or further analysis.