Inspiration
In a world where our phones buzz endlessly with promotions, fake prize offers, and phishing links, we noticed how common SMS spam has become — especially in regions with limited awareness of digital fraud. One of our team members fell victim to a phishing SMS, and that sparked the idea: Can we build an AI tool that empowers people to detect spam before it harms them?
What it does
SpamShield is an AI-powered web application that allows users to paste or type SMS messages to instantly determine whether they are Spam or Not Spam. It offers:
Multi-message analysis at once
Confidence levels for each prediction
Visual insights (Pie charts, stats)
A user-friendly UI that requires no technical knowledge
How we built it
We built SpamShield with the following tech stack:
Backend ML Model:
Algorithm: Multinomial Naive Bayes
Language: Python
Libraries: scikit-learn, pandas, joblib
Data: We used the SMS Spam Collection Dataset
Preprocessing: Lowercasing, stopword removal, stemming, and TF-IDF vectorization
Frontend Application:
Framework: Streamlit
Libraries: Plotly for visualizations
Deployment: Streamlit Cloud
Model Training Pipeline:
Cleaned and labeled the data
Converted messages into TF-IDF vectors
Trained the model and saved it using joblib
Integrated prediction + confidence scores into the Streamlit frontend
Challenges we ran into
Creating a clean preprocessing pipeline that avoids common bugs (e.g., .lower() on sparse matrix)
Packaging the vectorizer and model in a compatible way across local and cloud deployments
Making the UI visually appealing while being informative
Avoiding overfitting due to dataset imbalance (ham > spam)
Accomplishments that we're proud of
Seamless full-stack integration of an ML model into a real-time web app
Designed an intuitive user experience for non-technical users
Achieved >98% accuracy on test data
Built a polished demo within a limited hackathon timeline
What we learned
How to turn a basic ML classifier into a usable product
The importance of model serialization and inference-time preprocessing
Using Streamlit effectively for rapid front-end prototyping
Real-world UX considerations for AI-powered tools
What's next for SpamShield
Build a browser extension to auto-detect SMS content in real-time
Create a mobile app version for Android
Enable SMS forwarding analysis via WhatsApp/Telegram bot
Add custom model training with user-supplied messages for enterprise use
Improve performance with deep learning models like BERT or DistilBERT
Built With
- joblib
- nltk
- pandas
- plotly
- python
- scikit-learn
- streamlit
Log in or sign up for Devpost to join the conversation.