Inspiration

In a world where our phones buzz endlessly with promotions, fake prize offers, and phishing links, we noticed how common SMS spam has become — especially in regions with limited awareness of digital fraud. One of our team members fell victim to a phishing SMS, and that sparked the idea: Can we build an AI tool that empowers people to detect spam before it harms them?

What it does

SpamShield is an AI-powered web application that allows users to paste or type SMS messages to instantly determine whether they are Spam or Not Spam. It offers:

Multi-message analysis at once

Confidence levels for each prediction

Visual insights (Pie charts, stats)

A user-friendly UI that requires no technical knowledge

How we built it

We built SpamShield with the following tech stack:

Backend ML Model:

Algorithm: Multinomial Naive Bayes

Language: Python

Libraries: scikit-learn, pandas, joblib

Data: We used the SMS Spam Collection Dataset

Preprocessing: Lowercasing, stopword removal, stemming, and TF-IDF vectorization

Frontend Application:

Framework: Streamlit

Libraries: Plotly for visualizations

Deployment: Streamlit Cloud

Model Training Pipeline:

Cleaned and labeled the data

Converted messages into TF-IDF vectors

Trained the model and saved it using joblib

Integrated prediction + confidence scores into the Streamlit frontend

Challenges we ran into

Creating a clean preprocessing pipeline that avoids common bugs (e.g., .lower() on sparse matrix)

Packaging the vectorizer and model in a compatible way across local and cloud deployments

Making the UI visually appealing while being informative

Avoiding overfitting due to dataset imbalance (ham > spam)

Accomplishments that we're proud of

Seamless full-stack integration of an ML model into a real-time web app

Designed an intuitive user experience for non-technical users

Achieved >98% accuracy on test data

Built a polished demo within a limited hackathon timeline

What we learned

How to turn a basic ML classifier into a usable product

The importance of model serialization and inference-time preprocessing

Using Streamlit effectively for rapid front-end prototyping

Real-world UX considerations for AI-powered tools

What's next for SpamShield

Build a browser extension to auto-detect SMS content in real-time

Create a mobile app version for Android

Enable SMS forwarding analysis via WhatsApp/Telegram bot

Add custom model training with user-supplied messages for enterprise use

Improve performance with deep learning models like BERT or DistilBERT

Built With

Share this project:

Updates