SpamShield

Inspiration

In a world where our phones buzz endlessly with promotions, fake prize offers, and phishing links, we noticed how common SMS spam has become — especially in regions with limited awareness of digital fraud. One of our team members fell victim to a phishing SMS, and that sparked the idea: Can we build an AI tool that empowers people to detect spam before it harms them?

What it does

SpamShield is an AI-powered web application that allows users to paste or type SMS messages to instantly determine whether they are Spam or Not Spam. It offers:

Multi-message analysis at once

Confidence levels for each prediction

Visual insights (Pie charts, stats)

A user-friendly UI that requires no technical knowledge

How we built it

We built SpamShield with the following tech stack:

Backend ML Model:

Algorithm: Multinomial Naive Bayes

Language: Python

Libraries: scikit-learn, pandas, joblib

Data: We used the SMS Spam Collection Dataset

Preprocessing: Lowercasing, stopword removal, stemming, and TF-IDF vectorization

Frontend Application:

Framework: Streamlit

Libraries: Plotly for visualizations

Deployment: Streamlit Cloud

Model Training Pipeline:

Cleaned and labeled the data

Converted messages into TF-IDF vectors

Trained the model and saved it using joblib

Integrated prediction + confidence scores into the Streamlit frontend

Challenges we ran into

Creating a clean preprocessing pipeline that avoids common bugs (e.g., .lower() on sparse matrix)

Packaging the vectorizer and model in a compatible way across local and cloud deployments

Making the UI visually appealing while being informative

Avoiding overfitting due to dataset imbalance (ham > spam)

Accomplishments that we're proud of

Seamless full-stack integration of an ML model into a real-time web app

Designed an intuitive user experience for non-technical users

Achieved >98% accuracy on test data

Built a polished demo within a limited hackathon timeline

What we learned

How to turn a basic ML classifier into a usable product

The importance of model serialization and inference-time preprocessing

Using Streamlit effectively for rapid front-end prototyping

Real-world UX considerations for AI-powered tools

What's next for SpamShield

Build a browser extension to auto-detect SMS content in real-time

Create a mobile app version for Android

Enable SMS forwarding analysis via WhatsApp/Telegram bot

Add custom model training with user-supplied messages for enterprise use

Improve performance with deep learning models like BERT or DistilBERT

Built With

Updates

Aaryan Pawar posted an update — Jul 27, 2025 08:53 AM EDT

I have updated YouTube link after the submission deadline, did' nt know it will result in disqualification.

Log in or sign up for Devpost to join the conversation.

Aaryan Pawar started this project — Jul 25, 2025 01:27 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.