Inspiration

  • We live in a state of constant digital alert. Every unexpected text or email forces a difficult choice: Is this a critical notification or a sophisticated trap? The fear of being scammed is matched only by the fear of missing something important, like a delivery update or an appointment reminder.

  • We realized that without an integrated, instant verification tool, users are left guessing. This uncertainty leads to two costly outcomes: missed opportunities (ignoring real messages) or financial/data loss (falling for malicious links). We wanted to eliminate this guesswork by building a tool that provides clarity in a world of deception.

What it does

  • Lumos is an AI Co-pilot for Digital Trust. It is a centralized platform that provides instant, AI-powered analysis for suspicious messages and images, powered by machine learning models such as XGBoost.

  • Multi-Modal Analysis: Users can paste text or upload a screenshot of a suspicious message.

  • Instant Verdict: The system analyzes the input and delivers a clear Risk Score (0-100) in seconds.

  • Actionable Intelligence: Unlike simple blockers, Lumos explains why a message is risky through "Analysis Evidence" (e.g., unknown sender, urgency keywords, flagged URLs) and provides concrete "Recommendations" (e.g., Block number, Do not click links).

How we built it

  • We engineered a Multi-Layered Intelligence Engine that deconstructs threats into parts for analysis.

  • The Stack: We built a scalable backend using Node.js and Express.js, serving a responsive Web UI.

  • Data Processing (OCR): To handle image-based scams that evade text filters, we integrated Tesseract.js. This allows Lumos to extract and analyze text, URLs, and phone numbers directly from screenshots.

  • External Intelligence: We chained multiple powerful APIs to gather context:

    • Twilio API: For sender reputation and phone number validation.
    • Google Safe Browsing API: For checking URL reputation and domain age.
    • OpenAI API: For semantic analysis to detect urgency, threats, and linguistic patterns.
  • The Brain (Machine Learning): All these signals feed into our custom-trained XGBoost Model. This model evaluates 45 distinct features—ranging from keyword density to international sender status—to calculate the final probability of a scam.

Challenges we ran into

  • OCR Accuracy on Varied Backgrounds: Extracting text reliably from screenshots with different lighting, resolutions, and compression artifacts using Tesseract.js required significant tuning to ensure we didn't miss critical URLs or phone numbers.

  • Feature Engineering for XGBoost: Determining which features were most predictive of a scam was difficult. We had to balance technical signals (such as URL age) with linguistic signals (such as "urgency") to ensure legitimate high-priority messages (such as 2FA codes) weren't flagged as false positives.

  • Real-Time Latency: Chaining multiple APIs (Twilio, Google, OpenAI) creates a risk of slow responses. We had to optimize our asynchronous requests to ensure the user gets a "verdict in seconds" rather than waiting for a long analysis.

Accomplishments that we're proud of

  • The "Glass Box" Approach: We didn't just build a black-box AI that says "Bad." We are proud of our UI that breaks down the Analysis Evidence. Showing the user exactly why a message was flagged (e.g., "URL not flagged, but AI analysis detected urgency tactics") builds genuine user trust.

  • 45-Feature Threat Analysis: Successfully implementing a model that considers 45 different variables gives our detection engine a depth that simple keyword matching cannot achieve.

  • Seamless Image Handling: Getting the "Parser & OCR" layer to work smoothly allows us to catch scams that hide in images, which is a massive gap in many current security tools.

What we learned

  • Scams are Structural: We learned that while the content of scams changes, the structure (urgency + obscure link + unknown sender) remains remarkably consistent. This validated our choice to use XGBoost to detect these patterns.

  • Context is King: A URL might be safe, but the context (e.g., asking for a toll payment via text) is what makes it a scam. Relying on a single data point (like Safe Browsing) isn't enough; you need the multi-layered approach we built.

What's next for Lumos

  • Lumos is just our first step toward a safer digital ecosystem. Our roadmap includes:
    • Browser Extensions & Email Plug-ins: Moving Lumos from a destination site to a tool that lives where the users are, automatically scanning content as it arrives.
    • Real-Time APIs: Opening our detection engine to other developers to bring trust to their platforms.
    • Enhanced Community Feedback: Allowing users to report new scam templates to retrain our XGBoost model in real-time.

Built With

+ 1 more
Share this project:

Updates