Skip to content

riptide-06/clue_zero

Repository files navigation

Clue_zero

An autonomous home-security AI agent. Clue_zero watches a webcam feed, reasons about what it sees, tells real threats apart from false positives, and alerts the household only when something actually matters.

Built for the Hack-a-Claw x NVIDIA hackathon.


Why this exists

This project is personal. My father has anxiety about home security. Every single night he barricades the front door before he can sleep. A motion sensor would not help him — it would cry wolf at every moth, leaf, and passing car, and the noise would make the anxiety worse, not better.

What he needs is something that can actually look and judge: "that is just the wind" vs. "that is a person trying the door handle." Clue_zero is that judgment, running quietly all night, only speaking up when it should.


What it does

  1. Captures frames from a Logitech C270 webcam, one per second.
  2. Analyzes each frame with NVIDIA Nemotron 3 Nano Omni (vision reasoning, via the build.nvidia.com API).
  3. Decides whether the scene is a real threat or a false positive (bug, leaf, car, delivery driver vs. an actual intruder).
  4. Remembers every event in a persistent SQLite database.
  5. Alerts the household — and shows a calm, glanceable dashboard designed to be readable by a senior citizen across the room.

How it satisfies the hackathon "agent" requirements

Requirement How Clue_zero meets it
Autonomous Runs unattended all night. The capture loop and analysis pipeline need no human in the loop — it watches, judges, and acts on its own.
Multi-step reasoning For each frame it chains: capture → vision analysis → threat classification → alert decision. It weighs confidence and context, not just "motion: yes/no".
Tool use The agent calls discrete tools: vision_analyze (perception), event_logger (memory), alert_sender (action). Each is a clean, swappable interface.
Persistent memory Every event is written to memory/events.db (SQLite). The agent — and the dashboard — can recall the full history of the night.

Architecture

  Webcam (Logitech C270)
        |
        v
  capture/webcam_capture.py  --- saves frames to /captures/, 1 fps
        |
        v
  tools/vision_analyze.py    --- Nemotron vision reasoning  [STUB today]
        |
        v
  agent/  (reasoning loop)   --- threat vs. false-positive  [coming soon]
        |
        +--> tools/event_logger.py  --- SQLite persistent memory
        |
        +--> tools/alert_sender.py  --- console alert (SMS/push later)
        |
        v
  server/server.py (FastAPI)
        |
        +--  GET  /api/events        recent events as JSON
        +--  GET  /api/latest-frame  most recent webcam frame
        +--  POST /api/test-event    inject a demo event
        +--  WS   /ws/live           pushes events in real time
        |
        v
  dashboard/  --- dark-mode, senior-friendly live view + event log

Folder layout

Path Purpose
agent/ The reasoning loop (intentionally empty for now).
tools/ The agent's tools: vision analysis, event memory, alerting.
capture/ Webcam capture loop and single-frame grab.
captures/ Saved webcam frames (frame_<timestamp>.jpg).
dashboard/ The web UI (HTML/CSS/JS), served at /.
server/ FastAPI app: REST API + WebSocket + static dashboard.
memory/ events.db — the SQLite persistent memory.
demo_clips/ Montage videos for the final pitch.
assets/ Icons, logos, placeholder images.

Running it

Quick start (Windows)

run.bat

This creates a virtual environment, installs dependencies, and starts the server. Then open http://localhost:8080.

Manual

pip install -r requirements.txt
python server/server.py

The dashboard is at http://localhost:8080.

Try the pipeline without a camera or model

With the server running, inject a test event:

curl -X POST http://localhost:8080/api/test-event -H "Content-Type: application/json" -d "{\"scenario\": \"intruder\"}"

Scenarios: person, intruder, delivery, leaf, car, bug. A high threat event makes the dashboard banner flash red and play an alert tone.

Capture frames from the webcam

python capture/webcam_capture.py

Saves a frame per second to captures/. Ctrl+C to stop.


Current status

This is the scaffolding build. The capture, memory, alerting, server, and dashboard are all real and working. The vision analysis (tools/vision_analyze.py) is a stub returning canned results, and the agent/ reasoning loop is not written yet.

Next: wire in the real NVIDIA Nemotron vision call and build the agent reasoning loop on top. See BUILD_NOTES.md for the detailed punch list.


Tech stack

Python 3.11+ · OpenCV · FastAPI · Uvicorn · SQLite · Rich · vanilla HTML/CSS/JS · NVIDIA Nemotron 3 Nano Omni (build.nvidia.com).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors