Clue_zero

An autonomous home-security AI agent. Clue_zero watches a webcam feed, reasons about what it sees, tells real threats apart from false positives, and alerts the household only when something actually matters.

Built for the Hack-a-Claw x NVIDIA hackathon.

Why this exists

This project is personal. My father has anxiety about home security. Every single night he barricades the front door before he can sleep. A motion sensor would not help him — it would cry wolf at every moth, leaf, and passing car, and the noise would make the anxiety worse, not better.

What he needs is something that can actually look and judge: "that is just the wind" vs. "that is a person trying the door handle." Clue_zero is that judgment, running quietly all night, only speaking up when it should.

What it does

Captures frames from a Logitech C270 webcam, one per second.
Analyzes each frame with NVIDIA Nemotron 3 Nano Omni (vision reasoning, via the build.nvidia.com API).
Decides whether the scene is a real threat or a false positive (bug, leaf, car, delivery driver vs. an actual intruder).
Remembers every event in a persistent SQLite database.
Alerts the household — and shows a calm, glanceable dashboard designed to be readable by a senior citizen across the room.

How it satisfies the hackathon "agent" requirements

Requirement	How Clue_zero meets it
Autonomous	Runs unattended all night. The capture loop and analysis pipeline need no human in the loop — it watches, judges, and acts on its own.
Multi-step reasoning	For each frame it chains: capture → vision analysis → threat classification → alert decision. It weighs confidence and context, not just "motion: yes/no".
Tool use	The agent calls discrete tools: `vision_analyze` (perception), `event_logger` (memory), `alert_sender` (action). Each is a clean, swappable interface.
Persistent memory	Every event is written to `memory/events.db` (SQLite). The agent — and the dashboard — can recall the full history of the night.

Architecture

  Webcam (Logitech C270)
        |
        v
  capture/webcam_capture.py  --- saves frames to /captures/, 1 fps
        |
        v
  tools/vision_analyze.py    --- Nemotron vision reasoning  [STUB today]
        |
        v
  agent/  (reasoning loop)   --- threat vs. false-positive  [coming soon]
        |
        +--> tools/event_logger.py  --- SQLite persistent memory
        |
        +--> tools/alert_sender.py  --- console alert (SMS/push later)
        |
        v
  server/server.py (FastAPI)
        |
        +--  GET  /api/events        recent events as JSON
        +--  GET  /api/latest-frame  most recent webcam frame
        +--  POST /api/test-event    inject a demo event
        +--  WS   /ws/live           pushes events in real time
        |
        v
  dashboard/  --- dark-mode, senior-friendly live view + event log

Folder layout

Path	Purpose
`agent/`	The reasoning loop (intentionally empty for now).
`tools/`	The agent's tools: vision analysis, event memory, alerting.
`capture/`	Webcam capture loop and single-frame grab.
`captures/`	Saved webcam frames (`frame_<timestamp>.jpg`).
`dashboard/`	The web UI (HTML/CSS/JS), served at `/`.
`server/`	FastAPI app: REST API + WebSocket + static dashboard.
`memory/`	`events.db` — the SQLite persistent memory.
`demo_clips/`	Montage videos for the final pitch.
`assets/`	Icons, logos, placeholder images.

Running it

Quick start (Windows)

run.bat

This creates a virtual environment, installs dependencies, and starts the server. Then open http://localhost:8080.

Manual

pip install -r requirements.txt
python server/server.py

The dashboard is at http://localhost:8080.

Try the pipeline without a camera or model

With the server running, inject a test event:

curl -X POST http://localhost:8080/api/test-event -H "Content-Type: application/json" -d "{\"scenario\": \"intruder\"}"

Scenarios: person, intruder, delivery, leaf, car, bug. A high threat event makes the dashboard banner flash red and play an alert tone.

Capture frames from the webcam

python capture/webcam_capture.py

Saves a frame per second to captures/. Ctrl+C to stop.

Current status

This is the scaffolding build. The capture, memory, alerting, server, and dashboard are all real and working. The vision analysis (tools/vision_analyze.py) is a stub returning canned results, and the agent/ reasoning loop is not written yet.

Next: wire in the real NVIDIA Nemotron vision call and build the agent reasoning loop on top. See BUILD_NOTES.md for the detailed punch list.

Tech stack

Python 3.11+ · OpenCV · FastAPI · Uvicorn · SQLite · Rich · vanilla HTML/CSS/JS · NVIDIA Nemotron 3 Nano Omni (build.nvidia.com).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Clue_zero

Why this exists

What it does

How it satisfies the hackathon "agent" requirements

Architecture

Folder layout

Running it

Quick start (Windows)

Manual

Try the pipeline without a camera or model

Capture frames from the webcam

Current status

Tech stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
agent		agent
assets		assets
capture		capture
captures		captures
dashboard		dashboard
demo_clips		demo_clips
memory		memory
scripts		scripts
server		server
tools		tools
.env.example		.env.example
.gitignore		.gitignore
BUILD_NOTES.md		BUILD_NOTES.md
README.md		README.md
clue_zero_dashboard_mockup.html		clue_zero_dashboard_mockup.html
requirements.txt		requirements.txt
run.bat		run.bat

Folders and files

Latest commit

History

Repository files navigation

Clue_zero

Why this exists

What it does

How it satisfies the hackathon "agent" requirements

Architecture

Folder layout

Running it

Quick start (Windows)

Manual

Try the pipeline without a camera or model

Capture frames from the webcam

Current status

Tech stack

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages