Inspiration

Large gatherings—festivals, protests, transport hubs—often become unsafe due to overcrowding, delayed response, or lack of real-time visibility. Traditional CCTV systems are passive; they record events but don’t actively prevent incidents.

This project was inspired by the need to shift from reactive monitoring to proactive crowd intelligence, where risks can be detected and mitigated before escalation.

What it does

CrowdCareAI is an AI-powered crowd monitoring system that detects, tracks, and analyzes individuals in real time.

It focuses on:

Accurate person-level detection (not just crowd blobs) Density estimation and congestion alerts Real-time risk detection for overcrowding scenarios Scalable deployment across CCTV and live camera feeds

At its core, the system transforms video streams into actionable insights for safety authorities.

How we built it

The system is built using a computer vision pipeline optimized for real-time inference:

Data Collection & Annotation Used crowd datasets and custom-labeled data Focused on bounding-box annotations for individual people Model Selection & Training Evaluated multiple object detection architectures Selected models optimized for person detection precision (e.g., YOLO-based variants) Training performed using platforms like Roboflow and custom pipelines Inference Pipeline Video frames processed in real time

Each frame produces bounding boxes:

𝐵

{ ( 𝑥 , 𝑦 , 𝑤 , ℎ , 𝑐 ) } B={(x,y,w,h,c)}

where 𝑐 c represents confidence score

Crowd Density Estimation Count of detected individuals per frame

Density approximated as:

𝐷

𝑁 𝐴 D= A N

where 𝑁 N = number of people, 𝐴 A = area of region

Alert System Threshold-based triggers for overcrowding Can be extended to anomaly detection Challenges we ran into Model accuracy vs speed tradeoff High-accuracy models reduced FPS, affecting real-time usability Crowded scene ambiguity Overlapping people caused missed or merged detections Dataset limitations Many datasets detect “crowds” rather than individuals—requiring refinement Lighting and camera angles Performance drops in low-light or extreme perspectives Scalability Running inference across multiple live feeds requires optimization and possibly edge deployment What we learned The importance of dataset quality over quantity Real-world deployment constraints differ significantly from lab performance Bounding-box detection alone is not enough—contextual understanding is key Optimization (latency, throughput) is as critical as model accuracy

Built With

computervision
deeplearning
numpy
opencv
python
pytorch
realtimeinference
roboflow
yolo

Updates

Salmon Angelo started this project — Mar 24, 2026 05:14 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.