ZenFlow: The Digital Sanctuary 🧘♂️✨
Inspiration
In a world saturated with generic fitness trackers and impersonal video tutorials, we felt a disconnect. Yoga is not just about hitting a pose; it is about the union of breath, body, and mind. We wanted to bridge the gap between physical practice and mental presence.
Our inspiration was to create a "Digital Sanctuary"—a space that doesn't just track your movement but understands your journey. We set out to build a tool that democratizes professional-grade coaching, using Computer Vision to correct form and Agentic AI to nurture the mind.
What it does
ZenFlow is a premium, AI-powered yoga platform that functions as a holistic wellness companion.
- Vision Lab (The Eye): Using Ultra-Res 4FPS Sampling, it analyzes video feeds to identify distinct poses. It utilizes a 10-Second Quality Gate to filter out transition movements, ensuring only meaningful holds are tracked for duration, accuracy, and form consistency.
- AI Mentor (The Coach): Powered by Gemini 2.0 Flash and CrewAI, the mentor provides biomechanical feedback. It doesn't just say "good job"; it suggests specific regressions or modifications based on the visual data collected.
- Wellness Reflection (The Mind): It integrates mental health with physical metrics. The AI analyzes journal entries to adjust the intensity of coaching based on your current mood and recovery status.
- The Vault: A comprehensive library of technical breakdowns and reference imagery for every asana.
How we built it
We architected ZenFlow as a multi-layered system to ensure high performance and a serene user experience.
- Frontend: Built with React 18 and Vite, styled with Tailwind CSS. We used Framer Motion to create a fluid, "Glassmorphic" design that feels calming to use.
- Backend: A high-speed FastAPI server (running on Uvicorn) acts as the nervous system, orchestrating data flow between the client and the AI agents.
- Computer Vision: We utilized MediaPipe for pose landmarking and TensorFlow/Keras for custom pose classification. The system runs inference on video inputs to detect micro-metrics.
- Intelligence: The "brain" is built on CrewAI and Gemini 2.0 Flash. This allows for context-aware reasoning, where the AI remembers past sessions and journal entries to provide hyper-personalized advice.
- Storage: A secure SQLite ledger managed via SQLAlchemy keeps user data local and private.
Challenges we ran into
- Signal vs. Noise in Vision: One of the hardest parts was distinguishing between a user attempting a pose and a user transitioning between poses. We had to engineer a 10-Second Quality Gate to filter out transient movements so the data didn't get polluted with half-formed asanas.
- Latency in Feedback: Processing video streams while simultaneously querying an LLM can cause lag. We had to optimize the Vision Engine to run efficiently alongside the Agentic Core to ensure the experience remained smooth.
- Context Management: Teaching the AI to "remember" previous injuries or mood states from the journal required careful prompt engineering within the CrewAI framework to ensure the context was retrieved accurately for every session.
Accomplishments that we're proud of
- Surgical Precision: We achieved Multi-Pose Detection capability that can identify complex asanas with high accuracy using our custom Keras classification model.
- The "Agentic" Feel: The AI Mentor feels less like a chatbot and more like a coach. By giving Gemini access to the user's history and vision data, the advice feels genuinely grounded in the user's reality.
- The Aesthetic: We are incredibly proud of the Zen UI. The Glassmorphic design system proves that technical dashboards don't have to look clinical—they can be beautiful and calming.
What we learned
- The Power of Multimodal AI: Combining visual data (pose landmarks) with textual data (journal entries) creates a user experience that is exponentially more valuable than either on its own.
- Agentic Workflows: We learned how to use CrewAI to structure complex reasoning tasks, allowing Gemini to focus on empathy and strategy rather than just raw text generation.
- User-Centric ML: We learned that raw accuracy isn't enough; the data needs to be presented in a way that encourages the user, rather than overwhelming them with charts.
What's next for ZenFlow
- Real-Time Voice Corrections: Implementing text-to-speech to give audio cues during the practice so users don't have to look at the screen.
- Wearable Integration: Syncing with Apple Watch or Fitbit to correlate heart rate variability (HRV) with hold times and journal sentiment.
- AR Overlay: Using Augmented Reality to project the "ideal" skeleton over the user's video feed for immediate visual correction.
Log in or sign up for Devpost to join the conversation.