The session begins with a single click. Artemis calibrates to your gaze, heartbeat, and workspace — preparing to map your mind in real time.
Live focus tracking turns every blink and keystroke into a real-time graph of your cognitive rhythm.
Real-time metrics reveal focus stability, cognitive load, and task coherence with precision.
“Artemis detects distraction, closing unrelated tabs and keeping only focus-relevant windows active to maintain cognitive flow.”
Artemis adapts your environment in real time—cool blue light during calm focus, warm red tones with music BPM rising as focus intensifies.
“Artemis syncs with your phone to silence distractions, auto-blocking notifications the instant your focus session begins.”
Our Team

About Artemis

Inspiration

The idea for Artemis came from watching my sister struggle with ADHD and experiencing similar focus challenges ourselves. Sitting down to code, opening "just one tab" for documentation, and suddenly finding ourselves 30 minutes deep in YouTube videos with 47 open tabs and zero lines of code written became a pattern we couldn't ignore. Traditional productivity apps would block entire websites, but that felt too restrictive. A YouTube tutorial about React hooks is valuable learning content, while YouTube's homepage autoplay is a distraction trap. The difference isn't the domain, it's the intent and timing.

We wanted to build something smarter: a system that could actually understand what you're doing, what matters right now, and reshape your environment accordingly. Not by forcing rigid rules, but by learning your patterns and adapting in real-time. That's when we discovered the intersection of eye tracking, browser automation, and LLM reasoning could create something truly intelligent. For people with ADHD or anyone struggling with focus, context-aware assistance could be life-changing.

What It Does

Artemis is an AI-powered focus orchestration system that monitors your cognitive state and automatically optimizes your workspace. It combines:

Eye tracking via MediaPipe and EyeTrax to detect attention patterns, blink rates, and fixation stability
Browser telemetry through Chrome DevTools Protocol to analyze tab content, usage patterns, and engagement scores
Window monitoring to track active applications and infer task context
LLM reasoning using Claude 3.5 to semantically analyze content and make intelligent decisions
Environment control for smart lights (WiZ/LIFX) and music (Spotify) synchronized to flow phases

The system operates across four cognitive phases:

Calibration - Warming up, high exploration, dispersed attention
Engagement - Focus forming, gaze clusters stabilizing
Flow - Deep focus achieved, minimal distractions
Recovery - Fatigue detected, gradual cooldown

How We Built It

Architecture

Artemis is built as a multi-layered system with distinct components:

1. FlowSync Core (Electron + React + TypeScript)

Desktop application with glassmorphic UI using TailwindCSS and Framer Motion
IPC bridge between renderer and main process for system-level operations
Real-time metrics dashboard showing attention analytics

2. Eye Tracking Service (Python + Node.js Bridge)

Python service using EyeTrax library for gaze estimation
JSON-RPC communication over stdin/stdout for bidirectional messaging
Implemented Kalman filtering and variable scaling for 50% accuracy improvement
Calibration quality metrics with statistical analysis

3. Chrome Monitor (Chrome DevTools Protocol)

Connects to Chrome via CDP on port 9222
Executes JavaScript in target tabs to extract:
- Page content (up to 10,000 chars)
- Semantic information (headings, topics, sentiment)
- Technical context (frameworks, languages, code blocks)
- Behavioral metrics (scroll position, time spent, engagement)
Smart tab change detection that triggers LLM updates only when needed

4. LLM Reasoning Engine (Claude 3.5 Haiku)

Temporal memory system that stores significant moments with semantic tags
Computes real metrics from actual data (no hallucinated numbers):
- Focus stability based on window/tab dwell time
- Distraction level from tab switching frequency
- Task coherence from domain consistency
- Cognitive load from tab count and complexity
Conservative tab filtering with explicit decision logic:
- Preserves educational content for learning tasks
- Keeps documentation for development tasks
- Only hides truly irrelevant or long-unused tabs
Learns user patterns over time with baseline improvement tracking

5. Environment Controllers (Python)

Spotify Web API integration with OAuth2 authentication
Smart lighting control for WiZ-compatible bulbs
Neuroergonomic presets based on research (cool white for focus, warm for breaks)

Key Technical Decisions

Why Electron? We needed desktop-level access to control Chrome, manage windows, and integrate with system APIs while maintaining a modern UI. Electron provided the perfect bridge.

Why Python for eye tracking? The EyeTrax and MediaPipe libraries are Python-native. We built a clean IPC bridge to Node.js rather than fighting the ecosystem.

Why Claude instead of GPT? Claude 3.5 Haiku offered the best balance of speed, cost, and reasoning quality for our use case. The 30-second analysis interval meant we needed fast, accurate responses.

Why Chrome DevTools Protocol? Unlike browser extensions (which can't access certain pages), CDP gives us unrestricted access to all tabs and can execute arbitrary JavaScript for deep content extraction.

Challenges We Faced

1. Gaze Tracking Accuracy

Problem: Initial calibration accuracy was poor (80-150px error), making it impossible to know what the user was actually looking at.

Solution:

Implemented variable scaling to weight features by variance (20-40% error reduction)
Added Kalman filtering for smooth trajectories (60-80% jitter reduction)
Enhanced sample collection from 15 to 20-30 samples per calibration point
Added calibration quality metrics so users know when to recalibrate

Result: Achieved 40-80px mean error with smooth, natural-feeling tracking.

2. LLM Hallucination of Metrics

Problem: When we asked Claude to analyze context and provide metrics, it would hallucinate numbers rather than computing them from real data.

Solution:

Pre-compute all metrics in TypeScript from actual telemetry
Pass computed values explicitly in the prompt
Constrain the LLM to interpret, not generate, numerical data
Added temporal evolution metrics (baseline improvement, learning curve)

Result: Real metrics that accurately reflect user behavior, with LLM providing qualitative interpretation.

3. Over-Aggressive Tab Closing

Problem: Early versions would close valuable tabs like YouTube tutorials or documentation because they matched "distraction" patterns.

Solution:

Implemented semantic content analysis of actual page content
Added task-aware filtering that understands context:
- Educational videos are valuable for learning tasks
- Documentation is essential for development tasks
- Reference materials support research tasks
Conservative filtering rules: "when in doubt, keep it visible"
Detailed evaluation per tab with explicit reasoning

Result: System preserves task-relevant content while filtering genuine distractions.

4. Chrome DevTools Protocol Connection Issues

Problem: CDP requires Chrome to be launched with --remote-debugging-port=9222, which users often forget or configure incorrectly.

Solution:

Created comprehensive setup documentation with platform-specific instructions
Added connection testing and clear error messages
Implemented automatic reconnection logic
Built verification tools (curl http://localhost:9222/json)

Result: Smooth onboarding with clear troubleshooting steps.

5. Rate Limiting and Performance

Problem: Making an LLM API call on every tab switch or window change would be slow and expensive.

Solution:

Implemented intelligent caching (1-minute validity)
Tab change detection with hashing to avoid redundant calls
10-second minimum interval between LLM updates
15-second initialization delay to prevent startup spam
Pre-computed metrics reduce LLM workload

Result: Responsive system with minimal API costs (typically 2-4 calls per hour).

6. Cross-Platform Window Monitoring

Problem: Window tracking APIs differ significantly across macOS, Windows, and Linux.

Solution:

Used active-win library which abstracts platform differences
Fallback to basic tracking when detailed info isn't available
Graceful degradation on permission errors

Result: Consistent experience across all platforms with platform-specific optimizations.

7. Temporal Memory and Learning

Problem: Each analysis was stateless, missing patterns that emerge over time.

Solution:

Built a temporal memory system with significance scoring
Track focus sessions, task transitions, and flow state changes
Retrieve relevant memories based on recency and importance
Calculate baseline improvement and learning curves
Semantic tagging for efficient memory retrieval

Result: System learns from past behavior and adapts recommendations over time.

What We Learned

Technical Skills

Advanced IPC patterns: Building robust bridges between Python and Node.js with JSON-RPC
Chrome automation: Deep dive into Chrome DevTools Protocol beyond basic Puppeteer usage
Computer vision: Implementing and optimizing gaze tracking with Kalman filters and feature scaling
LLM prompt engineering: Crafting prompts that produce structured, accurate outputs without hallucination
Real-time system design: Building responsive UIs that react to multiple async data streams
Payment API integration: Working with enterprise financial APIs and security requirements
Semantic analysis: Extracting meaning from unstructured web content

System Design

Multi-modal sensor fusion: Combining data from eye tracking, browser telemetry, window monitoring, and content analysis into a coherent cognitive state model
Conservative automation: Building intelligent systems that preserve user agency rather than forcing rigid rules
Temporal awareness: Designing systems that learn and adapt based on historical patterns
Graceful degradation: Ensuring core functionality works even when components fail

Product Insights

Context is everything: The same content (YouTube, documentation, social media) can be valuable or distracting depending on the user's current task
Trust through transparency: Users need to understand why tabs are being closed, not just have it happen mysteriously
Incremental adaptation: Gentle, gradual changes are more effective than sudden dramatic interventions
Learning from behavior: Observing what users actually do is more valuable than what they say they want

Challenges of AI Systems

Hallucination mitigation: LLMs will confidently generate plausible-sounding but incorrect data if not constrained
Explainability: Users need to understand the system's reasoning, especially for actions that affect their work
Performance vs. accuracy trade-offs: Real-time responsiveness requires caching and rate limiting, which can delay perfect accuracy
Cold start problem: System needs time to learn user patterns before making confident recommendations

What's Next

Immediate Improvements

Multi-user profiles: Save calibrations and learned patterns per user
Enhanced calibration: Adaptive calibration that adds points in low-accuracy regions
Online learning: Update the gaze model continuously during tracking
Better visualization: Show gaze heatmaps and attention patterns over time

Advanced Features

Predictive focus state modeling: Anticipate when user is about to lose focus and intervene proactively
Calendar integration: Prepare environment for upcoming tasks based on schedule
Collaborative focus sessions: Shared focus time with friends or team members
Mobile integration: Extend environment control to phone notifications and apps
Voice control: Hands-free commands for common actions

Research Directions

Personalized flow triggers: Learn individual patterns that lead to deep focus
Distraction prediction: Identify early warning signs before attention breaks
Cognitive load optimization: Dynamically adjust task complexity based on current capacity
Long-term pattern analysis: Track productivity patterns over weeks and months

Conclusion

Artemis represents a new approach to productivity: instead of blocking or restricting, it understands context and adapts intelligently. By combining eye tracking, browser automation, and AI reasoning with a deep respect for user agency, we've built a system that genuinely helps maintain focus without getting in the way.

The journey taught us that building AI-powered systems requires careful attention to hallucination, explainability, and performance. Real-time cognitive state detection is possible with consumer hardware, but it requires thoughtful integration of multiple data sources and conservative decision-making.

Most importantly, we learned that the best productivity tools are the ones you forget are running - they just make your environment feel naturally conducive to focus. That's the ultimate goal of Artemis: invisible assistance that lets you do your best work.

Built With

Submitted to

DubHacks '25
- Winner GROW - The Advocate

Created by

I was responsible for ideating Artemis and envisioned it as an adaptive focus management system that extends beyond the computer, reshaping the user’s entire digital environment to sustain deep work. I also designed the FlowScore algorithm, which integrates behavioral and environmental parameters such as eye gaze stability, typing cadence, active window and tab transitions, and device interaction frequency into a continuous quantitative measure of cognitive focus.

Built entirely in Kotlin, I developed the Artemis Android app, a unified service that regulates network flow and filters notifications in real time based on the user's current FlowScore. Using the VpnService API, Artemis intercepts outgoing network traffic and applies a custom token-bucket rate limiter to dynamically adjust bandwidth with deterministic precision. Its content-level analysis layer parses incoming notifications, classifying them using text-based heuristics that consider message urgency, sender context, and semantic tone to decide whether to surface, delay, or suppress each notification.

Together, these systems form a distributed feedback loop that continuously senses cognitive engagement and autonomously reshapes the digital environment by throttling bandwidth, prioritizing critical alerts, and muting digital noise to help the user stay in an optimal flow state.

Devansh Khandelwal
I engineered a real-time cognitive orchestration system that aligns environment and UI with measured attention. Using gaze-derived engagement, it continuously modulates music, lighting, and on-screen context to keep the user in flow.

Topic-aware tab control: a lightweight classifier clusters open tabs by similarity to the current focus topic; relevant tabs are maximized/foregrounded, low-relevance tabs are auto-minimized (with a “Focus Lock” to prevent accidental nukes). This cuts distraction while preserving task context.

Under the hood: an asynchronous, event-driven architecture (coroutine orchestration) provides sub-second coordination across IoT endpoints and desktop automations. Ambient transitions use cosine-eased fades to avoid perceptual jolts. The framework is device-agnostic and extensible to Google Home, smart lighting, ambient displays, scent diffusers, and more, forming a scalable, neuroadaptive workspace that harmonizes sensory and digital environments around the user’s cognition.

Shivam Rastogi
I built a large portion of Artemis from the ground up, integrating multiple independent subsystems into a cohesive cognitive-state monitoring and environment orchestration platform, alongside adding the additional features that my teammates worked on. The system combines eye-tracking data, browser telemetry, and local application context to feed an AI reasoning engine that makes real-time optimization decisions. My primary goal was to create an adaptive workspace that understands why a user’s actions matter, not just what they’re doing.

The FlowSync Core, built with Electron, React, and TypeScript, serves as the central orchestration layer. I implemented the inter-process communication bridge between the renderer and main process to enable low-latency data exchange with Python and Node services. The UI uses TailwindCSS and Framer Motion for a reactive, GPU-accelerated interface, while the backend continuously aggregates and visualizes metrics such as focus stability, task coherence, and cognitive phase transitions.

The Eye Tracking Service is written in Python using MediaPipe and EyeTrax for gaze estimation. I implemented a JSON-RPC bridge over stdin/stdout to enable bidirectional streaming between Python and Node. To improve reliability, I applied Kalman filtering to remove noise from gaze vectors and implemented variance-based scaling to weight landmark features dynamically. This combination reduced average error from 120 px to roughly 50 px. I also designed the calibration pipeline with confidence metrics (standard deviation, variance ratio, and average drift) so users could quantify accuracy before tracking sessions.

On the browser layer, I developed the Chrome Monitor using the Chrome DevTools Protocol (CDP). It attaches to the Chrome instance running on --remote-debugging-port=9222 and executes injected JavaScript to extract structured content: semantic hierarchy, code blocks, framework signatures, scroll depth, and interaction time. I added content hashing and diff-based change detection to trigger LLM updates only when tab state meaningfully changed. Combined with intelligent caching and rate-limited update cycles, this reduced unnecessary API calls by more than 80 percent.

For reasoning, I integrated Claude 3.5 Haiku through a structured prompt schema that enforces deterministic interpretation of pre-computed telemetry. Instead of relying on generative text, the model receives actual numeric metrics (focus stability, task coherence, distraction frequency) and produces contextual decisions: which tabs to hide, which to preserve, and how to modulate environmental parameters. This architecture eliminated hallucinated metrics and ensured all reasoning was grounded in real sensor data.

At the system level, I defined a three-tier architecture: Perception (data acquisition from gaze, telemetry, and window monitors), Interpretation (semantic analysis and metric computation), and Orchestration (decision logic, environment control, and UI feedback). Each layer runs asynchronously, communicating through shared state over JSON streams. I implemented fault-tolerant fallbacks and queue-based reconnection to keep the system operational even if one service fails.

Finally, I produced the demonstration video that visualizes Artemis in action. I captured live data from the system to show context-aware automation in real time: Chrome tabs closing, lighting transitions, and music changes synchronized with the user’s focus trajectory. The video represented not just the technology but the feeling of a workspace that responds intelligently to its user.

Tanay Gondil