A real-time GPU cluster monitoring and optimization platform with integrated AI assistant.
Modern AI data centers are the backbone of innovation — but also major energy consumers:
- GPUs burn massive amounts of power, even when idle
- Cooling systems often overcompensate, wasting additional electricity and water
- Inefficient workload scheduling leads to 40%+ energy waste
- By 2030, data centers could consume as much power as entire countries
ThermaMind introduces an AI-driven orchestration layer that continuously monitors and optimizes compute and cooling efficiency in real-time.
- 32 GPU Nodes organized into 4 physical clusters (A, B, C, D)
- Live WebSocket connection updating every 2 seconds
- Heatmap visualization showing GPU load, temperature, and cooling efficiency
- Time-series charts tracking energy usage, cooling efficiency, and workload distribution
- Generates synthetic data for GPU load, temperature, and cooling metrics across all clusters
- AI-optimized cooling: 70% of nodes use predictive cooling (GPU load + 5%)
- Legacy systems: 30% of nodes still catching up from over-cooling issues
- Dynamic status indicators: Active, Idle, or Optimizing based on real metrics
- Predictive node management: Automatically determines when to power down idle nodes and adjust cooling dynamically
- Real-time metrics visualization: Energy savings, CO₂ offset, power usage
- PUE (Power Usage Effectiveness) tracking for efficiency monitoring
- Live performance indicators showing the environmental impact of optimization decisions
- Actionable insights for reducing data center carbon footprint
- Continuous telemetry streams stored in JSONL format (
data_logs/telemetry_log.jsonl) - Ready for future analysis or ML model fine-tuning
- Compact data summaries: Aggregates raw telemetry into hourly insights
- Context-efficient logs optimized for LLM prompts and downstream processing
- Real-time analysis: Ask the AI to analyze current cluster status
- Question answering: Get insights about power consumption, efficiency issues, and workload recommendations
- Text-to-Speech: AI responses are spoken using ElevenLabs voice synthesis
- Context-aware: The AI has access to live telemetry data, not canned responses
- Node.js 18+
- npm or yarn
- Gemini API Key (for AI assistant)
- ElevenLabs API Key (for voice synthesis)
- Clone the repository:
git clone <repo-url>
cd codeRedAstra- Install backend dependencies:
cd backend
npm install- Install frontend dependencies:
cd ../frontend
npm install- Configure environment variables:
# Copy the example file
cd ../backend
cp .env.example .env
# Edit .env and add your API keys:
# GEMINI_API_KEY=your_key_here
# ELEVENLABS_API_KEY=your_key_here- Start the backend server:
cd backend
node src/server.jsThe backend will run on http://localhost:8080
- In a new terminal, start the frontend:
cd frontend
npm run devThe frontend will run on http://localhost:5173
- Open your browser to
http://localhost:5173
- Click the 🤖 floating button in the bottom-right corner
- Click "📊 Analyze Current Status"
- Wait for the AI to analyze real-time telemetry
- Read or listen to the AI's insights
Example questions you can ask:
- "Which cluster should I run my next job on?"
- "Why is my power consumption high?"
- "Are there any efficiency issues?"
- "What's happening with Cluster C?"
- "Should I be worried about the cooling on Cluster A?"
- Gemini API: ~$0.001-0.005 per request (flash model)
- ElevenLabs TTS:
$0.01 per 1000 characters ($0.005 per response) - Total: ~$0.006-0.015 per AI interaction with voice
Note: You can disable TTS by removing withAudio: true from the WebSocket messages in AIAssistant.tsx to reduce costs.
- Express server: REST API and static file serving
- WebSocket server: Real-time telemetry broadcast (2-second intervals)
- Simulator: Generates realistic GPU workload and cooling data
- AI Assistant: Gemini integration for telemetry analysis
- React + TypeScript: Modern component-based UI
- Chart.js: Time-series visualization
- TailwindCSS: Responsive styling
- WebSocket hook: Real-time data connection
Simulator → WebSocket Server → Frontend Components
↓
AI Assistant (on demand)
↓
Gemini API → ElevenLabs TTS
backend/
src/
server.js # Express + WebSocket server
simulator.js # GPU telemetry generator
webSocket.js # WebSocket handlers
aiAssistant.js # Gemini + ElevenLabs integration
frontend/
src/
App.tsx # Main dashboard
hooks/
useTelemetry.ts # WebSocket connection hook
components/
AIAssistant.tsx # AI chat interface
MIT