ThermaMind - AI-Powered Data Center Optimization

A real-time GPU cluster monitoring and optimization platform with integrated AI assistant.

🚀 The Problem

Modern AI data centers are the backbone of innovation — but also major energy consumers:

GPUs burn massive amounts of power, even when idle
Cooling systems often overcompensate, wasting additional electricity and water
Inefficient workload scheduling leads to 40%+ energy waste
By 2030, data centers could consume as much power as entire countries

💡 The Solution

ThermaMind introduces an AI-driven orchestration layer that continuously monitors and optimizes compute and cooling efficiency in real-time.

Features

🖥️ Real-Time Telemetry Simulation

32 GPU Nodes organized into 4 physical clusters (A, B, C, D)
Live WebSocket connection updating every 2 seconds
Heatmap visualization showing GPU load, temperature, and cooling efficiency
Time-series charts tracking energy usage, cooling efficiency, and workload distribution
Generates synthetic data for GPU load, temperature, and cooling metrics across all clusters

🧊 AI-Based Optimization Logic

AI-optimized cooling: 70% of nodes use predictive cooling (GPU load + 5%)
Legacy systems: 30% of nodes still catching up from over-cooling issues
Dynamic status indicators: Active, Idle, or Optimizing based on real metrics
Predictive node management: Automatically determines when to power down idle nodes and adjust cooling dynamically

📊 Sustainability Dashboard

Real-time metrics visualization: Energy savings, CO₂ offset, power usage
PUE (Power Usage Effectiveness) tracking for efficiency monitoring
Live performance indicators showing the environmental impact of optimization decisions
Actionable insights for reducing data center carbon footprint

📁 Data Logging & Analytics

Continuous telemetry streams stored in JSONL format (data_logs/telemetry_log.jsonl)
Ready for future analysis or ML model fine-tuning
Compact data summaries: Aggregates raw telemetry into hourly insights
Context-efficient logs optimized for LLM prompts and downstream processing

🤖 AI Assistant (Gemini + ElevenLabs)

Real-time analysis: Ask the AI to analyze current cluster status
Question answering: Get insights about power consumption, efficiency issues, and workload recommendations
Text-to-Speech: AI responses are spoken using ElevenLabs voice synthesis
Context-aware: The AI has access to live telemetry data, not canned responses

Setup

Prerequisites

Node.js 18+
npm or yarn
Gemini API Key (for AI assistant)
ElevenLabs API Key (for voice synthesis)

Installation

Clone the repository:

git clone <repo-url>
cd codeRedAstra

Install backend dependencies:

cd backend
npm install

Install frontend dependencies:

cd ../frontend
npm install

Configure environment variables:

# Copy the example file
cd ../backend
cp .env.example .env

# Edit .env and add your API keys:
# GEMINI_API_KEY=your_key_here
# ELEVENLABS_API_KEY=your_key_here

Running the Application

Start the backend server:

cd backend
node src/server.js

The backend will run on http://localhost:8080

In a new terminal, start the frontend:

cd frontend
npm run dev

The frontend will run on http://localhost:5173

Open your browser to http://localhost:5173

AI Assistant Usage

Analyzing Current Status

Click the 🤖 floating button in the bottom-right corner
Click "📊 Analyze Current Status"
Wait for the AI to analyze real-time telemetry
Read or listen to the AI's insights

Asking Questions

Example questions you can ask:

"Which cluster should I run my next job on?"
"Why is my power consumption high?"
"Are there any efficiency issues?"
"What's happening with Cluster C?"
"Should I be worried about the cooling on Cluster A?"

API Costs

Gemini API: ~$0.001-0.005 per request (flash model)
ElevenLabs TTS: ~~$0.01 per 1000 characters (~~$0.005 per response)
Total: ~$0.006-0.015 per AI interaction with voice

Note: You can disable TTS by removing withAudio: true from the WebSocket messages in AIAssistant.tsx to reduce costs.

Architecture

Backend (`/backend`)

Express server: REST API and static file serving
WebSocket server: Real-time telemetry broadcast (2-second intervals)
Simulator: Generates realistic GPU workload and cooling data
AI Assistant: Gemini integration for telemetry analysis

Frontend (`/frontend`)

React + TypeScript: Modern component-based UI
Chart.js: Time-series visualization
TailwindCSS: Responsive styling
WebSocket hook: Real-time data connection

Data Flow

Simulator → WebSocket Server → Frontend Components
              ↓
         AI Assistant (on demand)
              ↓
         Gemini API → ElevenLabs TTS

Project Structure

backend/
  src/
    server.js          # Express + WebSocket server
    simulator.js       # GPU telemetry generator
    webSocket.js       # WebSocket handlers
    aiAssistant.js     # Gemini + ElevenLabs integration
    
frontend/
  src/
    App.tsx            # Main dashboard
    hooks/
      useTelemetry.ts  # WebSocket connection hook
    components/
      AIAssistant.tsx  # AI chat interface

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md
SETUP.md		SETUP.md
start-dev.ps1		start-dev.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ThermaMind - AI-Powered Data Center Optimization

🚀 The Problem

💡 The Solution

Features

🖥️ Real-Time Telemetry Simulation

🧊 AI-Based Optimization Logic

📊 Sustainability Dashboard

📁 Data Logging & Analytics

🤖 AI Assistant (Gemini + ElevenLabs)

Setup

Prerequisites

Installation

Running the Application

AI Assistant Usage

Analyzing Current Status

Asking Questions

API Costs

Architecture

Backend (`/backend`)

Frontend (`/frontend`)

Data Flow

Project Structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ThermaMind - AI-Powered Data Center Optimization

🚀 The Problem

💡 The Solution

Features

🖥️ Real-Time Telemetry Simulation

🧊 AI-Based Optimization Logic

📊 Sustainability Dashboard

📁 Data Logging & Analytics

🤖 AI Assistant (Gemini + ElevenLabs)

Setup

Prerequisites

Installation

Running the Application

AI Assistant Usage

Analyzing Current Status

Asking Questions

API Costs

Architecture

Backend (/backend)

Frontend (/frontend)

Data Flow

Project Structure

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Backend (`/backend`)

Frontend (`/frontend`)

Packages