Reduce your AI carbon footprint with smart model routing, advanced token pruning, and real-time carbon tracking.
cd web-app
cp config.example.js config.jsEdit config.js and add your API keys:
python -m http.server 8080Go to: http://localhost:8080
Green-Pipe analyzes your prompt and routes it to the most carbon-efficient model:
| Routed to EFFICIENT | Routed to POWERFUL |
|---|---|
| Simple Q&A | Code generation |
| Definitions | Detailed analysis |
| Math calculations | Creative writing |
| Short factual | Long complex |
Efficient Model: Llama 3.1 8B (~0.01g CO₂/query)
Powerful Model: Llama 3.3 70B (~0.12g CO₂/query)
Green Boost: When your local grid is clean (high renewable or low carbon), the app automatically unlocks the powerful model for any query—no CO₂ guilt!
High Carbon Warning: If the grid is dirty and you request a complex task, you’ll be prompted to wait, schedule for later, or proceed anyway.
The Green-Pipe architecture is designed for extensibility and future appliance scenarios:
- Default Routing: The system can be configured to always attempt to run queries on a local model (e.g., via Ollama or other on-device LLMs) for maximum privacy, speed, and zero network carbon cost.
- Fallback Logic: If the local model cannot handle the query (due to complexity, resource limits, or lack of capability), the system can automatically escalate to a more powerful local or cloud model.
- Benefits:
- Keeps data private and on-premises
- Reduces latency and network energy use
- Enables offline or low-connectivity operation
- Seamless Escalation: For queries that exceed local model capabilities, Green-Pipe can route requests to cloud-based LLMs (e.g., Groq, OpenAI, or other providers), factoring in real-time grid carbon intensity and user preferences.
- Smart Selection: The router can be expanded to select among multiple cloud providers based on carbon intensity, cost, or performance.
- Hybrid Workflows: Future versions may support hybrid execution, where part of a workflow runs locally and part in the cloud, optimizing for both carbon and capability.
This flexible architecture allows organizations to start with local, low-carbon inference and expand to cloud-based AI as needed, always with carbon awareness and user control.
Green-Pipe uses both rule-based and AI-powered (Transformers.js) pruning to reduce token count before sending your prompt:
- Removes greetings, closings, and filler words
- Converts embedded JSON to a compact TOON format
- Optionally prunes semantically unimportant words (if AI pruning is enabled)
Animated visualizations show you exactly what was pruned and how many tokens were saved.
- Live Grid Intensity: Powered by Electricity Maps API, updates every 5 minutes
- Per-Query CO₂: Calculated using actual grid data and model energy specs
- Green Boost: Automatically enables powerful model when grid is clean
- CO₂ Forecast & Scheduling: If the grid is dirty, you can schedule your query for the next green window
- Session Stats: Tracks tokens saved, CO₂ saved, and energy saved in real time
Green-Pipe provides transparent, real-time calculations for every query and session:
Where:
- Energy (kWh) is estimated from the number of tokens, model specs, and datacenter PUE.
- Grid Intensity is live from Electricity Maps.
Removes unnecessary tokens before sending to the LLM, then calculates:
Shows the difference in CO₂ between efficient and powerful models for the same query:
Tracks and displays:
- Total tokens processed
- Tokens pruned
- Energy saved
- CO₂ saved (routing + pruning)
Calculates potential CO₂ savings by waiting for a lower-carbon grid window, using live or forecasted grid intensity.
All calculations are shown in the UI, with breakdowns for each query and session.
web-app/
├── index.html # Main UI
├── app.js # Core application logic
├── config.js # API keys (gitignored)
├── config.example.js # API keys template
└── llmClients/
└── ollamaClient.js # LLM client abstraction
extension/
├── arbitrator.js # Model routing logic
├── background.js # Service worker
├── content.js # Content script
└── manifest.json # Chrome extension manifest
This project uses two free APIs:
-
Groq API — Fast LLM inference
Free: 30 requests/minute
Get key -
Electricity Maps API — Real-time carbon intensity
Free: 100 requests/day
Get key
- Smart model routing based on query complexity and live grid data
- Green Boost: unlocks powerful model when grid is clean
- Token pruning with animated and AI-powered visualization
- Live carbon intensity and CO₂ forecast
- Per-query and session CO₂/energy stats
- Query scheduling for optimal carbon savings
- Real-time CO₂ savings display
MIT License