Inspiration
I’m a student who basically lives in ChatGPT and Gemini. I always assumed using these tools was "free" environmentally, but I recently read that a standard conversation can consume up to 500mL of water in data center cooling.
That number seemed high, so I dug into the technicals. I realized most people (myself included) misunderstand how LLMs work. It’s not just the new text you generate that costs energy; it’s the Context Window. Every time I send a message, the model re-processes my entire history.
I realized I was wasting resources by keeping massive, stale threads open. I built Token Ocean because I wanted to see the actual numbers for myself in real-time.
What it does
Token Ocean is a Chrome Extension that audits your ChatGPT and Gemini sessions.
- Tracks the Context: It scrapes the DOM to count the total tokens in your current thread (history + new response).
- Calculates the Cost: It applies a weighted formula to estimate water (mL) and energy (Wh) usage, distinguishing between the "Prefill" (reading history) and "Decoding" (generating text) phases.
- Visualizes Impact: It shows a simple water bottle filling up. When your context gets too heavy, it warns you that starting a new chat would be more efficient.
How I built it
I built it as a Manifest V3 extension with no external dependencies.
- The Listener: I used a
MutationObserverto watch the chat container. Since AI responses stream character-by-character, I had to implement aggressive debouncing to stop the extension from recalculating 50 times a second and freezing the browser. - The Scraper: I reverse-engineered the obfuscated class names on ChatGPT and Gemini to reliably extract text content without an API.
- The Math: I used
chrome.storage.localto maintain a running state. The core logic calculates(Context_Tokens * Prefill_Cost) + (Output_Tokens * Decode_Cost)to get a realistic energy estimate, rather than just using a flat "per query" average.
Challenges I ran into
- Streaming UI: Handling the streaming text was the biggest headache. The DOM updates constantly during generation, which messed up my token counts. I had to write logic to detect when the "Stop Generating" button disappeared to know when to run the final calculation.
- Different Architectures: ChatGPT uses a completely different DOM structure than Gemini. I had to write modular content scripts to handle each site separately while feeding into a shared calculation engine.
- Context Inflation: I initially had a bug where I only counted the text on screen. I had to rewrite the logic to cumulatively sum previous turns to accurately model how the LLM backend actually works.
Accomplishments that I'm proud of
- Privacy-First: The extension is 100% client-side. I don't send user data to any server. It’s a pure local auditor.
- It actually works: Seeing the water bottle fill up during a long debugging session was a legit wake-up call for me. It changed my own behavior immediately.
- Cross-Platform: Getting it to work reliably on both ChatGPT and Gemini's current UIs.
What I learned
- Length > Frequency: I learned that a short question at the end of a long thread is significantly more expensive than a long question in a new thread. The context window is the main driver of cost.
- Manifest V3 Limits: I learned a lot about the constraints of modern Chrome extensions, specifically around service workers and persistent state.
What's next for Token Ocean
- More Accurate Tokenization & Water Calculation: Adding dynamic tokenization calculations and water usage calculations based on version of model (e.g. GPT-4 vs GPT-3)
- More Models: Adding support for Claude and DeepSeek.
- Context Pruning: I want to add a feature that suggests when to "summarize" a chat to clear the context window and save compute.
Built With
- css3
- html5
- javascript
- manifest
Log in or sign up for Devpost to join the conversation.