Inspiration

I’m a student who basically lives in ChatGPT and Gemini. I always assumed using these tools was "free" environmentally, but I recently read that a standard conversation can consume up to 500mL of water in data center cooling.

That number seemed high, so I dug into the technicals. I realized most people (myself included) misunderstand how LLMs work. It’s not just the new text you generate that costs energy; it’s the Context Window. Every time I send a message, the model re-processes my entire history.

I realized I was wasting resources by keeping massive, stale threads open. I built Token Ocean because I wanted to see the actual numbers for myself in real-time.

What it does

Token Ocean is a Chrome Extension that audits your ChatGPT and Gemini sessions.

  1. Tracks the Context: It scrapes the DOM to count the total tokens in your current thread (history + new response).
  2. Calculates the Cost: It applies a weighted formula to estimate water (mL) and energy (Wh) usage, distinguishing between the "Prefill" (reading history) and "Decoding" (generating text) phases.
  3. Visualizes Impact: It shows a simple water bottle filling up. When your context gets too heavy, it warns you that starting a new chat would be more efficient.

How I built it

I built it as a Manifest V3 extension with no external dependencies.

  • The Listener: I used a MutationObserver to watch the chat container. Since AI responses stream character-by-character, I had to implement aggressive debouncing to stop the extension from recalculating 50 times a second and freezing the browser.
  • The Scraper: I reverse-engineered the obfuscated class names on ChatGPT and Gemini to reliably extract text content without an API.
  • The Math: I used chrome.storage.local to maintain a running state. The core logic calculates (Context_Tokens * Prefill_Cost) + (Output_Tokens * Decode_Cost) to get a realistic energy estimate, rather than just using a flat "per query" average.

Challenges I ran into

  • Streaming UI: Handling the streaming text was the biggest headache. The DOM updates constantly during generation, which messed up my token counts. I had to write logic to detect when the "Stop Generating" button disappeared to know when to run the final calculation.
  • Different Architectures: ChatGPT uses a completely different DOM structure than Gemini. I had to write modular content scripts to handle each site separately while feeding into a shared calculation engine.
  • Context Inflation: I initially had a bug where I only counted the text on screen. I had to rewrite the logic to cumulatively sum previous turns to accurately model how the LLM backend actually works.

Accomplishments that I'm proud of

  • Privacy-First: The extension is 100% client-side. I don't send user data to any server. It’s a pure local auditor.
  • It actually works: Seeing the water bottle fill up during a long debugging session was a legit wake-up call for me. It changed my own behavior immediately.
  • Cross-Platform: Getting it to work reliably on both ChatGPT and Gemini's current UIs.

What I learned

  • Length > Frequency: I learned that a short question at the end of a long thread is significantly more expensive than a long question in a new thread. The context window is the main driver of cost.
  • Manifest V3 Limits: I learned a lot about the constraints of modern Chrome extensions, specifically around service workers and persistent state.

What's next for Token Ocean

  • More Accurate Tokenization & Water Calculation: Adding dynamic tokenization calculations and water usage calculations based on version of model (e.g. GPT-4 vs GPT-3)
  • More Models: Adding support for Claude and DeepSeek.
  • Context Pruning: I want to add a feature that suggests when to "summarize" a chat to clear the context window and save compute.

Built With

Share this project:

Updates