Skip to content

Per-model token tracking for mid-session model switches#28842

Closed
yitang wants to merge 1 commit into
NousResearch:mainfrom
yitang:feat/per-model-token-tracking
Closed

Per-model token tracking for mid-session model switches#28842
yitang wants to merge 1 commit into
NousResearch:mainfrom
yitang:feat/per-model-token-tracking

Conversation

@yitang

@yitang yitang commented May 19, 2026

Copy link
Copy Markdown

Closes #28637

Adds a usage_by_model TEXT JSON column to the sessions table that
records per-model token/cost breakdown on every API call, enabling
accurate cost attribution and display after mid-session /model
switches.

Changes

  • Schema: usage_by_model TEXT column on sessions, auto-added to
    existing DBs via _reconcile_columns(), bump SCHEMA_VERSION to 12
  • In-memory: agent.session_usage_by_model dict on AIAgent
  • Per-call tracking: populates per-model counters in
    conversation_loop.py — tokens, cache, reasoning, API calls, and
    cost — then serializes to JSON and persists on every turn
  • CLI /usage: per-model breakdown when multiple models detected;
    single-model fallback for legacy sessions
  • Gateway /usage: same breakdown in messaging platforms
  • /insights: model breakdown uses usage_by_model when present,
    falls back to model column for backward compat
  • TUI: usage_by_model included in Python → TypeScript payload
    (display rendering not yet implemented — separate follow-up)
  • Tests: round-trip serialization, cumulative sum invariant, backward
    compat fallback, schema version bump

Backward compatibility

  • Old sessions without the column render single-model as before
  • _reconcile_columns() adds the column on next startup — zero-downtime
  • Cumulative counters (input_tokens, output_tokens, etc.) are
    unchanged

Adds usage_by_model TEXT JSON column to sessions table that records
per-model token/cost breakdown on every API call, enabling accurate
cost attribution and display after mid-session /model switches.

Closes NousResearch#28637
@alt-glitch alt-glitch added type/feature New feature or request P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery comp/tui Terminal UI (ui-tui/ + tui_gateway/) labels May 19, 2026
@teknium1

Copy link
Copy Markdown
Contributor

Closed in favor of PR #35256 (merged) for the dashboard-staleness fix (#34850). Your per-model token-tracking work is a larger feature than this P3 fix needs, but if you want to revisit per-model accounting as a standalone proposal we're happy to look. Thanks @yitang!

@teknium1 teknium1 closed this May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery comp/tui Terminal UI (ui-tui/ + tui_gateway/) P2 Medium — degraded but workaround exists type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Per-model token usage is lost during mid-session model switches

3 participants