Skip to content

Feature: usage token tracking#43

Merged
SantiagoDePolonia merged 21 commits intomainfrom
feature/usage-token-tracking
Jan 21, 2026
Merged

Feature: usage token tracking#43
SantiagoDePolonia merged 21 commits intomainfrom
feature/usage-token-tracking

Conversation

@SantiagoDePolonia
Copy link
Copy Markdown
Contributor

@SantiagoDePolonia SantiagoDePolonia commented Jan 21, 2026

Summary by CodeRabbit

  • New Features

    • Token usage tracking: buffered async logging, multi-backend storage (sqlite/postgres/mongo), and usage data included in streaming and final responses.
    • Dynamic model listing for a provider.
  • Bug Fixes

    • 404/Not Found responses correctly classified.
  • Configuration

    • Consolidated storage config to STORAGE_TYPE (default sqlite); new USAGE_* env vars to enable/tune usage tracking.
  • Documentation

    • Added token-usage tracking plan and updated project docs.
  • Tests

    • Extensive tests for usage extraction, stores, stream wrappers, and providers.

✏️ Tip: You can customize this high-level summary in your review settings.

@SantiagoDePolonia SantiagoDePolonia self-assigned this Jan 21, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Jan 21, 2026

📝 Walkthrough

Walkthrough

Adds a token usage tracking subsystem: new usage types, config/env, multi-backend stores (SQLite/Postgres/MongoDB), async buffered logger, SSE stream extraction/wrapping, request-ID propagation, and wiring into app/server/handlers and provider streams; also moves storage-type config to a top-level storage section.

Changes

Cohort / File(s) Summary
Configuration & Env
/.claude/settings.local.json, .env.template, config/config.go, config/config.yaml, config/config_test.go, docker-compose.yaml
Introduce STORAGE_TYPE and USAGE_* env vars; remove LOGGING_STORAGE_TYPE; add top-level storage config and Usage loading/defaults.
Core Types & Context
internal/core/types.go, internal/core/responses.go, internal/core/context.go
Add StreamOptions and RawUsage fields; add WithRequestID/GetRequestID helpers; propagate StreamOptions in WithStreaming.
Usage Core & Constants
internal/usage/usage.go, internal/usage/constants.go, internal/usage/cleanup.go
New UsageEntry, UsageStore interface, Config struct, defaults, constants, and cleanup loop.
Usage Logger & Factory
internal/usage/logger.go, internal/usage/factory.go, internal/usage/usage_test.go
Async buffered Logger, NoopLogger, LoggerInterface, Result factory (New / NewWithSharedStorage), lifecycle/close behavior.
Usage Extractors & Stream Wrapper
internal/usage/extractor.go, internal/usage/extractor_test.go, internal/usage/stream_wrapper.go, internal/usage/stream_wrapper_test.go
Extract usage from responses and SSE; StreamUsageWrapper buffers SSE and emits UsageEntry on Close; comprehensive tests.
Usage Storage Backends
internal/usage/store_sqlite.go, internal/usage/store_postgresql.go, internal/usage/store_mongodb.go
SQLite/Postgres/MongoDB stores with batching, retention/cleanup, index/TTL handling, partial-write detection.
Server & Handler Wiring
internal/app/app.go, internal/server/http.go, internal/server/handlers.go, internal/server/handlers_test.go, internal/auditlog/factory.go, internal/auditlog/middleware.go
Wire usage logger into App and server Config; NewHandler gains usageLogger; handlers wrap streaming/non-streaming flows for usage extraction/logging and propagate request IDs; middleware sets X-Request-ID on response.
Providers & Streaming
internal/providers/anthropic/..., internal/providers/responses_converter.go, internal/providers/*
Anthropic: ListModels, parse created timestamps, include usage in streaming chunks; responses_converter caches usage for final response.done; several providers forward request-id in headers; OpenAI validates client request id.
Errors & Tests
internal/core/errors.go, internal/core/errors_test.go, numerous tests under internal/usage/* and provider tests
404 provider responses map to NotFound error; tests updated; many new/updated tests for usage extraction, stream wrapper, stores, and provider changes.
Docs & Misc
docs/plans/token-usage-tracking.md, CLAUDE.md, EXAMPLES.md, .claude/settings.local.json
New design doc and docs updates; example model update; permission added for WebFetch(domain:docs.anthropic.com) and minor punctuation tweak.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Handler
    participant Provider as Provider API
    participant UsageLogger
    participant Buffer
    participant Storage

    Note over Client,Storage: Non-streaming flow (usage write)
    Client->>Handler: POST /chat/completions
    Handler->>Provider: Forward request (ctx includes request-id)
    Provider-->>Handler: Response (includes usage)
    Handler->>UsageLogger: Write(UsageEntry)
    UsageLogger->>Buffer: Enqueue entry
    alt Flush by threshold or timer
        UsageLogger->>Storage: WriteBatch(entries)
        Storage-->>UsageLogger: Ack
    end
    Handler-->>Client: Response (+ usage)
Loading
sequenceDiagram
    participant Client
    participant Handler
    participant ProviderSSE as Provider SSE
    participant StreamWrapper
    participant UsageLogger
    participant Storage

    Note over Client,Storage: Streaming flow with SSE extraction
    Client->>Handler: POST /chat/completions?stream=true
    Handler->>ProviderSSE: Open provider SSE (wrapped for usage)
    ProviderSSE-->>StreamWrapper: SSE events (data chunks)
    StreamWrapper->>Client: Relay events
    loop streaming
        ProviderSSE->>StreamWrapper: Event (may include usage JSON)
        StreamWrapper->>StreamWrapper: Buffer rolling window
    end
    StreamWrapper->>StreamWrapper: Parse buffered SSE for usage on Close
    StreamWrapper->>UsageLogger: Write(UsageEntry)
    UsageLogger->>Storage: WriteBatch(entries)
    Storage-->>UsageLogger: Ack
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Suggested labels

enhancement

Poem

🐰 I nibble tokens late at night,

I buffer, flush, and log each bite,
SQLite, PG, Mongo keep score,
SSE crumbs tracked forevermore—
Hop! The usage logs take flight!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.96% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Feature: usage token tracking' directly summarizes the main change - implementing token usage tracking functionality across the codebase.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

🤖 Fix all issues with AI agents
In `@docs/plans/token-usage-tracking.md`:
- Around line 12-36: Fix markdownlint errors by ensuring all fenced code blocks
are surrounded by blank lines and include language tags (e.g., ```go for the
Usage struct block), replace emphasis-used headings (like **1. Core Usage Struct
is Minimal**) with proper heading syntax (e.g., "### Core Usage Struct is
Minimal") and ensure a blank line before/after headings, and run through the
document to correct other lint rules (MD031, MD036, MD040, MD022) so headings
and code blocks follow consistent spacing and language-tag conventions; update
the examples mentioning Usage, StreamLogWrapper and audit log generation to use
proper fenced code blocks and heading markup.

In `@internal/core/errors.go`:
- Around line 158-162: The 404 branch creating NewNotFoundError(message)
currently drops the originalErr; update the case handling so after creating err
:= NewNotFoundError(message) and setting err.Provider = provider you also set
err.OriginalErr = originalErr (matching other branches) before returning to
preserve debugging context; reference NewNotFoundError, err.Provider and
originalErr in your change.

In `@internal/core/responses.go`:
- Around line 73-78: The ResponsesUsage struct currently exposes RawUsage as
map[string]any which weakens payload typing; update ResponsesUsage to replace
RawUsage with json.RawMessage (importing encoding/json) or a dedicated typed
struct to preserve strong typing for provider-specific data, and adjust any
marshaling/unmarshaling code that reads/writes ResponsesUsage (references:
ResponsesUsage, RawUsage) to handle json.RawMessage (or the new typed struct)
instead of a map while keeping the json tag `json:"raw_usage,omitempty"`.

In `@internal/providers/anthropic/anthropic.go`:
- Around line 405-426: Provider.ListModels currently fetches only the first page
(limit=1000) and ignores anthropicModelsResponse.HasMore/LastID; change
ListModels to loop calling p.client.Do repeatedly while anthropicResp.HasMore is
true, passing after_id=anthropicResp.LastID (and limit=1000) on subsequent
requests, appending each page's models (convert with parseCreatedAt into
core.Model) into the same slice, and only return the full core.ModelsResponse
after no more pages remain; use the anthropicModelsResponse struct fields
HasMore and LastID to control pagination and aggregation.
- Around line 434-441: Change parseCreatedAt to log parse errors via the slog
package and return 0 instead of time.Now().Unix() so invalid timestamps are not
silently mapped to current time; specifically, in function
parseCreatedAt(createdAt string) int64 call slog.Error (or slog.With(...).Error)
with a clear message including the invalid createdAt and the parse error, then
return int64(0) on error. Also update the TestParseCreatedAt_InvalidFormat unit
test to expect 0 for invalid/ unparsable timestamps.

In `@internal/server/handlers.go`:
- Around line 55-56: The request ID is being read from the response headers
(using c.Response().Header().Get("X-Request-ID")) which is incorrect; change all
occurrences to read from the incoming request headers (use
c.Request().Header.Get("X-Request-ID")) so the value set by the audit log
middleware is returned—update the occurrences around the requestID variable in
handlers.go (and the similar spots referenced at the other two occurrences) to
use c.Request().Header.Get instead of c.Response().Header().Get.

In `@internal/usage/cleanup.go`:
- Around line 8-25: Extract the duplicate RunCleanupLoop and CleanupInterval
into a shared utility function that accepts the interval as a parameter (e.g.,
create common.RunCleanupLoop(interval time.Duration, stop <-chan struct{},
cleanupFn func())), remove the local constant/duplicate RunCleanupLoop in both
packages, and update the callers in usage and auditlog to call
common.RunCleanupLoop(CleanupInterval, stop, cleanupFn) (or pass their existing
interval constant) so both packages use the single shared implementation; ensure
you keep the initial immediate cleanup call and ticker.Stop defer behavior in
the new common function.

In `@internal/usage/extractor.go`:
- Around line 31-34: The code assigns resp.Usage.RawUsage directly to
entry.RawData which can lead to races if the source map is mutated after Write
because entries are enqueued asynchronously; fix by making a defensive deep copy
of the map before assigning (for every occurrence where resp.Usage.RawUsage ->
entry.RawData is set, including the occurrences around the blocks at lines
similar to the ones shown and the other spots referenced) so entry.RawData owns
its own map instance rather than referencing the original resp.Usage.RawUsage.

In `@internal/usage/logger.go`:
- Around line 51-139: The Write path can still panic if l.buffer is closed
concurrently; make Close idempotent and block until in-flight Write calls finish
by adding a write-tracking WaitGroup and a close guard: introduce e.g.
l.writesWg (sync.WaitGroup) and l.closeOnce (sync.Once) or l.closeMu
(sync.Mutex)/RWMutex, update Write to take a read lock or check under the close
guard, call l.writesWg.Add(1) before attempting to send and defer
l.writesWg.Done(); if closed return without sending; update Close to use the
close guard or closeOnce to only close l.done once, set l.closed, then wait for
l.writesWg.Wait() and then wait for l.wg.Wait() (flushLoop) before closing the
store; reference functions/fields: Write, Close, flushLoop, l.closed, l.buffer,
l.done, l.wg, and add l.writesWg and l.closeOnce/closeMu as needed.

In `@internal/usage/store_mongodb.go`:
- Around line 123-145: The error handling after s.collection.InsertMany should
use errors.As to detect a wrapped mongo.BulkWriteException instead of a direct
type assertion; update the block that currently does "if bulkErr, ok :=
err.(mongo.BulkWriteException); ok { ... }" to use var bulkErr
mongo.BulkWriteException and errors.As(err, &bulkErr) so partial write detection
and logging (slog.Warn, usagePartialWriteFailures.Inc, and returning
&PartialWriteError{... Cause: bulkErr}) still occur for wrapped errors, and add
the "errors" import if missing; apply the same pattern used in
internal/auditlog/store_mongodb.go for consistency.

In `@internal/usage/store_postgresql.go`:
- Around line 67-76: The PostgreSQLStore currently always allocates stopCleanup
even when retentionDays <= 0; change the constructor to only create and assign
stopCleanup when retentionDays > 0 and start the goroutine, otherwise leave
stopCleanup nil (or zero value). Locate the PostgreSQLStore struct instantiation
(PostgreSQLStore) and the RunCleanupLoop call and condition on retentionDays to
perform both the make(chan struct{}) assignment and the go RunCleanupLoop(...)
invocation only when retentionDays > 0 so the unused channel isn't allocated.
- Around line 123-159: writeBatchLarge currently opens a transaction but still
calls tx.Exec per row (tx.Exec loop), which negates batch performance and also
collects insert errors yet proceeds to tx.Commit; change it to perform a true
batch insert using pgx.Batch (create a pgx.Batch, Queue the INSERT for each
UsageEntry, execute via s.pool.SendBatch or tx.SendBatch and read results) or
use tx.CopyFrom with pgx.CopyFromRows to insert all rows efficiently, and ensure
transactional semantics: if any batch/CopyFrom error occurs, call tx.Rollback
and return the error without committing (remove committing when errs were
collected), and replace the per-row error collection logic with handling the
batch/CopyFrom error result. Reference symbols: writeBatchLarge, tx.Exec
(remove), pgx.Batch / tx.CopyFrom, tx.Commit, tx.Rollback.

In `@internal/usage/store_sqlite.go`:
- Around line 167-171: The cleanup query uses time.RFC3339 but entries are
written using time.RFC3339Nano (causing incorrect string comparisons); update
the cutoff generation in the cleanup path (where cutoff is set before
s.db.Exec("DELETE FROM usage WHERE timestamp < ?", cutoff)) to use
time.RFC3339Nano instead of time.RFC3339 so the stored timestamps and the cutoff
use the same nanosecond precision.

In `@internal/usage/stream_wrapper.go`:
- Around line 41-52: The rolling buffer in the Write path currently calls
w.buffer.Reset() which doesn't shrink the backing array, so when trimming to
SSEBufferSize you must reallocate a new smaller buffer to cap memory; modify the
trim branch (where w.buffer.Len() > SSEBufferSize and you compute data :=
w.buffer.Bytes()) to create a fresh buffer of size SSEBufferSize (or a
bytes.NewBuffer with the sliced tail) and replace w.buffer with it (or copy the
tail into a new bytes.Buffer) instead of Reset()+Write(), ensuring capacity is
limited while preserving the last SSEBufferSize bytes.
- Around line 132-191: The code currently leaves totalTokens at 0 when
"total_tokens" is missing, undercounting usage; update the logic after
extracting inputTokens, outputTokens, totalTokens from usageMap to set
totalTokens = inputTokens + outputTokens when totalTokens == 0 and (inputTokens
> 0 || outputTokens > 0). Keep the existing rawData handling and call to
ExtractFromSSEUsage(providerID, inputTokens, outputTokens, totalTokens, rawData,
w.requestID, model, w.provider, w.endpoint) so the derived totalTokens is passed
through.
- Around line 104-130: The extractUsageFromJSON function in StreamUsageWrapper
currently unmarshals into map[string]interface{}; change it to unmarshal into
strongly-typed structs (e.g., SSEChunk struct with ID, Model, Usage
json.RawMessage fields and a UsagePayload struct with Token fields as
json.Number or *int64) inside StreamUsageWrapper.extractUsageFromJSON, then
decode UsagePayload to populate and return a UsageEntry (preserve any
provider-specific remaining data into UsageEntry.RawData as raw JSON); ensure
model override logic still uses SSEChunk.Model when present and convert
json.Number to int safely for token counts.

In `@internal/usage/usage_test.go`:
- Around line 44-85: Replace the fixed time.Sleep in TestLogger with a polling
loop that repeatedly checks store.getEntries() until the expected 5 entries
appear or a timeout elapses; locate the TestLogger test that constructs logger
via NewLogger, writes entries with logger.Write and currently does
time.Sleep(200*time.Millisecond), and change it to poll (using time.After for
overall timeout and time.Tick or short sleeps) to assert len(entries)==5 or fail
if timeout reached, then continue to Close() the logger and verify store.closed
as before.

Comment on lines +434 to +441
// parseCreatedAt parses an RFC3339 timestamp string to Unix timestamp
func parseCreatedAt(createdAt string) int64 {
t, err := time.Parse(time.RFC3339, createdAt)
if err != nil {
return time.Now().Unix()
}
return t.Unix()
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file structure and the function in question
wc -l internal/providers/anthropic/anthropic.go

Repository: ENTERPILOT/GOModel

Length of output: 180


🏁 Script executed:

# Read the specific section around lines 434-441
sed -n '430,445p' internal/providers/anthropic/anthropic.go

Repository: ENTERPILOT/GOModel

Length of output: 531


🏁 Script executed:

# Check if slog is imported in the file
rg "import|slog" internal/providers/anthropic/anthropic.go | head -30

Repository: ENTERPILOT/GOModel

Length of output: 579


🏁 Script executed:

# Search for usages of parseCreatedAt function
rg "parseCreatedAt" internal/providers/anthropic/

Repository: ENTERPILOT/GOModel

Length of output: 770


🏁 Script executed:

# Get broader context of the anthropic.go file to understand the module and imports
head -50 internal/providers/anthropic/anthropic.go

Repository: ENTERPILOT/GOModel

Length of output: 1177


🏁 Script executed:

# Find and read the parseCreatedAt test
rg -A 15 "parseCreatedAt.*invalid" internal/providers/anthropic/anthropic_test.go

Repository: ENTERPILOT/GOModel

Length of output: 638


🏁 Script executed:

# Get the full test function for parseCreatedAt
rg -B 5 -A 20 "func.*Test.*parseCreatedAt" internal/providers/anthropic/anthropic_test.go

Repository: ENTERPILOT/GOModel

Length of output: 44


🏁 Script executed:

# Check the context where parseCreatedAt is called in anthropic.go
rg -B 5 -A 5 "created := parseCreatedAt" internal/providers/anthropic/anthropic.go

Repository: ENTERPILOT/GOModel

Length of output: 360


🏁 Script executed:

# Let's check if there are more comprehensive test cases for parseCreatedAt
rg -B 10 -A 30 "func Test.*parseCreatedAt\|parseCreatedAt\(\"" internal/providers/anthropic/anthropic_test.go | head -60

Repository: ENTERPILOT/GOModel

Length of output: 44


🏁 Script executed:

# Search more broadly for test functions around parseCreatedAt
rg "parseCreatedAt\|func Test" internal/providers/anthropic/anthropic_test.go | head -20

Repository: ENTERPILOT/GOModel

Length of output: 44


🏁 Script executed:

# Let's look at the test file around where parseCreatedAt is tested
rg -B 20 'parseCreatedAt\("invalid' internal/providers/anthropic/anthropic_test.go

Repository: ENTERPILOT/GOModel

Length of output: 580


🏁 Script executed:

# Let's see more context around the test to understand the full test setup
rg -B 40 "TestParseCreatedAt_InvalidFormat" internal/providers/anthropic/anthropic_test.go | tail -50

Repository: ENTERPILOT/GOModel

Length of output: 949


🏁 Script executed:

# Check how the Created field is used in callers of ListModels or core.Model
rg "\.Created|Model\{" internal/providers/anthropic/anthropic.go | head -10

Repository: ENTERPILOT/GOModel

Length of output: 140


🏁 Script executed:

# Check the core.Model structure to understand the Created field
rg -A 10 "type Model struct" internal/core/

Repository: ENTERPILOT/GOModel

Length of output: 626


🏁 Script executed:

# Search for how Created timestamp is used in the codebase
rg "\.Created" --type go | head -20

Repository: ENTERPILOT/GOModel

Length of output: 1286


Log invalid timestamps and return 0 instead of silently mapping to current time.

Returning time.Now().Unix() on parse errors masks upstream data issues and can unexpectedly reorder model lists. The suggested change uses the already-imported slog package to log the error and return 0 for invalid timestamps.

Note: This will require updating the TestParseCreatedAt_InvalidFormat test that currently expects non-zero values for invalid dates.

🔧 Suggested fix
 func parseCreatedAt(createdAt string) int64 {
 	t, err := time.Parse(time.RFC3339, createdAt)
 	if err != nil {
-		return time.Now().Unix()
+		slog.Warn("invalid model created_at", "value", createdAt, "error", err)
+		return 0
 	}
 	return t.Unix()
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// parseCreatedAt parses an RFC3339 timestamp string to Unix timestamp
func parseCreatedAt(createdAt string) int64 {
t, err := time.Parse(time.RFC3339, createdAt)
if err != nil {
return time.Now().Unix()
}
return t.Unix()
}
// parseCreatedAt parses an RFC3339 timestamp string to Unix timestamp
func parseCreatedAt(createdAt string) int64 {
t, err := time.Parse(time.RFC3339, createdAt)
if err != nil {
slog.Warn("invalid model created_at", "value", createdAt, "error", err)
return 0
}
return t.Unix()
}
🤖 Prompt for AI Agents
In `@internal/providers/anthropic/anthropic.go` around lines 434 - 441, Change
parseCreatedAt to log parse errors via the slog package and return 0 instead of
time.Now().Unix() so invalid timestamps are not silently mapped to current time;
specifically, in function parseCreatedAt(createdAt string) int64 call slog.Error
(or slog.With(...).Error) with a clear message including the invalid createdAt
and the parse error, then return int64(0) on error. Also update the
TestParseCreatedAt_InvalidFormat unit test to expect 0 for invalid/ unparsable
timestamps.

Comment on lines +132 to +191
var inputTokens, outputTokens, totalTokens int
rawData := make(map[string]any)

// Extract standard fields
if v, ok := usageMap["prompt_tokens"].(float64); ok {
inputTokens = int(v)
}
if v, ok := usageMap["input_tokens"].(float64); ok {
inputTokens = int(v)
}
if v, ok := usageMap["completion_tokens"].(float64); ok {
outputTokens = int(v)
}
if v, ok := usageMap["output_tokens"].(float64); ok {
outputTokens = int(v)
}
if v, ok := usageMap["total_tokens"].(float64); ok {
totalTokens = int(v)
}

// Extract extended usage data (provider-specific)
extendedFields := []string{
"cached_tokens", "reasoning_tokens", // OpenAI
"cache_creation_input_tokens", "cache_read_input_tokens", // Anthropic
"thought_tokens", "tool_use_tokens", // Gemini
"image_tokens", // xAI
}
for _, field := range extendedFields {
if v, ok := usageMap[field].(float64); ok && v > 0 {
rawData[field] = int(v)
}
}

// Also check for nested prompt_tokens_details and completion_tokens_details (OpenAI)
if details, ok := usageMap["prompt_tokens_details"].(map[string]interface{}); ok {
for k, v := range details {
if fv, ok := v.(float64); ok && fv > 0 {
rawData["prompt_"+k] = int(fv)
}
}
}
if details, ok := usageMap["completion_tokens_details"].(map[string]interface{}); ok {
for k, v := range details {
if fv, ok := v.(float64); ok && fv > 0 {
rawData["completion_"+k] = int(fv)
}
}
}

// Only create entry if we found some usage data
if inputTokens > 0 || outputTokens > 0 || totalTokens > 0 {
if len(rawData) == 0 {
rawData = nil
}
return ExtractFromSSEUsage(
providerID,
inputTokens, outputTokens, totalTokens,
rawData,
w.requestID, model, w.provider, w.endpoint,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Derive total_tokens when only input/output are present.

If a provider omits total_tokens, the current logic emits TotalTokens=0 even when input/output are set, which undercounts usage. Consider deriving totalTokens from input/output when missing.

🐛 Suggested fix
 	if v, ok := usageMap["total_tokens"].(float64); ok {
 		totalTokens = int(v)
 	}
 
+	if totalTokens == 0 && (inputTokens > 0 || outputTokens > 0) {
+		totalTokens = inputTokens + outputTokens
+	}
+
 	// Only create entry if we found some usage data
🤖 Prompt for AI Agents
In `@internal/usage/stream_wrapper.go` around lines 132 - 191, The code currently
leaves totalTokens at 0 when "total_tokens" is missing, undercounting usage;
update the logic after extracting inputTokens, outputTokens, totalTokens from
usageMap to set totalTokens = inputTokens + outputTokens when totalTokens == 0
and (inputTokens > 0 || outputTokens > 0). Keep the existing rawData handling
and call to ExtractFromSSEUsage(providerID, inputTokens, outputTokens,
totalTokens, rawData, w.requestID, model, w.provider, w.endpoint) so the derived
totalTokens is passed through.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/auditlog/middleware.go (1)

173-184: Unbounded single write may exceed MaxBodyCapture.

The current logic only checks if the buffer length is below MaxBodyCapture before writing, but writes the entire slice b regardless of its size. A single large write could significantly exceed the intended limit.

♻️ Proposed fix to enforce strict size limit
 func (r *responseBodyCapture) Write(b []byte) (int, error) {
 	// Write to the capture buffer (limit to MaxBodyCapture to avoid memory issues)
 	if r.body.Len() < int(MaxBodyCapture) {
-		r.body.Write(b)
-		// Check if we just hit the limit
-		if r.body.Len() >= int(MaxBodyCapture) {
+		remaining := int(MaxBodyCapture) - r.body.Len()
+		if len(b) <= remaining {
+			r.body.Write(b)
+		} else {
+			r.body.Write(b[:remaining])
 			r.truncated = true
 		}
 	}
 	// Write to the original response writer
 	return r.ResponseWriter.Write(b)
 }

SantiagoDePolonia and others added 4 commits January 21, 2026 15:52
)

Add a new configuration flag that automatically adds stream_options:
{"include_usage": true} to streaming requests for OpenAI-compatible
providers. This ensures usage data is always returned in streaming
responses when the flag is enabled (default: true).

Changes:
- Add StreamOptions struct to core/types.go for stream_options param
- Add StreamOptions field to ChatRequest and ResponsesRequest
- Update WithStreaming() methods to preserve StreamOptions
- Add EnforceReturningUsageData to usage.Config and config.UsageConfig
- Update handlers to set StreamOptions when flag is enabled
- Document ENFORCE_RETURNING_USAGE_DATA in .env.template

Co-authored-by: Claude <noreply@anthropic.com>
Use time.After for timeout and time.Ticker for polling instead of
a fixed time.Sleep, making the test more reliable and faster.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
docker-compose.yaml (1)

17-24: Align the storage backend comment with the active setting.

The block still says PostgreSQL for audit logs while STORAGE_TYPE=mongodb is active. That mismatch is confusing for operators—either update the comment or switch the active backend to match.

✏️ Suggested clarification
-      # Storage configuration (PostgreSQL for audit logs)
+      # Storage configuration (audit/usage storage backend)
...
-      # - STORAGE_TYPE=postgresql
-      - STORAGE_TYPE=mongodb
+      # - STORAGE_TYPE=postgresql  # use PostgreSQL backend
+      - STORAGE_TYPE=mongodb       # active backend
config/config.go (1)

266-301: Env var name for enforce‑usage is inconsistent with USAGE_ naming.*

ENFORCE_RETURNING_USAGE_DATA skips the USAGE_ prefix, so USAGE_ENFORCE_RETURNING_USAGE_DATA won’t be honored in either the legacy env path or the override path. Align it with the rest of the usage env vars.

🛠️ Proposed fix
-				EnforceReturningUsageData: getEnvBoolOrDefault("ENFORCE_RETURNING_USAGE_DATA", true),
+				EnforceReturningUsageData: getEnvBoolOrDefault("USAGE_ENFORCE_RETURNING_USAGE_DATA", true),
@@
-	if enforceUsage := os.Getenv("ENFORCE_RETURNING_USAGE_DATA"); enforceUsage != "" {
+	if enforceUsage := os.Getenv("USAGE_ENFORCE_RETURNING_USAGE_DATA"); enforceUsage != "" {

Also applies to: 416-437

🤖 Fix all issues with AI agents
In `@CLAUDE.md`:
- Around line 208-236: Add a single clarifying sentence to the "Manual Testing
with Docker" section stating that Docker Compose is optional and intended only
for manual storage-backend validation—not for running automated E2E or unit
tests—and that automated E2E tests must remain in-process without Docker; insert
this sentence near the examples that show STORAGE_TYPE, POSTGRES_URL and
MONGODB_URL so readers understand the distinction between manual validation and
in-process test suites.

In `@FUTURE_ARCHITECTURE.md`:
- Line 1: Add a top-level heading and specify the fenced-code language: insert a
heading "Future Architecture" at the top of FUTURE_ARCHITECTURE.md and change
the opening fence from an untyped "```" to "```text" and replace the malformed
closing "-```" with a proper closing "```"; apply the same change wherever an
untyped or malformed fenced block occurs (e.g., the other occurrence referenced
as "123-123").

In `@internal/core/types.go`:
- Around line 59-64: The Usage struct exposes RawUsage as map[string]any which
weakens API payload typing; change RawUsage to json.RawMessage to preserve
provider-specific raw bytes while keeping strong typing, update the Usage type
declaration (field RawUsage) to use json.RawMessage and ensure encoding/json is
imported, and then update any functions that marshal/unmarshal or assign to
Usage.RawUsage to handle raw JSON (e.g., pass through or
json.Marshal/json.RawMessage conversions) so existing
serialization/deserialization behavior is preserved.

In `@internal/usage/factory.go`:
- Around line 130-139: The defaults for storage type and paths are hardcoded;
replace the literal values (".cache/gomodel.db" and "gomodel") with shared
constants or getters from the storage/config package and fall back to storage
package defaults where available: update the initialization logic that sets
storageCfg.Type (currently comparing to storage.TypeSQLite),
storageCfg.SQLite.Path, and storageCfg.MongoDB.Database to use exported
constants (e.g., storage.DefaultSQLitePath, storage.DefaultMongoDatabase) or
call a storage.Defaults() helper so defaults remain single-sourced and cannot
drift.
- Around line 66-71: When createUsageStore(store, cfg.Usage.RetentionDays)
returns an error, the subsequent call to store.Close() currently ignores any
error; change the cleanup path in the block that handles the error to capture
the result of store.Close(), then either log it (using the existing logger in
scope) or wrap/aggregate it with the original createUsageStore error before
returning so cleanup failures aren’t lost; update the code around
createUsageStore, usageStore variable handling, and the store.Close() call to
return or log both errors together.
- Around line 23-41: The "Safe to call multiple times" comment is incorrect
because Logger.Close calls close(l.done) and will panic on a second call; either
remove or correct the comment or make Close idempotent: add a guarded closed
flag and mutex to both Result and the Logger implementation (check the closed
flag before calling Logger.Close, Storage.Close or closing l.done, set the flag
under lock after performing close operations) so repeated calls return nil
instead of closing channels twice; update the comment accordingly or remove it
if you implement the guard.

In `@internal/usage/store_sqlite.go`:
- Around line 58-70: The indexes list in the usage creation snippet (the indexes
slice and subsequent loop using db.Exec) omits an index on the endpoint column;
add "CREATE INDEX IF NOT EXISTS idx_usage_endpoint ON usage(endpoint)" to the
indexes slice so the for _, idx := range indexes { if _, err := db.Exec(idx) {
... } } loop will create it, ensuring queries that filter by endpoint are
indexed (reference the indexes variable, idx_usage_endpoint, and the db.Exec
call).

In `@internal/usage/usage.go`:
- Around line 49-55: The RawData field in the Usage struct currently uses
map[string]any; replace it with a typed struct (e.g., RawUsageData) that
contains optional, provider-specific sub-structs/fields (e.g., OpenAI,
Anthropic, Gemini) with proper json/bson tags and pointer types to keep fields
optional, then update the Usage.RawData field signature to use this new type;
ensure all code paths that read/write Usage.RawData
(serialization/deserialization, DB persistence and any unmarshalling code) are
updated to the new struct shape and handle backward compatibility (e.g.,
implement custom UnmarshalJSON/UnmarshalBSON or a migration helper to convert
existing map payloads into the typed RawUsageData).
♻️ Duplicate comments (4)
internal/core/responses.go (1)

76-81: Avoid map[string]any in API response payloads.

Same concern as previously flagged—ResponsesUsage is part of ResponsesResponse (an API payload). Use json.RawMessage instead for type safety.

internal/server/handlers.go (3)

54-57: X-Request-ID retrieved from wrong header source.

The requestID is read from c.Response().Header().Get("X-Request-ID"), but the X-Request-ID header is set on the request by the audit log middleware. This should be c.Request().Header.Get("X-Request-ID") to match the pattern used elsewhere in the codebase.

🔧 Suggested fix
-	requestID := c.Response().Header().Get("X-Request-ID")
+	requestID := c.Request().Header.Get("X-Request-ID")

116-123: Same X-Request-ID issue as streaming path.

Line 118 retrieves requestID from response headers instead of request headers.

🔧 Suggested fix
-		requestID := c.Response().Header().Get("X-Request-ID")
+		requestID := c.Request().Header.Get("X-Request-ID")

182-189: Same X-Request-ID issue in Responses handler.

Line 184 retrieves requestID from response headers instead of request headers.

🔧 Suggested fix
-		requestID := c.Response().Header().Get("X-Request-ID")
+		requestID := c.Request().Header.Get("X-Request-ID")

Comment on lines +66 to +71
// Create the usage store based on storage type
usageStore, err := createUsageStore(store, cfg.Usage.RetentionDays)
if err != nil {
store.Close()
return nil, err
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Error from store.Close() is silently ignored.

When createUsageStore fails, the error from store.Close() is discarded. While this is a cleanup path, logging or aggregating this error would help with debugging resource cleanup failures.

Proposed fix
 	// Create the usage store based on storage type
 	usageStore, err := createUsageStore(store, cfg.Usage.RetentionDays)
 	if err != nil {
-		store.Close()
-		return nil, err
+		closeErr := store.Close()
+		if closeErr != nil {
+			return nil, fmt.Errorf("%w (also failed to close storage: %v)", err, closeErr)
+		}
+		return nil, err
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Create the usage store based on storage type
usageStore, err := createUsageStore(store, cfg.Usage.RetentionDays)
if err != nil {
store.Close()
return nil, err
}
// Create the usage store based on storage type
usageStore, err := createUsageStore(store, cfg.Usage.RetentionDays)
if err != nil {
closeErr := store.Close()
if closeErr != nil {
return nil, fmt.Errorf("%w (also failed to close storage: %v)", err, closeErr)
}
return nil, err
}
🤖 Prompt for AI Agents
In `@internal/usage/factory.go` around lines 66 - 71, When createUsageStore(store,
cfg.Usage.RetentionDays) returns an error, the subsequent call to store.Close()
currently ignores any error; change the cleanup path in the block that handles
the error to capture the result of store.Close(), then either log it (using the
existing logger in scope) or wrap/aggregate it with the original
createUsageStore error before returning so cleanup failures aren’t lost; update
the code around createUsageStore, usageStore variable handling, and the
store.Close() call to return or log both errors together.

Comment on lines +130 to +139
// Apply defaults
if storageCfg.Type == "" {
storageCfg.Type = storage.TypeSQLite
}
if storageCfg.SQLite.Path == "" {
storageCfg.SQLite.Path = ".cache/gomodel.db"
}
if storageCfg.MongoDB.Database == "" {
storageCfg.MongoDB.Database = "gomodel"
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider using constants for default values.

The default values (e.g., ".cache/gomodel.db", "gomodel" for MongoDB database) are hardcoded here. If these defaults are also defined elsewhere (e.g., in the storage or config package), they could drift out of sync. Consider using shared constants or deferring to the storage package's defaults.

#!/bin/bash
# Check if these defaults are duplicated elsewhere
echo "=== Checking for duplicate default paths ==="
rg -n '\.cache/gomodel\.db|gomodel\.db' --type go
🤖 Prompt for AI Agents
In `@internal/usage/factory.go` around lines 130 - 139, The defaults for storage
type and paths are hardcoded; replace the literal values (".cache/gomodel.db"
and "gomodel") with shared constants or getters from the storage/config package
and fall back to storage package defaults where available: update the
initialization logic that sets storageCfg.Type (currently comparing to
storage.TypeSQLite), storageCfg.SQLite.Path, and storageCfg.MongoDB.Database to
use exported constants (e.g., storage.DefaultSQLitePath,
storage.DefaultMongoDatabase) or call a storage.Defaults() helper so defaults
remain single-sourced and cannot drift.

Comment on lines +58 to +70
// Create indexes for common queries
indexes := []string{
"CREATE INDEX IF NOT EXISTS idx_usage_timestamp ON usage(timestamp)",
"CREATE INDEX IF NOT EXISTS idx_usage_request_id ON usage(request_id)",
"CREATE INDEX IF NOT EXISTS idx_usage_provider_id ON usage(provider_id)",
"CREATE INDEX IF NOT EXISTS idx_usage_model ON usage(model)",
"CREATE INDEX IF NOT EXISTS idx_usage_provider ON usage(provider)",
}
for _, idx := range indexes {
if _, err := db.Exec(idx); err != nil {
slog.Warn("failed to create index", "error", err)
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider adding an index on endpoint for completeness.

Indexes are created for timestamp, request_id, provider_id, model, and provider, but endpoint is omitted. If queries filtering by endpoint are expected (e.g., usage reports per endpoint), consider adding it.

♻️ Optional enhancement
 	indexes := []string{
 		"CREATE INDEX IF NOT EXISTS idx_usage_timestamp ON usage(timestamp)",
 		"CREATE INDEX IF NOT EXISTS idx_usage_request_id ON usage(request_id)",
 		"CREATE INDEX IF NOT EXISTS idx_usage_provider_id ON usage(provider_id)",
 		"CREATE INDEX IF NOT EXISTS idx_usage_model ON usage(model)",
 		"CREATE INDEX IF NOT EXISTS idx_usage_provider ON usage(provider)",
+		"CREATE INDEX IF NOT EXISTS idx_usage_endpoint ON usage(endpoint)",
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Create indexes for common queries
indexes := []string{
"CREATE INDEX IF NOT EXISTS idx_usage_timestamp ON usage(timestamp)",
"CREATE INDEX IF NOT EXISTS idx_usage_request_id ON usage(request_id)",
"CREATE INDEX IF NOT EXISTS idx_usage_provider_id ON usage(provider_id)",
"CREATE INDEX IF NOT EXISTS idx_usage_model ON usage(model)",
"CREATE INDEX IF NOT EXISTS idx_usage_provider ON usage(provider)",
}
for _, idx := range indexes {
if _, err := db.Exec(idx); err != nil {
slog.Warn("failed to create index", "error", err)
}
}
// Create indexes for common queries
indexes := []string{
"CREATE INDEX IF NOT EXISTS idx_usage_timestamp ON usage(timestamp)",
"CREATE INDEX IF NOT EXISTS idx_usage_request_id ON usage(request_id)",
"CREATE INDEX IF NOT EXISTS idx_usage_provider_id ON usage(provider_id)",
"CREATE INDEX IF NOT EXISTS idx_usage_model ON usage(model)",
"CREATE INDEX IF NOT EXISTS idx_usage_provider ON usage(provider)",
"CREATE INDEX IF NOT EXISTS idx_usage_endpoint ON usage(endpoint)",
}
for _, idx := range indexes {
if _, err := db.Exec(idx); err != nil {
slog.Warn("failed to create index", "error", err)
}
}
🤖 Prompt for AI Agents
In `@internal/usage/store_sqlite.go` around lines 58 - 70, The indexes list in the
usage creation snippet (the indexes slice and subsequent loop using db.Exec)
omits an index on the endpoint column; add "CREATE INDEX IF NOT EXISTS
idx_usage_endpoint ON usage(endpoint)" to the indexes slice so the for _, idx :=
range indexes { if _, err := db.Exec(idx) { ... } } loop will create it,
ensuring queries that filter by endpoint are indexed (reference the indexes
variable, idx_usage_endpoint, and the db.Exec call).

Comment on lines +49 to +55
// RawData contains provider-specific extended usage data (JSONB)
// Examples:
// OpenAI: {"cached_tokens": 100, "reasoning_tokens": 50}
// Anthropic: {"cache_creation_input_tokens": 200, "cache_read_input_tokens": 150}
// Gemini: {"cached_tokens": 100, "thought_tokens": 75, "tool_use_tokens": 25}
RawData map[string]any `json:"raw_data,omitempty" bson:"raw_data,omitempty"`
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Prefer a typed RawData struct instead of map[string]any.

This is persisted payload data; a typed struct improves safety and aligns with the “no map[string]any for payloads” guideline. Consider a struct with optional provider-specific fields.

♻️ Possible refactor
 type UsageEntry struct {
@@
-	RawData map[string]any `json:"raw_data,omitempty" bson:"raw_data,omitempty"`
+	RawData *UsageRawData `json:"raw_data,omitempty" bson:"raw_data,omitempty"`
 }
+
+// UsageRawData captures provider-specific usage extensions.
+type UsageRawData struct {
+	CachedTokens             *int `json:"cached_tokens,omitempty" bson:"cached_tokens,omitempty"`
+	ReasoningTokens          *int `json:"reasoning_tokens,omitempty" bson:"reasoning_tokens,omitempty"`
+	CacheCreationInputTokens *int `json:"cache_creation_input_tokens,omitempty" bson:"cache_creation_input_tokens,omitempty"`
+	CacheReadInputTokens     *int `json:"cache_read_input_tokens,omitempty" bson:"cache_read_input_tokens,omitempty"`
+	ThoughtTokens            *int `json:"thought_tokens,omitempty" bson:"thought_tokens,omitempty"`
+	ToolUseTokens            *int `json:"tool_use_tokens,omitempty" bson:"tool_use_tokens,omitempty"`
+}

Based on learnings, keep payload fields strongly typed.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// RawData contains provider-specific extended usage data (JSONB)
// Examples:
// OpenAI: {"cached_tokens": 100, "reasoning_tokens": 50}
// Anthropic: {"cache_creation_input_tokens": 200, "cache_read_input_tokens": 150}
// Gemini: {"cached_tokens": 100, "thought_tokens": 75, "tool_use_tokens": 25}
RawData map[string]any `json:"raw_data,omitempty" bson:"raw_data,omitempty"`
}
// RawData contains provider-specific extended usage data (JSONB)
// Examples:
// OpenAI: {"cached_tokens": 100, "reasoning_tokens": 50}
// Anthropic: {"cache_creation_input_tokens": 200, "cache_read_input_tokens": 150}
// Gemini: {"cached_tokens": 100, "thought_tokens": 75, "tool_use_tokens": 25}
RawData *UsageRawData `json:"raw_data,omitempty" bson:"raw_data,omitempty"`
}
// UsageRawData captures provider-specific usage extensions.
type UsageRawData struct {
CachedTokens *int `json:"cached_tokens,omitempty" bson:"cached_tokens,omitempty"`
ReasoningTokens *int `json:"reasoning_tokens,omitempty" bson:"reasoning_tokens,omitempty"`
CacheCreationInputTokens *int `json:"cache_creation_input_tokens,omitempty" bson:"cache_creation_input_tokens,omitempty"`
CacheReadInputTokens *int `json:"cache_read_input_tokens,omitempty" bson:"cache_read_input_tokens,omitempty"`
ThoughtTokens *int `json:"thought_tokens,omitempty" bson:"thought_tokens,omitempty"`
ToolUseTokens *int `json:"tool_use_tokens,omitempty" bson:"tool_use_tokens,omitempty"`
}
🤖 Prompt for AI Agents
In `@internal/usage/usage.go` around lines 49 - 55, The RawData field in the Usage
struct currently uses map[string]any; replace it with a typed struct (e.g.,
RawUsageData) that contains optional, provider-specific sub-structs/fields
(e.g., OpenAI, Anthropic, Gemini) with proper json/bson tags and pointer types
to keep fields optional, then update the Usage.RawData field signature to use
this new type; ensure all code paths that read/write Usage.RawData
(serialization/deserialization, DB persistence and any unmarshalling code) are
updated to the new struct shape and handle backward compatibility (e.g.,
implement custom UnmarshalJSON/UnmarshalBSON or a migration helper to convert
existing map payloads into the typed RawUsageData).

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/providers/responses_converter.go (1)

108-130: Extract duplicated response.done construction into a helper method.

The response.done event construction logic is duplicated between the [DONE] handling path (lines 108-130) and the EOF path (lines 174-196). This violates DRY and risks divergence during future changes.

♻️ Proposed refactor
+func (sc *OpenAIResponsesStreamConverter) buildDoneMessage() string {
+	responseData := map[string]interface{}{
+		"id":         sc.responseID,
+		"object":     "response",
+		"status":     "completed",
+		"model":      sc.model,
+		"provider":   sc.provider,
+		"created_at": time.Now().Unix(),
+	}
+	if sc.cachedUsage != nil {
+		responseData["usage"] = sc.cachedUsage
+	}
+	doneEvent := map[string]interface{}{
+		"type":     "response.done",
+		"response": responseData,
+	}
+	jsonData, err := json.Marshal(doneEvent)
+	if err != nil {
+		slog.Error("failed to marshal response.done event", "error", err, "response_id", sc.responseID)
+		return ""
+	}
+	return fmt.Sprintf("event: response.done\ndata: %s\n\ndata: [DONE]\n\n", jsonData)
+}

Then replace both duplicated blocks with:

sc.sentDone = true
doneMsg := sc.buildDoneMessage()
if doneMsg != "" {
    sc.buffer = append(sc.buffer, []byte(doneMsg)...)
}

Also applies to: 174-196

🤖 Fix all issues with AI agents
In @.claude/settings.local.json:
- Around line 16-18: The permission entry Bash(mongosh:*) is too broad — replace
it with a least‑privilege alternative: either change Bash(mongosh:*) to require
confirmation (move it to an "ask" scope) or restrict it to explicit
subcommands/targets (e.g., list only the exact mongosh actions needed such as
count/find/aggregate or specific database/collection identifiers) so arbitrary
shell commands cannot be executed; update the settings entry accordingly and add
a brief comment explaining why the narrower scope or "ask" mode was chosen.
♻️ Duplicate comments (8)
CLAUDE.md (1)

208-236: Consider the past feedback about clarifying Docker's optional nature.

The section provides clear manual testing instructions with correct commands and connection details. The previously raised concern about potential confusion with the "No Docker or external dependencies needed for testing" statement (line 205) remains relevant.

internal/providers/anthropic/anthropic.go (1)

447-454: Silent fallback to current time masks data issues.

Returning time.Now().Unix() on parse failure silently masks upstream data problems and can cause unexpected model ordering if models are sorted by creation time. Consider logging the error and returning 0 to make issues visible.

🔧 Suggested fix
 func parseCreatedAt(createdAt string) int64 {
 	t, err := time.Parse(time.RFC3339, createdAt)
 	if err != nil {
-		return time.Now().Unix()
+		slog.Warn("invalid model created_at timestamp", "value", createdAt, "error", err)
+		return 0
 	}
 	return t.Unix()
 }

Note: This will require updating any tests that expect non-zero values for invalid timestamps.

internal/usage/stream_wrapper.go (3)

46-52: Reallocate the rolling buffer when trimming to cap memory.

Line 49 resets the buffer but keeps capacity, so a single large stream can retain a large backing array even after trimming. Re-buffer the tail into a fresh buffer instead.

♻️ Suggested fix
 		if w.buffer.Len() > SSEBufferSize {
 			// Discard old data, keep recent
 			data := w.buffer.Bytes()
-			w.buffer.Reset()
-			if _, errBuf := w.buffer.Write(data[len(data)-SSEBufferSize:]); errBuf != nil {
-				return n, errBuf
-			}
+			tail := append([]byte(nil), data[len(data)-SSEBufferSize:]...)
+			w.buffer = *bytes.NewBuffer(tail)
 		}

105-148: Use typed structs for SSE/usage payload parsing.

Parsing into map[string]interface{} forces float64 casts and bypasses compile-time checks. Define typed structs with json.RawMessage for the usage payload, then unmarshal into a typed usage struct (keeping provider-specific extras in RawData).

♻️ Suggested refactor (outline)
+type sseChunk struct {
+	ID       string          `json:"id"`
+	Model    string          `json:"model"`
+	Type     string          `json:"type"`
+	Usage    json.RawMessage `json:"usage"`
+	Response *struct {
+		ID    string          `json:"id"`
+		Model string          `json:"model"`
+		Usage json.RawMessage `json:"usage"`
+	} `json:"response,omitempty"`
+}
+
+type usagePayload struct {
+	PromptTokens     int `json:"prompt_tokens"`
+	InputTokens      int `json:"input_tokens"`
+	CompletionTokens int `json:"completion_tokens"`
+	OutputTokens     int `json:"output_tokens"`
+	TotalTokens      int `json:"total_tokens"`
+	// add provider-specific fields as needed
+}
...
-	var chunk map[string]interface{}
-	if err := json.Unmarshal(data, &chunk); err != nil {
+	var chunk sseChunk
+	if err := json.Unmarshal(data, &chunk); err != nil {
 		return nil
 	}
As per coding guidelines, avoid map-based payloads for API responses.

150-168: Derive total_tokens when only input/output are present.

If total_tokens is omitted, usage is undercounted even when input/output are present. Derive totalTokens as the sum when missing.

🐛 Suggested fix
 	if v, ok := usageMap["total_tokens"].(float64); ok {
 		totalTokens = int(v)
 	}
+	if totalTokens == 0 && (inputTokens > 0 || outputTokens > 0) {
+		totalTokens = inputTokens + outputTokens
+	}
internal/usage/factory.go (3)

23-40: “Safe to call multiple times” can be incorrect if Logger.Close isn’t idempotent.

Line 24 claims idempotency, but if Logger.Close or Storage.Close can panic on double-close, Result.Close isn’t safe. Either remove the claim or guard with a sync.Once/flag.

🐛 Suggested fix (guard Close)
 type Result struct {
 	Logger  LoggerInterface
 	Storage storage.Storage
+	closeOnce sync.Once
 }
 
 func (r *Result) Close() error {
-	var errs []error
-	if r.Logger != nil {
-		if err := r.Logger.Close(); err != nil {
-			errs = append(errs, fmt.Errorf("logger close: %w", err))
-		}
-	}
-	if r.Storage != nil {
-		if err := r.Storage.Close(); err != nil {
-			errs = append(errs, fmt.Errorf("storage close: %w", err))
-		}
-	}
-	if len(errs) > 0 {
-		return fmt.Errorf("close errors: %w", errors.Join(errs...))
-	}
-	return nil
+	var outErr error
+	r.closeOnce.Do(func() {
+		var errs []error
+		if r.Logger != nil {
+			if err := r.Logger.Close(); err != nil {
+				errs = append(errs, fmt.Errorf("logger close: %w", err))
+			}
+		}
+		if r.Storage != nil {
+			if err := r.Storage.Close(); err != nil {
+				errs = append(errs, fmt.Errorf("storage close: %w", err))
+			}
+		}
+		if len(errs) > 0 {
+			outErr = fmt.Errorf("close errors: %w", errors.Join(errs...))
+		}
+	})
+	return outErr
 }

To verify idempotency, inspect Logger.Close:

#!/bin/bash
rg -n 'func \(.*Logger\) Close' internal/usage/logger.go -A20

66-71: Don’t drop the cleanup error from store.Close().

If createUsageStore fails, the cleanup error is discarded. Capture and return/log it so cleanup failures aren’t lost.

🐛 Suggested fix
 	usageStore, err := createUsageStore(store, cfg.Usage.RetentionDays)
 	if err != nil {
-		store.Close()
-		return nil, err
+		closeErr := store.Close()
+		if closeErr != nil {
+			return nil, fmt.Errorf("%w (also failed to close storage: %v)", err, closeErr)
+		}
+		return nil, err
 	}

130-139: Avoid hardcoded storage defaults if shared constants exist.

The sqlite path and MongoDB database defaults are literal strings here; consider using shared defaults from the storage/config layer to avoid drift.

SantiagoDePolonia and others added 3 commits January 21, 2026 17:40
The NotFound case in ParseProviderError was dropping the original error,
losing debugging context. Other error branches preserve this information.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add context-based request ID passing via core.WithRequestID/GetRequestID
- Fix middleware to set X-Request-ID on both request and response headers
- Update handlers to enrich context with request ID before calling providers
- Forward request ID to all providers with provider-specific header names:
  - OpenAI, Groq, xAI: X-Request-ID
  - Anthropic: X-Request-Id
  - Gemini: X-Goog-Request-Id

This enables end-to-end request tracing from client through GOModel to
upstream providers. Clients can override the auto-generated ID by passing
their own X-Request-ID header.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
internal/auditlog/middleware.go (1)

47-58: Request ID propagation is skipped when audit logging is disabled.

Because the middleware returns early before this block, requests can lose IDs (and usage entries will miss correlation) when AuditLogger is off. Consider generating/setting the ID before the early-return or moving it to a lightweight middleware.

🔧 Suggested adjustment
-			// Skip if logging is disabled
-			if logger == nil || !logger.Config().Enabled {
-				return next(c)
-			}
-
-			cfg := logger.Config()
-
-			start := time.Now()
-			req := c.Request()
-
-			// Generate request ID if not present
-			requestID := req.Header.Get("X-Request-ID")
+			req := c.Request()
+			// Generate request ID even when audit logging is disabled
+			requestID := req.Header.Get("X-Request-ID")
 			if requestID == "" {
 				requestID = uuid.NewString()
 			}
 			req.Header.Set("X-Request-ID", requestID)
 			c.Response().Header().Set("X-Request-ID", requestID)
+
+			// Skip if logging is disabled
+			if logger == nil || !logger.Config().Enabled {
+				return next(c)
+			}
+
+			cfg := logger.Config()
+			start := time.Now()
internal/providers/groq/groq.go (1)

138-151: Consider strongly-typed input parsing.

Per coding guidelines, request/response payloads should prefer strongly-typed structs over interface{} and map[string]interface{}. While this internal conversion logic handles dynamic ResponsesRequest.Input, the per-request type assertions add runtime overhead and reduce type safety.

If core.ResponsesRequest.Input must remain interface{} for API flexibility, consider centralizing the input parsing logic into a shared utility with well-defined intermediate types, reducing duplicated type assertions across providers.

🤖 Fix all issues with AI agents
In `@internal/providers/gemini/gemini.go`:
- Around line 78-81: The code forwards a context request ID using
core.GetRequestID and sets header via req.Header.Set("X-Goog-Request-Id", ...)
even though Gemini may not accept that header; update the behavior to either (A)
verify Gemini acceptance and keep using "X-Goog-Request-Id" if confirmed, or (B)
change to a standard tracing header by replacing the header key with
"X-Request-Id" (or add both headers) so request tracing works; modify the logic
around core.GetRequestID(req.Context()) and req.Header.Set to implement the
chosen header(s) and ensure tests/logs reflect the header name change.

In `@internal/providers/openai/openai.go`:
- Around line 61-64: Replace the incorrect header "X-Request-ID" with OpenAI's
expected "X-Client-Request-Id" when forwarding the request ID from context
(where core.GetRequestID(req.Context()) is used); additionally validate the
request ID for ASCII-only characters and a maximum length of 512 bytes and only
set the header if the value passes validation (otherwise omit it) to avoid
triggering OpenAI 400 errors.
♻️ Duplicate comments (1)
internal/providers/anthropic/anthropic.go (1)

452-459: Consider logging parse errors instead of silently falling back to current time.

Returning time.Now().Unix() on parse failure masks upstream data issues and could unexpectedly affect model list ordering. Since slog is already imported, consider logging the error for observability while deciding on an appropriate fallback value.

🔧 Suggested fix
 func parseCreatedAt(createdAt string) int64 {
 	t, err := time.Parse(time.RFC3339, createdAt)
 	if err != nil {
+		slog.Warn("failed to parse model created_at timestamp", "value", createdAt, "error", err)
 		return time.Now().Unix()
 	}
 	return t.Unix()
 }

SantiagoDePolonia and others added 2 commits January 21, 2026 19:15
- Reduce from 272 to 84 lines (~69% reduction)
- Fix incorrect claim about init() registration (now explicit factory.Register())
- Fix incorrect 5-minute refresh interval (actually 1 hour default)
- Add clarification that Docker Compose is for manual testing only
- Update config section to reference .env.template and config/config.yaml

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@CLAUDE.md`:
- Line 81: Update the phrasing on the line that reads "requires initialized
registry—check `ModelCount() > 0`" to use correct grammar; for example change it
to "requires an initialized registry—check `ModelCount() > 0`" or "requires the
registry to be initialized—check `ModelCount() > 0`" so the sentence reads
naturally.
- Line 9: Replace the bare URL "https://github.com/ENTERPILOT/GOModel" in the
Module line with a markdown link; e.g., update the line containing `**Module:**
`gomodel` | **Go:** 1.24.0 | **Repo:** https://github.com/ENTERPILOT/GOModel` to
use a labeled link such as `**Module:** `gomodel` | **Go:** 1.24.0 | **Repo:**
[GOModel](https://github.com/ENTERPILOT/GOModel)` so the URL is formatted as a
proper markdown link and avoids linter warnings.
- Around line 52-62: The fenced code block in the CLAUDE.md snippet (the block
beginning with ```) is missing a language specifier causing a markdown linter
warning; update that fenced block to include a language like text or plaintext
(e.g., change the opening fence from ``` to ```text) so the directory listing
block is properly marked and the linter warning is resolved.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@internal/providers/openai/openai_test.go`:
- Around line 753-809: Add tests for control characters to
TestIsValidClientRequestID by including cases like "\x00", "\t", and "\x7F"
(expecting invalid) to document handling of non-printable ASCII; and add an
integration test for setHeaders that constructs a provider via
NewWithHTTPClient, creates an http.Request with a context set via
core.WithRequestID, calls provider.setHeaders(req) and asserts the
"X-Client-Request-Id" header is set to the request ID for valid IDs and absent
for empty/invalid IDs (reference functions: isValidClientRequestID, setHeaders,
NewWithHTTPClient, core.WithRequestID, header "X-Client-Request-Id").

Comment on lines +753 to +809
func TestIsValidClientRequestID(t *testing.T) {
tests := []struct {
name string
id string
valid bool
}{
{
name: "valid UUID",
id: "123e4567-e89b-12d3-a456-426614174000",
valid: true,
},
{
name: "valid short ID",
id: "req-123",
valid: true,
},
{
name: "valid empty string",
id: "",
valid: true,
},
{
name: "valid 512 chars",
id: strings.Repeat("a", 512),
valid: true,
},
{
name: "invalid - 513 chars (too long)",
id: strings.Repeat("a", 513),
valid: false,
},
{
name: "invalid - non-ASCII character",
id: "req-123-日本語",
valid: false,
},
{
name: "invalid - emoji",
id: "req-123-🎉",
valid: false,
},
{
name: "valid - all printable ASCII",
id: "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_.",
valid: true,
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := isValidClientRequestID(tt.id)
if got != tt.valid {
t.Errorf("isValidClientRequestID(%q) = %v, want %v", tt.id, got, tt.valid)
}
})
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Good test coverage for the validation function.

The test cases comprehensively cover boundary conditions and invalid inputs. Consider adding test cases for control characters (e.g., "\x00", "\t", "\x7F") to document the expected behavior for these edge cases.

Additionally, consider adding an integration test for setHeaders that verifies the X-Client-Request-Id header is actually set when a valid request ID is present in the context:

🧪 Suggested test for header integration
func TestSetHeadersWithRequestID(t *testing.T) {
	provider := NewWithHTTPClient("test-api-key", nil, llmclient.Hooks{})
	
	tests := []struct {
		name      string
		requestID string
		expectSet bool
	}{
		{"valid ID", "req-123", true},
		{"empty ID", "", false},
		{"invalid non-ASCII", "req-日本語", false},
	}
	
	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			req, _ := http.NewRequest(http.MethodGet, "http://example.com", nil)
			ctx := context.Background()
			if tt.requestID != "" {
				ctx = core.WithRequestID(ctx, tt.requestID)
			}
			req = req.WithContext(ctx)
			
			provider.setHeaders(req)
			
			got := req.Header.Get("X-Client-Request-Id")
			if tt.expectSet && got != tt.requestID {
				t.Errorf("X-Client-Request-Id = %q, want %q", got, tt.requestID)
			}
			if !tt.expectSet && got != "" {
				t.Errorf("X-Client-Request-Id should not be set, got %q", got)
			}
		})
	}
}
🤖 Prompt for AI Agents
In `@internal/providers/openai/openai_test.go` around lines 753 - 809, Add tests
for control characters to TestIsValidClientRequestID by including cases like
"\x00", "\t", and "\x7F" (expecting invalid) to document handling of
non-printable ASCII; and add an integration test for setHeaders that constructs
a provider via NewWithHTTPClient, creates an http.Request with a context set via
core.WithRequestID, calls provider.setHeaders(req) and asserts the
"X-Client-Request-Id" header is set to the request ID for valid IDs and absent
for empty/invalid IDs (reference functions: isValidClientRequestID, setHeaders,
NewWithHTTPClient, core.WithRequestID, header "X-Client-Request-Id").

Entries are queued asynchronously; if the source map is mutated after
Write, the logged data can be corrupted. Added defensive copy via
cloneRawData helper for all three extractor functions.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@SantiagoDePolonia SantiagoDePolonia merged commit 4896e97 into main Jan 21, 2026
7 checks passed
@SantiagoDePolonia SantiagoDePolonia deleted the feature/usage-token-tracking branch March 22, 2026 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant