Feature: usage token tracking by SantiagoDePolonia · Pull Request #43 · ENTERPILOT/GoModel

SantiagoDePolonia · 2026-01-21T12:59:49Z

Summary by CodeRabbit

New Features
- Token usage tracking: buffered async logging, multi-backend storage (sqlite/postgres/mongo), and usage data included in streaming and final responses.
- Dynamic model listing for a provider.
Bug Fixes
- 404/Not Found responses correctly classified.
Configuration
- Consolidated storage config to STORAGE_TYPE (default sqlite); new USAGE_* env vars to enable/tune usage tracking.
Documentation
- Added token-usage tracking plan and updated project docs.
Tests
- Extensive tests for usage extraction, stores, stream wrappers, and providers.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…racking

coderabbitai · 2026-01-21T13:00:17Z

📝 Walkthrough

Walkthrough

Adds a token usage tracking subsystem: new usage types, config/env, multi-backend stores (SQLite/Postgres/MongoDB), async buffered logger, SSE stream extraction/wrapping, request-ID propagation, and wiring into app/server/handlers and provider streams; also moves storage-type config to a top-level storage section.

Changes

Cohort / File(s)	Summary
Configuration & Env `/.claude/settings.local.json`, `.env.template`, `config/config.go`, `config/config.yaml`, `config/config_test.go`, `docker-compose.yaml`	Introduce `STORAGE_TYPE` and `USAGE_*` env vars; remove `LOGGING_STORAGE_TYPE`; add top-level `storage` config and Usage loading/defaults.
Core Types & Context `internal/core/types.go`, `internal/core/responses.go`, `internal/core/context.go`	Add `StreamOptions` and `RawUsage` fields; add `WithRequestID`/`GetRequestID` helpers; propagate StreamOptions in WithStreaming.
Usage Core & Constants `internal/usage/usage.go`, `internal/usage/constants.go`, `internal/usage/cleanup.go`	New UsageEntry, UsageStore interface, Config struct, defaults, constants, and cleanup loop.
Usage Logger & Factory `internal/usage/logger.go`, `internal/usage/factory.go`, `internal/usage/usage_test.go`	Async buffered Logger, NoopLogger, LoggerInterface, Result factory (New / NewWithSharedStorage), lifecycle/close behavior.
Usage Extractors & Stream Wrapper `internal/usage/extractor.go`, `internal/usage/extractor_test.go`, `internal/usage/stream_wrapper.go`, `internal/usage/stream_wrapper_test.go`	Extract usage from responses and SSE; StreamUsageWrapper buffers SSE and emits UsageEntry on Close; comprehensive tests.
Usage Storage Backends `internal/usage/store_sqlite.go`, `internal/usage/store_postgresql.go`, `internal/usage/store_mongodb.go`	SQLite/Postgres/MongoDB stores with batching, retention/cleanup, index/TTL handling, partial-write detection.
Server & Handler Wiring `internal/app/app.go`, `internal/server/http.go`, `internal/server/handlers.go`, `internal/server/handlers_test.go`, `internal/auditlog/factory.go`, `internal/auditlog/middleware.go`	Wire usage logger into App and server Config; NewHandler gains `usageLogger`; handlers wrap streaming/non-streaming flows for usage extraction/logging and propagate request IDs; middleware sets X-Request-ID on response.
Providers & Streaming `internal/providers/anthropic/...`, `internal/providers/responses_converter.go`, `internal/providers/*`	Anthropic: ListModels, parse created timestamps, include usage in streaming chunks; responses_converter caches usage for final `response.done`; several providers forward request-id in headers; OpenAI validates client request id.
Errors & Tests `internal/core/errors.go`, `internal/core/errors_test.go`, numerous tests under `internal/usage/*` and provider tests	404 provider responses map to NotFound error; tests updated; many new/updated tests for usage extraction, stream wrapper, stores, and provider changes.
Docs & Misc `docs/plans/token-usage-tracking.md`, `CLAUDE.md`, `EXAMPLES.md`, `.claude/settings.local.json`	New design doc and docs updates; example model update; permission added for WebFetch(domain:docs.anthropic.com) and minor punctuation tweak.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Handler
    participant Provider as Provider API
    participant UsageLogger
    participant Buffer
    participant Storage

    Note over Client,Storage: Non-streaming flow (usage write)
    Client->>Handler: POST /chat/completions
    Handler->>Provider: Forward request (ctx includes request-id)
    Provider-->>Handler: Response (includes usage)
    Handler->>UsageLogger: Write(UsageEntry)
    UsageLogger->>Buffer: Enqueue entry
    alt Flush by threshold or timer
        UsageLogger->>Storage: WriteBatch(entries)
        Storage-->>UsageLogger: Ack
    end
    Handler-->>Client: Response (+ usage)

sequenceDiagram
    participant Client
    participant Handler
    participant ProviderSSE as Provider SSE
    participant StreamWrapper
    participant UsageLogger
    participant Storage

    Note over Client,Storage: Streaming flow with SSE extraction
    Client->>Handler: POST /chat/completions?stream=true
    Handler->>ProviderSSE: Open provider SSE (wrapped for usage)
    ProviderSSE-->>StreamWrapper: SSE events (data chunks)
    StreamWrapper->>Client: Relay events
    loop streaming
        ProviderSSE->>StreamWrapper: Event (may include usage JSON)
        StreamWrapper->>StreamWrapper: Buffer rolling window
    end
    StreamWrapper->>StreamWrapper: Parse buffered SSE for usage on Close
    StreamWrapper->>UsageLogger: Write(UsageEntry)
    UsageLogger->>Storage: WriteBatch(entries)
    Storage-->>UsageLogger: Ack

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Feature: Add feature flag for token usage tracking #44: Overlapping streaming/usage changes — StreamOptions, streaming enforcement, and usage propagation.
Feature: Audit logs #33: Related storage/config and audit-logging wiring; moves storage type to top-level and adjusts storage sourcing.
Refactor: streaming sse #26: Related changes to responses converter and streaming handling (responses_converter.go).

Suggested labels

enhancement

Poem

🐰 I nibble tokens late at night,

I buffer, flush, and log each bite,
SQLite, PG, Mongo keep score,
SSE crumbs tracked forevermore—
Hop! The usage logs take flight!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.96% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Feature: usage token tracking' directly summarizes the main change - implementing token usage tracking functionality across the codebase.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 17

🤖 Fix all issues with AI agents

In `@docs/plans/token-usage-tracking.md`:
- Around line 12-36: Fix markdownlint errors by ensuring all fenced code blocks
are surrounded by blank lines and include language tags (e.g., ```go for the
Usage struct block), replace emphasis-used headings (like **1. Core Usage Struct
is Minimal**) with proper heading syntax (e.g., "### Core Usage Struct is
Minimal") and ensure a blank line before/after headings, and run through the
document to correct other lint rules (MD031, MD036, MD040, MD022) so headings
and code blocks follow consistent spacing and language-tag conventions; update
the examples mentioning Usage, StreamLogWrapper and audit log generation to use
proper fenced code blocks and heading markup.

In `@internal/core/errors.go`:
- Around line 158-162: The 404 branch creating NewNotFoundError(message)
currently drops the originalErr; update the case handling so after creating err
:= NewNotFoundError(message) and setting err.Provider = provider you also set
err.OriginalErr = originalErr (matching other branches) before returning to
preserve debugging context; reference NewNotFoundError, err.Provider and
originalErr in your change.

In `@internal/core/responses.go`:
- Around line 73-78: The ResponsesUsage struct currently exposes RawUsage as
map[string]any which weakens payload typing; update ResponsesUsage to replace
RawUsage with json.RawMessage (importing encoding/json) or a dedicated typed
struct to preserve strong typing for provider-specific data, and adjust any
marshaling/unmarshaling code that reads/writes ResponsesUsage (references:
ResponsesUsage, RawUsage) to handle json.RawMessage (or the new typed struct)
instead of a map while keeping the json tag `json:"raw_usage,omitempty"`.

In `@internal/providers/anthropic/anthropic.go`:
- Around line 405-426: Provider.ListModels currently fetches only the first page
(limit=1000) and ignores anthropicModelsResponse.HasMore/LastID; change
ListModels to loop calling p.client.Do repeatedly while anthropicResp.HasMore is
true, passing after_id=anthropicResp.LastID (and limit=1000) on subsequent
requests, appending each page's models (convert with parseCreatedAt into
core.Model) into the same slice, and only return the full core.ModelsResponse
after no more pages remain; use the anthropicModelsResponse struct fields
HasMore and LastID to control pagination and aggregation.
- Around line 434-441: Change parseCreatedAt to log parse errors via the slog
package and return 0 instead of time.Now().Unix() so invalid timestamps are not
silently mapped to current time; specifically, in function
parseCreatedAt(createdAt string) int64 call slog.Error (or slog.With(...).Error)
with a clear message including the invalid createdAt and the parse error, then
return int64(0) on error. Also update the TestParseCreatedAt_InvalidFormat unit
test to expect 0 for invalid/ unparsable timestamps.

In `@internal/server/handlers.go`:
- Around line 55-56: The request ID is being read from the response headers
(using c.Response().Header().Get("X-Request-ID")) which is incorrect; change all
occurrences to read from the incoming request headers (use
c.Request().Header.Get("X-Request-ID")) so the value set by the audit log
middleware is returned—update the occurrences around the requestID variable in
handlers.go (and the similar spots referenced at the other two occurrences) to
use c.Request().Header.Get instead of c.Response().Header().Get.

In `@internal/usage/cleanup.go`:
- Around line 8-25: Extract the duplicate RunCleanupLoop and CleanupInterval
into a shared utility function that accepts the interval as a parameter (e.g.,
create common.RunCleanupLoop(interval time.Duration, stop <-chan struct{},
cleanupFn func())), remove the local constant/duplicate RunCleanupLoop in both
packages, and update the callers in usage and auditlog to call
common.RunCleanupLoop(CleanupInterval, stop, cleanupFn) (or pass their existing
interval constant) so both packages use the single shared implementation; ensure
you keep the initial immediate cleanup call and ticker.Stop defer behavior in
the new common function.

In `@internal/usage/extractor.go`:
- Around line 31-34: The code assigns resp.Usage.RawUsage directly to
entry.RawData which can lead to races if the source map is mutated after Write
because entries are enqueued asynchronously; fix by making a defensive deep copy
of the map before assigning (for every occurrence where resp.Usage.RawUsage ->
entry.RawData is set, including the occurrences around the blocks at lines
similar to the ones shown and the other spots referenced) so entry.RawData owns
its own map instance rather than referencing the original resp.Usage.RawUsage.

In `@internal/usage/logger.go`:
- Around line 51-139: The Write path can still panic if l.buffer is closed
concurrently; make Close idempotent and block until in-flight Write calls finish
by adding a write-tracking WaitGroup and a close guard: introduce e.g.
l.writesWg (sync.WaitGroup) and l.closeOnce (sync.Once) or l.closeMu
(sync.Mutex)/RWMutex, update Write to take a read lock or check under the close
guard, call l.writesWg.Add(1) before attempting to send and defer
l.writesWg.Done(); if closed return without sending; update Close to use the
close guard or closeOnce to only close l.done once, set l.closed, then wait for
l.writesWg.Wait() and then wait for l.wg.Wait() (flushLoop) before closing the
store; reference functions/fields: Write, Close, flushLoop, l.closed, l.buffer,
l.done, l.wg, and add l.writesWg and l.closeOnce/closeMu as needed.

In `@internal/usage/store_mongodb.go`:
- Around line 123-145: The error handling after s.collection.InsertMany should
use errors.As to detect a wrapped mongo.BulkWriteException instead of a direct
type assertion; update the block that currently does "if bulkErr, ok :=
err.(mongo.BulkWriteException); ok { ... }" to use var bulkErr
mongo.BulkWriteException and errors.As(err, &bulkErr) so partial write detection
and logging (slog.Warn, usagePartialWriteFailures.Inc, and returning
&PartialWriteError{... Cause: bulkErr}) still occur for wrapped errors, and add
the "errors" import if missing; apply the same pattern used in
internal/auditlog/store_mongodb.go for consistency.

In `@internal/usage/store_postgresql.go`:
- Around line 67-76: The PostgreSQLStore currently always allocates stopCleanup
even when retentionDays <= 0; change the constructor to only create and assign
stopCleanup when retentionDays > 0 and start the goroutine, otherwise leave
stopCleanup nil (or zero value). Locate the PostgreSQLStore struct instantiation
(PostgreSQLStore) and the RunCleanupLoop call and condition on retentionDays to
perform both the make(chan struct{}) assignment and the go RunCleanupLoop(...)
invocation only when retentionDays > 0 so the unused channel isn't allocated.
- Around line 123-159: writeBatchLarge currently opens a transaction but still
calls tx.Exec per row (tx.Exec loop), which negates batch performance and also
collects insert errors yet proceeds to tx.Commit; change it to perform a true
batch insert using pgx.Batch (create a pgx.Batch, Queue the INSERT for each
UsageEntry, execute via s.pool.SendBatch or tx.SendBatch and read results) or
use tx.CopyFrom with pgx.CopyFromRows to insert all rows efficiently, and ensure
transactional semantics: if any batch/CopyFrom error occurs, call tx.Rollback
and return the error without committing (remove committing when errs were
collected), and replace the per-row error collection logic with handling the
batch/CopyFrom error result. Reference symbols: writeBatchLarge, tx.Exec
(remove), pgx.Batch / tx.CopyFrom, tx.Commit, tx.Rollback.

In `@internal/usage/store_sqlite.go`:
- Around line 167-171: The cleanup query uses time.RFC3339 but entries are
written using time.RFC3339Nano (causing incorrect string comparisons); update
the cutoff generation in the cleanup path (where cutoff is set before
s.db.Exec("DELETE FROM usage WHERE timestamp < ?", cutoff)) to use
time.RFC3339Nano instead of time.RFC3339 so the stored timestamps and the cutoff
use the same nanosecond precision.

In `@internal/usage/stream_wrapper.go`:
- Around line 41-52: The rolling buffer in the Write path currently calls
w.buffer.Reset() which doesn't shrink the backing array, so when trimming to
SSEBufferSize you must reallocate a new smaller buffer to cap memory; modify the
trim branch (where w.buffer.Len() > SSEBufferSize and you compute data :=
w.buffer.Bytes()) to create a fresh buffer of size SSEBufferSize (or a
bytes.NewBuffer with the sliced tail) and replace w.buffer with it (or copy the
tail into a new bytes.Buffer) instead of Reset()+Write(), ensuring capacity is
limited while preserving the last SSEBufferSize bytes.
- Around line 132-191: The code currently leaves totalTokens at 0 when
"total_tokens" is missing, undercounting usage; update the logic after
extracting inputTokens, outputTokens, totalTokens from usageMap to set
totalTokens = inputTokens + outputTokens when totalTokens == 0 and (inputTokens
> 0 || outputTokens > 0). Keep the existing rawData handling and call to
ExtractFromSSEUsage(providerID, inputTokens, outputTokens, totalTokens, rawData,
w.requestID, model, w.provider, w.endpoint) so the derived totalTokens is passed
through.
- Around line 104-130: The extractUsageFromJSON function in StreamUsageWrapper
currently unmarshals into map[string]interface{}; change it to unmarshal into
strongly-typed structs (e.g., SSEChunk struct with ID, Model, Usage
json.RawMessage fields and a UsagePayload struct with Token fields as
json.Number or *int64) inside StreamUsageWrapper.extractUsageFromJSON, then
decode UsagePayload to populate and return a UsageEntry (preserve any
provider-specific remaining data into UsageEntry.RawData as raw JSON); ensure
model override logic still uses SSEChunk.Model when present and convert
json.Number to int safely for token counts.

In `@internal/usage/usage_test.go`:
- Around line 44-85: Replace the fixed time.Sleep in TestLogger with a polling
loop that repeatedly checks store.getEntries() until the expected 5 entries
appear or a timeout elapses; locate the TestLogger test that constructs logger
via NewLogger, writes entries with logger.Write and currently does
time.Sleep(200*time.Millisecond), and change it to poll (using time.After for
overall timeout and time.Tick or short sleeps) to assert len(entries)==5 or fail
if timeout reached, then continue to Close() the logger and verify store.closed
as before.

docs/plans/token-usage-tracking.md

internal/core/errors.go

internal/core/responses.go

internal/providers/anthropic/anthropic.go

coderabbitai · 2026-01-21T13:11:41Z

internal/providers/anthropic/anthropic.go

+// parseCreatedAt parses an RFC3339 timestamp string to Unix timestamp
+func parseCreatedAt(createdAt string) int64 {
+	t, err := time.Parse(time.RFC3339, createdAt)
+	if err != nil {
+		return time.Now().Unix()
+	}
+	return t.Unix()
+}


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file structure and the function in question wc -l internal/providers/anthropic/anthropic.go

Repository: ENTERPILOT/GOModel

Length of output: 180

🏁 Script executed:

# Read the specific section around lines 434-441 sed -n '430,445p' internal/providers/anthropic/anthropic.go

Repository: ENTERPILOT/GOModel

Length of output: 531

🏁 Script executed:

# Check if slog is imported in the file rg "import|slog" internal/providers/anthropic/anthropic.go | head -30

Repository: ENTERPILOT/GOModel

Length of output: 579

🏁 Script executed:

# Search for usages of parseCreatedAt function rg "parseCreatedAt" internal/providers/anthropic/

Repository: ENTERPILOT/GOModel

Length of output: 770

🏁 Script executed:

# Get broader context of the anthropic.go file to understand the module and imports head -50 internal/providers/anthropic/anthropic.go

Repository: ENTERPILOT/GOModel

Length of output: 1177

🏁 Script executed:

# Find and read the parseCreatedAt test rg -A 15 "parseCreatedAt.*invalid" internal/providers/anthropic/anthropic_test.go

Repository: ENTERPILOT/GOModel

Length of output: 638

🏁 Script executed:

# Get the full test function for parseCreatedAt rg -B 5 -A 20 "func.*Test.*parseCreatedAt" internal/providers/anthropic/anthropic_test.go

Repository: ENTERPILOT/GOModel

Length of output: 44

🏁 Script executed:

# Check the context where parseCreatedAt is called in anthropic.go rg -B 5 -A 5 "created := parseCreatedAt" internal/providers/anthropic/anthropic.go

Repository: ENTERPILOT/GOModel

Length of output: 360

🏁 Script executed:

# Let's check if there are more comprehensive test cases for parseCreatedAt rg -B 10 -A 30 "func Test.*parseCreatedAt\|parseCreatedAt\(\"" internal/providers/anthropic/anthropic_test.go | head -60

Repository: ENTERPILOT/GOModel

Length of output: 44

🏁 Script executed:

# Search more broadly for test functions around parseCreatedAt rg "parseCreatedAt\|func Test" internal/providers/anthropic/anthropic_test.go | head -20

Repository: ENTERPILOT/GOModel

Length of output: 44

🏁 Script executed:

# Let's look at the test file around where parseCreatedAt is tested rg -B 20 'parseCreatedAt\("invalid' internal/providers/anthropic/anthropic_test.go

Repository: ENTERPILOT/GOModel

Length of output: 580

🏁 Script executed:

# Let's see more context around the test to understand the full test setup rg -B 40 "TestParseCreatedAt_InvalidFormat" internal/providers/anthropic/anthropic_test.go | tail -50

Repository: ENTERPILOT/GOModel

Length of output: 949

🏁 Script executed:

# Check how the Created field is used in callers of ListModels or core.Model rg "\.Created|Model\{" internal/providers/anthropic/anthropic.go | head -10

Repository: ENTERPILOT/GOModel

Length of output: 140

🏁 Script executed:

# Check the core.Model structure to understand the Created field rg -A 10 "type Model struct" internal/core/

Repository: ENTERPILOT/GOModel

Length of output: 626

🏁 Script executed:

# Search for how Created timestamp is used in the codebase rg "\.Created" --type go | head -20

Repository: ENTERPILOT/GOModel

Length of output: 1286

Log invalid timestamps and return 0 instead of silently mapping to current time.

Returning time.Now().Unix() on parse errors masks upstream data issues and can unexpectedly reorder model lists. The suggested change uses the already-imported slog package to log the error and return 0 for invalid timestamps.

Note: This will require updating the TestParseCreatedAt_InvalidFormat test that currently expects non-zero values for invalid dates.

🔧 Suggested fix

func parseCreatedAt(createdAt string) int64 { t, err := time.Parse(time.RFC3339, createdAt) if err != nil { - return time.Now().Unix() + slog.Warn("invalid model created_at", "value", createdAt, "error", err) + return 0 } return t.Unix() }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// parseCreatedAt parses an RFC3339 timestamp string to Unix timestamp

func parseCreatedAt(createdAt string) int64 {

t, err := time.Parse(time.RFC3339, createdAt)

if err != nil {

return time.Now().Unix()

}

return t.Unix()

}

// parseCreatedAt parses an RFC3339 timestamp string to Unix timestamp

func parseCreatedAt(createdAt string) int64 {

t, err := time.Parse(time.RFC3339, createdAt)

if err != nil {

slog.Warn("invalid model created_at", "value", createdAt, "error", err)

return 0

}

return t.Unix()

}

🤖 Prompt for AI Agents

In `@internal/providers/anthropic/anthropic.go` around lines 434 - 441, Change parseCreatedAt to log parse errors via the slog package and return 0 instead of time.Now().Unix() so invalid timestamps are not silently mapped to current time; specifically, in function parseCreatedAt(createdAt string) int64 call slog.Error (or slog.With(...).Error) with a clear message including the invalid createdAt and the parse error, then return int64(0) on error. Also update the TestParseCreatedAt_InvalidFormat unit test to expect 0 for invalid/ unparsable timestamps.

internal/usage/store_sqlite.go

internal/usage/stream_wrapper.go

coderabbitai · 2026-01-21T13:11:42Z

internal/usage/stream_wrapper.go

+	var inputTokens, outputTokens, totalTokens int
+	rawData := make(map[string]any)
+
+	// Extract standard fields
+	if v, ok := usageMap["prompt_tokens"].(float64); ok {
+		inputTokens = int(v)
+	}
+	if v, ok := usageMap["input_tokens"].(float64); ok {
+		inputTokens = int(v)
+	}
+	if v, ok := usageMap["completion_tokens"].(float64); ok {
+		outputTokens = int(v)
+	}
+	if v, ok := usageMap["output_tokens"].(float64); ok {
+		outputTokens = int(v)
+	}
+	if v, ok := usageMap["total_tokens"].(float64); ok {
+		totalTokens = int(v)
+	}
+
+	// Extract extended usage data (provider-specific)
+	extendedFields := []string{
+		"cached_tokens", "reasoning_tokens", // OpenAI
+		"cache_creation_input_tokens", "cache_read_input_tokens", // Anthropic
+		"thought_tokens", "tool_use_tokens", // Gemini
+		"image_tokens", // xAI
+	}
+	for _, field := range extendedFields {
+		if v, ok := usageMap[field].(float64); ok && v > 0 {
+			rawData[field] = int(v)
+		}
+	}
+
+	// Also check for nested prompt_tokens_details and completion_tokens_details (OpenAI)
+	if details, ok := usageMap["prompt_tokens_details"].(map[string]interface{}); ok {
+		for k, v := range details {
+			if fv, ok := v.(float64); ok && fv > 0 {
+				rawData["prompt_"+k] = int(fv)
+			}
+		}
+	}
+	if details, ok := usageMap["completion_tokens_details"].(map[string]interface{}); ok {
+		for k, v := range details {
+			if fv, ok := v.(float64); ok && fv > 0 {
+				rawData["completion_"+k] = int(fv)
+			}
+		}
+	}
+
+	// Only create entry if we found some usage data
+	if inputTokens > 0 || outputTokens > 0 || totalTokens > 0 {
+		if len(rawData) == 0 {
+			rawData = nil
+		}
+		return ExtractFromSSEUsage(
+			providerID,
+			inputTokens, outputTokens, totalTokens,
+			rawData,
+			w.requestID, model, w.provider, w.endpoint,
+		)


⚠️ Potential issue | 🟠 Major

Derive total_tokens when only input/output are present.

If a provider omits total_tokens, the current logic emits TotalTokens=0 even when input/output are set, which undercounts usage. Consider deriving totalTokens from input/output when missing.

🐛 Suggested fix

if v, ok := usageMap["total_tokens"].(float64); ok { totalTokens = int(v) } + if totalTokens == 0 && (inputTokens > 0 || outputTokens > 0) { + totalTokens = inputTokens + outputTokens + } + // Only create entry if we found some usage data

🤖 Prompt for AI Agents

In `@internal/usage/stream_wrapper.go` around lines 132 - 191, The code currently leaves totalTokens at 0 when "total_tokens" is missing, undercounting usage; update the logic after extracting inputTokens, outputTokens, totalTokens from usageMap to set totalTokens = inputTokens + outputTokens when totalTokens == 0 and (inputTokens > 0 || outputTokens > 0). Keep the existing rawData handling and call to ExtractFromSSEUsage(providerID, inputTokens, outputTokens, totalTokens, rawData, w.requestID, model, w.provider, w.endpoint) so the derived totalTokens is passed through.

internal/usage/usage_test.go

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

internal/auditlog/middleware.go (1)
173-184: Unbounded single write may exceed MaxBodyCapture.

The current logic only checks if the buffer length is below MaxBodyCapture before writing, but writes the entire slice b regardless of its size. A single large write could significantly exceed the intended limit.
♻️ Proposed fix to enforce strict size limit
 func (r *responseBodyCapture) Write(b []byte) (int, error) {
 	// Write to the capture buffer (limit to MaxBodyCapture to avoid memory issues)
 	if r.body.Len() < int(MaxBodyCapture) {
-		r.body.Write(b)
-		// Check if we just hit the limit
-		if r.body.Len() >= int(MaxBodyCapture) {
+		remaining := int(MaxBodyCapture) - r.body.Len()
+		if len(b) <= remaining {
+			r.body.Write(b)
+		} else {
+			r.body.Write(b[:remaining])
 			r.truncated = true
 		}
 	}
 	// Write to the original response writer
 	return r.ResponseWriter.Write(b)
 }

) Add a new configuration flag that automatically adds stream_options: {"include_usage": true} to streaming requests for OpenAI-compatible providers. This ensures usage data is always returned in streaming responses when the flag is enabled (default: true). Changes: - Add StreamOptions struct to core/types.go for stream_options param - Add StreamOptions field to ChatRequest and ResponsesRequest - Update WithStreaming() methods to preserve StreamOptions - Add EnforceReturningUsageData to usage.Config and config.UsageConfig - Update handlers to set StreamOptions when flag is enabled - Document ENFORCE_RETURNING_USAGE_DATA in .env.template Co-authored-by: Claude <noreply@anthropic.com>

Use time.After for timeout and time.Ticker for polling instead of a fixed time.Sleep, making the test more reliable and faster. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…d CLOUD.md upgraded

coderabbitai

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

docker-compose.yaml (1)
17-24: Align the storage backend comment with the active setting.

The block still says PostgreSQL for audit logs while STORAGE_TYPE=mongodb is active. That mismatch is confusing for operators—either update the comment or switch the active backend to match.
✏️ Suggested clarification
-      # Storage configuration (PostgreSQL for audit logs)
+      # Storage configuration (audit/usage storage backend)
...
-      # - STORAGE_TYPE=postgresql
-      - STORAGE_TYPE=mongodb
+      # - STORAGE_TYPE=postgresql  # use PostgreSQL backend
+      - STORAGE_TYPE=mongodb       # active backend
config/config.go (1)
266-301: Env var name for enforce‑usage is inconsistent with USAGE_ naming.*

ENFORCE_RETURNING_USAGE_DATA skips the USAGE_ prefix, so USAGE_ENFORCE_RETURNING_USAGE_DATA won’t be honored in either the legacy env path or the override path. Align it with the rest of the usage env vars.
🛠️ Proposed fix
-				EnforceReturningUsageData: getEnvBoolOrDefault("ENFORCE_RETURNING_USAGE_DATA", true),
+				EnforceReturningUsageData: getEnvBoolOrDefault("USAGE_ENFORCE_RETURNING_USAGE_DATA", true),
@@
-	if enforceUsage := os.Getenv("ENFORCE_RETURNING_USAGE_DATA"); enforceUsage != "" {
+	if enforceUsage := os.Getenv("USAGE_ENFORCE_RETURNING_USAGE_DATA"); enforceUsage != "" {
Also applies to: 416-437

🤖 Fix all issues with AI agents

In `@CLAUDE.md`:
- Around line 208-236: Add a single clarifying sentence to the "Manual Testing
with Docker" section stating that Docker Compose is optional and intended only
for manual storage-backend validation—not for running automated E2E or unit
tests—and that automated E2E tests must remain in-process without Docker; insert
this sentence near the examples that show STORAGE_TYPE, POSTGRES_URL and
MONGODB_URL so readers understand the distinction between manual validation and
in-process test suites.

In `@FUTURE_ARCHITECTURE.md`:
- Line 1: Add a top-level heading and specify the fenced-code language: insert a
heading "Future Architecture" at the top of FUTURE_ARCHITECTURE.md and change
the opening fence from an untyped "```" to "```text" and replace the malformed
closing "-```" with a proper closing "```"; apply the same change wherever an
untyped or malformed fenced block occurs (e.g., the other occurrence referenced
as "123-123").

In `@internal/core/types.go`:
- Around line 59-64: The Usage struct exposes RawUsage as map[string]any which
weakens API payload typing; change RawUsage to json.RawMessage to preserve
provider-specific raw bytes while keeping strong typing, update the Usage type
declaration (field RawUsage) to use json.RawMessage and ensure encoding/json is
imported, and then update any functions that marshal/unmarshal or assign to
Usage.RawUsage to handle raw JSON (e.g., pass through or
json.Marshal/json.RawMessage conversions) so existing
serialization/deserialization behavior is preserved.

In `@internal/usage/factory.go`:
- Around line 130-139: The defaults for storage type and paths are hardcoded;
replace the literal values (".cache/gomodel.db" and "gomodel") with shared
constants or getters from the storage/config package and fall back to storage
package defaults where available: update the initialization logic that sets
storageCfg.Type (currently comparing to storage.TypeSQLite),
storageCfg.SQLite.Path, and storageCfg.MongoDB.Database to use exported
constants (e.g., storage.DefaultSQLitePath, storage.DefaultMongoDatabase) or
call a storage.Defaults() helper so defaults remain single-sourced and cannot
drift.
- Around line 66-71: When createUsageStore(store, cfg.Usage.RetentionDays)
returns an error, the subsequent call to store.Close() currently ignores any
error; change the cleanup path in the block that handles the error to capture
the result of store.Close(), then either log it (using the existing logger in
scope) or wrap/aggregate it with the original createUsageStore error before
returning so cleanup failures aren’t lost; update the code around
createUsageStore, usageStore variable handling, and the store.Close() call to
return or log both errors together.
- Around line 23-41: The "Safe to call multiple times" comment is incorrect
because Logger.Close calls close(l.done) and will panic on a second call; either
remove or correct the comment or make Close idempotent: add a guarded closed
flag and mutex to both Result and the Logger implementation (check the closed
flag before calling Logger.Close, Storage.Close or closing l.done, set the flag
under lock after performing close operations) so repeated calls return nil
instead of closing channels twice; update the comment accordingly or remove it
if you implement the guard.

In `@internal/usage/store_sqlite.go`:
- Around line 58-70: The indexes list in the usage creation snippet (the indexes
slice and subsequent loop using db.Exec) omits an index on the endpoint column;
add "CREATE INDEX IF NOT EXISTS idx_usage_endpoint ON usage(endpoint)" to the
indexes slice so the for _, idx := range indexes { if _, err := db.Exec(idx) {
... } } loop will create it, ensuring queries that filter by endpoint are
indexed (reference the indexes variable, idx_usage_endpoint, and the db.Exec
call).

In `@internal/usage/usage.go`:
- Around line 49-55: The RawData field in the Usage struct currently uses
map[string]any; replace it with a typed struct (e.g., RawUsageData) that
contains optional, provider-specific sub-structs/fields (e.g., OpenAI,
Anthropic, Gemini) with proper json/bson tags and pointer types to keep fields
optional, then update the Usage.RawData field signature to use this new type;
ensure all code paths that read/write Usage.RawData
(serialization/deserialization, DB persistence and any unmarshalling code) are
updated to the new struct shape and handle backward compatibility (e.g.,
implement custom UnmarshalJSON/UnmarshalBSON or a migration helper to convert
existing map payloads into the typed RawUsageData).

♻️ Duplicate comments (4)

internal/core/responses.go (1)

76-81: Avoid map[string]any in API response payloads.

Same concern as previously flagged—ResponsesUsage is part of ResponsesResponse (an API payload). Use json.RawMessage instead for type safety.
internal/server/handlers.go (3)
54-57: X-Request-ID retrieved from wrong header source.

The requestID is read from c.Response().Header().Get("X-Request-ID"), but the X-Request-ID header is set on the request by the audit log middleware. This should be c.Request().Header.Get("X-Request-ID") to match the pattern used elsewhere in the codebase.
🔧 Suggested fix
-	requestID := c.Response().Header().Get("X-Request-ID")
+	requestID := c.Request().Header.Get("X-Request-ID")
116-123: Same X-Request-ID issue as streaming path.

Line 118 retrieves requestID from response headers instead of request headers.
🔧 Suggested fix
-		requestID := c.Response().Header().Get("X-Request-ID")
+		requestID := c.Request().Header.Get("X-Request-ID")
182-189: Same X-Request-ID issue in Responses handler.

Line 184 retrieves requestID from response headers instead of request headers.
🔧 Suggested fix
-		requestID := c.Response().Header().Get("X-Request-ID")
+		requestID := c.Request().Header.Get("X-Request-ID")

CLAUDE.md

FUTURE_ARCHITECTURE.md

internal/core/types.go

internal/usage/factory.go

coderabbitai · 2026-01-21T15:06:03Z

internal/usage/factory.go

+	// Create the usage store based on storage type
+	usageStore, err := createUsageStore(store, cfg.Usage.RetentionDays)
+	if err != nil {
+		store.Close()
+		return nil, err
+	}


⚠️ Potential issue | 🟡 Minor

Error from store.Close() is silently ignored.

When createUsageStore fails, the error from store.Close() is discarded. While this is a cleanup path, logging or aggregating this error would help with debugging resource cleanup failures.

Proposed fix

// Create the usage store based on storage type usageStore, err := createUsageStore(store, cfg.Usage.RetentionDays) if err != nil { - store.Close() - return nil, err + closeErr := store.Close() + if closeErr != nil { + return nil, fmt.Errorf("%w (also failed to close storage: %v)", err, closeErr) + } + return nil, err }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Create the usage store based on storage type

usageStore, err := createUsageStore(store, cfg.Usage.RetentionDays)

if err != nil {

store.Close()

return nil, err

}

// Create the usage store based on storage type

usageStore, err := createUsageStore(store, cfg.Usage.RetentionDays)

if err != nil {

closeErr := store.Close()

if closeErr != nil {

return nil, fmt.Errorf("%w (also failed to close storage: %v)", err, closeErr)

}

return nil, err

}

🤖 Prompt for AI Agents

In `@internal/usage/factory.go` around lines 66 - 71, When createUsageStore(store, cfg.Usage.RetentionDays) returns an error, the subsequent call to store.Close() currently ignores any error; change the cleanup path in the block that handles the error to capture the result of store.Close(), then either log it (using the existing logger in scope) or wrap/aggregate it with the original createUsageStore error before returning so cleanup failures aren’t lost; update the code around createUsageStore, usageStore variable handling, and the store.Close() call to return or log both errors together.

coderabbitai · 2026-01-21T15:06:04Z

internal/usage/factory.go

+	// Apply defaults
+	if storageCfg.Type == "" {
+		storageCfg.Type = storage.TypeSQLite
+	}
+	if storageCfg.SQLite.Path == "" {
+		storageCfg.SQLite.Path = ".cache/gomodel.db"
+	}
+	if storageCfg.MongoDB.Database == "" {
+		storageCfg.MongoDB.Database = "gomodel"
+	}


🧹 Nitpick | 🔵 Trivial

Consider using constants for default values.

The default values (e.g., ".cache/gomodel.db", "gomodel" for MongoDB database) are hardcoded here. If these defaults are also defined elsewhere (e.g., in the storage or config package), they could drift out of sync. Consider using shared constants or deferring to the storage package's defaults.

#!/bin/bash # Check if these defaults are duplicated elsewhere echo "=== Checking for duplicate default paths ===" rg -n '\.cache/gomodel\.db|gomodel\.db' --type go

🤖 Prompt for AI Agents

In `@internal/usage/factory.go` around lines 130 - 139, The defaults for storage type and paths are hardcoded; replace the literal values (".cache/gomodel.db" and "gomodel") with shared constants or getters from the storage/config package and fall back to storage package defaults where available: update the initialization logic that sets storageCfg.Type (currently comparing to storage.TypeSQLite), storageCfg.SQLite.Path, and storageCfg.MongoDB.Database to use exported constants (e.g., storage.DefaultSQLitePath, storage.DefaultMongoDatabase) or call a storage.Defaults() helper so defaults remain single-sourced and cannot drift.

coderabbitai · 2026-01-21T15:06:04Z

internal/usage/store_sqlite.go

+	// Create indexes for common queries
+	indexes := []string{
+		"CREATE INDEX IF NOT EXISTS idx_usage_timestamp ON usage(timestamp)",
+		"CREATE INDEX IF NOT EXISTS idx_usage_request_id ON usage(request_id)",
+		"CREATE INDEX IF NOT EXISTS idx_usage_provider_id ON usage(provider_id)",
+		"CREATE INDEX IF NOT EXISTS idx_usage_model ON usage(model)",
+		"CREATE INDEX IF NOT EXISTS idx_usage_provider ON usage(provider)",
+	}
+	for _, idx := range indexes {
+		if _, err := db.Exec(idx); err != nil {
+			slog.Warn("failed to create index", "error", err)
+		}
+	}


🧹 Nitpick | 🔵 Trivial

Consider adding an index on endpoint for completeness.

Indexes are created for timestamp, request_id, provider_id, model, and provider, but endpoint is omitted. If queries filtering by endpoint are expected (e.g., usage reports per endpoint), consider adding it.

♻️ Optional enhancement

indexes := []string{ "CREATE INDEX IF NOT EXISTS idx_usage_timestamp ON usage(timestamp)", "CREATE INDEX IF NOT EXISTS idx_usage_request_id ON usage(request_id)", "CREATE INDEX IF NOT EXISTS idx_usage_provider_id ON usage(provider_id)", "CREATE INDEX IF NOT EXISTS idx_usage_model ON usage(model)", "CREATE INDEX IF NOT EXISTS idx_usage_provider ON usage(provider)", + "CREATE INDEX IF NOT EXISTS idx_usage_endpoint ON usage(endpoint)", }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Create indexes for common queries

indexes := []string{

"CREATE INDEX IF NOT EXISTS idx_usage_timestamp ON usage(timestamp)",

"CREATE INDEX IF NOT EXISTS idx_usage_request_id ON usage(request_id)",

"CREATE INDEX IF NOT EXISTS idx_usage_provider_id ON usage(provider_id)",

"CREATE INDEX IF NOT EXISTS idx_usage_model ON usage(model)",

"CREATE INDEX IF NOT EXISTS idx_usage_provider ON usage(provider)",

}

for _, idx := range indexes {

if _, err := db.Exec(idx); err != nil {

slog.Warn("failed to create index", "error", err)

}

}

// Create indexes for common queries

indexes := []string{

"CREATE INDEX IF NOT EXISTS idx_usage_timestamp ON usage(timestamp)",

"CREATE INDEX IF NOT EXISTS idx_usage_request_id ON usage(request_id)",

"CREATE INDEX IF NOT EXISTS idx_usage_provider_id ON usage(provider_id)",

"CREATE INDEX IF NOT EXISTS idx_usage_model ON usage(model)",

"CREATE INDEX IF NOT EXISTS idx_usage_provider ON usage(provider)",

"CREATE INDEX IF NOT EXISTS idx_usage_endpoint ON usage(endpoint)",

}

for _, idx := range indexes {

if _, err := db.Exec(idx); err != nil {

slog.Warn("failed to create index", "error", err)

}

}

🤖 Prompt for AI Agents

In `@internal/usage/store_sqlite.go` around lines 58 - 70, The indexes list in the usage creation snippet (the indexes slice and subsequent loop using db.Exec) omits an index on the endpoint column; add "CREATE INDEX IF NOT EXISTS idx_usage_endpoint ON usage(endpoint)" to the indexes slice so the for _, idx := range indexes { if _, err := db.Exec(idx) { ... } } loop will create it, ensuring queries that filter by endpoint are indexed (reference the indexes variable, idx_usage_endpoint, and the db.Exec call).

coderabbitai · 2026-01-21T15:06:04Z

internal/usage/usage.go

+	// RawData contains provider-specific extended usage data (JSONB)
+	// Examples:
+	//   OpenAI: {"cached_tokens": 100, "reasoning_tokens": 50}
+	//   Anthropic: {"cache_creation_input_tokens": 200, "cache_read_input_tokens": 150}
+	//   Gemini: {"cached_tokens": 100, "thought_tokens": 75, "tool_use_tokens": 25}
+	RawData map[string]any `json:"raw_data,omitempty" bson:"raw_data,omitempty"`
+}


🛠️ Refactor suggestion | 🟠 Major

Prefer a typed RawData struct instead of map[string]any.

This is persisted payload data; a typed struct improves safety and aligns with the “no map[string]any for payloads” guideline. Consider a struct with optional provider-specific fields.

♻️ Possible refactor

type UsageEntry struct { @@ - RawData map[string]any `json:"raw_data,omitempty" bson:"raw_data,omitempty"` + RawData *UsageRawData `json:"raw_data,omitempty" bson:"raw_data,omitempty"` } + +// UsageRawData captures provider-specific usage extensions. +type UsageRawData struct { + CachedTokens *int `json:"cached_tokens,omitempty" bson:"cached_tokens,omitempty"` + ReasoningTokens *int `json:"reasoning_tokens,omitempty" bson:"reasoning_tokens,omitempty"` + CacheCreationInputTokens *int `json:"cache_creation_input_tokens,omitempty" bson:"cache_creation_input_tokens,omitempty"` + CacheReadInputTokens *int `json:"cache_read_input_tokens,omitempty" bson:"cache_read_input_tokens,omitempty"` + ThoughtTokens *int `json:"thought_tokens,omitempty" bson:"thought_tokens,omitempty"` + ToolUseTokens *int `json:"tool_use_tokens,omitempty" bson:"tool_use_tokens,omitempty"` +}

Based on learnings, keep payload fields strongly typed.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// RawData contains provider-specific extended usage data (JSONB)

// Examples:

// OpenAI: {"cached_tokens": 100, "reasoning_tokens": 50}

// Anthropic: {"cache_creation_input_tokens": 200, "cache_read_input_tokens": 150}

// Gemini: {"cached_tokens": 100, "thought_tokens": 75, "tool_use_tokens": 25}

RawData map[string]any `json:"raw_data,omitempty" bson:"raw_data,omitempty"`

}

// RawData contains provider-specific extended usage data (JSONB)

// Examples:

// OpenAI: {"cached_tokens": 100, "reasoning_tokens": 50}

// Anthropic: {"cache_creation_input_tokens": 200, "cache_read_input_tokens": 150}

// Gemini: {"cached_tokens": 100, "thought_tokens": 75, "tool_use_tokens": 25}

RawData *UsageRawData `json:"raw_data,omitempty" bson:"raw_data,omitempty"`

}

// UsageRawData captures provider-specific usage extensions.

type UsageRawData struct {

CachedTokens *int `json:"cached_tokens,omitempty" bson:"cached_tokens,omitempty"`

ReasoningTokens *int `json:"reasoning_tokens,omitempty" bson:"reasoning_tokens,omitempty"`

CacheCreationInputTokens *int `json:"cache_creation_input_tokens,omitempty" bson:"cache_creation_input_tokens,omitempty"`

CacheReadInputTokens *int `json:"cache_read_input_tokens,omitempty" bson:"cache_read_input_tokens,omitempty"`

ThoughtTokens *int `json:"thought_tokens,omitempty" bson:"thought_tokens,omitempty"`

ToolUseTokens *int `json:"tool_use_tokens,omitempty" bson:"tool_use_tokens,omitempty"`

}

🤖 Prompt for AI Agents

In `@internal/usage/usage.go` around lines 49 - 55, The RawData field in the Usage struct currently uses map[string]any; replace it with a typed struct (e.g., RawUsageData) that contains optional, provider-specific sub-structs/fields (e.g., OpenAI, Anthropic, Gemini) with proper json/bson tags and pointer types to keep fields optional, then update the Usage.RawData field signature to use this new type; ensure all code paths that read/write Usage.RawData (serialization/deserialization, DB persistence and any unmarshalling code) are updated to the new struct shape and handle backward compatibility (e.g., implement custom UnmarshalJSON/UnmarshalBSON or a migration helper to convert existing map payloads into the typed RawUsageData).

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

internal/providers/responses_converter.go (1)

108-130: Extract duplicated response.done construction into a helper method.

The response.done event construction logic is duplicated between the [DONE] handling path (lines 108-130) and the EOF path (lines 174-196). This violates DRY and risks divergence during future changes.

♻️ Proposed refactor

+func (sc *OpenAIResponsesStreamConverter) buildDoneMessage() string {
+	responseData := map[string]interface{}{
+		"id":         sc.responseID,
+		"object":     "response",
+		"status":     "completed",
+		"model":      sc.model,
+		"provider":   sc.provider,
+		"created_at": time.Now().Unix(),
+	}
+	if sc.cachedUsage != nil {
+		responseData["usage"] = sc.cachedUsage
+	}
+	doneEvent := map[string]interface{}{
+		"type":     "response.done",
+		"response": responseData,
+	}
+	jsonData, err := json.Marshal(doneEvent)
+	if err != nil {
+		slog.Error("failed to marshal response.done event", "error", err, "response_id", sc.responseID)
+		return ""
+	}
+	return fmt.Sprintf("event: response.done\ndata: %s\n\ndata: [DONE]\n\n", jsonData)
+}

Then replace both duplicated blocks with:

sc.sentDone = true
doneMsg := sc.buildDoneMessage()
if doneMsg != "" {
    sc.buffer = append(sc.buffer, []byte(doneMsg)...)
}

Also applies to: 174-196

🤖 Fix all issues with AI agents

In @.claude/settings.local.json:
- Around line 16-18: The permission entry Bash(mongosh:*) is too broad — replace
it with a least‑privilege alternative: either change Bash(mongosh:*) to require
confirmation (move it to an "ask" scope) or restrict it to explicit
subcommands/targets (e.g., list only the exact mongosh actions needed such as
count/find/aggregate or specific database/collection identifiers) so arbitrary
shell commands cannot be executed; update the settings entry accordingly and add
a brief comment explaining why the narrower scope or "ask" mode was chosen.

♻️ Duplicate comments (8)

CLAUDE.md (1)

208-236: Consider the past feedback about clarifying Docker's optional nature.

The section provides clear manual testing instructions with correct commands and connection details. The previously raised concern about potential confusion with the "No Docker or external dependencies needed for testing" statement (line 205) remains relevant.
internal/providers/anthropic/anthropic.go (1)
447-454: Silent fallback to current time masks data issues.

Returning time.Now().Unix() on parse failure silently masks upstream data problems and can cause unexpected model ordering if models are sorted by creation time. Consider logging the error and returning 0 to make issues visible.
🔧 Suggested fix
 func parseCreatedAt(createdAt string) int64 {
 	t, err := time.Parse(time.RFC3339, createdAt)
 	if err != nil {
-		return time.Now().Unix()
+		slog.Warn("invalid model created_at timestamp", "value", createdAt, "error", err)
+		return 0
 	}
 	return t.Unix()
 }
Note: This will require updating any tests that expect non-zero values for invalid timestamps.
internal/usage/stream_wrapper.go (3)
46-52: Reallocate the rolling buffer when trimming to cap memory.

Line 49 resets the buffer but keeps capacity, so a single large stream can retain a large backing array even after trimming. Re-buffer the tail into a fresh buffer instead.
♻️ Suggested fix
 		if w.buffer.Len() > SSEBufferSize {
 			// Discard old data, keep recent
 			data := w.buffer.Bytes()
-			w.buffer.Reset()
-			if _, errBuf := w.buffer.Write(data[len(data)-SSEBufferSize:]); errBuf != nil {
-				return n, errBuf
-			}
+			tail := append([]byte(nil), data[len(data)-SSEBufferSize:]...)
+			w.buffer = *bytes.NewBuffer(tail)
 		}
105-148: Use typed structs for SSE/usage payload parsing.

Parsing into map[string]interface{} forces float64 casts and bypasses compile-time checks. Define typed structs with json.RawMessage for the usage payload, then unmarshal into a typed usage struct (keeping provider-specific extras in RawData).
♻️ Suggested refactor (outline)
+type sseChunk struct {
+	ID       string          `json:"id"`
+	Model    string          `json:"model"`
+	Type     string          `json:"type"`
+	Usage    json.RawMessage `json:"usage"`
+	Response *struct {
+		ID    string          `json:"id"`
+		Model string          `json:"model"`
+		Usage json.RawMessage `json:"usage"`
+	} `json:"response,omitempty"`
+}
+
+type usagePayload struct {
+	PromptTokens     int `json:"prompt_tokens"`
+	InputTokens      int `json:"input_tokens"`
+	CompletionTokens int `json:"completion_tokens"`
+	OutputTokens     int `json:"output_tokens"`
+	TotalTokens      int `json:"total_tokens"`
+	// add provider-specific fields as needed
+}
...
-	var chunk map[string]interface{}
-	if err := json.Unmarshal(data, &chunk); err != nil {
+	var chunk sseChunk
+	if err := json.Unmarshal(data, &chunk); err != nil {
 		return nil
 	}
As per coding guidelines, avoid map-based payloads for API responses.

150-168: Derive total_tokens when only input/output are present.

If total_tokens is omitted, usage is undercounted even when input/output are present. Derive totalTokens as the sum when missing.
🐛 Suggested fix
 	if v, ok := usageMap["total_tokens"].(float64); ok {
 		totalTokens = int(v)
 	}
+	if totalTokens == 0 && (inputTokens > 0 || outputTokens > 0) {
+		totalTokens = inputTokens + outputTokens
+	}
internal/usage/factory.go (3)
23-40: “Safe to call multiple times” can be incorrect if Logger.Close isn’t idempotent.

Line 24 claims idempotency, but if Logger.Close or Storage.Close can panic on double-close, Result.Close isn’t safe. Either remove the claim or guard with a sync.Once/flag.
🐛 Suggested fix (guard Close)
 type Result struct {
 	Logger  LoggerInterface
 	Storage storage.Storage
+	closeOnce sync.Once
 }
 
 func (r *Result) Close() error {
-	var errs []error
-	if r.Logger != nil {
-		if err := r.Logger.Close(); err != nil {
-			errs = append(errs, fmt.Errorf("logger close: %w", err))
-		}
-	}
-	if r.Storage != nil {
-		if err := r.Storage.Close(); err != nil {
-			errs = append(errs, fmt.Errorf("storage close: %w", err))
-		}
-	}
-	if len(errs) > 0 {
-		return fmt.Errorf("close errors: %w", errors.Join(errs...))
-	}
-	return nil
+	var outErr error
+	r.closeOnce.Do(func() {
+		var errs []error
+		if r.Logger != nil {
+			if err := r.Logger.Close(); err != nil {
+				errs = append(errs, fmt.Errorf("logger close: %w", err))
+			}
+		}
+		if r.Storage != nil {
+			if err := r.Storage.Close(); err != nil {
+				errs = append(errs, fmt.Errorf("storage close: %w", err))
+			}
+		}
+		if len(errs) > 0 {
+			outErr = fmt.Errorf("close errors: %w", errors.Join(errs...))
+		}
+	})
+	return outErr
 }
To verify idempotency, inspect Logger.Close:
#!/bin/bash
rg -n 'func \(.*Logger\) Close' internal/usage/logger.go -A20
66-71: Don’t drop the cleanup error from store.Close().

If createUsageStore fails, the cleanup error is discarded. Capture and return/log it so cleanup failures aren’t lost.
🐛 Suggested fix
 	usageStore, err := createUsageStore(store, cfg.Usage.RetentionDays)
 	if err != nil {
-		store.Close()
-		return nil, err
+		closeErr := store.Close()
+		if closeErr != nil {
+			return nil, fmt.Errorf("%w (also failed to close storage: %v)", err, closeErr)
+		}
+		return nil, err
 	}
130-139: Avoid hardcoded storage defaults if shared constants exist.

The sqlite path and MongoDB database defaults are literal strings here; consider using shared defaults from the storage/config layer to avoid drift.

.claude/settings.local.json

The NotFound case in ParseProviderError was dropping the original error, losing debugging context. Other error branches preserve this information. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add context-based request ID passing via core.WithRequestID/GetRequestID - Fix middleware to set X-Request-ID on both request and response headers - Update handlers to enrich context with request ID before calling providers - Forward request ID to all providers with provider-specific header names: - OpenAI, Groq, xAI: X-Request-ID - Anthropic: X-Request-Id - Gemini: X-Goog-Request-Id This enables end-to-end request tracing from client through GOModel to upstream providers. Clients can override the auto-generated ID by passing their own X-Request-ID header. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

internal/auditlog/middleware.go (1)
47-58: Request ID propagation is skipped when audit logging is disabled.

Because the middleware returns early before this block, requests can lose IDs (and usage entries will miss correlation) when AuditLogger is off. Consider generating/setting the ID before the early-return or moving it to a lightweight middleware.
🔧 Suggested adjustment
-			// Skip if logging is disabled
-			if logger == nil || !logger.Config().Enabled {
-				return next(c)
-			}
-
-			cfg := logger.Config()
-
-			start := time.Now()
-			req := c.Request()
-
-			// Generate request ID if not present
-			requestID := req.Header.Get("X-Request-ID")
+			req := c.Request()
+			// Generate request ID even when audit logging is disabled
+			requestID := req.Header.Get("X-Request-ID")
 			if requestID == "" {
 				requestID = uuid.NewString()
 			}
 			req.Header.Set("X-Request-ID", requestID)
 			c.Response().Header().Set("X-Request-ID", requestID)
+
+			// Skip if logging is disabled
+			if logger == nil || !logger.Config().Enabled {
+				return next(c)
+			}
+
+			cfg := logger.Config()
+			start := time.Now()
internal/providers/groq/groq.go (1)

138-151: Consider strongly-typed input parsing.

Per coding guidelines, request/response payloads should prefer strongly-typed structs over interface{} and map[string]interface{}. While this internal conversion logic handles dynamic ResponsesRequest.Input, the per-request type assertions add runtime overhead and reduce type safety.

If core.ResponsesRequest.Input must remain interface{} for API flexibility, consider centralizing the input parsing logic into a shared utility with well-defined intermediate types, reducing duplicated type assertions across providers.

🤖 Fix all issues with AI agents

In `@internal/providers/gemini/gemini.go`:
- Around line 78-81: The code forwards a context request ID using
core.GetRequestID and sets header via req.Header.Set("X-Goog-Request-Id", ...)
even though Gemini may not accept that header; update the behavior to either (A)
verify Gemini acceptance and keep using "X-Goog-Request-Id" if confirmed, or (B)
change to a standard tracing header by replacing the header key with
"X-Request-Id" (or add both headers) so request tracing works; modify the logic
around core.GetRequestID(req.Context()) and req.Header.Set to implement the
chosen header(s) and ensure tests/logs reflect the header name change.

In `@internal/providers/openai/openai.go`:
- Around line 61-64: Replace the incorrect header "X-Request-ID" with OpenAI's
expected "X-Client-Request-Id" when forwarding the request ID from context
(where core.GetRequestID(req.Context()) is used); additionally validate the
request ID for ASCII-only characters and a maximum length of 512 bytes and only
set the header if the value passes validation (otherwise omit it) to avoid
triggering OpenAI 400 errors.

♻️ Duplicate comments (1)

internal/providers/anthropic/anthropic.go (1)
452-459: Consider logging parse errors instead of silently falling back to current time.

Returning time.Now().Unix() on parse failure masks upstream data issues and could unexpectedly affect model list ordering. Since slog is already imported, consider logging the error for observability while deciding on an appropriate fallback value.
🔧 Suggested fix
 func parseCreatedAt(createdAt string) int64 {
 	t, err := time.Parse(time.RFC3339, createdAt)
 	if err != nil {
+		slog.Warn("failed to parse model created_at timestamp", "value", createdAt, "error", err)
 		return time.Now().Unix()
 	}
 	return t.Unix()
 }

internal/providers/gemini/gemini.go

internal/providers/openai/openai.go

- Reduce from 272 to 84 lines (~69% reduction) - Fix incorrect claim about init() registration (now explicit factory.Register()) - Fix incorrect 5-minute refresh interval (actually 1 hour default) - Add clarification that Docker Compose is for manual testing only - Update config section to reference .env.template and config/config.yaml Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@CLAUDE.md`:
- Line 81: Update the phrasing on the line that reads "requires initialized
registry—check `ModelCount() > 0`" to use correct grammar; for example change it
to "requires an initialized registry—check `ModelCount() > 0`" or "requires the
registry to be initialized—check `ModelCount() > 0`" so the sentence reads
naturally.
- Line 9: Replace the bare URL "https://github.com/ENTERPILOT/GOModel" in the
Module line with a markdown link; e.g., update the line containing `**Module:**
`gomodel` | **Go:** 1.24.0 | **Repo:** https://github.com/ENTERPILOT/GOModel` to
use a labeled link such as `**Module:** `gomodel` | **Go:** 1.24.0 | **Repo:**
[GOModel](https://github.com/ENTERPILOT/GOModel)` so the URL is formatted as a
proper markdown link and avoids linter warnings.
- Around line 52-62: The fenced code block in the CLAUDE.md snippet (the block
beginning with ```) is missing a language specifier causing a markdown linter
warning; update that fenced block to include a language like text or plaintext
(e.g., change the opening fence from ``` to ```text) so the directory listing
block is properly marked and the linter warning is resolved.

CLAUDE.md

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@internal/providers/openai/openai_test.go`:
- Around line 753-809: Add tests for control characters to
TestIsValidClientRequestID by including cases like "\x00", "\t", and "\x7F"
(expecting invalid) to document handling of non-printable ASCII; and add an
integration test for setHeaders that constructs a provider via
NewWithHTTPClient, creates an http.Request with a context set via
core.WithRequestID, calls provider.setHeaders(req) and asserts the
"X-Client-Request-Id" header is set to the request ID for valid IDs and absent
for empty/invalid IDs (reference functions: isValidClientRequestID, setHeaders,
NewWithHTTPClient, core.WithRequestID, header "X-Client-Request-Id").

coderabbitai · 2026-01-21T19:57:24Z

internal/providers/openai/openai_test.go

+func TestIsValidClientRequestID(t *testing.T) {
+	tests := []struct {
+		name  string
+		id    string
+		valid bool
+	}{
+		{
+			name:  "valid UUID",
+			id:    "123e4567-e89b-12d3-a456-426614174000",
+			valid: true,
+		},
+		{
+			name:  "valid short ID",
+			id:    "req-123",
+			valid: true,
+		},
+		{
+			name:  "valid empty string",
+			id:    "",
+			valid: true,
+		},
+		{
+			name:  "valid 512 chars",
+			id:    strings.Repeat("a", 512),
+			valid: true,
+		},
+		{
+			name:  "invalid - 513 chars (too long)",
+			id:    strings.Repeat("a", 513),
+			valid: false,
+		},
+		{
+			name:  "invalid - non-ASCII character",
+			id:    "req-123-日本語",
+			valid: false,
+		},
+		{
+			name:  "invalid - emoji",
+			id:    "req-123-🎉",
+			valid: false,
+		},
+		{
+			name:  "valid - all printable ASCII",
+			id:    "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_.",
+			valid: true,
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			got := isValidClientRequestID(tt.id)
+			if got != tt.valid {
+				t.Errorf("isValidClientRequestID(%q) = %v, want %v", tt.id, got, tt.valid)
+			}
+		})
+	}
+}


🧹 Nitpick | 🔵 Trivial

Good test coverage for the validation function.

The test cases comprehensively cover boundary conditions and invalid inputs. Consider adding test cases for control characters (e.g., "\x00", "\t", "\x7F") to document the expected behavior for these edge cases.

Additionally, consider adding an integration test for setHeaders that verifies the X-Client-Request-Id header is actually set when a valid request ID is present in the context:

🧪 Suggested test for header integration

func TestSetHeadersWithRequestID(t *testing.T) { provider := NewWithHTTPClient("test-api-key", nil, llmclient.Hooks{}) tests := []struct { name string requestID string expectSet bool }{ {"valid ID", "req-123", true}, {"empty ID", "", false}, {"invalid non-ASCII", "req-日本語", false}, } for _, tt := range tests { t.Run(tt.name, func(t *testing.T) { req, _ := http.NewRequest(http.MethodGet, "http://example.com", nil) ctx := context.Background() if tt.requestID != "" { ctx = core.WithRequestID(ctx, tt.requestID) } req = req.WithContext(ctx) provider.setHeaders(req) got := req.Header.Get("X-Client-Request-Id") if tt.expectSet && got != tt.requestID { t.Errorf("X-Client-Request-Id = %q, want %q", got, tt.requestID) } if !tt.expectSet && got != "" { t.Errorf("X-Client-Request-Id should not be set, got %q", got) } }) } }

🤖 Prompt for AI Agents

In `@internal/providers/openai/openai_test.go` around lines 753 - 809, Add tests for control characters to TestIsValidClientRequestID by including cases like "\x00", "\t", and "\x7F" (expecting invalid) to document handling of non-printable ASCII; and add an integration test for setHeaders that constructs a provider via NewWithHTTPClient, creates an http.Request with a context set via core.WithRequestID, calls provider.setHeaders(req) and asserts the "X-Client-Request-Id" header is set to the request ID for valid IDs and absent for empty/invalid IDs (reference functions: isValidClientRequestID, setHeaders, NewWithHTTPClient, core.WithRequestID, header "X-Client-Request-Id").

Entries are queued asynchronously; if the source map is mutated after Write, the logged data can be corrupted. Added defensive copy via cloneRawData helper for all three extractor functions. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

SantiagoDePolonia added 6 commits January 20, 2026 21:02

feat: added token usage tracking

e0e8ff8

docs: saved executed plan for implementing tokens usage

c414191

Merge remote-tracking branch 'origin/main' into feature/usage-token-t…

4183cd4

…racking

fix: enabled by default - usage tracking feature

705ec54

fix: actual anthropic models added

bbab09e

fix: get anthropic model dynamically

f1f068b

SantiagoDePolonia self-assigned this Jan 21, 2026