Conversation
📝 WalkthroughWalkthroughAdds a token usage tracking subsystem: new usage types, config/env, multi-backend stores (SQLite/Postgres/MongoDB), async buffered logger, SSE stream extraction/wrapping, request-ID propagation, and wiring into app/server/handlers and provider streams; also moves storage-type config to a top-level Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Handler
participant Provider as Provider API
participant UsageLogger
participant Buffer
participant Storage
Note over Client,Storage: Non-streaming flow (usage write)
Client->>Handler: POST /chat/completions
Handler->>Provider: Forward request (ctx includes request-id)
Provider-->>Handler: Response (includes usage)
Handler->>UsageLogger: Write(UsageEntry)
UsageLogger->>Buffer: Enqueue entry
alt Flush by threshold or timer
UsageLogger->>Storage: WriteBatch(entries)
Storage-->>UsageLogger: Ack
end
Handler-->>Client: Response (+ usage)
sequenceDiagram
participant Client
participant Handler
participant ProviderSSE as Provider SSE
participant StreamWrapper
participant UsageLogger
participant Storage
Note over Client,Storage: Streaming flow with SSE extraction
Client->>Handler: POST /chat/completions?stream=true
Handler->>ProviderSSE: Open provider SSE (wrapped for usage)
ProviderSSE-->>StreamWrapper: SSE events (data chunks)
StreamWrapper->>Client: Relay events
loop streaming
ProviderSSE->>StreamWrapper: Event (may include usage JSON)
StreamWrapper->>StreamWrapper: Buffer rolling window
end
StreamWrapper->>StreamWrapper: Parse buffered SSE for usage on Close
StreamWrapper->>UsageLogger: Write(UsageEntry)
UsageLogger->>Storage: WriteBatch(entries)
Storage-->>UsageLogger: Ack
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 17
🤖 Fix all issues with AI agents
In `@docs/plans/token-usage-tracking.md`:
- Around line 12-36: Fix markdownlint errors by ensuring all fenced code blocks
are surrounded by blank lines and include language tags (e.g., ```go for the
Usage struct block), replace emphasis-used headings (like **1. Core Usage Struct
is Minimal**) with proper heading syntax (e.g., "### Core Usage Struct is
Minimal") and ensure a blank line before/after headings, and run through the
document to correct other lint rules (MD031, MD036, MD040, MD022) so headings
and code blocks follow consistent spacing and language-tag conventions; update
the examples mentioning Usage, StreamLogWrapper and audit log generation to use
proper fenced code blocks and heading markup.
In `@internal/core/errors.go`:
- Around line 158-162: The 404 branch creating NewNotFoundError(message)
currently drops the originalErr; update the case handling so after creating err
:= NewNotFoundError(message) and setting err.Provider = provider you also set
err.OriginalErr = originalErr (matching other branches) before returning to
preserve debugging context; reference NewNotFoundError, err.Provider and
originalErr in your change.
In `@internal/core/responses.go`:
- Around line 73-78: The ResponsesUsage struct currently exposes RawUsage as
map[string]any which weakens payload typing; update ResponsesUsage to replace
RawUsage with json.RawMessage (importing encoding/json) or a dedicated typed
struct to preserve strong typing for provider-specific data, and adjust any
marshaling/unmarshaling code that reads/writes ResponsesUsage (references:
ResponsesUsage, RawUsage) to handle json.RawMessage (or the new typed struct)
instead of a map while keeping the json tag `json:"raw_usage,omitempty"`.
In `@internal/providers/anthropic/anthropic.go`:
- Around line 405-426: Provider.ListModels currently fetches only the first page
(limit=1000) and ignores anthropicModelsResponse.HasMore/LastID; change
ListModels to loop calling p.client.Do repeatedly while anthropicResp.HasMore is
true, passing after_id=anthropicResp.LastID (and limit=1000) on subsequent
requests, appending each page's models (convert with parseCreatedAt into
core.Model) into the same slice, and only return the full core.ModelsResponse
after no more pages remain; use the anthropicModelsResponse struct fields
HasMore and LastID to control pagination and aggregation.
- Around line 434-441: Change parseCreatedAt to log parse errors via the slog
package and return 0 instead of time.Now().Unix() so invalid timestamps are not
silently mapped to current time; specifically, in function
parseCreatedAt(createdAt string) int64 call slog.Error (or slog.With(...).Error)
with a clear message including the invalid createdAt and the parse error, then
return int64(0) on error. Also update the TestParseCreatedAt_InvalidFormat unit
test to expect 0 for invalid/ unparsable timestamps.
In `@internal/server/handlers.go`:
- Around line 55-56: The request ID is being read from the response headers
(using c.Response().Header().Get("X-Request-ID")) which is incorrect; change all
occurrences to read from the incoming request headers (use
c.Request().Header.Get("X-Request-ID")) so the value set by the audit log
middleware is returned—update the occurrences around the requestID variable in
handlers.go (and the similar spots referenced at the other two occurrences) to
use c.Request().Header.Get instead of c.Response().Header().Get.
In `@internal/usage/cleanup.go`:
- Around line 8-25: Extract the duplicate RunCleanupLoop and CleanupInterval
into a shared utility function that accepts the interval as a parameter (e.g.,
create common.RunCleanupLoop(interval time.Duration, stop <-chan struct{},
cleanupFn func())), remove the local constant/duplicate RunCleanupLoop in both
packages, and update the callers in usage and auditlog to call
common.RunCleanupLoop(CleanupInterval, stop, cleanupFn) (or pass their existing
interval constant) so both packages use the single shared implementation; ensure
you keep the initial immediate cleanup call and ticker.Stop defer behavior in
the new common function.
In `@internal/usage/extractor.go`:
- Around line 31-34: The code assigns resp.Usage.RawUsage directly to
entry.RawData which can lead to races if the source map is mutated after Write
because entries are enqueued asynchronously; fix by making a defensive deep copy
of the map before assigning (for every occurrence where resp.Usage.RawUsage ->
entry.RawData is set, including the occurrences around the blocks at lines
similar to the ones shown and the other spots referenced) so entry.RawData owns
its own map instance rather than referencing the original resp.Usage.RawUsage.
In `@internal/usage/logger.go`:
- Around line 51-139: The Write path can still panic if l.buffer is closed
concurrently; make Close idempotent and block until in-flight Write calls finish
by adding a write-tracking WaitGroup and a close guard: introduce e.g.
l.writesWg (sync.WaitGroup) and l.closeOnce (sync.Once) or l.closeMu
(sync.Mutex)/RWMutex, update Write to take a read lock or check under the close
guard, call l.writesWg.Add(1) before attempting to send and defer
l.writesWg.Done(); if closed return without sending; update Close to use the
close guard or closeOnce to only close l.done once, set l.closed, then wait for
l.writesWg.Wait() and then wait for l.wg.Wait() (flushLoop) before closing the
store; reference functions/fields: Write, Close, flushLoop, l.closed, l.buffer,
l.done, l.wg, and add l.writesWg and l.closeOnce/closeMu as needed.
In `@internal/usage/store_mongodb.go`:
- Around line 123-145: The error handling after s.collection.InsertMany should
use errors.As to detect a wrapped mongo.BulkWriteException instead of a direct
type assertion; update the block that currently does "if bulkErr, ok :=
err.(mongo.BulkWriteException); ok { ... }" to use var bulkErr
mongo.BulkWriteException and errors.As(err, &bulkErr) so partial write detection
and logging (slog.Warn, usagePartialWriteFailures.Inc, and returning
&PartialWriteError{... Cause: bulkErr}) still occur for wrapped errors, and add
the "errors" import if missing; apply the same pattern used in
internal/auditlog/store_mongodb.go for consistency.
In `@internal/usage/store_postgresql.go`:
- Around line 67-76: The PostgreSQLStore currently always allocates stopCleanup
even when retentionDays <= 0; change the constructor to only create and assign
stopCleanup when retentionDays > 0 and start the goroutine, otherwise leave
stopCleanup nil (or zero value). Locate the PostgreSQLStore struct instantiation
(PostgreSQLStore) and the RunCleanupLoop call and condition on retentionDays to
perform both the make(chan struct{}) assignment and the go RunCleanupLoop(...)
invocation only when retentionDays > 0 so the unused channel isn't allocated.
- Around line 123-159: writeBatchLarge currently opens a transaction but still
calls tx.Exec per row (tx.Exec loop), which negates batch performance and also
collects insert errors yet proceeds to tx.Commit; change it to perform a true
batch insert using pgx.Batch (create a pgx.Batch, Queue the INSERT for each
UsageEntry, execute via s.pool.SendBatch or tx.SendBatch and read results) or
use tx.CopyFrom with pgx.CopyFromRows to insert all rows efficiently, and ensure
transactional semantics: if any batch/CopyFrom error occurs, call tx.Rollback
and return the error without committing (remove committing when errs were
collected), and replace the per-row error collection logic with handling the
batch/CopyFrom error result. Reference symbols: writeBatchLarge, tx.Exec
(remove), pgx.Batch / tx.CopyFrom, tx.Commit, tx.Rollback.
In `@internal/usage/store_sqlite.go`:
- Around line 167-171: The cleanup query uses time.RFC3339 but entries are
written using time.RFC3339Nano (causing incorrect string comparisons); update
the cutoff generation in the cleanup path (where cutoff is set before
s.db.Exec("DELETE FROM usage WHERE timestamp < ?", cutoff)) to use
time.RFC3339Nano instead of time.RFC3339 so the stored timestamps and the cutoff
use the same nanosecond precision.
In `@internal/usage/stream_wrapper.go`:
- Around line 41-52: The rolling buffer in the Write path currently calls
w.buffer.Reset() which doesn't shrink the backing array, so when trimming to
SSEBufferSize you must reallocate a new smaller buffer to cap memory; modify the
trim branch (where w.buffer.Len() > SSEBufferSize and you compute data :=
w.buffer.Bytes()) to create a fresh buffer of size SSEBufferSize (or a
bytes.NewBuffer with the sliced tail) and replace w.buffer with it (or copy the
tail into a new bytes.Buffer) instead of Reset()+Write(), ensuring capacity is
limited while preserving the last SSEBufferSize bytes.
- Around line 132-191: The code currently leaves totalTokens at 0 when
"total_tokens" is missing, undercounting usage; update the logic after
extracting inputTokens, outputTokens, totalTokens from usageMap to set
totalTokens = inputTokens + outputTokens when totalTokens == 0 and (inputTokens
> 0 || outputTokens > 0). Keep the existing rawData handling and call to
ExtractFromSSEUsage(providerID, inputTokens, outputTokens, totalTokens, rawData,
w.requestID, model, w.provider, w.endpoint) so the derived totalTokens is passed
through.
- Around line 104-130: The extractUsageFromJSON function in StreamUsageWrapper
currently unmarshals into map[string]interface{}; change it to unmarshal into
strongly-typed structs (e.g., SSEChunk struct with ID, Model, Usage
json.RawMessage fields and a UsagePayload struct with Token fields as
json.Number or *int64) inside StreamUsageWrapper.extractUsageFromJSON, then
decode UsagePayload to populate and return a UsageEntry (preserve any
provider-specific remaining data into UsageEntry.RawData as raw JSON); ensure
model override logic still uses SSEChunk.Model when present and convert
json.Number to int safely for token counts.
In `@internal/usage/usage_test.go`:
- Around line 44-85: Replace the fixed time.Sleep in TestLogger with a polling
loop that repeatedly checks store.getEntries() until the expected 5 entries
appear or a timeout elapses; locate the TestLogger test that constructs logger
via NewLogger, writes entries with logger.Write and currently does
time.Sleep(200*time.Millisecond), and change it to poll (using time.After for
overall timeout and time.Tick or short sleeps) to assert len(entries)==5 or fail
if timeout reached, then continue to Close() the logger and verify store.closed
as before.
| // parseCreatedAt parses an RFC3339 timestamp string to Unix timestamp | ||
| func parseCreatedAt(createdAt string) int64 { | ||
| t, err := time.Parse(time.RFC3339, createdAt) | ||
| if err != nil { | ||
| return time.Now().Unix() | ||
| } | ||
| return t.Unix() | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the file structure and the function in question
wc -l internal/providers/anthropic/anthropic.goRepository: ENTERPILOT/GOModel
Length of output: 180
🏁 Script executed:
# Read the specific section around lines 434-441
sed -n '430,445p' internal/providers/anthropic/anthropic.goRepository: ENTERPILOT/GOModel
Length of output: 531
🏁 Script executed:
# Check if slog is imported in the file
rg "import|slog" internal/providers/anthropic/anthropic.go | head -30Repository: ENTERPILOT/GOModel
Length of output: 579
🏁 Script executed:
# Search for usages of parseCreatedAt function
rg "parseCreatedAt" internal/providers/anthropic/Repository: ENTERPILOT/GOModel
Length of output: 770
🏁 Script executed:
# Get broader context of the anthropic.go file to understand the module and imports
head -50 internal/providers/anthropic/anthropic.goRepository: ENTERPILOT/GOModel
Length of output: 1177
🏁 Script executed:
# Find and read the parseCreatedAt test
rg -A 15 "parseCreatedAt.*invalid" internal/providers/anthropic/anthropic_test.goRepository: ENTERPILOT/GOModel
Length of output: 638
🏁 Script executed:
# Get the full test function for parseCreatedAt
rg -B 5 -A 20 "func.*Test.*parseCreatedAt" internal/providers/anthropic/anthropic_test.goRepository: ENTERPILOT/GOModel
Length of output: 44
🏁 Script executed:
# Check the context where parseCreatedAt is called in anthropic.go
rg -B 5 -A 5 "created := parseCreatedAt" internal/providers/anthropic/anthropic.goRepository: ENTERPILOT/GOModel
Length of output: 360
🏁 Script executed:
# Let's check if there are more comprehensive test cases for parseCreatedAt
rg -B 10 -A 30 "func Test.*parseCreatedAt\|parseCreatedAt\(\"" internal/providers/anthropic/anthropic_test.go | head -60Repository: ENTERPILOT/GOModel
Length of output: 44
🏁 Script executed:
# Search more broadly for test functions around parseCreatedAt
rg "parseCreatedAt\|func Test" internal/providers/anthropic/anthropic_test.go | head -20Repository: ENTERPILOT/GOModel
Length of output: 44
🏁 Script executed:
# Let's look at the test file around where parseCreatedAt is tested
rg -B 20 'parseCreatedAt\("invalid' internal/providers/anthropic/anthropic_test.goRepository: ENTERPILOT/GOModel
Length of output: 580
🏁 Script executed:
# Let's see more context around the test to understand the full test setup
rg -B 40 "TestParseCreatedAt_InvalidFormat" internal/providers/anthropic/anthropic_test.go | tail -50Repository: ENTERPILOT/GOModel
Length of output: 949
🏁 Script executed:
# Check how the Created field is used in callers of ListModels or core.Model
rg "\.Created|Model\{" internal/providers/anthropic/anthropic.go | head -10Repository: ENTERPILOT/GOModel
Length of output: 140
🏁 Script executed:
# Check the core.Model structure to understand the Created field
rg -A 10 "type Model struct" internal/core/Repository: ENTERPILOT/GOModel
Length of output: 626
🏁 Script executed:
# Search for how Created timestamp is used in the codebase
rg "\.Created" --type go | head -20Repository: ENTERPILOT/GOModel
Length of output: 1286
Log invalid timestamps and return 0 instead of silently mapping to current time.
Returning time.Now().Unix() on parse errors masks upstream data issues and can unexpectedly reorder model lists. The suggested change uses the already-imported slog package to log the error and return 0 for invalid timestamps.
Note: This will require updating the TestParseCreatedAt_InvalidFormat test that currently expects non-zero values for invalid dates.
🔧 Suggested fix
func parseCreatedAt(createdAt string) int64 {
t, err := time.Parse(time.RFC3339, createdAt)
if err != nil {
- return time.Now().Unix()
+ slog.Warn("invalid model created_at", "value", createdAt, "error", err)
+ return 0
}
return t.Unix()
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // parseCreatedAt parses an RFC3339 timestamp string to Unix timestamp | |
| func parseCreatedAt(createdAt string) int64 { | |
| t, err := time.Parse(time.RFC3339, createdAt) | |
| if err != nil { | |
| return time.Now().Unix() | |
| } | |
| return t.Unix() | |
| } | |
| // parseCreatedAt parses an RFC3339 timestamp string to Unix timestamp | |
| func parseCreatedAt(createdAt string) int64 { | |
| t, err := time.Parse(time.RFC3339, createdAt) | |
| if err != nil { | |
| slog.Warn("invalid model created_at", "value", createdAt, "error", err) | |
| return 0 | |
| } | |
| return t.Unix() | |
| } |
🤖 Prompt for AI Agents
In `@internal/providers/anthropic/anthropic.go` around lines 434 - 441, Change
parseCreatedAt to log parse errors via the slog package and return 0 instead of
time.Now().Unix() so invalid timestamps are not silently mapped to current time;
specifically, in function parseCreatedAt(createdAt string) int64 call slog.Error
(or slog.With(...).Error) with a clear message including the invalid createdAt
and the parse error, then return int64(0) on error. Also update the
TestParseCreatedAt_InvalidFormat unit test to expect 0 for invalid/ unparsable
timestamps.
| var inputTokens, outputTokens, totalTokens int | ||
| rawData := make(map[string]any) | ||
|
|
||
| // Extract standard fields | ||
| if v, ok := usageMap["prompt_tokens"].(float64); ok { | ||
| inputTokens = int(v) | ||
| } | ||
| if v, ok := usageMap["input_tokens"].(float64); ok { | ||
| inputTokens = int(v) | ||
| } | ||
| if v, ok := usageMap["completion_tokens"].(float64); ok { | ||
| outputTokens = int(v) | ||
| } | ||
| if v, ok := usageMap["output_tokens"].(float64); ok { | ||
| outputTokens = int(v) | ||
| } | ||
| if v, ok := usageMap["total_tokens"].(float64); ok { | ||
| totalTokens = int(v) | ||
| } | ||
|
|
||
| // Extract extended usage data (provider-specific) | ||
| extendedFields := []string{ | ||
| "cached_tokens", "reasoning_tokens", // OpenAI | ||
| "cache_creation_input_tokens", "cache_read_input_tokens", // Anthropic | ||
| "thought_tokens", "tool_use_tokens", // Gemini | ||
| "image_tokens", // xAI | ||
| } | ||
| for _, field := range extendedFields { | ||
| if v, ok := usageMap[field].(float64); ok && v > 0 { | ||
| rawData[field] = int(v) | ||
| } | ||
| } | ||
|
|
||
| // Also check for nested prompt_tokens_details and completion_tokens_details (OpenAI) | ||
| if details, ok := usageMap["prompt_tokens_details"].(map[string]interface{}); ok { | ||
| for k, v := range details { | ||
| if fv, ok := v.(float64); ok && fv > 0 { | ||
| rawData["prompt_"+k] = int(fv) | ||
| } | ||
| } | ||
| } | ||
| if details, ok := usageMap["completion_tokens_details"].(map[string]interface{}); ok { | ||
| for k, v := range details { | ||
| if fv, ok := v.(float64); ok && fv > 0 { | ||
| rawData["completion_"+k] = int(fv) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Only create entry if we found some usage data | ||
| if inputTokens > 0 || outputTokens > 0 || totalTokens > 0 { | ||
| if len(rawData) == 0 { | ||
| rawData = nil | ||
| } | ||
| return ExtractFromSSEUsage( | ||
| providerID, | ||
| inputTokens, outputTokens, totalTokens, | ||
| rawData, | ||
| w.requestID, model, w.provider, w.endpoint, | ||
| ) |
There was a problem hiding this comment.
Derive total_tokens when only input/output are present.
If a provider omits total_tokens, the current logic emits TotalTokens=0 even when input/output are set, which undercounts usage. Consider deriving totalTokens from input/output when missing.
🐛 Suggested fix
if v, ok := usageMap["total_tokens"].(float64); ok {
totalTokens = int(v)
}
+ if totalTokens == 0 && (inputTokens > 0 || outputTokens > 0) {
+ totalTokens = inputTokens + outputTokens
+ }
+
// Only create entry if we found some usage data🤖 Prompt for AI Agents
In `@internal/usage/stream_wrapper.go` around lines 132 - 191, The code currently
leaves totalTokens at 0 when "total_tokens" is missing, undercounting usage;
update the logic after extracting inputTokens, outputTokens, totalTokens from
usageMap to set totalTokens = inputTokens + outputTokens when totalTokens == 0
and (inputTokens > 0 || outputTokens > 0). Keep the existing rawData handling
and call to ExtractFromSSEUsage(providerID, inputTokens, outputTokens,
totalTokens, rawData, w.requestID, model, w.provider, w.endpoint) so the derived
totalTokens is passed through.
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/auditlog/middleware.go (1)
173-184: Unbounded single write may exceedMaxBodyCapture.The current logic only checks if the buffer length is below
MaxBodyCapturebefore writing, but writes the entire slicebregardless of its size. A single large write could significantly exceed the intended limit.♻️ Proposed fix to enforce strict size limit
func (r *responseBodyCapture) Write(b []byte) (int, error) { // Write to the capture buffer (limit to MaxBodyCapture to avoid memory issues) if r.body.Len() < int(MaxBodyCapture) { - r.body.Write(b) - // Check if we just hit the limit - if r.body.Len() >= int(MaxBodyCapture) { + remaining := int(MaxBodyCapture) - r.body.Len() + if len(b) <= remaining { + r.body.Write(b) + } else { + r.body.Write(b[:remaining]) r.truncated = true } } // Write to the original response writer return r.ResponseWriter.Write(b) }
) Add a new configuration flag that automatically adds stream_options: {"include_usage": true} to streaming requests for OpenAI-compatible providers. This ensures usage data is always returned in streaming responses when the flag is enabled (default: true). Changes: - Add StreamOptions struct to core/types.go for stream_options param - Add StreamOptions field to ChatRequest and ResponsesRequest - Update WithStreaming() methods to preserve StreamOptions - Add EnforceReturningUsageData to usage.Config and config.UsageConfig - Update handlers to set StreamOptions when flag is enabled - Document ENFORCE_RETURNING_USAGE_DATA in .env.template Co-authored-by: Claude <noreply@anthropic.com>
Use time.After for timeout and time.Ticker for polling instead of a fixed time.Sleep, making the test more reliable and faster. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…d CLOUD.md upgraded
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
docker-compose.yaml (1)
17-24: Align the storage backend comment with the active setting.The block still says PostgreSQL for audit logs while
STORAGE_TYPE=mongodbis active. That mismatch is confusing for operators—either update the comment or switch the active backend to match.✏️ Suggested clarification
- # Storage configuration (PostgreSQL for audit logs) + # Storage configuration (audit/usage storage backend) ... - # - STORAGE_TYPE=postgresql - - STORAGE_TYPE=mongodb + # - STORAGE_TYPE=postgresql # use PostgreSQL backend + - STORAGE_TYPE=mongodb # active backendconfig/config.go (1)
266-301: Env var name for enforce‑usage is inconsistent with USAGE_ naming.*
ENFORCE_RETURNING_USAGE_DATAskips theUSAGE_prefix, soUSAGE_ENFORCE_RETURNING_USAGE_DATAwon’t be honored in either the legacy env path or the override path. Align it with the rest of the usage env vars.🛠️ Proposed fix
- EnforceReturningUsageData: getEnvBoolOrDefault("ENFORCE_RETURNING_USAGE_DATA", true), + EnforceReturningUsageData: getEnvBoolOrDefault("USAGE_ENFORCE_RETURNING_USAGE_DATA", true), @@ - if enforceUsage := os.Getenv("ENFORCE_RETURNING_USAGE_DATA"); enforceUsage != "" { + if enforceUsage := os.Getenv("USAGE_ENFORCE_RETURNING_USAGE_DATA"); enforceUsage != "" {Also applies to: 416-437
🤖 Fix all issues with AI agents
In `@CLAUDE.md`:
- Around line 208-236: Add a single clarifying sentence to the "Manual Testing
with Docker" section stating that Docker Compose is optional and intended only
for manual storage-backend validation—not for running automated E2E or unit
tests—and that automated E2E tests must remain in-process without Docker; insert
this sentence near the examples that show STORAGE_TYPE, POSTGRES_URL and
MONGODB_URL so readers understand the distinction between manual validation and
in-process test suites.
In `@FUTURE_ARCHITECTURE.md`:
- Line 1: Add a top-level heading and specify the fenced-code language: insert a
heading "Future Architecture" at the top of FUTURE_ARCHITECTURE.md and change
the opening fence from an untyped "```" to "```text" and replace the malformed
closing "-```" with a proper closing "```"; apply the same change wherever an
untyped or malformed fenced block occurs (e.g., the other occurrence referenced
as "123-123").
In `@internal/core/types.go`:
- Around line 59-64: The Usage struct exposes RawUsage as map[string]any which
weakens API payload typing; change RawUsage to json.RawMessage to preserve
provider-specific raw bytes while keeping strong typing, update the Usage type
declaration (field RawUsage) to use json.RawMessage and ensure encoding/json is
imported, and then update any functions that marshal/unmarshal or assign to
Usage.RawUsage to handle raw JSON (e.g., pass through or
json.Marshal/json.RawMessage conversions) so existing
serialization/deserialization behavior is preserved.
In `@internal/usage/factory.go`:
- Around line 130-139: The defaults for storage type and paths are hardcoded;
replace the literal values (".cache/gomodel.db" and "gomodel") with shared
constants or getters from the storage/config package and fall back to storage
package defaults where available: update the initialization logic that sets
storageCfg.Type (currently comparing to storage.TypeSQLite),
storageCfg.SQLite.Path, and storageCfg.MongoDB.Database to use exported
constants (e.g., storage.DefaultSQLitePath, storage.DefaultMongoDatabase) or
call a storage.Defaults() helper so defaults remain single-sourced and cannot
drift.
- Around line 66-71: When createUsageStore(store, cfg.Usage.RetentionDays)
returns an error, the subsequent call to store.Close() currently ignores any
error; change the cleanup path in the block that handles the error to capture
the result of store.Close(), then either log it (using the existing logger in
scope) or wrap/aggregate it with the original createUsageStore error before
returning so cleanup failures aren’t lost; update the code around
createUsageStore, usageStore variable handling, and the store.Close() call to
return or log both errors together.
- Around line 23-41: The "Safe to call multiple times" comment is incorrect
because Logger.Close calls close(l.done) and will panic on a second call; either
remove or correct the comment or make Close idempotent: add a guarded closed
flag and mutex to both Result and the Logger implementation (check the closed
flag before calling Logger.Close, Storage.Close or closing l.done, set the flag
under lock after performing close operations) so repeated calls return nil
instead of closing channels twice; update the comment accordingly or remove it
if you implement the guard.
In `@internal/usage/store_sqlite.go`:
- Around line 58-70: The indexes list in the usage creation snippet (the indexes
slice and subsequent loop using db.Exec) omits an index on the endpoint column;
add "CREATE INDEX IF NOT EXISTS idx_usage_endpoint ON usage(endpoint)" to the
indexes slice so the for _, idx := range indexes { if _, err := db.Exec(idx) {
... } } loop will create it, ensuring queries that filter by endpoint are
indexed (reference the indexes variable, idx_usage_endpoint, and the db.Exec
call).
In `@internal/usage/usage.go`:
- Around line 49-55: The RawData field in the Usage struct currently uses
map[string]any; replace it with a typed struct (e.g., RawUsageData) that
contains optional, provider-specific sub-structs/fields (e.g., OpenAI,
Anthropic, Gemini) with proper json/bson tags and pointer types to keep fields
optional, then update the Usage.RawData field signature to use this new type;
ensure all code paths that read/write Usage.RawData
(serialization/deserialization, DB persistence and any unmarshalling code) are
updated to the new struct shape and handle backward compatibility (e.g.,
implement custom UnmarshalJSON/UnmarshalBSON or a migration helper to convert
existing map payloads into the typed RawUsageData).
♻️ Duplicate comments (4)
internal/core/responses.go (1)
76-81: Avoidmap[string]anyin API response payloads.Same concern as previously flagged—
ResponsesUsageis part ofResponsesResponse(an API payload). Usejson.RawMessageinstead for type safety.internal/server/handlers.go (3)
54-57: X-Request-ID retrieved from wrong header source.The
requestIDis read fromc.Response().Header().Get("X-Request-ID"), but the X-Request-ID header is set on the request by the audit log middleware. This should bec.Request().Header.Get("X-Request-ID")to match the pattern used elsewhere in the codebase.🔧 Suggested fix
- requestID := c.Response().Header().Get("X-Request-ID") + requestID := c.Request().Header.Get("X-Request-ID")
116-123: Same X-Request-ID issue as streaming path.Line 118 retrieves
requestIDfrom response headers instead of request headers.🔧 Suggested fix
- requestID := c.Response().Header().Get("X-Request-ID") + requestID := c.Request().Header.Get("X-Request-ID")
182-189: Same X-Request-ID issue in Responses handler.Line 184 retrieves
requestIDfrom response headers instead of request headers.🔧 Suggested fix
- requestID := c.Response().Header().Get("X-Request-ID") + requestID := c.Request().Header.Get("X-Request-ID")
| // Create the usage store based on storage type | ||
| usageStore, err := createUsageStore(store, cfg.Usage.RetentionDays) | ||
| if err != nil { | ||
| store.Close() | ||
| return nil, err | ||
| } |
There was a problem hiding this comment.
Error from store.Close() is silently ignored.
When createUsageStore fails, the error from store.Close() is discarded. While this is a cleanup path, logging or aggregating this error would help with debugging resource cleanup failures.
Proposed fix
// Create the usage store based on storage type
usageStore, err := createUsageStore(store, cfg.Usage.RetentionDays)
if err != nil {
- store.Close()
- return nil, err
+ closeErr := store.Close()
+ if closeErr != nil {
+ return nil, fmt.Errorf("%w (also failed to close storage: %v)", err, closeErr)
+ }
+ return nil, err
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Create the usage store based on storage type | |
| usageStore, err := createUsageStore(store, cfg.Usage.RetentionDays) | |
| if err != nil { | |
| store.Close() | |
| return nil, err | |
| } | |
| // Create the usage store based on storage type | |
| usageStore, err := createUsageStore(store, cfg.Usage.RetentionDays) | |
| if err != nil { | |
| closeErr := store.Close() | |
| if closeErr != nil { | |
| return nil, fmt.Errorf("%w (also failed to close storage: %v)", err, closeErr) | |
| } | |
| return nil, err | |
| } |
🤖 Prompt for AI Agents
In `@internal/usage/factory.go` around lines 66 - 71, When createUsageStore(store,
cfg.Usage.RetentionDays) returns an error, the subsequent call to store.Close()
currently ignores any error; change the cleanup path in the block that handles
the error to capture the result of store.Close(), then either log it (using the
existing logger in scope) or wrap/aggregate it with the original
createUsageStore error before returning so cleanup failures aren’t lost; update
the code around createUsageStore, usageStore variable handling, and the
store.Close() call to return or log both errors together.
| // Apply defaults | ||
| if storageCfg.Type == "" { | ||
| storageCfg.Type = storage.TypeSQLite | ||
| } | ||
| if storageCfg.SQLite.Path == "" { | ||
| storageCfg.SQLite.Path = ".cache/gomodel.db" | ||
| } | ||
| if storageCfg.MongoDB.Database == "" { | ||
| storageCfg.MongoDB.Database = "gomodel" | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider using constants for default values.
The default values (e.g., ".cache/gomodel.db", "gomodel" for MongoDB database) are hardcoded here. If these defaults are also defined elsewhere (e.g., in the storage or config package), they could drift out of sync. Consider using shared constants or deferring to the storage package's defaults.
#!/bin/bash
# Check if these defaults are duplicated elsewhere
echo "=== Checking for duplicate default paths ==="
rg -n '\.cache/gomodel\.db|gomodel\.db' --type go🤖 Prompt for AI Agents
In `@internal/usage/factory.go` around lines 130 - 139, The defaults for storage
type and paths are hardcoded; replace the literal values (".cache/gomodel.db"
and "gomodel") with shared constants or getters from the storage/config package
and fall back to storage package defaults where available: update the
initialization logic that sets storageCfg.Type (currently comparing to
storage.TypeSQLite), storageCfg.SQLite.Path, and storageCfg.MongoDB.Database to
use exported constants (e.g., storage.DefaultSQLitePath,
storage.DefaultMongoDatabase) or call a storage.Defaults() helper so defaults
remain single-sourced and cannot drift.
| // Create indexes for common queries | ||
| indexes := []string{ | ||
| "CREATE INDEX IF NOT EXISTS idx_usage_timestamp ON usage(timestamp)", | ||
| "CREATE INDEX IF NOT EXISTS idx_usage_request_id ON usage(request_id)", | ||
| "CREATE INDEX IF NOT EXISTS idx_usage_provider_id ON usage(provider_id)", | ||
| "CREATE INDEX IF NOT EXISTS idx_usage_model ON usage(model)", | ||
| "CREATE INDEX IF NOT EXISTS idx_usage_provider ON usage(provider)", | ||
| } | ||
| for _, idx := range indexes { | ||
| if _, err := db.Exec(idx); err != nil { | ||
| slog.Warn("failed to create index", "error", err) | ||
| } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider adding an index on endpoint for completeness.
Indexes are created for timestamp, request_id, provider_id, model, and provider, but endpoint is omitted. If queries filtering by endpoint are expected (e.g., usage reports per endpoint), consider adding it.
♻️ Optional enhancement
indexes := []string{
"CREATE INDEX IF NOT EXISTS idx_usage_timestamp ON usage(timestamp)",
"CREATE INDEX IF NOT EXISTS idx_usage_request_id ON usage(request_id)",
"CREATE INDEX IF NOT EXISTS idx_usage_provider_id ON usage(provider_id)",
"CREATE INDEX IF NOT EXISTS idx_usage_model ON usage(model)",
"CREATE INDEX IF NOT EXISTS idx_usage_provider ON usage(provider)",
+ "CREATE INDEX IF NOT EXISTS idx_usage_endpoint ON usage(endpoint)",
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Create indexes for common queries | |
| indexes := []string{ | |
| "CREATE INDEX IF NOT EXISTS idx_usage_timestamp ON usage(timestamp)", | |
| "CREATE INDEX IF NOT EXISTS idx_usage_request_id ON usage(request_id)", | |
| "CREATE INDEX IF NOT EXISTS idx_usage_provider_id ON usage(provider_id)", | |
| "CREATE INDEX IF NOT EXISTS idx_usage_model ON usage(model)", | |
| "CREATE INDEX IF NOT EXISTS idx_usage_provider ON usage(provider)", | |
| } | |
| for _, idx := range indexes { | |
| if _, err := db.Exec(idx); err != nil { | |
| slog.Warn("failed to create index", "error", err) | |
| } | |
| } | |
| // Create indexes for common queries | |
| indexes := []string{ | |
| "CREATE INDEX IF NOT EXISTS idx_usage_timestamp ON usage(timestamp)", | |
| "CREATE INDEX IF NOT EXISTS idx_usage_request_id ON usage(request_id)", | |
| "CREATE INDEX IF NOT EXISTS idx_usage_provider_id ON usage(provider_id)", | |
| "CREATE INDEX IF NOT EXISTS idx_usage_model ON usage(model)", | |
| "CREATE INDEX IF NOT EXISTS idx_usage_provider ON usage(provider)", | |
| "CREATE INDEX IF NOT EXISTS idx_usage_endpoint ON usage(endpoint)", | |
| } | |
| for _, idx := range indexes { | |
| if _, err := db.Exec(idx); err != nil { | |
| slog.Warn("failed to create index", "error", err) | |
| } | |
| } |
🤖 Prompt for AI Agents
In `@internal/usage/store_sqlite.go` around lines 58 - 70, The indexes list in the
usage creation snippet (the indexes slice and subsequent loop using db.Exec)
omits an index on the endpoint column; add "CREATE INDEX IF NOT EXISTS
idx_usage_endpoint ON usage(endpoint)" to the indexes slice so the for _, idx :=
range indexes { if _, err := db.Exec(idx) { ... } } loop will create it,
ensuring queries that filter by endpoint are indexed (reference the indexes
variable, idx_usage_endpoint, and the db.Exec call).
| // RawData contains provider-specific extended usage data (JSONB) | ||
| // Examples: | ||
| // OpenAI: {"cached_tokens": 100, "reasoning_tokens": 50} | ||
| // Anthropic: {"cache_creation_input_tokens": 200, "cache_read_input_tokens": 150} | ||
| // Gemini: {"cached_tokens": 100, "thought_tokens": 75, "tool_use_tokens": 25} | ||
| RawData map[string]any `json:"raw_data,omitempty" bson:"raw_data,omitempty"` | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Prefer a typed RawData struct instead of map[string]any.
This is persisted payload data; a typed struct improves safety and aligns with the “no map[string]any for payloads” guideline. Consider a struct with optional provider-specific fields.
♻️ Possible refactor
type UsageEntry struct {
@@
- RawData map[string]any `json:"raw_data,omitempty" bson:"raw_data,omitempty"`
+ RawData *UsageRawData `json:"raw_data,omitempty" bson:"raw_data,omitempty"`
}
+
+// UsageRawData captures provider-specific usage extensions.
+type UsageRawData struct {
+ CachedTokens *int `json:"cached_tokens,omitempty" bson:"cached_tokens,omitempty"`
+ ReasoningTokens *int `json:"reasoning_tokens,omitempty" bson:"reasoning_tokens,omitempty"`
+ CacheCreationInputTokens *int `json:"cache_creation_input_tokens,omitempty" bson:"cache_creation_input_tokens,omitempty"`
+ CacheReadInputTokens *int `json:"cache_read_input_tokens,omitempty" bson:"cache_read_input_tokens,omitempty"`
+ ThoughtTokens *int `json:"thought_tokens,omitempty" bson:"thought_tokens,omitempty"`
+ ToolUseTokens *int `json:"tool_use_tokens,omitempty" bson:"tool_use_tokens,omitempty"`
+}Based on learnings, keep payload fields strongly typed.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // RawData contains provider-specific extended usage data (JSONB) | |
| // Examples: | |
| // OpenAI: {"cached_tokens": 100, "reasoning_tokens": 50} | |
| // Anthropic: {"cache_creation_input_tokens": 200, "cache_read_input_tokens": 150} | |
| // Gemini: {"cached_tokens": 100, "thought_tokens": 75, "tool_use_tokens": 25} | |
| RawData map[string]any `json:"raw_data,omitempty" bson:"raw_data,omitempty"` | |
| } | |
| // RawData contains provider-specific extended usage data (JSONB) | |
| // Examples: | |
| // OpenAI: {"cached_tokens": 100, "reasoning_tokens": 50} | |
| // Anthropic: {"cache_creation_input_tokens": 200, "cache_read_input_tokens": 150} | |
| // Gemini: {"cached_tokens": 100, "thought_tokens": 75, "tool_use_tokens": 25} | |
| RawData *UsageRawData `json:"raw_data,omitempty" bson:"raw_data,omitempty"` | |
| } | |
| // UsageRawData captures provider-specific usage extensions. | |
| type UsageRawData struct { | |
| CachedTokens *int `json:"cached_tokens,omitempty" bson:"cached_tokens,omitempty"` | |
| ReasoningTokens *int `json:"reasoning_tokens,omitempty" bson:"reasoning_tokens,omitempty"` | |
| CacheCreationInputTokens *int `json:"cache_creation_input_tokens,omitempty" bson:"cache_creation_input_tokens,omitempty"` | |
| CacheReadInputTokens *int `json:"cache_read_input_tokens,omitempty" bson:"cache_read_input_tokens,omitempty"` | |
| ThoughtTokens *int `json:"thought_tokens,omitempty" bson:"thought_tokens,omitempty"` | |
| ToolUseTokens *int `json:"tool_use_tokens,omitempty" bson:"tool_use_tokens,omitempty"` | |
| } |
🤖 Prompt for AI Agents
In `@internal/usage/usage.go` around lines 49 - 55, The RawData field in the Usage
struct currently uses map[string]any; replace it with a typed struct (e.g.,
RawUsageData) that contains optional, provider-specific sub-structs/fields
(e.g., OpenAI, Anthropic, Gemini) with proper json/bson tags and pointer types
to keep fields optional, then update the Usage.RawData field signature to use
this new type; ensure all code paths that read/write Usage.RawData
(serialization/deserialization, DB persistence and any unmarshalling code) are
updated to the new struct shape and handle backward compatibility (e.g.,
implement custom UnmarshalJSON/UnmarshalBSON or a migration helper to convert
existing map payloads into the typed RawUsageData).
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/providers/responses_converter.go (1)
108-130: Extract duplicated response.done construction into a helper method.The response.done event construction logic is duplicated between the
[DONE]handling path (lines 108-130) and the EOF path (lines 174-196). This violates DRY and risks divergence during future changes.♻️ Proposed refactor
+func (sc *OpenAIResponsesStreamConverter) buildDoneMessage() string { + responseData := map[string]interface{}{ + "id": sc.responseID, + "object": "response", + "status": "completed", + "model": sc.model, + "provider": sc.provider, + "created_at": time.Now().Unix(), + } + if sc.cachedUsage != nil { + responseData["usage"] = sc.cachedUsage + } + doneEvent := map[string]interface{}{ + "type": "response.done", + "response": responseData, + } + jsonData, err := json.Marshal(doneEvent) + if err != nil { + slog.Error("failed to marshal response.done event", "error", err, "response_id", sc.responseID) + return "" + } + return fmt.Sprintf("event: response.done\ndata: %s\n\ndata: [DONE]\n\n", jsonData) +}Then replace both duplicated blocks with:
sc.sentDone = true doneMsg := sc.buildDoneMessage() if doneMsg != "" { sc.buffer = append(sc.buffer, []byte(doneMsg)...) }Also applies to: 174-196
🤖 Fix all issues with AI agents
In @.claude/settings.local.json:
- Around line 16-18: The permission entry Bash(mongosh:*) is too broad — replace
it with a least‑privilege alternative: either change Bash(mongosh:*) to require
confirmation (move it to an "ask" scope) or restrict it to explicit
subcommands/targets (e.g., list only the exact mongosh actions needed such as
count/find/aggregate or specific database/collection identifiers) so arbitrary
shell commands cannot be executed; update the settings entry accordingly and add
a brief comment explaining why the narrower scope or "ask" mode was chosen.
♻️ Duplicate comments (8)
CLAUDE.md (1)
208-236: Consider the past feedback about clarifying Docker's optional nature.The section provides clear manual testing instructions with correct commands and connection details. The previously raised concern about potential confusion with the "No Docker or external dependencies needed for testing" statement (line 205) remains relevant.
internal/providers/anthropic/anthropic.go (1)
447-454: Silent fallback to current time masks data issues.Returning
time.Now().Unix()on parse failure silently masks upstream data problems and can cause unexpected model ordering if models are sorted by creation time. Consider logging the error and returning0to make issues visible.🔧 Suggested fix
func parseCreatedAt(createdAt string) int64 { t, err := time.Parse(time.RFC3339, createdAt) if err != nil { - return time.Now().Unix() + slog.Warn("invalid model created_at timestamp", "value", createdAt, "error", err) + return 0 } return t.Unix() }Note: This will require updating any tests that expect non-zero values for invalid timestamps.
internal/usage/stream_wrapper.go (3)
46-52: Reallocate the rolling buffer when trimming to cap memory.Line 49 resets the buffer but keeps capacity, so a single large stream can retain a large backing array even after trimming. Re-buffer the tail into a fresh buffer instead.
♻️ Suggested fix
if w.buffer.Len() > SSEBufferSize { // Discard old data, keep recent data := w.buffer.Bytes() - w.buffer.Reset() - if _, errBuf := w.buffer.Write(data[len(data)-SSEBufferSize:]); errBuf != nil { - return n, errBuf - } + tail := append([]byte(nil), data[len(data)-SSEBufferSize:]...) + w.buffer = *bytes.NewBuffer(tail) }
105-148: Use typed structs for SSE/usage payload parsing.Parsing into
map[string]interface{}forces float64 casts and bypasses compile-time checks. Define typed structs withjson.RawMessagefor the usage payload, then unmarshal into a typed usage struct (keeping provider-specific extras in RawData).As per coding guidelines, avoid map-based payloads for API responses.♻️ Suggested refactor (outline)
+type sseChunk struct { + ID string `json:"id"` + Model string `json:"model"` + Type string `json:"type"` + Usage json.RawMessage `json:"usage"` + Response *struct { + ID string `json:"id"` + Model string `json:"model"` + Usage json.RawMessage `json:"usage"` + } `json:"response,omitempty"` +} + +type usagePayload struct { + PromptTokens int `json:"prompt_tokens"` + InputTokens int `json:"input_tokens"` + CompletionTokens int `json:"completion_tokens"` + OutputTokens int `json:"output_tokens"` + TotalTokens int `json:"total_tokens"` + // add provider-specific fields as needed +} ... - var chunk map[string]interface{} - if err := json.Unmarshal(data, &chunk); err != nil { + var chunk sseChunk + if err := json.Unmarshal(data, &chunk); err != nil { return nil }
150-168: Derive total_tokens when only input/output are present.If total_tokens is omitted, usage is undercounted even when input/output are present. Derive totalTokens as the sum when missing.
🐛 Suggested fix
if v, ok := usageMap["total_tokens"].(float64); ok { totalTokens = int(v) } + if totalTokens == 0 && (inputTokens > 0 || outputTokens > 0) { + totalTokens = inputTokens + outputTokens + }internal/usage/factory.go (3)
23-40: “Safe to call multiple times” can be incorrect if Logger.Close isn’t idempotent.Line 24 claims idempotency, but if Logger.Close or Storage.Close can panic on double-close, Result.Close isn’t safe. Either remove the claim or guard with a
sync.Once/flag.🐛 Suggested fix (guard Close)
type Result struct { Logger LoggerInterface Storage storage.Storage + closeOnce sync.Once } func (r *Result) Close() error { - var errs []error - if r.Logger != nil { - if err := r.Logger.Close(); err != nil { - errs = append(errs, fmt.Errorf("logger close: %w", err)) - } - } - if r.Storage != nil { - if err := r.Storage.Close(); err != nil { - errs = append(errs, fmt.Errorf("storage close: %w", err)) - } - } - if len(errs) > 0 { - return fmt.Errorf("close errors: %w", errors.Join(errs...)) - } - return nil + var outErr error + r.closeOnce.Do(func() { + var errs []error + if r.Logger != nil { + if err := r.Logger.Close(); err != nil { + errs = append(errs, fmt.Errorf("logger close: %w", err)) + } + } + if r.Storage != nil { + if err := r.Storage.Close(); err != nil { + errs = append(errs, fmt.Errorf("storage close: %w", err)) + } + } + if len(errs) > 0 { + outErr = fmt.Errorf("close errors: %w", errors.Join(errs...)) + } + }) + return outErr }To verify idempotency, inspect Logger.Close:
#!/bin/bash rg -n 'func \(.*Logger\) Close' internal/usage/logger.go -A20
66-71: Don’t drop the cleanup error from store.Close().If createUsageStore fails, the cleanup error is discarded. Capture and return/log it so cleanup failures aren’t lost.
🐛 Suggested fix
usageStore, err := createUsageStore(store, cfg.Usage.RetentionDays) if err != nil { - store.Close() - return nil, err + closeErr := store.Close() + if closeErr != nil { + return nil, fmt.Errorf("%w (also failed to close storage: %v)", err, closeErr) + } + return nil, err }
130-139: Avoid hardcoded storage defaults if shared constants exist.The sqlite path and MongoDB database defaults are literal strings here; consider using shared defaults from the storage/config layer to avoid drift.
The NotFound case in ParseProviderError was dropping the original error, losing debugging context. Other error branches preserve this information. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add context-based request ID passing via core.WithRequestID/GetRequestID - Fix middleware to set X-Request-ID on both request and response headers - Update handlers to enrich context with request ID before calling providers - Forward request ID to all providers with provider-specific header names: - OpenAI, Groq, xAI: X-Request-ID - Anthropic: X-Request-Id - Gemini: X-Goog-Request-Id This enables end-to-end request tracing from client through GOModel to upstream providers. Clients can override the auto-generated ID by passing their own X-Request-ID header. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
internal/auditlog/middleware.go (1)
47-58: Request ID propagation is skipped when audit logging is disabled.Because the middleware returns early before this block, requests can lose IDs (and usage entries will miss correlation) when
AuditLoggeris off. Consider generating/setting the ID before the early-return or moving it to a lightweight middleware.🔧 Suggested adjustment
- // Skip if logging is disabled - if logger == nil || !logger.Config().Enabled { - return next(c) - } - - cfg := logger.Config() - - start := time.Now() - req := c.Request() - - // Generate request ID if not present - requestID := req.Header.Get("X-Request-ID") + req := c.Request() + // Generate request ID even when audit logging is disabled + requestID := req.Header.Get("X-Request-ID") if requestID == "" { requestID = uuid.NewString() } req.Header.Set("X-Request-ID", requestID) c.Response().Header().Set("X-Request-ID", requestID) + + // Skip if logging is disabled + if logger == nil || !logger.Config().Enabled { + return next(c) + } + + cfg := logger.Config() + start := time.Now()internal/providers/groq/groq.go (1)
138-151: Consider strongly-typed input parsing.Per coding guidelines, request/response payloads should prefer strongly-typed structs over
interface{}andmap[string]interface{}. While this internal conversion logic handles dynamicResponsesRequest.Input, the per-request type assertions add runtime overhead and reduce type safety.If
core.ResponsesRequest.Inputmust remaininterface{}for API flexibility, consider centralizing the input parsing logic into a shared utility with well-defined intermediate types, reducing duplicated type assertions across providers.
🤖 Fix all issues with AI agents
In `@internal/providers/gemini/gemini.go`:
- Around line 78-81: The code forwards a context request ID using
core.GetRequestID and sets header via req.Header.Set("X-Goog-Request-Id", ...)
even though Gemini may not accept that header; update the behavior to either (A)
verify Gemini acceptance and keep using "X-Goog-Request-Id" if confirmed, or (B)
change to a standard tracing header by replacing the header key with
"X-Request-Id" (or add both headers) so request tracing works; modify the logic
around core.GetRequestID(req.Context()) and req.Header.Set to implement the
chosen header(s) and ensure tests/logs reflect the header name change.
In `@internal/providers/openai/openai.go`:
- Around line 61-64: Replace the incorrect header "X-Request-ID" with OpenAI's
expected "X-Client-Request-Id" when forwarding the request ID from context
(where core.GetRequestID(req.Context()) is used); additionally validate the
request ID for ASCII-only characters and a maximum length of 512 bytes and only
set the header if the value passes validation (otherwise omit it) to avoid
triggering OpenAI 400 errors.
♻️ Duplicate comments (1)
internal/providers/anthropic/anthropic.go (1)
452-459: Consider logging parse errors instead of silently falling back to current time.Returning
time.Now().Unix()on parse failure masks upstream data issues and could unexpectedly affect model list ordering. Sinceslogis already imported, consider logging the error for observability while deciding on an appropriate fallback value.🔧 Suggested fix
func parseCreatedAt(createdAt string) int64 { t, err := time.Parse(time.RFC3339, createdAt) if err != nil { + slog.Warn("failed to parse model created_at timestamp", "value", createdAt, "error", err) return time.Now().Unix() } return t.Unix() }
- Reduce from 272 to 84 lines (~69% reduction) - Fix incorrect claim about init() registration (now explicit factory.Register()) - Fix incorrect 5-minute refresh interval (actually 1 hour default) - Add clarification that Docker Compose is for manual testing only - Update config section to reference .env.template and config/config.yaml Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@CLAUDE.md`:
- Line 81: Update the phrasing on the line that reads "requires initialized
registry—check `ModelCount() > 0`" to use correct grammar; for example change it
to "requires an initialized registry—check `ModelCount() > 0`" or "requires the
registry to be initialized—check `ModelCount() > 0`" so the sentence reads
naturally.
- Line 9: Replace the bare URL "https://github.com/ENTERPILOT/GOModel" in the
Module line with a markdown link; e.g., update the line containing `**Module:**
`gomodel` | **Go:** 1.24.0 | **Repo:** https://github.com/ENTERPILOT/GOModel` to
use a labeled link such as `**Module:** `gomodel` | **Go:** 1.24.0 | **Repo:**
[GOModel](https://github.com/ENTERPILOT/GOModel)` so the URL is formatted as a
proper markdown link and avoids linter warnings.
- Around line 52-62: The fenced code block in the CLAUDE.md snippet (the block
beginning with ```) is missing a language specifier causing a markdown linter
warning; update that fenced block to include a language like text or plaintext
(e.g., change the opening fence from ``` to ```text) so the directory listing
block is properly marked and the linter warning is resolved.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@internal/providers/openai/openai_test.go`:
- Around line 753-809: Add tests for control characters to
TestIsValidClientRequestID by including cases like "\x00", "\t", and "\x7F"
(expecting invalid) to document handling of non-printable ASCII; and add an
integration test for setHeaders that constructs a provider via
NewWithHTTPClient, creates an http.Request with a context set via
core.WithRequestID, calls provider.setHeaders(req) and asserts the
"X-Client-Request-Id" header is set to the request ID for valid IDs and absent
for empty/invalid IDs (reference functions: isValidClientRequestID, setHeaders,
NewWithHTTPClient, core.WithRequestID, header "X-Client-Request-Id").
| func TestIsValidClientRequestID(t *testing.T) { | ||
| tests := []struct { | ||
| name string | ||
| id string | ||
| valid bool | ||
| }{ | ||
| { | ||
| name: "valid UUID", | ||
| id: "123e4567-e89b-12d3-a456-426614174000", | ||
| valid: true, | ||
| }, | ||
| { | ||
| name: "valid short ID", | ||
| id: "req-123", | ||
| valid: true, | ||
| }, | ||
| { | ||
| name: "valid empty string", | ||
| id: "", | ||
| valid: true, | ||
| }, | ||
| { | ||
| name: "valid 512 chars", | ||
| id: strings.Repeat("a", 512), | ||
| valid: true, | ||
| }, | ||
| { | ||
| name: "invalid - 513 chars (too long)", | ||
| id: strings.Repeat("a", 513), | ||
| valid: false, | ||
| }, | ||
| { | ||
| name: "invalid - non-ASCII character", | ||
| id: "req-123-日本語", | ||
| valid: false, | ||
| }, | ||
| { | ||
| name: "invalid - emoji", | ||
| id: "req-123-🎉", | ||
| valid: false, | ||
| }, | ||
| { | ||
| name: "valid - all printable ASCII", | ||
| id: "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_.", | ||
| valid: true, | ||
| }, | ||
| } | ||
|
|
||
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| got := isValidClientRequestID(tt.id) | ||
| if got != tt.valid { | ||
| t.Errorf("isValidClientRequestID(%q) = %v, want %v", tt.id, got, tt.valid) | ||
| } | ||
| }) | ||
| } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Good test coverage for the validation function.
The test cases comprehensively cover boundary conditions and invalid inputs. Consider adding test cases for control characters (e.g., "\x00", "\t", "\x7F") to document the expected behavior for these edge cases.
Additionally, consider adding an integration test for setHeaders that verifies the X-Client-Request-Id header is actually set when a valid request ID is present in the context:
🧪 Suggested test for header integration
func TestSetHeadersWithRequestID(t *testing.T) {
provider := NewWithHTTPClient("test-api-key", nil, llmclient.Hooks{})
tests := []struct {
name string
requestID string
expectSet bool
}{
{"valid ID", "req-123", true},
{"empty ID", "", false},
{"invalid non-ASCII", "req-日本語", false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
req, _ := http.NewRequest(http.MethodGet, "http://example.com", nil)
ctx := context.Background()
if tt.requestID != "" {
ctx = core.WithRequestID(ctx, tt.requestID)
}
req = req.WithContext(ctx)
provider.setHeaders(req)
got := req.Header.Get("X-Client-Request-Id")
if tt.expectSet && got != tt.requestID {
t.Errorf("X-Client-Request-Id = %q, want %q", got, tt.requestID)
}
if !tt.expectSet && got != "" {
t.Errorf("X-Client-Request-Id should not be set, got %q", got)
}
})
}
}🤖 Prompt for AI Agents
In `@internal/providers/openai/openai_test.go` around lines 753 - 809, Add tests
for control characters to TestIsValidClientRequestID by including cases like
"\x00", "\t", and "\x7F" (expecting invalid) to document handling of
non-printable ASCII; and add an integration test for setHeaders that constructs
a provider via NewWithHTTPClient, creates an http.Request with a context set via
core.WithRequestID, calls provider.setHeaders(req) and asserts the
"X-Client-Request-Id" header is set to the request ID for valid IDs and absent
for empty/invalid IDs (reference functions: isValidClientRequestID, setHeaders,
NewWithHTTPClient, core.WithRequestID, header "X-Client-Request-Id").
Entries are queued asynchronously; if the source map is mutated after Write, the logged data can be corrupted. Added defensive copy via cloneRawData helper for all three extractor functions. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary by CodeRabbit
New Features
Bug Fixes
Configuration
Documentation
Tests
✏️ Tip: You can customize this high-level summary in your review settings.