feat: add OpenAI-compatible batch API with usage and audit logging by SantiagoDePolonia · Pull Request #108 · ENTERPILOT/GoModel

SantiagoDePolonia · 2026-02-28T13:49:54Z

Summary by CodeRabbit

New Features
- Added OpenAI-compatible file endpoints and a full batch API (create/list/get/cancel/results) with provider routing, native provider support, optional persistent batch store, and batch results adapter; usage/pricing tracking included.
Documentation
- API docs updated with /v1/batches and /v1/files; added AGENTS.md and new config option to enable guardrails for batch processing.
Tests
- Expanded test coverage for batch stores, providers, handlers, pricing, routing, and validation.

coderabbitai · 2026-02-28T13:50:04Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5106316 and d962f93.

📒 Files selected for processing (9)

internal/app/app.go
internal/batch/factory.go
internal/batch/store_postgresql.go
internal/providers/anthropic/anthropic.go
internal/providers/router_test.go
internal/server/handlers.go
internal/server/handlers_test.go
internal/usage/extractor.go
internal/usage/extractor_test.go

📝 Walkthrough

Walkthrough

Adds end-to-end OpenAI-compatible files and batch processing: new core types and provider interfaces, provider implementations and adapters, multi-backend batch storage with factory, server HTTP routes and handlers, guarded per-item processing, usage/audit updates, app lifecycle wiring, and extensive tests. (50 words)

Changes

Cohort / File(s)	Summary
Docs & Config `AGENTS.md`, `README.md`, `config/config.go`, `config/config.example.yaml`, `.env.template`	New AGENTS.md; README lists file & batch endpoints; add `Guardrails.EnableForBatchProcessing` config/env and document guardrail behavior for files/batches.
Core Types & Interfaces `internal/core/batch.go`, `internal/core/files.go`, `internal/core/interfaces.go`	New batch and file domain models and new provider interfaces (NativeBatch/NativeFile and routable variants).
Batch Storage Core `internal/batch/factory.go`, `internal/batch/store.go`	Factory Result wrapper, storage-config builder, Result.Close, Store interface, JSON (de)serialization and pagination/helpers.
Storage Implementations & Tests `internal/batch/store_memory.go`, `internal/batch/store_memory_test.go`, `internal/batch/store_sqlite.go`, `internal/batch/store_sqlite_test.go`, `internal/batch/store_postgresql.go`, `internal/batch/store_mongodb.go`	In-memory, SQLite, PostgreSQL, MongoDB stores with CRUD, cursor-based List, cloning for isolation; memory/SQLite tests added.
Provider Implementations & Adapters `internal/providers/...` (`anthropic`, `openai`, `gemini`, `groq`, `xai`, `ollama`), `internal/providers/batch_results_file_adapter.go`, `internal/providers/file_adapter_openai_compat.go`	Add native batch and OpenAI-compatible file APIs across providers; Anthropic adds NDJSON streaming parsing; batch-results/file adapters included; Ollama explicitly returns unsupported for native batches.
Router & Guardrails `internal/providers/router.go`, `internal/guardrails/provider.go`, `internal/guardrails/provider_test.go`	Router gains provider-type resolution and batch/file routing; GuardedProvider gains Options (EnableForBatchProcessing) and per-item guardrail processing for batch flows.
Server HTTP & Handlers + Tests `internal/server/http.go`, `internal/server/handlers.go`, `internal/server/handlers_test.go`, `internal/server/model_validation.go`, `internal/server/model_validation_test.go`	New HTTP routes for files and batches; handler supports injectable BatchStore via SetBatchStore; handlers implement create/get/list/cancel/results flows, merging/persistence and provider routing; model-validation bypass for batch/files; extensive tests and mockProvider extensions.
App Lifecycle `internal/app/app.go`	App initializes/wires batch.Result (shared storage when available); server config now passes BatchStore; shutdown closes batch store earlier and aggregates close errors.
Audit & Usage `internal/auditlog/...`, `internal/usage/extractor.go`, `internal/usage/extractor_test.go`	IsModelInteractionPath tightened and extended to include `/v1/files` and `/v1/batches`; endpoint-aware pricingForEndpoint added for batch pricing overrides; tests updated.
Client / LLM Request `internal/llmclient/client.go`	Request gains `RawBody []byte` to support raw/multipart bodies (mutually exclusive with JSON Body).
Tests & Misc `internal/batch/store_test.go`, various provider & handler tests	New unit and integration tests covering batch store validation, provider adapters, handler flows, batch/file endpoints, and usage extraction scenarios.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant HTTP as "HTTP Handler"
    participant Router as "Provider Router"
    participant Provider as "Concrete Provider"
    participant Store as "Batch Store"
    participant DB as "Storage Backend"

    Client->>HTTP: POST /v1/batches (BatchRequest)
    HTTP->>HTTP: validate & determine provider type
    HTTP->>Router: CreateBatch(providerType, req)
    Router->>Provider: CreateBatch(req)
    Provider-->>Router: BatchResponse
    Router-->>HTTP: BatchResponse (tagged with provider)
    HTTP->>Store: Create(storedBatch)
    Store->>DB: persist batch
    DB-->>Store: persist ok
    Store-->>HTTP: created
    HTTP-->>Client: 200 BatchResponse

    Client->>HTTP: GET /v1/batches/{id}/results
    HTTP->>Store: Get(id)
    Store->>DB: query
    DB-->>Store: batch payload
    Store-->>HTTP: BatchResponse
    alt provider supports results
      HTTP->>Router: GetBatchResults(providerType, id)
      Router->>Provider: GetBatchResults(id)
      Provider-->>Router: BatchResults
      Router-->>HTTP: BatchResults
      HTTP-->>Client: 200 BatchResults
    else
      HTTP-->>Client: 400/unsupported
    end

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

feat: add guardrails system with system prompt injection/override/decorator #64 — related guardrails changes (Pipeline / GuardedProvider) that overlap with EnableForBatchProcessing and per-item processing.
tests: Prepared and executed the testing strategy #46 — touches usage/pricing extraction; relevant to pricingForEndpoint adjustments.
feat(21): add resilience config #77 — related provider/factory registration and provider options that may interact with router/provider wiring.

Suggested labels

enhancement

Poem

🐰
I hopped through code with whiskers twitching bright,
Batches and files now frolic in the night,
Stores keep the crumbs, routers point the way,
Providers hum answers — then hop away,
A joyous rabbit cheers: "Endpoints take flight!"

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 28.81% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately captures the primary change: adding OpenAI-compatible batch API with usage and audit logging support.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/batch-api-support

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

internal/batch/store_mongodb.go

coderabbitai

Actionable comments posted: 16

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

internal/auditlog/stream_wrapper.go (1)

425-434: ⚠️ Potential issue | 🟡 Minor

Avoid prefix overmatching in model-path detection.

Using strings.HasPrefix(path, p) will also match unrelated routes like /v1/batcheship, which can incorrectly force audit logging under OnlyModelInteractions.

Suggested fix

 func IsModelInteractionPath(path string) bool {
 	modelPaths := []string{
 		"/v1/chat/completions",
 		"/v1/responses",
 		"/v1/embeddings",
 		"/v1/batches",
 	}
 	for _, p := range modelPaths {
-		if strings.HasPrefix(path, p) {
+		if path == p || strings.HasPrefix(path, p+"/") || strings.HasPrefix(path, p+"?") {
 			return true
 		}
 	}
 	return false
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/auditlog/stream_wrapper.go` around lines 425 - 434, The
IsModelInteractionPath function currently uses strings.HasPrefix against
modelPaths which overmatches (e.g., "/v1/batcheship"); update
IsModelInteractionPath to only accept exact matches or a path segment boundary
after the prefix by checking that path == p or that the character immediately
following the matched prefix is '/' or '?' (to allow query strings) or the
prefix ends the string; locate the modelPaths slice and the loop in
IsModelInteractionPath and replace the simple HasPrefix check with this stricter
boundary-aware check.

internal/app/app.go (2)

133-137: ⚠️ Potential issue | 🟠 Major

Close batch resources on guardrail initialization failure.

Line 133 error cleanup omits app.batch.Close(). If guardrail pipeline creation fails after batch init, batch resources are leaked.

🧩 Suggested fix

-			closeErr := errors.Join(app.usage.Close(), app.audit.Close(), app.providers.Close())
+			closeErr := errors.Join(
+				app.batch.Close(),
+				app.usage.Close(),
+				app.audit.Close(),
+				app.providers.Close(),
+			)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/app/app.go` around lines 133 - 137, The error cleanup when guardrail
initialization fails is missing app.batch.Close(), leaking resources; update the
cleanup block (where errors.Join is called with app.usage.Close(),
app.audit.Close(), app.providers.Close()) to include app.batch.Close() as well,
and ensure the returned error message still wraps the original err and includes
the combined closeErr information (i.e., call errors.Join(app.batch.Close(),
app.usage.Close(), app.audit.Close(), app.providers.Close()) and propagate the
combined close error in the fmt.Errorf).

233-239: ⚠️ Potential issue | 🟡 Minor

Update Shutdown doc comment to match current lifecycle steps.

The public comment still documents a 3-step sequence and omits usage/batch behavior added in this change.

As per coding guidelines "**/*.go: Maintain Go doc comments on all exported types, functions, and interfaces; update when changing public APIs or function signatures/behavior".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/app/app.go` around lines 233 - 239, The exported doc comment for
Shutdown is out of date—update the Shutdown comment (the Go doc for the exported
function Shutdown) to precisely describe the current lifecycle and ordering
(including the HTTP server shutdown, stopping background refresh goroutine and
cache, any usage/batch behavior added, audit logging, and cleanup), note that it
is safe to call multiple times (idempotent), and reflect any concurrency/timeout
semantics or error handling the implementation now uses so the comment matches
the function's actual behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@AGENTS.md`:
- Line 3: The MD022 violations are caused by missing blank lines after several
markdown headings (e.g., the "## Purpose" heading and the other headings at
lines referenced); update the AGENTS.md file by inserting a single blank line
immediately after each heading (such as "## Purpose" and the other top-level
headings referenced) so every heading is followed by one empty line to satisfy
markdownlint-cli2 MD022.

In `@internal/batch/factory.go`:
- Around line 41-43: The New function calls buildStorageConfig(cfg) without
checking cfg for nil which can panic; add the same nil-config guard used in
NewWithSharedStorage: if cfg is nil return a clear error (e.g., fmt.Errorf or
wrapped error) before calling buildStorageConfig, so New validates cfg, avoids
panic, and returns an error when cfg == nil instead of dereferencing it.

In `@internal/batch/store_postgresql.go`:
- Around line 26-47: The schema initialization currently calls
context.Background() which prevents cancellation; change the constructor that
returns *PostgreSQLStore to accept a context.Context parameter (e.g., func
NewPostgreSQLStore(ctx context.Context, pool *pgxpool.Pool) (*PostgreSQLStore,
error)) and replace context.Background() with the passed ctx for all pool.Exec
calls used to CREATE TABLE and CREATE INDEX; also update callers to pass a
cancellable/timeout context so startup DDL can be cancelled or time-limited.

In `@internal/batch/store.go`:
- Around line 51-59: The serializeBatch function currently marshals a
BatchResponse without validating its ID, allowing DB-backed stores to persist
empty IDs; update serializeBatch to return an error if batch is nil or batch.ID
is empty (e.g., zero-length string) so callers like Create/Update and DB-backed
stores cannot serialize/persist records with missing IDs—keep the existing error
wrapping for json.Marshal and ensure the error message clearly states "batch ID
is empty" and references batch.ID in the check.

In `@internal/core/batch.go`:
- Around line 55-62: Change BatchListResponse.Data from []BatchResponse to
[]*BatchResponse to avoid copying large structs; update all call sites that
construct or iterate Data (including any handlers that build responses and loops
over Data) to allocate and append pointers to BatchResponse (use &batch or
new(BatchResponse)) and dereference where necessary when accessing fields, and
ensure JSON marshaling behavior remains correct for the modified slice type.

In `@internal/providers/anthropic/anthropic.go`:
- Around line 1001-1010: The loop that reads scanner.Scan() currently ignores
json.Unmarshal errors for anthropicBatchResultLine, which can silently drop
malformed result lines; update the error path so instead of simply continuing on
err := json.Unmarshal(line, &row), log the failure (include the error and the
raw trimmed line) using the package or function logger, then continue, so
callers can see which lines were malformed; reference the scanner.Scan() loop,
the json.Unmarshal call, and the anthropicBatchResultLine type when making this
change.
- Around line 925-965: In ListBatches, make the pagination mapping explicit and
convert HTTP errors to GatewayErrors: either rename the function parameter
"after" to "before" (or add a clear comment) to reflect that it is mapped to
Anthropic's "before_id" which returns most-recent-first results, and ensure the
request error is passed through core.ParseProviderError (replace "return nil,
err" after p.client.Do(...) with "return nil, core.ParseProviderError(err)") so
provider errors are normalized; refer to ListBatches, the "after" parameter, the
"before_id" query parameter, p.client.Do, and core.ParseProviderError when
making the change.

In `@internal/providers/gemini/gemini.go`:
- Around line 283-291: ListBatches returns core.BatchListResponse items without
normalizing ProviderBatchID; update the ListBatches implementation to iterate
resp.Items before returning and backfill item.ProviderBatchID when empty using
the same logic used in CreateBatch/GetBatch/CancelBatch (e.g., set
ProviderBatchID = item.BatchID or the equivalent source field used elsewhere).
Locate the ListBatches function where resp core.BatchListResponse is populated,
add the normalization loop over resp.Items to set ProviderBatchID for missing
entries, then return &resp.

In `@internal/providers/openai/openai.go`:
- Around line 263-271: ListBatches currently returns core.BatchListResponse
without normalizing missing ProviderBatchID like
CreateBatch/GetBatch/CancelBatch do; after p.client.Do fills resp in
ListBatches, iterate over the response items (resp.Data / resp.Items in
core.BatchListResponse) and apply the same fallback: if an item's
ProviderBatchID is empty, set it to the item's ID. This mirrors the
normalization in CreateBatch/GetBatch/CancelBatch so consumers get consistent
batch identities.

In `@internal/providers/router.go`:
- Around line 207-224: ListBatches currently returns the native provider
response directly; capture the result (e.g., resp := bp.ListBatches(...)), then
set resp.Provider = providerType for consistency with
CreateBatch/GetBatch/CancelBatch and also set the Provider field on each
returned batch item (iterate resp.Items or resp.Batches and assign Provider =
providerType) before returning; use the symbols Router.ListBatches,
providerType, provider, bp and core.BatchListResponse to locate and update the
code.

In `@internal/providers/xai/xai.go`:
- Around line 203-211: ListBatches returns raw items and thus omits the backfill
of ProviderBatchID, causing inconsistent payloads with
CreateBatch/GetBatch/CancelBatch; update the ListBatches handler (function
ListBatches) to iterate resp.Items in the returned core.BatchListResponse and
for each item set item.ProviderBatchID = item.ID when ProviderBatchID is empty
or nil, ensuring the same normalization logic used in
CreateBatch/GetBatch/CancelBatch is applied before returning &resp.

In `@internal/server/handlers.go`:
- Around line 664-674: isNativeBatchResultsPending currently hardcodes Anthropic
behavior; update it to be extensible by replacing the direct
strings.EqualFold(gatewayErr.Provider, "anthropic") check with a lookup against
a configurable set/map of providers that treat 404 as "results pending" (e.g.,
pendingProviders := map[string]bool{"anthropic": true}) or call a new helper
like providerTreats404AsPending(provider string) that encapsulates
provider-specific logic; ensure you still validate the error type
(core.GatewayError) and HTTPStatusCode(), and add a brief comment in
isNativeBatchResultsPending explaining the design choice and where to add other
providers in the future.
- Around line 642-662: mergeStoredBatchFromUpstream currently replaces
stored.Metadata with upstream.Metadata which discards gateway-injected metadata
(e.g., provider, provider_batch_id) set earlier; change the logic in
mergeStoredBatchFromUpstream so that instead of assigning stored.Metadata =
upstream.Metadata you merge them: if stored.Metadata is nil allocate a map and
copy all keys from upstream.Metadata, otherwise copy keys from upstream.Metadata
into stored.Metadata but preserve existing gateway keys (at minimum "provider"
and "provider_batch_id") by not overwriting them; ensure this merge handles nil
maps safely and retains upstream values for non-gateway keys.
- Around line 586-611: When calling h.batchStore.Update after
mergeStoredBatchFromUpstream and after assigning stored.Results, don't ignore
the returned error; capture it (e.g., err :=
h.batchStore.Update(c.Request().Context(), stored)) and log it with context
(batch ID, provider, ProviderBatchID and the error) using the existing logger
(or a suitable logger in scope) so failures are observable; keep the existing
behavior of not failing the response but ensure the log message is descriptive
(e.g., "failed to update batch store for batchID=%s provider=%s
providerBatchID=%s: %v").

In `@internal/server/model_validation.go`:
- Around line 28-33: The current middleware check uses
strings.HasPrefix(c.Request().URL.Path, "/v1/batches") which accidentally
matches routes like "/v1/batchesXYZ"; update the condition in the middleware
that sets the request ID (the block using c.Request().URL.Path,
core.WithRequestID, c.SetRequest and next(c)) to only match the exact collection
root or its subresources by changing the predicate to: path == "/v1/batches" ||
strings.HasPrefix(path, "/v1/batches/"). This ensures only "/v1/batches" and
paths under "/v1/batches/" get the special handling.

In `@internal/usage/extractor.go`:
- Around line 232-233: The current check if !strings.HasPrefix(endpoint,
"/v1/batches") { return pricing } is too broad and can match routes like
"/v1/batcheship"; update the condition to only treat exact "/v1/batches" or
paths under it as batch endpoints by checking endpoint == "/v1/batches" ||
strings.HasPrefix(endpoint, "/v1/batches/") (use the existing endpoint variable
and keep the same return pricing behavior), ensuring batch rates are applied
only for true batch routes.

---

Outside diff comments:
In `@internal/app/app.go`:
- Around line 133-137: The error cleanup when guardrail initialization fails is
missing app.batch.Close(), leaking resources; update the cleanup block (where
errors.Join is called with app.usage.Close(), app.audit.Close(),
app.providers.Close()) to include app.batch.Close() as well, and ensure the
returned error message still wraps the original err and includes the combined
closeErr information (i.e., call errors.Join(app.batch.Close(),
app.usage.Close(), app.audit.Close(), app.providers.Close()) and propagate the
combined close error in the fmt.Errorf).
- Around line 233-239: The exported doc comment for Shutdown is out of
date—update the Shutdown comment (the Go doc for the exported function Shutdown)
to precisely describe the current lifecycle and ordering (including the HTTP
server shutdown, stopping background refresh goroutine and cache, any
usage/batch behavior added, audit logging, and cleanup), note that it is safe to
call multiple times (idempotent), and reflect any concurrency/timeout semantics
or error handling the implementation now uses so the comment matches the
function's actual behavior.

In `@internal/auditlog/stream_wrapper.go`:
- Around line 425-434: The IsModelInteractionPath function currently uses
strings.HasPrefix against modelPaths which overmatches (e.g., "/v1/batcheship");
update IsModelInteractionPath to only accept exact matches or a path segment
boundary after the prefix by checking that path == p or that the character
immediately following the matched prefix is '/' or '?' (to allow query strings)
or the prefix ends the string; locate the modelPaths slice and the loop in
IsModelInteractionPath and replace the simple HasPrefix check with this stricter
boundary-aware check.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 07900ae and ae714a6.

📒 Files selected for processing (31)

AGENTS.md
README.md
config/config.go
internal/app/app.go
internal/auditlog/auditlog.go
internal/auditlog/auditlog_test.go
internal/auditlog/stream_wrapper.go
internal/batch/factory.go
internal/batch/store.go
internal/batch/store_memory.go
internal/batch/store_memory_test.go
internal/batch/store_mongodb.go
internal/batch/store_postgresql.go
internal/batch/store_sqlite.go
internal/batch/store_sqlite_test.go
internal/core/batch.go
internal/core/interfaces.go
internal/providers/anthropic/anthropic.go
internal/providers/gemini/gemini.go
internal/providers/groq/groq.go
internal/providers/ollama/ollama.go
internal/providers/openai/openai.go
internal/providers/router.go
internal/providers/xai/xai.go
internal/server/handlers.go
internal/server/handlers_test.go
internal/server/http.go
internal/server/model_validation.go
internal/server/model_validation_test.go
internal/usage/extractor.go
internal/usage/extractor_test.go

AGENTS.md

internal/batch/factory.go

coderabbitai · 2026-02-28T14:59:08Z

internal/batch/store_postgresql.go

+	ctx := context.Background()
+	_, err := pool.Exec(ctx, `
+		CREATE TABLE IF NOT EXISTS batches (
+			id TEXT PRIMARY KEY,
+			created_at BIGINT NOT NULL,
+			updated_at BIGINT NOT NULL,
+			status TEXT NOT NULL,
+			data JSONB NOT NULL
+		)
+	`)
+	if err != nil {
+		return nil, fmt.Errorf("failed to create batches table: %w", err)
+	}
+
+	if _, err := pool.Exec(ctx, "CREATE INDEX IF NOT EXISTS idx_batches_created_at ON batches(created_at DESC)"); err != nil {
+		return nil, fmt.Errorf("failed to create batches created_at index: %w", err)
+	}
+	if _, err := pool.Exec(ctx, "CREATE INDEX IF NOT EXISTS idx_batches_status ON batches(status)"); err != nil {
+		return nil, fmt.Errorf("failed to create batches status index: %w", err)
+	}
+
+	return &PostgreSQLStore{pool: pool}, nil


🧹 Nitpick | 🔵 Trivial

Consider accepting a context parameter for schema initialization.

The constructor uses context.Background() for table/index creation, which prevents cancellation during startup. For long-running DDL operations or network issues, this could block indefinitely.

♻️ Suggested improvement

-func NewPostgreSQLStore(pool *pgxpool.Pool) (*PostgreSQLStore, error) { +func NewPostgreSQLStore(ctx context.Context, pool *pgxpool.Pool) (*PostgreSQLStore, error) { if pool == nil { return nil, fmt.Errorf("connection pool is required") } - ctx := context.Background() _, err := pool.Exec(ctx, `

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/batch/store_postgresql.go` around lines 26 - 47, The schema initialization currently calls context.Background() which prevents cancellation; change the constructor that returns *PostgreSQLStore to accept a context.Context parameter (e.g., func NewPostgreSQLStore(ctx context.Context, pool *pgxpool.Pool) (*PostgreSQLStore, error)) and replace context.Background() with the passed ctx for all pool.Exec calls used to CREATE TABLE and CREATE INDEX; also update callers to pass a cancellable/timeout context so startup DDL can be cancelled or time-limited.

internal/batch/store.go

coderabbitai · 2026-02-28T14:59:08Z

internal/core/batch.go

+// BatchListResponse is returned by GET /v1/batches.
+type BatchListResponse struct {
+	Object  string          `json:"object"`
+	Data    []BatchResponse `json:"data"`
+	HasMore bool            `json:"has_more"`
+	FirstID string          `json:"first_id,omitempty"`
+	LastID  string          `json:"last_id,omitempty"`
+}


🧹 Nitpick | 🔵 Trivial

Consider using []*BatchResponse for the Data field to avoid large struct copies.

BatchResponse is a moderately large struct (multiple string fields, maps, slices). Using a value slice means every append/copy duplicates the entire struct. A pointer slice would be more memory-efficient for large batch lists.

♻️ Suggested change

type BatchListResponse struct { Object string `json:"object"` - Data []BatchResponse `json:"data"` + Data []*BatchResponse `json:"data"` HasMore bool `json:"has_more"` FirstID string `json:"first_id,omitempty"` LastID string `json:"last_id,omitempty"` }

Note: This would require updating callers that iterate over Data and the handlers that build responses.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/core/batch.go` around lines 55 - 62, Change BatchListResponse.Data from []BatchResponse to []*BatchResponse to avoid copying large structs; update all call sites that construct or iterate Data (including any handlers that build responses and loops over Data) to allocate and append pointers to BatchResponse (use &batch or new(BatchResponse)) and dereference where necessary when accessing fields, and ensure JSON marshaling behavior remains correct for the modified slice type.

internal/server/handlers.go

internal/server/model_validation.go

internal/usage/extractor.go

… handling

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

internal/auditlog/stream_wrapper.go (1)
426-435: ⚠️ Potential issue | 🟡 Minor

Use endpoint-boundary matching instead of raw prefix matching.

strings.HasPrefix here will also match non-endpoints like /v1/filesystem or /v1/batches-extra, causing unintended audit logging when OnlyModelInteractions is enabled.
🔧 Suggested fix
 func IsModelInteractionPath(path string) bool {
 	modelPaths := []string{
 		"/v1/chat/completions",
 		"/v1/responses",
 		"/v1/embeddings",
 		"/v1/files",
 		"/v1/batches",
 	}
 	for _, p := range modelPaths {
-		if strings.HasPrefix(path, p) {
+		if path == p || strings.HasPrefix(path, p+"/") || strings.HasPrefix(path, p+"?") {
 			return true
 		}
 	}
 	return false
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/auditlog/stream_wrapper.go` around lines 426 - 435, The current loop
over modelPaths uses strings.HasPrefix(path, p) which wrongly matches things
like "/v1/filesystem"; locate the loop in internal/auditlog/stream_wrapper.go
(the modelPaths variable and the strings.HasPrefix check) and replace the naive
prefix check with an endpoint-boundary check: return true only when the request
path equals the endpoint (path == p) or the endpoint is followed by a valid
boundary character (e.g., path starts with p + "/" or p + "?" or p + "#" or
other separator you consider valid), so that "/v1/files" matches but
"/v1/filesystem" does not.
internal/app/app.go (1)
243-248: ⚠️ Potential issue | 🟡 Minor

Update Shutdown doc comment to match the new lifecycle steps.

The exported Shutdown comment still describes the old sequence and omits the new batch-close step, which now changes behavior and operational expectations.

As per coding guidelines **/*.go: “Maintain Go doc comments on all exported types, functions, and interfaces; update when changing public APIs or function signatures/behavior”.

Also applies to: 287-295
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/app/app.go` around lines 243 - 248, Update the exported Shutdown doc
comment for the Shutdown function to reflect the current lifecycle order and
include the new batch-close step: list each shutdown phase in order (HTTP server
stop accepting new requests, batch close of pending operations/resources,
background refresh goroutine and cache teardown, audit logging flush/close, and
any other cleanup), and ensure the wording matches the function's behavior and
public documentation requirements; also make the same comment update where the
old text is repeated (the secondary comment block that mirrors Shutdown's doc)
so both doc comments stay consistent with the new lifecycle.

♻️ Duplicate comments (7)

internal/batch/store.go (1)

51-59: ⚠️ Potential issue | 🟠 Major

Validate non-empty batch.ID in serializeBatch.

serializeBatch is used as the shared persistence serialization path. Without checking batch.ID, stores can persist invalid records with empty IDs.

🔧 Suggested fix

 func serializeBatch(batch *core.BatchResponse) ([]byte, error) {
 	if batch == nil {
 		return nil, fmt.Errorf("batch is nil")
 	}
+	if batch.ID == "" {
+		return nil, fmt.Errorf("batch ID is empty")
+	}
 	b, err := json.Marshal(batch)
 	if err != nil {
 		return nil, fmt.Errorf("marshal batch: %w", err)
 	}
 	return b, nil
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/batch/store.go` around lines 51 - 59, serializeBatch currently
allows persisting batches with an empty ID; add validation at the start of
serializeBatch (which accepts *core.BatchResponse) to return an error if
batch.ID is empty or only whitespace (e.g., fmt.Errorf("batch ID is empty")), so
stores never serialize records lacking a valid ID; update any callers/tests if
they rely on previous behavior.

internal/server/model_validation.go (1)

28-33: ⚠️ Potential issue | 🟡 Minor

Tighten batch/file bypass path matching to exact root + subresource paths.

Line 28 uses broad HasPrefix checks, so paths like /v1/batchesXYZ or /v1/filesXYZ can incorrectly bypass validation.

♻️ Suggested fix

-			if strings.HasPrefix(c.Request().URL.Path, "/v1/batches") || strings.HasPrefix(c.Request().URL.Path, "/v1/files") {
+			p := c.Request().URL.Path
+			if p == "/v1/batches" || strings.HasPrefix(p, "/v1/batches/") ||
+				p == "/v1/files" || strings.HasPrefix(p, "/v1/files/") {
 				requestID := c.Request().Header.Get("X-Request-ID")
 				ctx := core.WithRequestID(c.Request().Context(), requestID)
 				c.SetRequest(c.Request().WithContext(ctx))
 				return next(c)
 			}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/server/model_validation.go` around lines 28 - 33, The path bypass
currently uses broad strings.HasPrefix checks on c.Request().URL.Path which lets
paths like "/v1/batchesXYZ" bypass validation; update the conditional in the
model validation middleware to only allow the exact root or its subpaths by
checking equality OR a slash-prefixed prefix (e.g., path == "/v1/batches" ||
strings.HasPrefix(path, "/v1/batches/") and similarly for "/v1/files"), leaving
the rest of the block (requestID extraction, context injection, c.SetRequest,
return next) unchanged and targeting the same c.Request().URL.Path expressions.

internal/providers/openai/openai.go (1)

249-272: ⚠️ Potential issue | 🟡 Minor

Apply ProviderBatchID fallback for list items too.

CreateBatch, GetBatch, and CancelBatch normalize missing ProviderBatchID, but ListBatches does not. Consumers can receive inconsistent batch identities depending on the endpoint used.

🔧 Proposed fix

 	var resp core.BatchListResponse
 	err := p.client.Do(ctx, llmclient.Request{
 		Method:   http.MethodGet,
 		Endpoint: endpoint,
 	}, &resp)
 	if err != nil {
 		return nil, err
 	}
+	for i := range resp.Data {
+		if resp.Data[i].ProviderBatchID == "" {
+			resp.Data[i].ProviderBatchID = resp.Data[i].ID
+		}
+	}
 	return &resp, nil

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/providers/openai/openai.go` around lines 249 - 272, ListBatches
currently returns core.BatchListResponse items without applying the same
ProviderBatchID normalization used in CreateBatch, GetBatch, and CancelBatch,
causing inconsistent batch identities; update Provider.ListBatches to iterate
the returned batch items (in the response from p.client.Do) and set each item's
ProviderBatchID to the fallback value (e.g., use the existing normalization
logic used by CreateBatch/GetBatch/CancelBatch or a helper function) when
ProviderBatchID is missing so list results match other endpoints. Ensure you
reference and reuse the same normalization behavior used by
CreateBatch/GetBatch/CancelBatch to avoid duplication and maintain parity.

internal/server/handlers.go (1)

1366-1376: 🧹 Nitpick | 🔵 Trivial

isNativeBatchResultsPending hardcodes Anthropic-specific behavior.

This function only returns true for Anthropic providers. Consider adding a comment explaining why, or making it extensible for other providers that may have similar "results not ready" patterns.

// isNativeBatchResultsPending checks if the error indicates batch results aren't ready yet.
// Currently Anthropic-specific: their results endpoint returns 404 until processing completes.
// OpenAI returns the output_file_id in the batch object instead.
// Add other provider checks here as needed.
func isNativeBatchResultsPending(err error) bool {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/server/handlers.go` around lines 1366 - 1376, The
isNativeBatchResultsPending function currently hardcodes Anthropic-specific
logic; update it to either (a) add a clear explanatory comment above
isNativeBatchResultsPending stating that Anthropic's results endpoint returns
404 until ready and that other providers behave differently (e.g., OpenAI
returns output_file_id), or (b) make the function extensible by mapping provider
names to their "results-pending" semantics (e.g., introduce a
providerPendingCheck map and call providerPendingCheck[gatewayErr.Provider]),
while keeping the existing checks on gatewayErr.HTTPStatusCode() and errors.As;
reference isNativeBatchResultsPending, gatewayErr, Provider, and HTTPStatusCode
to locate the code to change.

internal/providers/gemini/gemini.go (1)

269-292: ⚠️ Potential issue | 🟡 Minor

Normalize ProviderBatchID in ListBatches items as well.

CreateBatch, GetBatch, and CancelBatch backfill ProviderBatchID when missing, but ListBatches currently does not. This can leave listed entries without a usable provider batch identifier for follow-up operations.

🛠️ Suggested fix

 	var resp core.BatchListResponse
 	err := p.client.Do(ctx, llmclient.Request{
 		Method:   http.MethodGet,
 		Endpoint: endpoint,
 	}, &resp)
 	if err != nil {
 		return nil, err
 	}
+	for i := range resp.Data {
+		if resp.Data[i].ProviderBatchID == "" {
+			resp.Data[i].ProviderBatchID = resp.Data[i].ID
+		}
+	}
 	return &resp, nil

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/providers/gemini/gemini.go` around lines 269 - 292, ListBatches
currently returns items without backfilled ProviderBatchID like
CreateBatch/GetBatch/CancelBatch do; after the Do call in Provider.ListBatches,
iterate resp.Items and for any item with an empty ProviderBatchID set it using
the same normalization used elsewhere (e.g. construct the provider-scoped ID
from the Gemini native id the other methods use — reference the
Provider.ListBatches method, resp.Items slice and the ProviderBatchID field and
apply the same formatting/helper function used by
CreateBatch/GetBatch/CancelBatch so listed entries are usable for follow-ups).

internal/providers/xai/xai.go (1)

189-212: ⚠️ Potential issue | 🟡 Minor

Normalize ProviderBatchID in ListBatches responses for consistency.

CreateBatch, GetBatch, and CancelBatch all backfill ProviderBatchID from ID when empty, but ListBatches returns raw items without this normalization. This creates inconsistent API responses.

🔧 Proposed fix

 	var resp core.BatchListResponse
 	err := p.client.Do(ctx, llmclient.Request{
 		Method:   http.MethodGet,
 		Endpoint: endpoint,
 	}, &resp)
 	if err != nil {
 		return nil, err
 	}
+	for i := range resp.Data {
+		if resp.Data[i].ProviderBatchID == "" {
+			resp.Data[i].ProviderBatchID = resp.Data[i].ID
+		}
+	}
 	return &resp, nil
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/providers/xai/xai.go` around lines 189 - 212, ListBatches currently
returns core.BatchListResponse items without normalizing ProviderBatchID,
causing inconsistency with CreateBatch/GetBatch/CancelBatch which backfill
ProviderBatchID from ID; update Provider.ListBatches to iterate resp.Items (or
the appropriate slice in core.BatchListResponse) after the HTTP Do call and for
each batch, if batch.ProviderBatchID is empty set it to batch.ID so all
batch-returning methods consistently backfill ProviderBatchID.

internal/providers/router.go (1)

207-224: ⚠️ Potential issue | 🟡 Minor

ListBatches should tag Provider on response items for consistency.

CreateBatch, GetBatch, and CancelBatch all set resp.Provider = providerType, but ListBatches returns the provider's response directly. ListFiles (lines 307-314) correctly iterates and sets Provider on each item—apply the same pattern here.

🔧 Proposed fix

 	bp, ok := provider.(core.NativeBatchProvider)
 	if !ok {
 		return nil, core.NewInvalidRequestError(fmt.Sprintf("%s does not support native batch processing", providerType), nil)
 	}
-	return bp.ListBatches(ctx, limit, after)
+	resp, err := bp.ListBatches(ctx, limit, after)
+	if err != nil {
+		return nil, err
+	}
+	if resp != nil {
+		for i := range resp.Data {
+			resp.Data[i].Provider = providerType
+		}
+	}
+	return resp, nil
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/providers/router.go` around lines 207 - 224, ListBatches currently
returns the provider's BatchListResponse directly without tagging each returned
batch with the provider type; call bp.ListBatches(ctx, limit, after), then
iterate resp.Items and set item.Provider = providerType for each item (same
pattern used in ListFiles) before returning; locate this logic in
Router.ListBatches and use providerByType/core.NativeBatchProvider and
resp.Items to apply the fix.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/providers/batch_results_file_adapter_test.go`:
- Around line 14-49: Add a regression test that verifies
FetchBatchResultsFromOutputFile fails when an output-file contains a malformed
JSONL line: create a new test (or subtest) alongside
TestFetchBatchResultsFromOutputFile that spins up an httptest server returning a
valid /batches/{id} payload but an output file (/files/{id}/content) whose
second (or any) line is intentionally invalid JSON, call
FetchBatchResultsFromOutputFile with the same client and batch id, and assert
that the call returns a non-nil error (and optionally that the error message
indicates a JSON parse/malformed line) instead of silently accepting corrupted
lines. Ensure the test references the existing FetchBatchResultsFromOutputFile
function and mirrors the server handling pattern used in
TestFetchBatchResultsFromOutputFile.

In `@internal/providers/batch_results_file_adapter.go`:
- Around line 40-46: The DoRaw calls (client.DoRaw in the batch fetches)
currently return raw transport errors directly; replace the direct returns of
err with parsed errors using core.ParseProviderError(err) so upstream gateways
get normalized GatewayError types—i.e., after each client.DoRaw(...) check (the
call that assigns batchRaw and the later similar call at the other location),
call parsedErr := core.ParseProviderError(err) and return nil, parsedErr (or
return parsedErr where appropriate) instead of returning err unmodified.
- Around line 113-116: The code currently swallows malformed JSONL by doing `if
err := json.Unmarshal(line, &row); err != nil { continue }`; instead of
continuing, surface the failure: return or accumulate an error that includes the
offending line (or its index) and the json.Unmarshal error so upstream callers
are alerted to corrupt batch output. Update the handling around
`json.Unmarshal(line, &row)` (which populates `openAICompatibleBatchLine row`)
to stop silently dropping rows and propagate a clear error/diagnostic rather
than `continue`.

In `@internal/providers/file_adapter_openai_compat.go`:
- Around line 61-63: Several file-adapter methods are returning raw client
errors (e.g., places that currently do "return nil, err" after decoding into
fileObj); update each of these to normalize HTTP failures via
core.ParseProviderError(err) so they return the GatewayError contract.
Concretely, in internal/providers/file_adapter_openai_compat.go replace
instances like "return nil, err" (the ones after decoding into fileObj at the
shown diff and the similar sites around the other ranges) with "return nil,
core.ParseProviderError(err)" and ensure core is imported if not already; apply
this same change for the other occurrences you noted (around lines 96-98,
118-120, 140-142, 162-164).

In `@internal/providers/groq/groq.go`:
- Around line 173-196: ListBatches returns core.BatchListResponse items without
normalizing ProviderBatchID like CreateBatch/GetBatch/CancelBatch do; update
Provider.ListBatches to iterate over resp.Items after the Do call and for each
item set item.ProviderBatchID = item.ID when ProviderBatchID is empty (preserve
existing value if present) so the response matches the other batch methods'
normalization for ProviderBatchID in core.BatchListResponse items.

In `@internal/providers/router_test.go`:
- Around line 384-428: Replace the magic number comparison in
TestRouterBatchProviderTypeValidation with the http package constant: change the
check that compares gwErr.HTTPStatusCode() != 400 to use http.StatusBadRequest
instead so status code style is consistent with other tests (refer to
TestRouterBatchProviderTypeValidation and the gwErr.HTTPStatusCode() check).

---

Outside diff comments:
In `@internal/app/app.go`:
- Around line 243-248: Update the exported Shutdown doc comment for the Shutdown
function to reflect the current lifecycle order and include the new batch-close
step: list each shutdown phase in order (HTTP server stop accepting new
requests, batch close of pending operations/resources, background refresh
goroutine and cache teardown, audit logging flush/close, and any other cleanup),
and ensure the wording matches the function's behavior and public documentation
requirements; also make the same comment update where the old text is repeated
(the secondary comment block that mirrors Shutdown's doc) so both doc comments
stay consistent with the new lifecycle.

In `@internal/auditlog/stream_wrapper.go`:
- Around line 426-435: The current loop over modelPaths uses
strings.HasPrefix(path, p) which wrongly matches things like "/v1/filesystem";
locate the loop in internal/auditlog/stream_wrapper.go (the modelPaths variable
and the strings.HasPrefix check) and replace the naive prefix check with an
endpoint-boundary check: return true only when the request path equals the
endpoint (path == p) or the endpoint is followed by a valid boundary character
(e.g., path starts with p + "/" or p + "?" or p + "#" or other separator you
consider valid), so that "/v1/files" matches but "/v1/filesystem" does not.

---

Duplicate comments:
In `@internal/batch/store.go`:
- Around line 51-59: serializeBatch currently allows persisting batches with an
empty ID; add validation at the start of serializeBatch (which accepts
*core.BatchResponse) to return an error if batch.ID is empty or only whitespace
(e.g., fmt.Errorf("batch ID is empty")), so stores never serialize records
lacking a valid ID; update any callers/tests if they rely on previous behavior.

In `@internal/providers/gemini/gemini.go`:
- Around line 269-292: ListBatches currently returns items without backfilled
ProviderBatchID like CreateBatch/GetBatch/CancelBatch do; after the Do call in
Provider.ListBatches, iterate resp.Items and for any item with an empty
ProviderBatchID set it using the same normalization used elsewhere (e.g.
construct the provider-scoped ID from the Gemini native id the other methods use
— reference the Provider.ListBatches method, resp.Items slice and the
ProviderBatchID field and apply the same formatting/helper function used by
CreateBatch/GetBatch/CancelBatch so listed entries are usable for follow-ups).

In `@internal/providers/openai/openai.go`:
- Around line 249-272: ListBatches currently returns core.BatchListResponse
items without applying the same ProviderBatchID normalization used in
CreateBatch, GetBatch, and CancelBatch, causing inconsistent batch identities;
update Provider.ListBatches to iterate the returned batch items (in the response
from p.client.Do) and set each item's ProviderBatchID to the fallback value
(e.g., use the existing normalization logic used by
CreateBatch/GetBatch/CancelBatch or a helper function) when ProviderBatchID is
missing so list results match other endpoints. Ensure you reference and reuse
the same normalization behavior used by CreateBatch/GetBatch/CancelBatch to
avoid duplication and maintain parity.

In `@internal/providers/router.go`:
- Around line 207-224: ListBatches currently returns the provider's
BatchListResponse directly without tagging each returned batch with the provider
type; call bp.ListBatches(ctx, limit, after), then iterate resp.Items and set
item.Provider = providerType for each item (same pattern used in ListFiles)
before returning; locate this logic in Router.ListBatches and use
providerByType/core.NativeBatchProvider and resp.Items to apply the fix.

In `@internal/providers/xai/xai.go`:
- Around line 189-212: ListBatches currently returns core.BatchListResponse
items without normalizing ProviderBatchID, causing inconsistency with
CreateBatch/GetBatch/CancelBatch which backfill ProviderBatchID from ID; update
Provider.ListBatches to iterate resp.Items (or the appropriate slice in
core.BatchListResponse) after the HTTP Do call and for each batch, if
batch.ProviderBatchID is empty set it to batch.ID so all batch-returning methods
consistently backfill ProviderBatchID.

In `@internal/server/handlers.go`:
- Around line 1366-1376: The isNativeBatchResultsPending function currently
hardcodes Anthropic-specific logic; update it to either (a) add a clear
explanatory comment above isNativeBatchResultsPending stating that Anthropic's
results endpoint returns 404 until ready and that other providers behave
differently (e.g., OpenAI returns output_file_id), or (b) make the function
extensible by mapping provider names to their "results-pending" semantics (e.g.,
introduce a providerPendingCheck map and call
providerPendingCheck[gatewayErr.Provider]), while keeping the existing checks on
gatewayErr.HTTPStatusCode() and errors.As; reference
isNativeBatchResultsPending, gatewayErr, Provider, and HTTPStatusCode to locate
the code to change.

In `@internal/server/model_validation.go`:
- Around line 28-33: The path bypass currently uses broad strings.HasPrefix
checks on c.Request().URL.Path which lets paths like "/v1/batchesXYZ" bypass
validation; update the conditional in the model validation middleware to only
allow the exact root or its subpaths by checking equality OR a slash-prefixed
prefix (e.g., path == "/v1/batches" || strings.HasPrefix(path, "/v1/batches/")
and similarly for "/v1/files"), leaving the rest of the block (requestID
extraction, context injection, c.SetRequest, return next) unchanged and
targeting the same c.Request().URL.Path expressions.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 465ff98 and d99aa89.

📒 Files selected for processing (30)

.env.template
README.md
config/config.example.yaml
config/config.go
config/config_helpers_test.go
config/config_test.go
internal/app/app.go
internal/auditlog/auditlog.go
internal/auditlog/auditlog_test.go
internal/auditlog/stream_wrapper.go
internal/batch/store.go
internal/core/files.go
internal/core/interfaces.go
internal/guardrails/provider.go
internal/guardrails/provider_test.go
internal/llmclient/client.go
internal/providers/batch_results_file_adapter.go
internal/providers/batch_results_file_adapter_test.go
internal/providers/file_adapter_openai_compat.go
internal/providers/gemini/gemini.go
internal/providers/groq/groq.go
internal/providers/openai/openai.go
internal/providers/router.go
internal/providers/router_test.go
internal/providers/xai/xai.go
internal/server/handlers.go
internal/server/handlers_test.go
internal/server/http.go
internal/server/model_validation.go
internal/server/model_validation_test.go

coderabbitai · 2026-02-28T17:04:53Z

internal/providers/batch_results_file_adapter_test.go

+func TestFetchBatchResultsFromOutputFile(t *testing.T) {
+	server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		switch r.URL.Path {
+		case "/batches/batch_1":
+			w.WriteHeader(http.StatusOK)
+			_, _ = w.Write([]byte(`{"id":"batch_1","status":"completed","output_file_id":"file_1","endpoint":"/v1/chat/completions"}`))
+		case "/files/file_1/content":
+			w.WriteHeader(http.StatusOK)
+			_, _ = w.Write([]byte(
+				`{"custom_id":"ok-1","response":{"status_code":200,"url":"/v1/chat/completions","body":{"id":"resp-1","model":"gpt-4o-mini","usage":{"prompt_tokens":1,"completion_tokens":1,"total_tokens":2}}}}` + "\n" +
+					`{"custom_id":"err-1","error":{"type":"invalid_request_error","message":"bad request"}}`,
+			))
+		default:
+			http.NotFound(w, r)
+		}
+	}))
+	defer server.Close()
+
+	client := llmclient.NewWithHTTPClient(server.Client(), llmclient.DefaultConfig("openai", server.URL), nil)
+	resp, err := FetchBatchResultsFromOutputFile(context.Background(), client, "openai", "batch_1")
+	if err != nil {
+		t.Fatalf("unexpected error: %v", err)
+	}
+	if resp.BatchID != "batch_1" {
+		t.Fatalf("BatchID = %q, want %q", resp.BatchID, "batch_1")
+	}
+	if len(resp.Data) != 2 {
+		t.Fatalf("len(Data) = %d, want 2", len(resp.Data))
+	}
+	if resp.Data[0].StatusCode != 200 || resp.Data[0].Model != "gpt-4o-mini" {
+		t.Fatalf("unexpected first row: %+v", resp.Data[0])
+	}
+	if resp.Data[1].Error == nil || resp.Data[1].Error.Type != "invalid_request_error" {
+		t.Fatalf("unexpected error row: %+v", resp.Data[1])
+	}
+}


🧹 Nitpick | 🔵 Trivial

Add a malformed JSONL regression test for output-file parsing.

Please add a case where one output-file line is invalid JSON and assert the function returns an error, so corruption isn’t silently accepted.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/providers/batch_results_file_adapter_test.go` around lines 14 - 49, Add a regression test that verifies FetchBatchResultsFromOutputFile fails when an output-file contains a malformed JSONL line: create a new test (or subtest) alongside TestFetchBatchResultsFromOutputFile that spins up an httptest server returning a valid /batches/{id} payload but an output file (/files/{id}/content) whose second (or any) line is intentionally invalid JSON, call FetchBatchResultsFromOutputFile with the same client and batch id, and assert that the call returns a non-nil error (and optionally that the error message indicates a JSON parse/malformed line) instead of silently accepting corrupted lines. Ensure the test references the existing FetchBatchResultsFromOutputFile function and mirrors the server handling pattern used in TestFetchBatchResultsFromOutputFile.

coderabbitai · 2026-02-28T17:04:53Z

internal/providers/batch_results_file_adapter.go

+	batchRaw, err := client.DoRaw(ctx, llmclient.Request{
+		Method:   http.MethodGet,
+		Endpoint: "/batches/" + url.PathEscape(batchID),
+	})
+	if err != nil {
+		return nil, err
+	}


⚠️ Potential issue | 🟠 Major

Normalize upstream DoRaw failures with core.ParseProviderError() before returning.

Returning raw transport/provider errors here risks inconsistent GatewayError typing and status propagation across provider surfaces.

As per coding guidelines internal/providers/**/*.go: “Use core.ParseProviderError() to convert upstream HTTP errors to the correct GatewayError type in provider implementations”.

Also applies to: 60-66

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/providers/batch_results_file_adapter.go` around lines 40 - 46, The DoRaw calls (client.DoRaw in the batch fetches) currently return raw transport errors directly; replace the direct returns of err with parsed errors using core.ParseProviderError(err) so upstream gateways get normalized GatewayError types—i.e., after each client.DoRaw(...) check (the call that assigns batchRaw and the later similar call at the other location), call parsedErr := core.ParseProviderError(err) and return nil, parsedErr (or return parsedErr where appropriate) instead of returning err unmodified.

coderabbitai · 2026-02-28T17:04:53Z

internal/providers/batch_results_file_adapter.go

+		var row openAICompatibleBatchLine
+		if err := json.Unmarshal(line, &row); err != nil {
+			continue
+		}


⚠️ Potential issue | 🟠 Major

Do not silently drop malformed JSONL rows in batch output.

Skipping invalid lines hides upstream corruption and returns incomplete results without any error signal.

♻️ Suggested fix

func parseBatchOutputFile(raw []byte, fallbackURL, providerName string) ([]core.BatchResultItem, error) { scanner := bufio.NewScanner(bytes.NewReader(raw)) scanner.Buffer(make([]byte, 0, 64*1024), 4*1024*1024) results := make([]core.BatchResultItem, 0) index := 0 + lineNo := 0 for scanner.Scan() { + lineNo++ line := bytes.TrimSpace(scanner.Bytes()) if len(line) == 0 { continue } var row openAICompatibleBatchLine if err := json.Unmarshal(line, &row); err != nil { - continue + return nil, fmt.Errorf("invalid JSONL in batch output at line %d: %w", lineNo, err) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

var row openAICompatibleBatchLine

if err := json.Unmarshal(line, &row); err != nil {

continue

}

func parseBatchOutputFile(raw []byte, fallbackURL, providerName string) ([]core.BatchResultItem, error) {

scanner := bufio.NewScanner(bytes.NewReader(raw))

scanner.Buffer(make([]byte, 0, 64*1024), 4*1024*1024)

results := make([]core.BatchResultItem, 0)

index := 0

lineNo := 0

for scanner.Scan() {

lineNo++

line := bytes.TrimSpace(scanner.Bytes())

if len(line) == 0 {

continue

}

var row openAICompatibleBatchLine

if err := json.Unmarshal(line, &row); err != nil {

return nil, fmt.Errorf("invalid JSONL in batch output at line %d: %w", lineNo, err)

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/providers/batch_results_file_adapter.go` around lines 113 - 116, The code currently swallows malformed JSONL by doing `if err := json.Unmarshal(line, &row); err != nil { continue }`; instead of continuing, surface the failure: return or accumulate an error that includes the offending line (or its index) and the json.Unmarshal error so upstream callers are alerted to corrupt batch output. Update the handling around `json.Unmarshal(line, &row)` (which populates `openAICompatibleBatchLine row`) to stop silently dropping rows and propagate a clear error/diagnostic rather than `continue`.

coderabbitai · 2026-02-28T17:04:54Z

internal/providers/file_adapter_openai_compat.go

+	}, &fileObj); err != nil {
+		return nil, err
+	}


⚠️ Potential issue | 🟠 Major

Normalize provider HTTP failures consistently across all file adapter methods.

Each method currently returns raw client errors directly. Please route these through core.ParseProviderError() so file endpoints preserve the same GatewayError contract as other provider paths.

As per coding guidelines internal/providers/**/*.go: “Use core.ParseProviderError() to convert upstream HTTP errors to the correct GatewayError type in provider implementations”.

Also applies to: 96-98, 118-120, 140-142, 162-164

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/providers/file_adapter_openai_compat.go` around lines 61 - 63, Several file-adapter methods are returning raw client errors (e.g., places that currently do "return nil, err" after decoding into fileObj); update each of these to normalize HTTP failures via core.ParseProviderError(err) so they return the GatewayError contract. Concretely, in internal/providers/file_adapter_openai_compat.go replace instances like "return nil, err" (the ones after decoding into fileObj at the shown diff and the similar sites around the other ranges) with "return nil, core.ParseProviderError(err)" and ensure core is imported if not already; apply this same change for the other occurrences you noted (around lines 96-98, 118-120, 140-142, 162-164).

internal/providers/groq/groq.go

internal/providers/router_test.go

coderabbitai

Actionable comments posted: 10

♻️ Duplicate comments (1)

internal/providers/router_test.go (1)

427-428: 🧹 Nitpick | 🔵 Trivial

Use consistent status code style.

Line 427 uses magic number 400 while line 470 uses http.StatusBadRequest. For consistency, prefer using the named constant throughout.

♻️ Suggested fix

-			if gwErr.HTTPStatusCode() != 400 {
-				t.Fatalf("expected status 400, got %d", gwErr.HTTPStatusCode())
+			if gwErr.HTTPStatusCode() != http.StatusBadRequest {
+				t.Fatalf("expected status %d, got %d", http.StatusBadRequest, gwErr.HTTPStatusCode())

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/providers/router_test.go` around lines 427 - 428, Replace the magic
number 400 in the assertion that checks gwErr.HTTPStatusCode() with the named
constant http.StatusBadRequest for consistent status code style; update the test
in router_test.go where gwErr.HTTPStatusCode() is compared (the failing t.Fatalf
line) to use http.StatusBadRequest and add or ensure the net/http package is
imported in that test file if not already.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/app/app.go`:
- Around line 282-288: Update the misleading comment above the batch close block
so it accurately describes the operation being performed: replace "3. Close
usage tracking (flushes pending entries)" with a comment that indicates the code
is closing the batch store (e.g., "3. Close batch store (flushes pending
entries)"); locate the block that references a.batch and calls a.batch.Close()
and change only the comment text to reflect closing the batch store while
leaving the error handling (slog.Error, errs = append(...)) and the call to
a.batch.Close() unchanged.

In `@internal/batch/factory.go`:
- Around line 64-78: The function NewWithSharedStorage currently accepts an
unused cfg parameter; remove cfg from the function signature and its nil check,
leaving NewWithSharedStorage(ctx context.Context, shared storage.Storage)
(*Result, error), keep the createStore call and returned Result unchanged; then
update all call sites that pass a cfg (e.g., the invocations that currently call
NewWithSharedStorage(ctx, cfg, shared) to call NewWithSharedStorage(ctx,
shared)) and remove any now-unused config imports/variables in those callers,
and run tests/build to ensure no remaining references.

In `@internal/batch/store_postgresql.go`:
- Around line 101-109: In the List pagination branch where you fetch the cursor
(the QueryRow call that scans into cursorCreatedAt when after != ""), stop
returning ErrNotFound on pgx.ErrNoRows and instead treat a missing cursor as
"start from beginning": catch errors.Is(err, pgx.ErrNoRows) and set
cursorCreatedAt to 0 (or appropriate zero value) and continue; only return an
error for non-ErrNoRows cases. Update the block around
s.pool.QueryRow(...).Scan(&cursorCreatedAt) and the ErrNotFound path so List
continues normally when the cursor row was deleted.

In `@internal/providers/anthropic/anthropic.go`:
- Around line 853-863: The handler accepts native "/v1/responses" requests (see
respReq and convertResponsesRequestToAnthropic) but later hardcodes results to
"/v1/chat/completions" and uses a chat-completions payload shape; change the
result-emission logic to be endpoint-aware: detect when the batch item
originated from "/v1/responses" and set the emitted result URL and success
payload mapping to "/v1/responses" with the Responses-shaped body rather than
the chat-completions shape. Update the code that currently hardcodes
"/v1/chat/completions" (around the emission block referenced in lines
~1015-1020) to branch on the original request path and produce the correct
Responses-shaped payload for respReq.
- Around line 814-817: buildAnthropicBatchCreateRequest currently dereferences
req without checking for nil which can panic; add a nil guard at the start of
buildAnthropicBatchCreateRequest (before accessing req.Requests) that returns a
typed error (e.g., core.NewInvalidRequestError("request is required for
anthropic batch processing", nil)) when req == nil, preserving the existing
validation for empty req.Requests afterward so callers receive a graceful
gateway error instead of a runtime panic.
- Around line 1009-1012: The log currently emits the raw batch payload via
string(line) which can leak sensitive prompts/completions; change the warning
inside the json.Unmarshal error path (around anthropicBatchResultLine handling)
to avoid logging the payload and instead log only metadata such as the error,
the byte length (len(line)) and the batch item index (introduce or use an
existing index variable like i or idx if present) and any batch/total size
fields; remove string(line) from the slog.Warn call and replace it with those
non-sensitive metadata fields.

In `@internal/server/handlers.go`:
- Around line 267-281: The function applyAfterCursor currently hides
invalid/stale cursors by returning the full items slice when the given after ID
isn't found; change its signature to return ([]core.FileObject, error) (or add a
(bool, error) indicator) so it returns an error when after is non-empty and no
matching items[i].ID == after is found, and update all callers to handle the
error (e.g., propagate a client-facing invalid cursor error or log a warning and
return a 4xx response); keep the existing behavior of returning items[i+1:] when
found and returning items unchanged only when after is empty.
- Around line 1251-1298: intFromAny currently casts large unsigned integers
(uint64) and other wide numeric types directly to int, which can overflow on
32-bit systems; update intFromAny to check the numeric value range before
converting (specifically for uint64 and uint32/uint types) and return false when
the value is outside the target int range, and for float->int and json.Number
branches also validate the bounds before casting so any out-of-range numeric
input is rejected rather than wrapped/truncated.

In `@internal/usage/extractor_test.go`:
- Around line 485-583: Add a regression test that verifies the code treats
subpaths under /v1/batches/ as batch pricing: create a test similar to
TestExtractFromChatResponse_WithBatchPricingEndpoint (or
TestExtractFromEmbeddingResponse_WithBatchPricingEndpoint) but call
ExtractFromChatResponse (or ExtractFromEmbeddingResponse) with the endpoint
string "/v1/batches/batch_123" and the same pricing/usage setup, then assert
InputCost/OutputCost/TotalCost match the batch-pricing expectations; this locks
in the HasPrefix("/v1/batches/") behavior.

In `@internal/usage/extractor.go`:
- Around line 228-244: Update the Go doc comments for all exported ExtractFrom*
functions to state that pricing is endpoint-aware: when the endpoint is
"/v1/batches" or starts with "/v1/batches/" the batch-specific fields
BatchInputPerMtok/BatchOutputPerMtok may override InputPerMtok/OutputPerMtok
(the behavior implemented in pricingForEndpoint). Locate the exported
ExtractFrom* declarations and add a short sentence explaining that batch
endpoints use batch rates for input/output mtok values and that
pricingForEndpoint applies these overrides.

---

Duplicate comments:
In `@internal/providers/router_test.go`:
- Around line 427-428: Replace the magic number 400 in the assertion that checks
gwErr.HTTPStatusCode() with the named constant http.StatusBadRequest for
consistent status code style; update the test in router_test.go where
gwErr.HTTPStatusCode() is compared (the failing t.Fatalf line) to use
http.StatusBadRequest and add or ensure the net/http package is imported in that
test file if not already.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d99aa89 and 5106316.

📒 Files selected for processing (21)

AGENTS.md
internal/app/app.go
internal/auditlog/auditlog_test.go
internal/auditlog/stream_wrapper.go
internal/batch/factory.go
internal/batch/store.go
internal/batch/store_postgresql.go
internal/batch/store_test.go
internal/providers/anthropic/anthropic.go
internal/providers/gemini/gemini.go
internal/providers/groq/groq.go
internal/providers/openai/openai.go
internal/providers/router.go
internal/providers/router_test.go
internal/providers/xai/xai.go
internal/server/handlers.go
internal/server/handlers_test.go
internal/server/model_validation.go
internal/server/model_validation_test.go
internal/usage/extractor.go
internal/usage/extractor_test.go

internal/app/app.go

internal/batch/factory.go

internal/batch/store_postgresql.go

internal/providers/anthropic/anthropic.go

internal/server/handlers.go

internal/usage/extractor_test.go

internal/usage/extractor.go

SantiagoDePolonia · 2026-02-28T18:00:52Z

@coderabbitai review

coderabbitai · 2026-02-28T18:01:01Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

feat: add OpenAI-compatible batch API with usage and audit logging

7aa5048

Implement native provider batch API with persistent lifecycle store

ae714a6

github-advanced-security bot found potential problems Feb 28, 2026

View reviewed changes

internal/batch/store_mongodb.go Dismissed Show dismissed Hide dismissed

internal/batch/store_mongodb.go Dismissed Show dismissed Hide dismissed

SantiagoDePolonia marked this pull request as ready for review February 28, 2026 14:52

coderabbitai bot reviewed Feb 28, 2026

View reviewed changes

SantiagoDePolonia added 3 commits February 28, 2026 16:25

Log usage for batch results and enrich batch audit entries

465ff98

Fix batch guardrails routing, results adapters, and batch persistence…

22f0430

… handling

Add OpenAI-compatible /v1/files endpoints with provider adapters

d99aa89

coderabbitai bot reviewed Feb 28, 2026

View reviewed changes

Harden batch/files routing and validation edge cases

5106316

coderabbitai bot reviewed Feb 28, 2026

View reviewed changes

Refine batch cursor handling, anthropic results mapping, and docs

d962f93

SantiagoDePolonia merged commit 6f13571 into main Feb 28, 2026
12 checks passed

coderabbitai bot mentioned this pull request Feb 28, 2026

Feature: audit logs sidemenu #110

Merged

This was referenced Mar 11, 2026

Refactor: architecture refactoring - ADR 2 implemented #133

Merged

fix: fixed a bunch of bugs and agents Markdown files updated #134

Merged

refactor: introduced execution plan and decoupling #148

Merged

SantiagoDePolonia deleted the feature/batch-api-support branch March 22, 2026 14:25

coderabbitai bot mentioned this pull request Mar 26, 2026

refactor(storage): narrow backend interfaces #182

Merged

-		var row openAICompatibleBatchLine
-		if err := json.Unmarshal(line, &row); err != nil {
-			continue
-		}
+func parseBatchOutputFile(raw []byte, fallbackURL, providerName string) ([]core.BatchResultItem, error) {
+	scanner := bufio.NewScanner(bytes.NewReader(raw))
+	scanner.Buffer(make([]byte, 0, 64*1024), 4*1024*1024)
+	results := make([]core.BatchResultItem, 0)
+	index := 0
+	lineNo := 0
+	for scanner.Scan() {
+		lineNo++
+		line := bytes.TrimSpace(scanner.Bytes())
+		if len(line) == 0 {
+			continue
+		}
+		var row openAICompatibleBatchLine
+		if err := json.Unmarshal(line, &row); err != nil {
+			return nil, fmt.Errorf("invalid JSONL in batch output at line %d: %w", lineNo, err)
+		}

Conversation

SantiagoDePolonia commented Feb 28, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SantiagoDePolonia commented Feb 28, 2026

Uh oh!

coderabbitai bot commented Feb 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SantiagoDePolonia commented Feb 28, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 28, 2026 •

edited

Loading