feat(gom-48): add simple response cache#123
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRefactors cache into nested Cache.Model and Cache.Response configs; adds validation; extracts a modelcache package and generic Store interface; implements Redis and in-memory stores; adds response-cache middleware and tests; wires middleware into app/server; updates docs, env, and Helm manifests. Changes
Sequence DiagramsequenceDiagram
participant Client
participant EchoServer as "Echo Server"
participant Middleware as "ResponseCacheMiddleware"
participant SimpleCache as "simpleCacheMiddleware"
participant Store as "RedisStore/MapStore"
participant Handler as "Downstream Handler"
Client->>EchoServer: POST /v1/chat/completions
EchoServer->>Middleware: incoming request
Middleware->>Middleware: read body, compute key = SHA256(path+body)
Middleware->>SimpleCache: get(key)
SimpleCache->>Store: Get(ctx, key)
alt Cache HIT
Store-->>SimpleCache: cached bytes
SimpleCache-->>Middleware: return cached response
Middleware-->>EchoServer: respond (X-Cache: HIT)
EchoServer-->>Client: 200 cached response
else Cache MISS
Store-->>SimpleCache: nil
SimpleCache-->>Middleware: miss
Middleware->>Handler: call downstream handler
Handler-->>Middleware: 200 + body
Middleware->>SimpleCache: async store(key, body, ttl)
SimpleCache->>Store: Set(ctx, key, body, ttl)
Store-->>SimpleCache: success
Middleware-->>EchoServer: forward response
EchoServer-->>Client: 200 response
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…cing-costs-parts-of-the
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/providers/registry.go (1)
242-275:⚠️ Potential issue | 🟠 MajorPreserve registration order when loading duplicate model IDs from cache.
Line 253 ranges over
modelCache.Providers, so when two configured providers expose the samemodel.ID, the provider that winsnewModels[modelID]depends on Go's randomized map iteration. That breaks this file's own "first provider wins" rule until a network refresh repopulates the registry.🧭 Proposed fix
- r.mu.RLock() - nameToProvider := make(map[string]core.Provider, len(r.providerNames)) - for provider, pName := range r.providerNames { - nameToProvider[pName] = provider - } - r.mu.RUnlock() + r.mu.RLock() + orderedProviderNames := make([]string, 0, len(r.providers)) + nameToProvider := make(map[string]core.Provider, len(r.providerNames)) + for _, provider := range r.providers { + pName := r.providerNames[provider] + orderedProviderNames = append(orderedProviderNames, pName) + nameToProvider[pName] = provider + } + r.mu.RUnlock() @@ - for providerName, cachedProv := range modelCache.Providers { + for _, providerName := range orderedProviderNames { + cachedProv, ok := modelCache.Providers[providerName] + if !ok { + continue + } provider, ok := nameToProvider[providerName] if !ok { - // Provider not configured, skip all its models continue }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/registry.go` around lines 242 - 275, The code currently ranges over modelCache.Providers which yields non-deterministic iteration order and breaks the "first provider wins" rule; change the loop to iterate providers in the registry's configured/registration order (not the cache map order) when populating newModels/newModelsByProvider: build a deterministic ordered list of provider names from the registry's registration info (use the existing registration-order slice if one exists, or record provider names in registration order when populating r.providerNames), then loop over that ordered slice instead of ranging modelCache.Providers so that duplicate model IDs consistently honor the first-registered provider; update the loop that references modelCache.Providers, providerName, cachedProv, newModels, and newModelsByProvider accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config/config.go`:
- Around line 381-387: The default config currently pre-populates
Cache.Model.Local in buildDefaultConfig(), which causes unmarshaled YAML that
only sets cache.model.redis to incorrectly have both backends set and fail
ValidateCacheConfig; remove the Local: &LocalCacheConfig{CacheDir: ".cache"}
assignment from the ModelCacheConfig defaults so buildDefaultConfig() leaves
Local nil by default, and instead apply the ".cache" fallback only during
runtime initialization when the Local backend is actually selected (e.g., in the
cache initialization path that consumes ModelCacheConfig), setting
Local.CacheDir if Local is non-nil and CacheDir is empty; ensure
ValidateCacheConfig no longer sees a preselected Local and update any tests that
expected Local to be nil.
- Around line 537-542: The applyEnvOverridesValue branch currently skips nil
pointer struct fields (check using field.Type.Kind() == reflect.Ptr &&
!fieldVal.IsNil()), preventing env-only population; update
applyEnvOverridesValue to detect when a nil pointer-to-struct has any descendant
fields with an env tag, allocate a new zero-value struct (set fieldVal to a new
pointer) before recursing into applyEnvOverridesValue(fieldVal.Elem()), and only
leave it nil if no env-tagged descendants exist; add a regression test that
exercises env-only Redis config (e.g., populate REDIS_URL/REDIS_TTL) to ensure
pointer allocation materializes Cache.Response.Simple.Redis when YAML is absent.
- Line 296: Add a Go doc comment above the exported function ValidateCacheConfig
that briefly describes what the function does, its input (the *CacheConfig
parameter), and the returned error behavior; ensure the comment follows Go doc
style (starts with "ValidateCacheConfig ...") and is updated to reflect any
specific validation rules or side effects implemented by ValidateCacheConfig.
In `@internal/app/app.go`:
- Around line 198-205: The failure path for
responsecache.NewResponseCacheMiddleware currently closes app.usage, app.audit,
and app.providers but forgets to close app.batch, leaving batch storage/workers
running; update the error cleanup to also call app.batch.Close() (include it in
errors.Join alongside app.usage.Close(), app.audit.Close(),
app.providers.Close()) so that when response cache initialization fails the
batch resources are properly closed before returning the error from the
initialization block in function where rcm is created.
In `@internal/cache/modelcache/redis.go`:
- Around line 24-55: The wrapper currently always closes the underlying
cache.Store even when callers inject a shared store; update the redisModelCache
struct to track ownership (e.g., an owned bool flag), set owned=true in
NewRedisModelCache and owned=false in NewRedisModelCacheWithStore, and modify
the redisModelCache.Close method to only call store.Close when owned is true;
reference redisModelCache, NewRedisModelCache, NewRedisModelCacheWithStore, and
Close to locate the change.
In `@internal/cache/redis.go`:
- Around line 31-40: In NewRedisStore, the redis client created by
redis.NewClient(opts) must be closed if startup Ping fails to avoid leaking the
client's connection pool; update the error path after client.Ping(ctx).Err() to
call client.Close() (or defer a close only after successful init) before
returning the fmt.Errorf("failed to connect to redis: %w", err) error so the
client is properly cleaned up; reference symbols: NewRedisStore,
redis.NewClient, client.Ping, client.Close.
In `@internal/providers/init.go`:
- Around line 160-165: The startup log currently prints the Redis connection
string directly (slog.Info call using m.Redis.URL), which can leak credentials;
update the code that prepares the log (the block using variables key :=
m.Redis.Key and calling slog.Info("using redis cache", "url", m.Redis.URL,
"key", key)) to redact sensitive parts of m.Redis.URL before logging—parse the
URL (or strip userinfo) and replace credentials with a placeholder (e.g.,
"REDACTED") or log only the host/port, then pass the redacted value to slog.Info
so secrets are not emitted.
In `@internal/responsecache/simple.go`:
- Around line 75-80: The current code launches a detached goroutine inside the
cache path (the anonymous func that calls m.store.Set with storeCtx) which drops
the error and is not tracked by shutdown; replace the fire-and-forget write with
a tracked background worker or inline blocking write so Close() can wait for
pending writes: add a small bounded worker queue (e.g., a buffered channel of
jobs) and a sync.WaitGroup that increments when enqueuing a job for m.store.Set
and decrements after the Set returns (include context timeout handling and
propagate/log errors instead of ignoring them), or perform the Set synchronously
in the request goroutine; ensure the Cache type has Close() that waits on the
WaitGroup and drains the worker channel before closing m.store so no write races
store.Close().
- Around line 102-106: The isStreamingRequest function currently byte-matches
for `"stream":true` which misses variants like `"stream" : true`, multiline JSON
and can false-positive on prompt text; change isStreamingRequest to parse the
request body as JSON into a small struct (e.g., type payload struct { Stream
*bool `json:"stream"` }) and return true only if Stream != nil && *Stream ==
true (still return false for path "/v1/embeddings"); on JSON unmarshal errors or
non-boolean/missing stream fields treat as non-streaming to avoid false
positives that break responseCapture and SSE handling.
In `@internal/server/http.go`:
- Around line 236-241: The shutdown sequence in Server.Shutdown is closing
s.responseCacheMiddleware before stopping s.echo, risking handlers hitting a
closed store and swallowing Close errors; change the order to call
s.echo.Shutdown(ctx) first, capture its error, then call
s.responseCacheMiddleware.Close(), capture that error, and return a combined
error (use errors.Join) so both errors are propagated; also add the errors
import.
---
Outside diff comments:
In `@internal/providers/registry.go`:
- Around line 242-275: The code currently ranges over modelCache.Providers which
yields non-deterministic iteration order and breaks the "first provider wins"
rule; change the loop to iterate providers in the registry's
configured/registration order (not the cache map order) when populating
newModels/newModelsByProvider: build a deterministic ordered list of provider
names from the registry's registration info (use the existing registration-order
slice if one exists, or record provider names in registration order when
populating r.providerNames), then loop over that ordered slice instead of
ranging modelCache.Providers so that duplicate model IDs consistently honor the
first-registered provider; update the loop that references modelCache.Providers,
providerName, cachedProv, newModels, and newModelsByProvider accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: b69869b7-dd07-43bc-85f5-b6bdc06c9bbf
📒 Files selected for processing (21)
config/cache_validation_test.goconfig/config.goconfig/config_helpers_test.goconfig/config_test.gointernal/app/app.gointernal/cache/modelcache/local.gointernal/cache/modelcache/local_test.gointernal/cache/modelcache/modelcache.gointernal/cache/modelcache/redis.gointernal/cache/modelcache/redis_test.gointernal/cache/redis.gointernal/cache/store.gointernal/providers/init.gointernal/providers/registry.gointernal/providers/registry_cache_test.gointernal/responsecache/middleware_test.gointernal/responsecache/responsecache.gointernal/responsecache/simple.gointernal/server/http.gotests/integration/setup_test.gotests/stress/stress_test.go
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
docs/advanced/configuration.mdx (2)
46-56:⚠️ Potential issue | 🟡 MinorAdd
CACHE_REFRESH_INTERVALto the Cache configuration table.The configuration option
CACHE_REFRESH_INTERVAL(YAML:cache.refresh_interval) is supported and controls how often models auto-refresh. It defaults to 3600 seconds (1 hour) but is missing from the Cache configuration table. Add this row for completeness:
Variable Description Default CACHE_REFRESH_INTERVALHow often to refresh the model registry (seconds) 3600(1h)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/advanced/configuration.mdx` around lines 46 - 56, Add a new row to the Cache configuration table documenting CACHE_REFRESH_INTERVAL: the YAML key is cache.refresh_interval, it controls how often the model registry auto-refreshes (seconds) and defaults to 3600 (1h); update the table near the existing CACHE_TYPE/GOMODEL_CACHE_DIR/REDIS_* entries to include the row with Variable `CACHE_REFRESH_INTERVAL`, Description `How often to refresh the model registry (seconds)`, and Default ``3600` (1h)` to keep the docs complete and consistent.
179-183:⚠️ Potential issue | 🟡 MinorUpdate YAML example to show per-category cache configuration.
The cache configuration supports separate settings for model registry and response caching via
cache.modelandcache.responsesections. The current example only showscache.typeandcache.redis.url, which doesn't reflect the actual structure. Update it to demonstrate both categories—for example, showingcache.model.redisandcache.response.simple.redisconfigurations.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/advanced/configuration.mdx` around lines 179 - 183, Update the YAML example to show per-category cache configuration rather than a single top-level cache block: replace the current cache keys with separate sections for cache.model and cache.response (e.g., cache.model.redis.url = "redis://..." and cache.response.simple.redis.url = "redis://..." or similar), preserving the indicated types (redis/simple) so readers can see both model registry and response-cache settings; edit the example around the existing cache block to use cache.model and cache.response keys instead of cache.type and cache.redis.
♻️ Duplicate comments (3)
config/config.go (3)
305-305:⚠️ Potential issue | 🟡 MinorAdd a Go doc comment for
ValidateCacheConfig.The new exported helper is undocumented.
As per coding guidelines, "Maintain Go doc comments on all exported types, functions, and interfaces; update when changing public APIs or function signatures/behavior".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/config.go` at line 305, Add a Go doc comment for the exported function ValidateCacheConfig that begins with the function name and succinctly describes its purpose, expected input (the CacheConfig pointer), and what kind of error it returns on invalid configuration; update the comment to mention any important validation rules or side effects performed by ValidateCacheConfig so callers know when and why it might return an error.
390-396:⚠️ Potential issue | 🟠 MajorDon't hard-select the model-cache backend in defaults.
Seeding
Cache.Model.Localhere makes a YAML file that only setscache.model.redisfail validation with both backends present, which is why the updated tests now needlocal: null. Resolve backend selection after YAML/env processing instead of baking it intobuildDefaultConfig().🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/config.go` around lines 390 - 396, The default config currently seeds Cache.Model.Local inside buildDefaultConfig() which forces both backends to appear and breaks validation when users only configure cache.model.redis; remove the hard-coded Local assignment (and any explicit nil/placeholder for Redis) from the ModelCacheConfig defaults in buildDefaultConfig(), leaving both backend fields absent/unset by default, and instead implement backend selection/normalization after YAML/env loading (e.g., a dedicated normalizeCacheModel or finalizeConfig step) that decides and fills Local or Redis based on user input; target the ModelCacheConfig struct and buildDefaultConfig() function to make this change.
546-551:⚠️ Potential issue | 🟠 MajorEnv-only Redis config still can't materialize nested structs.
This branch skips nil pointer fields, so
REDIS_URL/REDIS_TTL_RESPONSEScan never createCache.Model.RedisorCache.Response.Simple.Redisunless YAML already built those structs. That leaves the current.env, Helm, and Compose Redis path silently falling back to local/no-op caching. Allocate pointer-to-struct fields when one of their descendant env vars is present, and add a regression test for env-only Redis config.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/config.go` around lines 546 - 551, The pointer-branch in applyEnvOverridesValue currently skips nil pointer fields so env-only settings (e.g., REDIS_URL / REDIS_TTL_RESPONSES) can’t create nested structs; change the logic so when field.Type.Kind() == reflect.Ptr and fieldVal.IsNil() you detect whether any descendant env var for that pointer-to-struct exists, and if so allocate a new value with reflect.New(field.Type.Elem()), set it into the parent (fieldVal.Set(newVal)) and then call applyEnvOverridesValue(newVal.Elem()) to populate it; also add a regression test that sets REDIS_URL and REDIS_TTL_RESPONSES (or equivalent env names used by Cache.Model.Redis and Cache.Response.Simple.Redis) and asserts the config now materializes those nested Redis structs instead of leaving them nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config/config.go`:
- Around line 281-288: The response cache currently ignores
RedisResponseConfig.Key (REDIS_KEY_RESPONSES) and always uses the hard-coded
responseCachePrefix; update the response-cache middleware where the Redis store
is created to use the configured key (e.g., cfg.RedisResponse.Key) as the
namespace/prefix (falling back to responseCachePrefix when empty) instead of the
constant, so operators' REDIS_KEY_RESPONSES takes effect; alternatively remove
the Key field from RedisResponseConfig and docs if you prefer not to support
per-response namespaces.
In `@docs/advanced/configuration.mdx`:
- Around line 53-56: Add a migration note explaining that the env vars were
renamed: map old names REDIS_KEY -> REDIS_KEY_MODELS and REDIS_TTL ->
REDIS_TTL_MODELS so users upgrading should set the new vars (or mention
backward-compatibility steps), and add a short clarifying sentence that
REDIS_KEY_RESPONSES intentionally includes a trailing colon
(`gomodel:response:`) because it’s used as a key prefix for response entries
whereas REDIS_KEY_MODELS (`gomodel:models`) is a discrete key; place this note
immediately after the Cache table and reference the exact env var names
(REDIS_KEY, REDIS_TTL, REDIS_KEY_MODELS, REDIS_TTL_MODELS, REDIS_KEY_RESPONSES)
so readers can map old→new.
In `@helm/templates/deployment.yaml`:
- Around line 69-78: The deployment Helm template is missing environment
variables for REDIS_TTL_MODELS and REDIS_TTL_RESPONSES so pods never receive TTL
settings; add two env entries (names REDIS_TTL_MODELS and REDIS_TTL_RESPONSES)
alongside REDIS_KEY_MODELS/RESPONSES using valueFrom configMapKeyRef with the
same {{ include "gomodel.fullname" . }} name and corresponding keys
REDIS_TTL_MODELS and REDIS_TTL_RESPONSES, and also update the ConfigMap template
to expose those two keys so values from chart values can flow through into the
pod.
---
Outside diff comments:
In `@docs/advanced/configuration.mdx`:
- Around line 46-56: Add a new row to the Cache configuration table documenting
CACHE_REFRESH_INTERVAL: the YAML key is cache.refresh_interval, it controls how
often the model registry auto-refreshes (seconds) and defaults to 3600 (1h);
update the table near the existing CACHE_TYPE/GOMODEL_CACHE_DIR/REDIS_* entries
to include the row with Variable `CACHE_REFRESH_INTERVAL`, Description `How
often to refresh the model registry (seconds)`, and Default ``3600` (1h)` to
keep the docs complete and consistent.
- Around line 179-183: Update the YAML example to show per-category cache
configuration rather than a single top-level cache block: replace the current
cache keys with separate sections for cache.model and cache.response (e.g.,
cache.model.redis.url = "redis://..." and cache.response.simple.redis.url =
"redis://..." or similar), preserving the indicated types (redis/simple) so
readers can see both model registry and response-cache settings; edit the
example around the existing cache block to use cache.model and cache.response
keys instead of cache.type and cache.redis.
---
Duplicate comments:
In `@config/config.go`:
- Line 305: Add a Go doc comment for the exported function ValidateCacheConfig
that begins with the function name and succinctly describes its purpose,
expected input (the CacheConfig pointer), and what kind of error it returns on
invalid configuration; update the comment to mention any important validation
rules or side effects performed by ValidateCacheConfig so callers know when and
why it might return an error.
- Around line 390-396: The default config currently seeds Cache.Model.Local
inside buildDefaultConfig() which forces both backends to appear and breaks
validation when users only configure cache.model.redis; remove the hard-coded
Local assignment (and any explicit nil/placeholder for Redis) from the
ModelCacheConfig defaults in buildDefaultConfig(), leaving both backend fields
absent/unset by default, and instead implement backend selection/normalization
after YAML/env loading (e.g., a dedicated normalizeCacheModel or finalizeConfig
step) that decides and fills Local or Redis based on user input; target the
ModelCacheConfig struct and buildDefaultConfig() function to make this change.
- Around line 546-551: The pointer-branch in applyEnvOverridesValue currently
skips nil pointer fields so env-only settings (e.g., REDIS_URL /
REDIS_TTL_RESPONSES) can’t create nested structs; change the logic so when
field.Type.Kind() == reflect.Ptr and fieldVal.IsNil() you detect whether any
descendant env var for that pointer-to-struct exists, and if so allocate a new
value with reflect.New(field.Type.Elem()), set it into the parent
(fieldVal.Set(newVal)) and then call applyEnvOverridesValue(newVal.Elem()) to
populate it; also add a regression test that sets REDIS_URL and
REDIS_TTL_RESPONSES (or equivalent env names used by Cache.Model.Redis and
Cache.Response.Simple.Redis) and asserts the config now materializes those
nested Redis structs instead of leaving them nil.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 690bbe32-d86d-42fe-b9b9-fedb4899d21d
📒 Files selected for processing (9)
.env.templateCLAUDE.mdconfig/cache_validation_test.goconfig/config.goconfig/config_test.godocker-compose.yamldocs/advanced/configuration.mdxhelm/templates/configmap.yamlhelm/templates/deployment.yaml
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
https: //github.com//pull/123#discussion_r2898119497 Made-with: Cursor
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
internal/responsecache/simple.go (2)
102-106:⚠️ Potential issue | 🟠 MajorParse
streamfrom JSON instead of byte-matching it.The current check misses valid payloads like
"stream" : trueor multiline JSON, and it can false-positive when prompt text contains the same substring. A missed streaming request falls through toresponseCapture, which does not preserve flushing/SSE semantics and can cache streamed bytes.Suggested fix
import ( "bytes" "context" "crypto/sha256" + "encoding/json" "encoding/hex" "io" "net/http" "strings" "time" @@ func isStreamingRequest(path string, body []byte) bool { if path == "/v1/embeddings" { return false } - return bytes.Contains(body, []byte(`"stream":true`)) || bytes.Contains(body, []byte(`"stream": true`)) + var req struct { + Stream *bool `json:"stream"` + } + if err := json.Unmarshal(body, &req); err != nil { + return false + } + return req.Stream != nil && *req.Stream }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/responsecache/simple.go` around lines 102 - 106, isStreamingRequest currently byte-matches the body and can miss variants or false-positive in prompt text; change isStreamingRequest to parse the JSON body (e.g., json.Unmarshal into a map[string]interface{} or a small struct with Stream *bool) and return true only when the parsed "stream" field is true, keep the existing path == "/v1/embeddings" early-return, and on JSON unmarshal errors conservatively return false so responseCapture handling (which doesn't preserve streaming/SSE) is not used for real streamed requests.
75-80:⚠️ Potential issue | 🟠 MajorTrack background cache writes instead of detaching them.
This still spawns one goroutine per cacheable 200 response, drops the
Seterror, and gives shutdown no way to drain pending writes. A slow backend can silently lose entries or race store teardown. Either do theSetinline or push it through a bounded worker/WaitGroupthatClose()waits on.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/responsecache/simple.go` around lines 75 - 80, The code currently detaches cache writes in an anonymous goroutine (checking capture.Response().Status and capture.body.Len()) which drops Set errors and cannot be drained on shutdown; change this to track background writes: either perform m.store.Set inline (replace the goroutine with a direct call using a context.WithTimeout) or add a bounded worker/WaitGroup that Close() waits on. If using WaitGroup, call m.wg.Add(1) before starting the goroutine, defer m.wg.Done() inside it, capture and log/return the error from m.store.Set, and ensure Close() waits (with a timeout) for m.wg to finish; alternatively implement a buffered channel and a fixed number of worker goroutines that read write requests (key, capture.body.Bytes(), m.ttl) and call m.store.Set, and have Close() close the channel and wait for workers to exit. Ensure you keep the same checks for capture.Response().Status == http.StatusOK and capture.body.Len() > 0 and reuse the same context timeout pattern for Set calls.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/responsecache/simple.go`:
- Around line 46-49: The handler currently calls io.ReadAll(c.Request().Body)
and on error falls back to return next(c), which lets downstream handlers see a
partially-consumed request body; instead, when io.ReadAll returns an error you
should propagate that error (return the read error) rather than calling next(c),
so replace the fallback to next(c) with returning the read error from the
middleware surrounding the io.ReadAll call (ensure the error from io.ReadAll is
returned to the caller so the request is not forwarded with a truncated body).
---
Duplicate comments:
In `@internal/responsecache/simple.go`:
- Around line 102-106: isStreamingRequest currently byte-matches the body and
can miss variants or false-positive in prompt text; change isStreamingRequest to
parse the JSON body (e.g., json.Unmarshal into a map[string]interface{} or a
small struct with Stream *bool) and return true only when the parsed "stream"
field is true, keep the existing path == "/v1/embeddings" early-return, and on
JSON unmarshal errors conservatively return false so responseCapture handling
(which doesn't preserve streaming/SSE) is not used for real streamed requests.
- Around line 75-80: The code currently detaches cache writes in an anonymous
goroutine (checking capture.Response().Status and capture.body.Len()) which
drops Set errors and cannot be drained on shutdown; change this to track
background writes: either perform m.store.Set inline (replace the goroutine with
a direct call using a context.WithTimeout) or add a bounded worker/WaitGroup
that Close() waits on. If using WaitGroup, call m.wg.Add(1) before starting the
goroutine, defer m.wg.Done() inside it, capture and log/return the error from
m.store.Set, and ensure Close() waits (with a timeout) for m.wg to finish;
alternatively implement a buffered channel and a fixed number of worker
goroutines that read write requests (key, capture.body.Bytes(), m.ttl) and call
m.store.Set, and have Close() close the channel and wait for workers to exit.
Ensure you keep the same checks for capture.Response().Status == http.StatusOK
and capture.body.Len() > 0 and reuse the same context timeout pattern for Set
calls.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 03b77bab-ced7-4289-9110-ebd016c3cc4d
📒 Files selected for processing (1)
internal/responsecache/simple.go
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
https: //github.com//pull/123#discussion_r2898119528 Made-with: Cursor
When Ping fails, the client's connection pool was leaked. Close the client before returning the error so the pool is properly cleaned up. https: //github.com//pull/123#discussion_r2898119510 Made-with: Cursor
NewResponseCacheMiddleware was ignoring RedisResponseConfig.Key and always using the hardcoded responseCachePrefix. Use the configured key as the Redis prefix, falling back to the constant when empty, so REDIS_KEY_RESPONSES takes effect. https: //github.com//pull/123#discussion_r2900706634 Made-with: Cursor
Add the two missing TTL env vars to deployment.yaml and configmap.yaml so pods receive TTL settings. Also align values.yaml: replace the old single cache.redis.key/ttl with the split keyModels/ttlModels and keyResponses/ttlResponses fields that the ConfigMap template already expected. https: //github.com//pull/123#discussion_r2900706643 Made-with: Cursor
…equest When io.ReadAll failed, the middleware called next(c) with a partially consumed body. Return the error instead so the request is never forwarded with a truncated body. https: //github.com//pull/123#discussion_r2900732735 Made-with: Cursor
SantiagoDePolonia
left a comment
There was a problem hiding this comment.
Generally - looks good. But please change those two places where I marked you in the comment and feel free to merge
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config/config.go`:
- Around line 281-288: The struct comment for RedisResponseConfig should
explicitly state that the URL field shares the same env var (REDIS_URL) as
RedisModelConfig, so a single REDIS_URL configures the Redis instance for both
response and model caches while Key and TTL can be set independently via
REDIS_KEY_RESPONSES and REDIS_TTL_RESPONSES; update the doc comment above
RedisResponseConfig to mention RedisModelConfig, call out the shared REDIS_URL
behavior, and note that Key and TTL are separate env vars allowing independent
cache keys/expirations.
In `@helm/values.yaml`:
- Around line 93-100: Add the four new Redis cache fields to the Helm values
schema by extending the cache.redis object in values.schema.json to include
keyModels (string), ttlModels (integer), keyResponses (string), and ttlResponses
(integer); ensure each property has an appropriate type (string for
keyModels/keyResponses, integer for ttlModels/ttlResponses) and include brief
descriptions/defaults consistent with existing cache.redis properties (e.g.,
match naming and units for TTLs) so users see and validate these inputs in the
chart schema UI.
In `@internal/cache/modelcache/redis_test.go`:
- Around line 67-88: The test currently only verifies Set/Get succeed but
doesn't confirm that NewRedisModelCacheWithStore applied defaults; update
TestRedisModelCache_DefaultKeyAndTTL to either type-assert the returned cache to
*redisModelCache and assert its key and ttl fields equal DefaultRedisKey and
cache.DefaultRedisTTL, or inspect the backing store (store from
cache.NewMapStore) to verify the entry was written under DefaultRedisKey and
expires with cache.DefaultRedisTTL; reference NewRedisModelCacheWithStore,
redisModelCache, DefaultRedisKey, cache.DefaultRedisTTL, Set/Get and the store
object to locate and implement the assertions.
- Around line 90-119: Replace the MapStore usage in both tests with a small spy
implementation of cache.Store that records whether Close() was called; in
TestRedisModelCacheWithStore_CloseDoesNotCloseSharedStore create the spy, pass
it to NewRedisModelCacheWithStore, call c.Close(), and assert the spy’s Close
was not called (owned=false), and in TestRedisModelCache_CloseClosesOwnedStore
create the spy, construct redisModelCache with owned=true, call c.Close() and
assert the spy’s Close was called exactly once and that a second c.Close() is
safe (idempotent); reference symbols:
TestRedisModelCacheWithStore_CloseDoesNotCloseSharedStore,
TestRedisModelCache_CloseClosesOwnedStore, NewRedisModelCacheWithStore,
redisModelCache, cache.Store, Close.
In `@internal/cache/modelcache/redis.go`:
- Around line 13-15: Add a Go doc comment for the exported constant
DefaultRedisKey: place a single-line comment immediately above the const
declaration that begins with "DefaultRedisKey" and succinctly describes its
purpose (the default Redis key/namespace used for storing model entries) and its
default value ("gomodel:models"); ensure the comment follows Go doc style
(starts with the symbol name) so the package surface is properly documented.
In `@internal/responsecache/simple.go`:
- Around line 138-153: The responseCapture currently wraps http.ResponseWriter
but does not implement http.Flusher, so any downstream Flush() calls (e.g., SSE)
will be ignored; add a Flush method on responseCapture that type-asserts
r.ResponseWriter to http.Flusher and, if present, calls its Flush() (no-op
otherwise) so flushes are forwarded; update the responseCapture type to include
this method alongside Write to preserve existing behavior.
In `@internal/server/http.go`:
- Around line 49-50: Add a Go doc comment for the exported
ResponseCacheMiddleware field on the server.Config struct: describe that it
allows injecting a *responsecache.ResponseCacheMiddleware instance to
enable/override response caching behavior for HTTP responses (and note its
effect or expected lifecycle if relevant). Ensure the comment is a single-line
prefix comment placed immediately above the ResponseCacheMiddleware field in the
server.Config type to match surrounding field docs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: fcabf2e7-fafc-4bf7-975d-fef4d09c19af
📒 Files selected for processing (15)
config/config.goconfig/config_test.godocker-compose.yamlhelm/templates/configmap.yamlhelm/templates/deployment.yamlhelm/values.yamlinternal/app/app.gointernal/cache/modelcache/redis.gointernal/cache/modelcache/redis_test.gointernal/cache/redis.gointernal/providers/init.gointernal/responsecache/middleware_test.gointernal/responsecache/responsecache.gointernal/responsecache/simple.gointernal/server/http.go
💤 Files with no reviewable changes (1)
- docker-compose.yaml
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Replace stale cache.redis.key/ttl with the four new properties: keyModels (string), ttlModels (integer), keyResponses (string), ttlResponses (integer), each with descriptions and consistent with values.yaml defaults. https: //github.com//pull/123#discussion_r2902322581 Made-with: Cursor
…ltKeyAndTTL Type-assert the returned Cache to *redisModelCache and verify its key and ttl fields equal DefaultRedisKey and cache.DefaultRedisTTL so the test actually confirms defaults were applied. https: //github.com//pull/123#discussion_r2902322582 Made-with: Cursor
…tics Replace MapStore with a spyStore that counts Close calls. The unowned test now asserts closeCalls==0 and the owned test asserts closeCalls==1 after the first Close and ==2 after the second, making the ownership contract explicit rather than relying on MapStore's no-op behaviour. https: //github.com//pull/123#discussion_r2902322583 Made-with: Cursor
…cing-costs-parts-of-the # Conflicts: # config/config_test.go # internal/server/http.go
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
config/config.go (1)
466-483:⚠️ Potential issue | 🟠 MajorEnv-selected model cache backend still loses to YAML instead of overriding it.
If YAML sets
cache.model.localand the process setsREDIS_URL/REDIS_KEY_MODELS/REDIS_TTL_MODELS,applyEnvOverrides()materializesCache.Model.Redisbut leavesLocalintact, so Lines 481-483 fail validation instead of letting env win. The same happens in reverse withGOMODEL_CACHE_DIRover YAML Redis. Resolve the mutually exclusive backend after env overlay by clearing the competing side when only one backend is selected via env.Suggested fix
if err := applyEnvOverrides(cfg); err != nil { return nil, err } + modelLocalEnv := os.Getenv("GOMODEL_CACHE_DIR") != "" + modelRedisEnv := os.Getenv("REDIS_URL") != "" || + os.Getenv("REDIS_KEY_MODELS") != "" || + os.Getenv("REDIS_TTL_MODELS") != "" + + switch { + case modelLocalEnv && !modelRedisEnv: + cfg.Cache.Model.Redis = nil + case modelRedisEnv && !modelLocalEnv: + cfg.Cache.Model.Local = nil + } + // When no model cache backend was specified at all, default to local. if cfg.Cache.Model.Local == nil && cfg.Cache.Model.Redis == nil { cfg.Cache.Model.Local = &LocalCacheConfig{} }As per coding guidelines, "Configuration loading must follow the cascade: code defaults → config/config.yaml (supports
${VAR}and${VAR:-default}expansion) → environment variables (always win)".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/config.go` around lines 466 - 483, After applyEnvOverrides(cfg) returns, ensure the env-selected mutually-exclusive model cache backend wins by detecting which backend-specific environment variables were used and clearing the other backend on cfg.Cache.Model; specifically, check for Redis env vars (e.g., REDIS_URL, REDIS_KEY_MODELS, REDIS_TTL_MODELS) and if any present set cfg.Cache.Model.Local = nil, or check for the local env var (e.g., GOMODEL_CACHE_DIR) and if present set cfg.Cache.Model.Redis = nil, so that the subsequent defaulting/validation (the block that sets Local when both nil and the call to ValidateCacheConfig) honors environment overrides rather than the YAML value.
♻️ Duplicate comments (1)
internal/responsecache/simple.go (1)
79-89:⚠️ Potential issue | 🟠 MajorBound cache writes instead of spawning one goroutine per miss.
wgfixes shutdown ordering, but Lines 81-89 still create an unbounded goroutine backlog when the store slows down or stalls. A burst of cache misses can turn Redis latency into memory pressure on the request path. Use a bounded worker/semaphore or write inline so cache traffic cannot grow without limit.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/responsecache/simple.go` around lines 79 - 89, The current code spawns an unbounded goroutine per cache-miss (m.wg, m.store.Set) which can backlog if the store is slow; fix by bounding concurrent cache writers: add a semaphore field (e.g., m.writeSem chan struct{} with a fixed capacity), acquire from m.writeSem before launching the writer goroutine (or acquire synchronously and fall back to doing the Set inline if the sem is full), then release the sem after Set and still use m.wg.Done; update the code referencing capture.body, key and m.ttl to perform the Set under the semaphore to prevent unbounded goroutine growth.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config/config.go`:
- Around line 306-325: ValidateCacheConfig currently dereferences c (m :=
&c.Model) and panics when called with nil; add a guard at the top of
ValidateCacheConfig to return a descriptive error if c == nil (e.g.,
fmt.Errorf("cache: config is nil")) before any dereference, then proceed with
the existing checks on c.Model, c.Model.Local and c.Model.Redis (and the
m.Redis.URL check) so the function safely handles nil input.
In `@helm/values.schema.json`:
- Around line 191-200: The redis properties object in the JSON schema currently
allows unknown properties so removed fields like cache.redis.key/cache.redis.ttl
will still validate but be ignored; update the "redis" schema block (the object
named "redis" with properties
"url","keyModels","ttlModels","keyResponses","ttlResponses") to either add
"additionalProperties": false to make it strict, or add explicit
backward-compatibility aliases "key" and "ttl" (with descriptions marking them
deprecated and mapping guidance) so old values files are validated and the
templates can read them; implement one of these fixes in the
helm/values.schema.json redis object.
In `@internal/cache/modelcache/redis.go`:
- Around line 18-23: The exported type RedisModelCacheConfig and its fields lack
proper Go doc comments and the existing top comment is misleading; update the
comment on RedisModelCacheConfig to state that it is the configuration used by
NewRedisModelCache (which creates the Redis store) and document each exported
field (URL, Key, TTL) with their meaning, defaults (if any), and the TTL units
(e.g., time.Duration in seconds), so go doc consumers understand expected values
and behavior; ensure the package-level comment remains accurate and that
NewRedisModelCache is referenced in the type comment.
- Around line 77-82: The Set method on redisModelCache should reject nil
ModelCache payloads to avoid writing JSON "null" to Redis; modify
redisModelCache.Set to check if mc == nil and return a descriptive error (e.g.,
"nil ModelCache") before calling json.Marshal and c.store.Set (referencing
redisModelCache.Set, ModelCache, c.store.Set, c.key, c.ttl), and add a
regression test in a *_test.go that calls redisModelCache.Set(ctx, nil) and
asserts an error is returned and nothing is written to the store.
---
Outside diff comments:
In `@config/config.go`:
- Around line 466-483: After applyEnvOverrides(cfg) returns, ensure the
env-selected mutually-exclusive model cache backend wins by detecting which
backend-specific environment variables were used and clearing the other backend
on cfg.Cache.Model; specifically, check for Redis env vars (e.g., REDIS_URL,
REDIS_KEY_MODELS, REDIS_TTL_MODELS) and if any present set cfg.Cache.Model.Local
= nil, or check for the local env var (e.g., GOMODEL_CACHE_DIR) and if present
set cfg.Cache.Model.Redis = nil, so that the subsequent defaulting/validation
(the block that sets Local when both nil and the call to ValidateCacheConfig)
honors environment overrides rather than the YAML value.
---
Duplicate comments:
In `@internal/responsecache/simple.go`:
- Around line 79-89: The current code spawns an unbounded goroutine per
cache-miss (m.wg, m.store.Set) which can backlog if the store is slow; fix by
bounding concurrent cache writers: add a semaphore field (e.g., m.writeSem chan
struct{} with a fixed capacity), acquire from m.writeSem before launching the
writer goroutine (or acquire synchronously and fall back to doing the Set inline
if the sem is full), then release the sem after Set and still use m.wg.Done;
update the code referencing capture.body, key and m.ttl to perform the Set under
the semaphore to prevent unbounded goroutine growth.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4b453ad0-8175-47cb-bd1d-492ec64edef1
📒 Files selected for processing (6)
config/config.gohelm/values.schema.jsoninternal/cache/modelcache/redis.gointernal/cache/modelcache/redis_test.gointernal/responsecache/simple.gointernal/server/http.go
in echo v5, graceful HTTP shutdown is handled entirely by cancelling the context passed to StartConfig.Start — the Echo type no longer has a Shutdown method.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…elds Replace the misleading "Caller uses cache.RedisStore" comment with a proper doc comment referencing NewRedisModelCache, and document each field (URL, Key, TTL) with its meaning, default, and units. https: //github.com//pull/123#discussion_r2904148144 Made-with: Cursor
json.Marshal(nil) produces "null" which would be silently stored in Redis. Guard against this by returning an error before marshalling. Extend spyStore with a setCalls counter and add a regression test that asserts both an error is returned and store.Set is never called. https: //github.com//pull/123#discussion_r2904148149 Made-with: Cursor
post-merge leftovers
Refactor: Separate
internal/cache/into layered packagesSummary
Split the flat
internal/cache/package into a generic key-value store layer (internal/cache/) and a model-specific domain layer (internal/cache/modelcache/). Each cache consumer now depends on the correct abstraction level. No behavioral changes.Problem
The original
internal/cache/package mixed three unrelated concerns in a single namespace:Store,MapStore,RedisStore) — used by response cache middlewareModelCache,CachedProvider,CachedModel,Cache,LocalCache,redisModelCache) — used by provider registryNewRedisStore) — shared infrastructureThis caused:
cache.godefined both the genericStoreinterface and model-specific types. The name conveyed nothing about its actual contents.internal/cacheto getStoreandRedisStore, but that package was conceptually the "model cache" package.ResponseCachePrefix("gomodel:response:") lived inredis.gonext toDefaultRedisKey("gomodel:models:"). A response-cache constant had no business in the model-cache package.RedisStoreconstruction was tangled with model-cache logic, there was no clean seam to share a single Redis connection between consumers.Solution
Dependency direction (after):
Both
responsecacheandmodelcachedepend on the genericcachelayer. Neither depends on the other.Changes
New files
internal/cache/store.goStoreinterface,MapStorecache.gointernal/cache/modelcache/modelcache.goModelCache,CachedProvider,CachedModel,Cacheinterfacecache.gointernal/cache/modelcache/local.goLocalCachefile-based impllocal.gointernal/cache/modelcache/redis.goredisModelCache,DefaultRedisKeymodel_redis.gointernal/cache/modelcache/redis_test.gomodel_redis_test.gointernal/cache/modelcache/local_test.gocache_test.goDeleted files
internal/cache/cache.gostore.go+modelcache/modelcache.gointernal/cache/local.gomodelcache/local.gointernal/cache/model_redis.gomodelcache/redis.gointernal/cache/model_redis_test.gomodelcache/redis_test.gointernal/cache/cache_test.gomodelcache/local_test.goModified files
internal/cache/redis.goDefaultRedisKeyandResponseCachePrefixconstantsinternal/responsecache/responsecache.goresponseCachePrefixconstant; removedcache.ResponseCachePrefixreferenceinternal/providers/init.gomodelcache;cache.Cache→modelcache.Cache,cache.NewRedisModelCache→modelcache.NewRedisModelCache,cache.NewLocalCache→modelcache.NewLocalCache,cache.DefaultRedisKey→modelcache.DefaultRedisKeyinternal/providers/registry.gomodelcache; allcache.ModelCache/cache.CachedProvider/cache.CachedModel→modelcache.*internal/providers/registry_cache_test.goregistry.gotests/stress/stress_test.gomodelcache;emptyCachemock usesmodelcache.ModelCacheUnchanged
internal/responsecache/simple.go,middleware_test.go— already importedcache.Storecorrectlyinternal/app/app.go,internal/server/http.go— no direct cache type referencesconfig/— config DTOs don't depend on cache packagesWhat did NOT change
modelcacheimports only its parentinternal/cacheand stdlib.Future: shared Redis connection
With the generic
cache.Storelayer separated, sharing a single Redis connection becomes a small localized change: create oneRedisStoreinapp.goat startup and pass it into bothmodelcache.NewRedisModelCacheWithStore(sharedStore, ...)andresponsecache.NewResponseCacheMiddlewareWithStore(sharedStore, ...). Both constructor-with-store variants already exist.Summary by CodeRabbit
New Features
Improvements
Tests