Skip to content

feat(fallback): add model fallback routing#179

Merged
SantiagoDePolonia merged 10 commits intomainfrom
feature/fallback
Mar 28, 2026
Merged

feat(fallback): add model fallback routing#179
SantiagoDePolonia merged 10 commits intomainfrom
feature/fallback

Conversation

@SantiagoDePolonia
Copy link
Copy Markdown
Contributor

@SantiagoDePolonia SantiagoDePolonia commented Mar 26, 2026

Summary

  • add translated-route model fallback with off, manual, and auto modes
  • load manual fallback chains from JSON and resolve auto candidates from model ranking metadata
  • harden the feature by skipping cache writes for fallback-served responses and disabling embeddings fallback until vector compatibility is modeled

Testing

  • go test ./...
  • pre-commit hooks during git commit (fmt, unit tests, hot-path guard, golangci-lint)

Summary by CodeRabbit

  • New Features

    • Configurable model fallback: global default, per-model overrides, optional manual-rule file; auto/manual/off modes with ranked, provider-aware candidates
    • Runtime fallback for Chat and Responses (streaming and non-streaming); Embeddings remain without fallback
    • Requests flagged when fallback is used so cache inserts are skipped
  • Documentation

    • Terminology updated: “failover” → “fallback”
  • Tests

    • Expanded coverage for fallback config, resolver, ranking, provider metadata, and cache-skipping behavior

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 26, 2026

📝 Walkthrough

Walkthrough

Adds a configurable fallback subsystem: new fallback config and example JSON, config parsing/validation and manual-rule loading, a Resolver that produces ordered fallback selectors (manual or auto-ranked), wiring into server handlers and execution plans, per-request fallback tracking, and cache-write skipping when fallback was used.

Changes

Cohort / File(s) Summary
Configuration & Examples
config/config.example.yaml, config/config.go, config/config_test.go, config/fallback.example.json, config/failover.yaml
Add top-level fallback config (default_mode, manual_rules_path, overrides), validation/normalization, manual-rules JSON loading and parsing, example file and tests; remove commented failover snippet.
Fallback resolver & tests
internal/fallback/resolver.go, internal/fallback/resolver_test.go
New Resolver + Registry interface; supports manual and auto modes, manual-rule usage, auto candidate generation from registry, multi-criteria ranking, dedupe/truncate, and comprehensive unit tests.
Server wiring & interfaces
internal/server/http.go, internal/server/handlers.go, internal/server/request_model_resolution.go
Introduce RequestFallbackResolver interface; add FallbackResolver to server config and handler, thread resolver into handler construction.
Translated inference & runtime fallback
internal/server/translated_inference_service.go, internal/server/fallback_test.go, internal/server/handlers_test.go, internal/server/translated_inference_service_test.go
Integrate fallback into Chat/Responses (streaming + non-streaming) with selector-specific retries, request cloning, mark request when fallback used, disable embeddings fallback, refactor streaming helpers, and add/update tests.
Application & feature flags
internal/app/app.go, internal/app/app_test.go, internal/executionplans/..., internal/core/execution_plan.go
Initialize fallback resolver in app setup; compute global fallback-enabled feature and propagate into execution plan inputs and runtime caps; add ExecutionFeatures.Fallback and compile/runtime mappings/tests.
Request context & cache behavior
internal/core/context.go, internal/responsecache/simple.go, internal/responsecache/semantic.go, internal/responsecache/handle_request_test.go
Add fallback-used request flag and helpers; skip simple and semantic cache writes when fallback used; tests validate cache-write bypass.
Model metadata & rankings
internal/core/types.go, internal/modeldata/types.go, internal/modeldata/merger.go, internal/modeldata/merger_test.go, internal/modeldata/fetcher_test.go
Add ModelRanking and ModelMetadata.Rankings; extend ranking parsing (Elo, AsOf), map rankings into metadata, adjust provider-model actual ID handling, and add tests.
Providers registry & cache
internal/providers/registry.go, internal/providers/registry_cache_test.go, internal/providers/registry_test.go
Persist ProviderName/ProviderType on ModelInfo, backfill/enrich provider-type when loading cache, prefer stored provider metadata when listing models, refactor enrichment helper, and add tests.
Response handling tests & docs
internal/server/fallback_test.go, docs/adr/0004-capability-model-and-provider-attempts.md
Add in-memory server handler tests for fallback flows; rename “failover”→“fallback” in ADR documentation.

Sequence Diagram

sequenceDiagram
    participant Client as Client
    participant Handler as Server Handler
    participant Fallback as Fallback Resolver
    participant Registry as Model Registry
    participant Primary as Primary Provider
    participant Alternate as Alternate Provider

    Client->>Handler: Send request with model selector
    Handler->>Primary: Execute primary (Chat/Responses)
    alt Primary success
        Primary-->>Handler: Success
        Handler->>Client: Return 200
    else Provider-unavailable error
        Handler->>Fallback: ResolveFallbacks(resolution, op)
        Fallback->>Registry: GetModel / ListModelsWithProvider
        Note right of Fallback: Build candidates (manual or auto), score, dedupe, truncate
        Fallback-->>Handler: Fallback selector(s)
        Handler->>Alternate: Retry with fallback selector
        Alternate-->>Handler: Success
        Handler->>Handler: MarkFallbackUsed(ctx)
        Handler->>Client: Return 200
    else Non-availability error
        Handler->>Client: Return error (no fallback)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

release:internal

Poem

🐰 I hopped when the primary lost its pace,

I read the list and found a second place.
Manual maps or auto-ranked trail,
I flagged the request and skipped the cache tale.
A little rabbit retry — neat and hale.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.24% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding model fallback routing as a feature. It is specific, non-vague, and directly reflects the substantial fallback infrastructure (configuration, resolver, execution logic) added throughout the codebase.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/fallback

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/providers/registry.go (1)

245-251: ⚠️ Potential issue | 🟠 Major

Backfill ProviderType when reading older cache entries.

Caches written before this field existed will load with cachedProv.ProviderType == "". Since the new GetProviderType() and ListModelsWithProvider() paths now trust info.ProviderType, cache-backed startup loses provider classification until a full network refresh succeeds.

💡 Suggested fix
 r.mu.RLock()
 nameToProvider := make(map[string]core.Provider, len(r.providerNames))
+providerTypes := make(map[core.Provider]string, len(r.providerTypes))
 for provider, pName := range r.providerNames {
 	nameToProvider[pName] = provider
 }
+maps.Copy(providerTypes, r.providerTypes)
 r.mu.RUnlock()
@@
+providerType := cachedProv.ProviderType
+if providerType == "" {
+	providerType = providerTypes[provider]
+}
 info := &ModelInfo{
 	Model: core.Model{
 		ID:      cached.ID,
 		Object:  "model",
 		OwnedBy: cachedProv.OwnedBy,
 		Created: cached.Created,
 	},
 	Provider:     provider,
 	ProviderName: providerName,
-	ProviderType: cachedProv.ProviderType,
+	ProviderType: providerType,
 }

Also applies to: 264-273

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/registry.go` around lines 245 - 251, When constructing the
nameToProvider map and when loading cached provider entries, detect entries
where cachedProv.ProviderType == "" and backfill it by calling the registry's
GetProviderType() (or using r.providerNames mapping) so the ProviderType is
populated for older cache records; update the code paths that build
nameToProvider (using r.providerNames and the nameToProvider map) and the
cache-read logic to set cachedProv.ProviderType before trusting
info.ProviderType in ListModelsWithProvider() and other consumers. Ensure the
backfill happens immediately after reading the cached provider and before
inserting into nameToProvider so subsequent calls use the filled ProviderType.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@config/config_test.go`:
- Around line 268-276: The test embeds manualRulesPath directly into a
double-quoted YAML string (the yaml variable) which makes it platform-dependent
on Windows; instead construct the YAML via a Go value and yaml.Marshal (or
otherwise quote/escape the path safely) so backslashes aren't treated as YAML
escapes. Change the code that builds the yaml string (referencing
manualRulesPath and the yaml variable) to create a struct representing the
fallback config and call yaml.Marshal to produce the bytes, then write those
bytes with os.WriteFile (the same call that writes filepath.Join(dir,
"config.yaml")).

In `@config/config.go`:
- Around line 734-744: The code currently allows an empty fallback mode to pass
by normalizing it to ""—update the validation in the loop over cfg.Overrides
(working with FallbackModelOverride and normalizeFallbackMode) to reject missing
modes: after calling override.Mode = normalizeFallbackMode(override.Mode) check
if override.Mode == "" and return a config error (e.g.,
"fallback.overrides[%q].mode must be one of: auto, manual, off" or
"fallback.overrides[%q].mode cannot be empty"), otherwise continue to call
override.Mode.Valid() as before so entries like fallback.overrides.gpt-4o: {}
fail to load.

In `@config/fallback.yaml`:
- Around line 1-8: The fallback settings in config/fallback.yaml are never read
because config.Load() only reads config/config.yaml or config.yaml; update the
config loading logic (the config.Load function) to also detect and merge
config/fallback.yaml (or make loader scan for a fallback file name) and merge
its "fallback" section into the main configuration, or alternatively add a note
and copy the fallback keys into the primary config file; ensure the codepath
uses the "fallback" key from the merged config so entries like
"fallback.default_mode" and "fallback.overrides" (and specific overrides such as
"gpt-4o" / "claude-sonnet-4") are applied at runtime.

In `@internal/fallback/resolver.go`:
- Around line 387-395: sameFamily currently trims to check emptiness but
compares the original untrimmed source.Family and candidate.Family, causing
mismatches when whitespace differs; update sameFamily to trim both values into
local variables (e.g., srcFam := strings.TrimSpace(source.Family), candFam :=
strings.TrimSpace(candidate.Family)), validate those for non-empty, and then
compare srcFam == candFam so equality uses the trimmed strings.

In `@internal/responsecache/handle_request_test.go`:
- Around line 125-131: The test races with asynchronous cache writes scheduled
by run(false) because defer store.Close() can close the store before those
goroutines finish; modify the test (or run) to wait for the background
cache-population to complete before asserting: add a synchronization point
(e.g., expose and call a wait method on the fake store like
WaitForPendingWrites(), or have run(false) block until its scheduled cache write
goroutine signals completion) and then perform the checks on
rec2.Header().Get("X-Cache") and handlerCalls so defer store.Close() no longer
races with the async work.

---

Outside diff comments:
In `@internal/providers/registry.go`:
- Around line 245-251: When constructing the nameToProvider map and when loading
cached provider entries, detect entries where cachedProv.ProviderType == "" and
backfill it by calling the registry's GetProviderType() (or using
r.providerNames mapping) so the ProviderType is populated for older cache
records; update the code paths that build nameToProvider (using r.providerNames
and the nameToProvider map) and the cache-read logic to set
cachedProv.ProviderType before trusting info.ProviderType in
ListModelsWithProvider() and other consumers. Ensure the backfill happens
immediately after reading the cached provider and before inserting into
nameToProvider so subsequent calls use the filled ProviderType.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f040d3d2-81b8-469e-84cb-e300a20fc00e

📥 Commits

Reviewing files that changed from the base of the PR and between ab9d4c1 and 128fd48.

📒 Files selected for processing (26)
  • config/config.example.yaml
  • config/config.go
  • config/config_test.go
  • config/failover.yaml
  • config/fallback.example.json
  • config/fallback.yaml
  • docs/adr/0004-capability-model-and-provider-attempts.md
  • internal/app/app.go
  • internal/core/context.go
  • internal/core/types.go
  • internal/fallback/resolver.go
  • internal/fallback/resolver_test.go
  • internal/modeldata/fetcher_test.go
  • internal/modeldata/merger.go
  • internal/modeldata/merger_test.go
  • internal/modeldata/types.go
  • internal/providers/registry.go
  • internal/responsecache/handle_request_test.go
  • internal/responsecache/semantic.go
  • internal/responsecache/simple.go
  • internal/server/fallback_test.go
  • internal/server/handlers.go
  • internal/server/handlers_test.go
  • internal/server/http.go
  • internal/server/request_model_resolution.go
  • internal/server/translated_inference_service.go
💤 Files with no reviewable changes (1)
  • config/failover.yaml

Comment on lines +1 to +8
fallback:
default_mode: auto
manual_rules_path: "config/fallback.example.json"
overrides:
"gpt-4o":
mode: manual
"claude-sonnet-4":
mode: off
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

config/fallback.yaml is not loaded by the current config loader.

Line 1 introduces a standalone fallback config, but config.Load() only reads config/config.yaml or config.yaml. As written, this file’s values will be ignored unless duplicated in one of those files.

Suggested fix options
# Option A (preferred): move this fallback block into config/config.yaml
# so it is definitely loaded by applyYAML().
# Option B: explicitly extend config loading to merge config/fallback.yaml
# in config/config.go instead of first-found single-file behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/fallback.yaml` around lines 1 - 8, The fallback settings in
config/fallback.yaml are never read because config.Load() only reads
config/config.yaml or config.yaml; update the config loading logic (the
config.Load function) to also detect and merge config/fallback.yaml (or make
loader scan for a fallback file name) and merge its "fallback" section into the
main configuration, or alternatively add a note and copy the fallback keys into
the primary config file; ensure the codepath uses the "fallback" key from the
merged config so entries like "fallback.default_mode" and "fallback.overrides"
(and specific overrides such as "gpt-4o" / "claude-sonnet-4") are applied at
runtime.

Comment on lines +125 to +131
rec2 := run(false)
if rec2.Header().Get("X-Cache") != "" {
t.Fatalf("fallback-served response should not populate cache, got X-Cache=%q", rec2.Header().Get("X-Cache"))
}
if handlerCalls != 2 {
t.Fatalf("expected second request to execute handler again, got %d calls", handlerCalls)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Wait for the second async cache population before returning.

run(false) schedules simple/semantic cache writes, but the test exits immediately afterward. defer store.Close() can race those goroutines and make this test flaky.

💡 Suggested fix
 rec2 := run(false)
 if rec2.Header().Get("X-Cache") != "" {
 	t.Fatalf("fallback-served response should not populate cache, got X-Cache=%q", rec2.Header().Get("X-Cache"))
 }
 if handlerCalls != 2 {
 	t.Fatalf("expected second request to execute handler again, got %d calls", handlerCalls)
 }
+
+m.simple.wg.Wait()
+m.semantic.wg.Wait()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/responsecache/handle_request_test.go` around lines 125 - 131, The
test races with asynchronous cache writes scheduled by run(false) because defer
store.Close() can close the store before those goroutines finish; modify the
test (or run) to wait for the background cache-population to complete before
asserting: add a synchronization point (e.g., expose and call a wait method on
the fake store like WaitForPendingWrites(), or have run(false) block until its
scheduled cache write goroutine signals completion) and then perform the checks
on rec2.Header().Get("X-Cache") and handlerCalls so defer store.Close() no
longer races with the async work.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/providers/registry.go (1)

264-279: ⚠️ Potential issue | 🟠 Major

Prefer the configured provider type over the serialized cache value.

At this point provider already comes from the current providerName -> provider registration map. If a cache file survives a provider-type change for that same name, this path will attach models to the new provider instance but keep the old cached ProviderType, so GetProviderType() / ListModelsWithProvider() expose stale metadata until the first refresh. Using the configured type as the source of truth avoids that split-brain state.

🩹 Proposed fix
-		providerType := strings.TrimSpace(cachedProv.ProviderType)
-		if providerType == "" {
-			providerType = strings.TrimSpace(nameToProviderType[providerName])
-		}
+		providerType := strings.TrimSpace(nameToProviderType[providerName])
+		if providerType == "" {
+			providerType = strings.TrimSpace(cachedProv.ProviderType)
+		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/registry.go` around lines 264 - 279, The code currently
prefers cachedProv.ProviderType over the configured provider type, which can
leave stale metadata attached to a new provider instance; change the assignment
so providerType is taken from the configured provider (the current provider
instance) rather than the cache: after computing providerType from cachedProv,
overwrite it with the provider's configured type (use the provider's accessor
such as provider.Type() or provider.GetType()/GetProviderType() as appropriate
in this codebase) so that providerType reflects the live provider for provider,
providerName, cachedProv and nameToProviderType usage.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/fallback/resolver.go`:
- Around line 75-95: ResolveFallbacks currently builds manualSelectorsFor and
may enable autoSelectorsFor even for embedding requests; detect embedding
operations early by checking op (core.OperationEmbeddings) or computing
requiredCategoryForOperation(op) and if it equals core.CategoryEmbedding, return
nil before calling manualSelectorsFor or autoSelectorsFor so no fallback
candidates (manual or auto) are produced for embeddings; update ResolveFallbacks
to perform this early-return check and keep references to manualSelectorsFor,
autoSelectorsFor, ResolveFallbacks, requiredCategoryForOperation,
core.OperationEmbeddings and core.CategoryEmbedding to locate the change.
- Around line 298-325: In Resolver.matchKeys the generated key order puts bare
model keys before provider-qualified keys so generic entries like
fallback.overrides["gpt-4o"] shadow provider-specific entries; change the key
construction so that when source != nil and source.ProviderName != "" and
source.Model.ID != "" you insert the provider-qualified key
(source.ProviderName+"/"+source.Model.ID) before the bare keys
(resolution.Requested.Model and resolution.ResolvedSelector.Model) so
provider-qualified keys are emitted earlier; keep the same trimming/duplication
filtering logic and the existing calls resolution.RequestedQualifiedModel(),
resolution.ResolvedQualifiedModel(), resolution.Requested.Model and
resolution.ResolvedSelector.Model to locate where to reorder without altering
the duplicate removal loop.

---

Outside diff comments:
In `@internal/providers/registry.go`:
- Around line 264-279: The code currently prefers cachedProv.ProviderType over
the configured provider type, which can leave stale metadata attached to a new
provider instance; change the assignment so providerType is taken from the
configured provider (the current provider instance) rather than the cache: after
computing providerType from cachedProv, overwrite it with the provider's
configured type (use the provider's accessor such as provider.Type() or
provider.GetType()/GetProviderType() as appropriate in this codebase) so that
providerType reflects the live provider for provider, providerName, cachedProv
and nameToProviderType usage.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 04a9a469-ae36-4ac4-af2f-69c8889cbe5c

📥 Commits

Reviewing files that changed from the base of the PR and between 128fd48 and 1ac74cb.

📒 Files selected for processing (6)
  • config/config.go
  • config/config_test.go
  • internal/fallback/resolver.go
  • internal/fallback/resolver_test.go
  • internal/providers/registry.go
  • internal/providers/registry_cache_test.go

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
internal/fallback/resolver.go (1)

302-341: ⚠️ Potential issue | 🟠 Major

Don't infer selector specificity from the rendered string.

selectorHasProvider() treats any foo/bar value as provider-qualified, but the registry deliberately supports bare model IDs containing /. For requests like meta-llama/Meta-Llama-3-70B, matchKeys() will still put the bare key ahead of source.ProviderName+"/"+source.Model.ID, so provider-specific manual rules and overrides remain shadowed. Drive this ordering from the explicit provider fields on the request/resolved selectors (or the resolved ModelInfo) instead of reparsing the string form.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/fallback/resolver.go` around lines 302 - 341, matchKeys currently
infers provider qualification by reparsing the rendered selector strings via
selectorHasProvider(core.ParseModelSelector), which misorders keys for bare
model IDs that contain "/" (e.g. meta-llama/Meta-Llama-3-70B). Fix by driving
ordering from explicit provider fields instead: stop using selectorHasProvider
and instead check the parsed/provider fields on the request/resolved selector
objects (e.g. resolution.Requested.Provider,
resolution.ResolvedSelector.Provider) and the resolved source
(source.ProviderName and source.Model.ID) when deciding to prepend
provider-qualified keys; ensure you still include
resolution.RequestedQualifiedModel(), resolution.ResolvedQualifiedModel(),
resolution.Requested.Model and resolution.ResolvedSelector.Model but only treat
them as provider-qualified if the corresponding explicit Provider fields are
non-empty.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/providers/registry.go`:
- Around line 734-742: In ListModelsWithProviderByCategory(), stop
reconstructing provider metadata from r.providerTypes and the map key and
instead use the stored values on the info object: set
ModelWithProvider.ProviderType = info.ProviderType and ProviderName =
info.ProviderName (and use qualifyPublicModelID(info.ProviderName, modelID) for
Selector) so the category-filtered listing returns the same
provider_type/provider_name as ListModelsWithProvider(); update any variables
like publicProviderName and selector construction to derive from info rather
than the registry maps.

---

Duplicate comments:
In `@internal/fallback/resolver.go`:
- Around line 302-341: matchKeys currently infers provider qualification by
reparsing the rendered selector strings via
selectorHasProvider(core.ParseModelSelector), which misorders keys for bare
model IDs that contain "/" (e.g. meta-llama/Meta-Llama-3-70B). Fix by driving
ordering from explicit provider fields instead: stop using selectorHasProvider
and instead check the parsed/provider fields on the request/resolved selector
objects (e.g. resolution.Requested.Provider,
resolution.ResolvedSelector.Provider) and the resolved source
(source.ProviderName and source.Model.ID) when deciding to prepend
provider-qualified keys; ensure you still include
resolution.RequestedQualifiedModel(), resolution.ResolvedQualifiedModel(),
resolution.Requested.Model and resolution.ResolvedSelector.Model but only treat
them as provider-qualified if the corresponding explicit Provider fields are
non-empty.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 33d74382-eb82-4738-a0e2-730a518e1c70

📥 Commits

Reviewing files that changed from the base of the PR and between 1ac74cb and d8861c1.

📒 Files selected for processing (4)
  • internal/fallback/resolver.go
  • internal/fallback/resolver_test.go
  • internal/providers/registry.go
  • internal/providers/registry_cache_test.go

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
internal/providers/registry.go (2)

157-177: ⚠️ Potential issue | 🟠 Major

Enrich every provider-specific ModelInfo, not just the first winner.

Each provider copy is inserted into newModelsByProvider, but only the first one is added to newModels, which is the only map passed into modeldata.Enrich(). For duplicate model IDs, alternate providers therefore keep Metadata == nil unless ListModels() already returned it. ListModelsWithProvider() now feeds auto fallback selection, and internal/fallback/resolver.go skips nil-metadata candidates, so same-model cross-provider fallbacks disappear in the common “shared model ID” case.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/registry.go` around lines 157 - 177, The code only adds
the first provider's ModelInfo to newModels (the map passed to
modeldata.Enrich), so alternate provider-specific ModelInfo entries in
newModelsByProvider remain unenriched (Metadata == nil); change the enrichment
step to include every provider-specific ModelInfo by flattening
newModelsByProvider (or iterating each provider map) and passing all ModelInfo
instances to modeldata.Enrich (or calling modeldata.Enrich per provider map)
instead of only using newModels, ensuring every ModelInfo (from
newModelsByProvider) is enriched before being kept/returned.

264-279: ⚠️ Potential issue | 🟠 Major

Propagate cached ProviderType beyond the load step.

LoadFromCache() now backfills info.ProviderType, but registryAccessor.GetProviderType() and SaveToCache() still re-derive the type from r.providerTypes[info.Provider]. If the configured type is blank, this new backfill never reaches modeldata.Enrich(), and a later cache save can drop that provider group entirely. Please make the stored field the source of truth once it is populated.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/registry.go` around lines 264 - 279, LoadFromCache
populates ModelInfo.ProviderType but registryAccessor.GetProviderType and
SaveToCache still derive type from r.providerTypes[info.Provider], which can
overwrite or drop the backfilled value; update GetProviderType and SaveToCache
to prefer the stored ProviderType on the ModelInfo (or cachedProv.ProviderType)
when non-empty before falling back to r.providerTypes[info.Provider], and ensure
SaveToCache writes the ProviderType from info.ProviderType so modeldata.Enrich
sees the backfilled value as the source of truth.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@internal/providers/registry.go`:
- Around line 157-177: The code only adds the first provider's ModelInfo to
newModels (the map passed to modeldata.Enrich), so alternate provider-specific
ModelInfo entries in newModelsByProvider remain unenriched (Metadata == nil);
change the enrichment step to include every provider-specific ModelInfo by
flattening newModelsByProvider (or iterating each provider map) and passing all
ModelInfo instances to modeldata.Enrich (or calling modeldata.Enrich per
provider map) instead of only using newModels, ensuring every ModelInfo (from
newModelsByProvider) is enriched before being kept/returned.
- Around line 264-279: LoadFromCache populates ModelInfo.ProviderType but
registryAccessor.GetProviderType and SaveToCache still derive type from
r.providerTypes[info.Provider], which can overwrite or drop the backfilled
value; update GetProviderType and SaveToCache to prefer the stored ProviderType
on the ModelInfo (or cachedProv.ProviderType) when non-empty before falling back
to r.providerTypes[info.Provider], and ensure SaveToCache writes the
ProviderType from info.ProviderType so modeldata.Enrich sees the backfilled
value as the source of truth.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4fbc5c2b-971d-4c62-a319-3fdb85f24c7b

📥 Commits

Reviewing files that changed from the base of the PR and between d8861c1 and 453a422.

📒 Files selected for processing (4)
  • internal/fallback/resolver.go
  • internal/fallback/resolver_test.go
  • internal/providers/registry.go
  • internal/providers/registry_test.go

@SantiagoDePolonia
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 26, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@SantiagoDePolonia
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 26, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@config/config.go`:
- Around line 780-795: The loop building the manual map allows distinct decoded
keys that collide after strings.TrimSpace to silently overwrite each other;
update the code that iterates over decoded (the block using variables manual,
key, models, normalized) to detect if the trimmed key already exists in manual
and return a descriptive error (e.g., "duplicate manual rule key after trimming:
%q") instead of overwriting, preserving the existing behavior for empty keys and
empty model entries.
- Around line 745-761: When normalizing cfg.Overrides, detect collisions caused
by trimming keys: build a temporary set of trimmed keys and if a trimmed key
already exists (e.g., original keys like "gpt-4o" vs " gpt-4o "), return an
error instead of silently overwriting; modify the loop that currently trims keys
and assigns to normalized (referencing cfg.Overrides, normalized, and
FallbackModelOverride/override.Mode and normalizeFallbackMode) to check for
duplicates after strings.TrimSpace(key) and fail with a clear error message
indicating the duplicate trimmed key.

In `@internal/providers/registry_test.go`:
- Around line 1199-1234: The test currently asserts equality of selectors
between ListModelsWithProvider() and ListModelsWithProviderByCategory(), which
relies on implementation-derived values; change it to assert the explicit
expected selector value instead. In
TestListModelsWithProviderByCategory_UsesStoredProviderMetadata, replace the
comparison of filtered[0].Selector == allModels[0].Selector with a direct check
against the expected string (e.g., "public-openai/gpt-4o") computed from
ModelInfo.ProviderName and the model key, keeping the rest of the assertions for
ProviderName and ProviderType; this uses the same symbols:
ListModelsWithProvider, ListModelsWithProviderByCategory, and
ModelInfo.ProviderName.

In `@internal/server/fallback_test.go`:
- Around line 103-294: Add streaming equivalents of the existing fallback tests:
reproduce TestChatCompletion_FallsBackToAlternateModel and
TestResponses_FallsBackToAlternateModel but call the streaming endpoints (the
stream-handling functions: streamChatCompletion / streamResponses or their
handler methods) with stream=true in the request, assert the provider was
attempted first on the primary model then the alternate provider (check
provider.chatCalls or provider.responsesCalls length and ordering), assert the
streamed output contains the fallback content, and assert
core.GetFallbackUsed(c.Request().Context()) is set; also add a test mirroring
TestChatCompletion_DoesNotFallbackWhenExecutionPolicyDisablesFallback to ensure
the streaming fast-path bypass respects the ExecutionPolicy.Fallback=false flag
(i.e., only primary model attempted and appropriate status code returned).

In `@internal/server/translated_inference_service.go`:
- Around line 526-535: The helper providerTypeForSelector currently uses the
primary plan fallback when registry lookup returns empty, which misattributes
fallback traffic; update providerTypeForSelector to first prefer the selector's
explicit provider (e.g., call selector.Provider() or the selector field that
exposes the manually-specified provider) if non-empty, then try
s.provider.GetProviderType(selector.QualifiedModel()) and return that if
non-empty, and only finally return the incoming fallback parameter (after
trimming). Ensure you reference providerTypeForSelector,
selector.QualifiedModel(), selector.Provider() (or the selector method/field
that holds the explicit provider), and s.provider.GetProviderType(...) when
making these checks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 674eb3fc-83b5-456b-b7ca-2422f0969732

📥 Commits

Reviewing files that changed from the base of the PR and between 453a422 and b5b1a64.

📒 Files selected for processing (19)
  • config/config.example.yaml
  • config/config.go
  • config/config_test.go
  • internal/app/app.go
  • internal/app/app_test.go
  • internal/core/execution_plan.go
  • internal/executionplans/compiler.go
  • internal/executionplans/compiler_test.go
  • internal/executionplans/types.go
  • internal/executionplans/types_test.go
  • internal/providers/registry.go
  • internal/providers/registry_cache_test.go
  • internal/providers/registry_test.go
  • internal/responsecache/simple.go
  • internal/server/fallback_test.go
  • internal/server/handlers.go
  • internal/server/handlers_test.go
  • internal/server/http.go
  • internal/server/translated_inference_service.go

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds translated-route model fallback routing (off/manual/auto) so chat/responses requests can automatically retry on alternate model selectors when the primary is unavailable, while avoiding caching of fallback-served responses and deferring embeddings fallback until vector compatibility is modeled.

Changes:

  • Introduces translated fallback execution in the translated inference service (chat/responses), with error-based eligibility and request context marking for cache skipping.
  • Adds a configurable fallback resolver (manual chains + auto candidates based on model ranking metadata) and wires it through app/server config.
  • Extends model registry enrichment/caching to preserve per-provider metadata (including rankings) needed for auto fallback selection, and adds execution-plan feature flagging for fallback.

Reviewed changes

Copilot reviewed 33 out of 33 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
internal/server/translated_inference_service.go Adds fallback execution helpers for chat/responses, disables embeddings fallback, and marks fallback-used requests for cache skipping.
internal/server/request_model_resolution.go Defines RequestFallbackResolver interface for translated-route fallback selection.
internal/server/http.go Plumbs fallback resolver into server handler construction.
internal/server/handlers.go Stores fallback resolver on Handler and injects into translated inference service.
internal/server/handlers_test.go Updates handler constructor calls to new signature.
internal/server/fallback_test.go Adds handler-level tests for fallback behavior across chat/responses/embeddings.
internal/responsecache/simple.go Skips cache writes when request context indicates fallback was used.
internal/responsecache/semantic.go Skips semantic cache writes when request context indicates fallback was used.
internal/responsecache/handle_request_test.go Adds test ensuring fallback-used responses do not populate caches.
internal/providers/registry.go Persists provider name/type in ModelInfo and enriches provider-specific model metadata.
internal/providers/registry_test.go Adds tests for provider-specific metadata enrichment and category listing behavior.
internal/providers/registry_cache_test.go Adds tests for provider-type behavior when loading/saving registry cache.
internal/modeldata/types.go Improves reverse-index building and extends ranking schema (elo/as_of).
internal/modeldata/merger.go Maps model-list ranking data into core.ModelMetadata.
internal/modeldata/merger_test.go Adds test verifying ranking mapping into metadata.
internal/modeldata/fetcher_test.go Adds test for reverse-index building from provider_model_id and ranking parsing.
internal/fallback/resolver.go Implements fallback resolver (manual + auto via rankings and capabilities).
internal/fallback/resolver_test.go Adds coverage for fallback resolver modes, overrides, embeddings exclusion, and scoring.
internal/executionplans/types.go Adds fallback flag (default true when unset) and runtime conversion helper.
internal/executionplans/compiler.go Uses FeatureFlags.runtimeFeatures() when compiling plans.
internal/executionplans/compiler_test.go Adds tests for fallback defaulting and feature caps behavior.
internal/executionplans/types_test.go Adds test asserting fallback defaults to enabled at runtime.
internal/core/execution_plan.go Adds ExecutionFeatures.Fallback and ExecutionPlan.FallbackEnabled().
internal/core/context.go Adds context marker for “fallback used” to coordinate with cache middleware.
internal/core/types.go Adds ModelMetadata.Rankings and ModelRanking type for rankings data.
internal/app/app.go Wires fallback resolver + adds global fallback feature-cap logic into plan defaults/caps.
internal/app/app_test.go Tests fallback feature caps and default plan input behavior.
config/config.go Adds fallback config schema, env overrides, and JSON loading for manual fallback rules.
config/config_test.go Adds config tests for fallback loading, validation, and example config support.
config/config.example.yaml Documents fallback config options.
config/fallback.example.json Adds example manual fallback chains.
config/failover.yaml Removes obsolete failover config stub.
docs/adr/0004-capability-model-and-provider-attempts.md Updates terminology from “failover” to “fallback”.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +812 to +813
if status >= http.StatusInternalServerError || status == http.StatusTooManyRequests || status == http.StatusNotFound {
return true
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldAttemptFallback will attempt fallback on any 404 (status == http.StatusNotFound) regardless of whether the error is actually model-availability related. Provider 404s can also represent missing endpoints/resources or other misconfiguration, and falling back in those cases can mask the real error and route traffic unexpectedly. Consider removing the unconditional 404 case and relying on the existing code/message heuristics (or gating 404 behind a model-specific heuristic) before attempting fallback.

Copilot uses AI. Check for mistakes.
Comment on lines +159 to +161
Elo: new(1287.0),
Rank: new(3),
AsOf: new("2026-02-01"),
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test uses new(...) with values (e.g., new(1287.0), new(3), new("2026-02-01")). The built-in new requires a type argument (new(T)), so these calls won’t compile unless there is a package-level helper named new (none appears to be defined in this package). Use explicit pointer helpers (e.g., local vars / small ptr helpers) instead of new(...) here.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@config/config_test.go`:
- Around line 547-555: The test currently writes the example config to
filepath.Join(dir, "config.yaml") but the documented and exercised setup expects
config/config.yaml; update the test so the example config is written to
filepath.Join(dir, "config", "config.yaml") (adjust the os.WriteFile call that
creates the exampleData) and ensure the fallback example file remains at
filepath.Join(dir, "config", "fallback.example.json"); change any corresponding
path assumptions in this test so the loader exercises the documented
config/config.yaml resolution (update the os.WriteFile and any related
filepath.Join uses in the test).

In `@config/config.example.yaml`:
- Around line 110-117: The example YAML uses unquoted fallback mode values
(e.g., default_mode: off, mode: manual) which can be parsed as booleans by some
YAML tools; update the example to quote all mode values under the fallback block
(reference keys: fallback, default_mode, overrides, and the per-model mode
entries like "gpt-4o" -> mode) so values are explicit strings and safe to
copy/paste across editors and linters.

In `@config/config.go`:
- Around line 737-770: cfg currently allows DefaultMode or any Overrides to be
set to "manual" even when ManualRulesPath is empty which leads to a resolver
with empty chains at runtime; after normalizing cfg.DefaultMode and each
override (uses normalizeFallbackMode, FallbackModeOff, FallbackModelOverride and
override.Mode.Valid()) add a validation step that if
strings.TrimSpace(cfg.ManualRulesPath) == "" and (cfg.DefaultMode ==
FallbackModeManual or any override.Mode == FallbackModeManual) then return an
error like "fallback.manual_rules_path must be set when fallback.default_mode or
any fallback.overrides[].mode is 'manual'"; only set cfg.Manual = nil and return
nil after this validation passes.

In `@internal/executionplans/types.go`:
- Around line 35-49: The plan hash differs when FeatureFlags.Fallback is omitted
because runtimeFeatures() treats nil as true but normalizePayload() hashes the
raw payload; fix by canonicalizing the fallback before hashing: add a short
canonicalization step (e.g., a FeatureFlags.Canonicalize() or adjust
normalizePayload()) that ensures FeatureFlags.Fallback is set to true when nil
and use that canonicalized FeatureFlags (or the ExecutionFeatures returned by
runtimeFeatures()) when computing plan_hash so semantically identical plans
produce the same hash; reference FeatureFlags.runtimeFeatures(),
normalizePayload(), FeatureFlags, and plan_hash when locating where to apply the
change.

In `@internal/server/fallback_test.go`:
- Around line 343-392: Add an assertion in
TestResponses_FallsBackToAlternateModel to verify that the fallback flag was
set: after calling handler.Responses(c) and checking the status, call
core.GetFallbackUsed(c.Request().Context()) and assert it is true; this mirrors
other tests (e.g., TestChatCompletion_FallsBackToAlternateModel) and ensures
dispatchResponses/markRequestFallbackUsed was invoked when falling back to the
alternate model.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 660f682a-b509-4b8c-9284-28dbb91e02f1

📥 Commits

Reviewing files that changed from the base of the PR and between 453a422 and 5a0f290.

📒 Files selected for processing (20)
  • config/config.example.yaml
  • config/config.go
  • config/config_test.go
  • internal/app/app.go
  • internal/app/app_test.go
  • internal/core/execution_plan.go
  • internal/executionplans/compiler.go
  • internal/executionplans/compiler_test.go
  • internal/executionplans/types.go
  • internal/executionplans/types_test.go
  • internal/providers/registry.go
  • internal/providers/registry_cache_test.go
  • internal/providers/registry_test.go
  • internal/responsecache/simple.go
  • internal/server/fallback_test.go
  • internal/server/handlers.go
  • internal/server/handlers_test.go
  • internal/server/http.go
  • internal/server/translated_inference_service.go
  • internal/server/translated_inference_service_test.go

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
internal/server/fallback_test.go (1)

389-394: ⚠️ Potential issue | 🟡 Minor

Strengthen non-stream Responses fallback assertions.

Line 389–394 verifies attempt count and fallback flag, but it still doesn’t prove the fallback selector order or returned fallback payload for this path.

💚 Proposed test hardening
 	if len(provider.responsesCalls) != 2 {
 		t.Fatalf("responses calls = %v, want 2 attempts", provider.responsesCalls)
 	}
+	if provider.responsesCalls[0] != "gpt-4o" || provider.responsesCalls[1] != "azure/gpt-4o" {
+		t.Fatalf("responses calls = %v, want [gpt-4o azure/gpt-4o]", provider.responsesCalls)
+	}
+	if !strings.Contains(rec.Body.String(), "fallback response") {
+		t.Fatalf("response body = %s, want fallback content", rec.Body.String())
+	}
 	if !core.GetFallbackUsed(c.Request().Context()) {
 		t.Fatal("expected request context to be marked as fallback-used")
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/server/fallback_test.go` around lines 389 - 394, The test currently
only checks responsesCalls and core.GetFallbackUsed but should also verify the
fallback selector order and the actual payload returned for the non-stream
Responses path: after the call, assert that the provider recorded the selector
invocation sequence (e.g., check provider.requestedSelectors or
provider.selectorCalls contains the expected primary-then-fallback order), and
assert the HTTP response body/parsed Responses payload equals the expected
fallback payload (and optionally content-type/status) to ensure the fallback
return is correct; update the assertions around provider.responsesCalls and
core.GetFallbackUsed(c.Request().Context()) to also validate the selector call
sequence and the exact returned payload.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@config/config.example.yaml`:
- Around line 112-115: The comment for manual_rules_path is misleading about
optionality; update the example comment and/or text so it states that the JSON
file is optional only if no model override uses manual mode, and that when any
override sets overrides:"gpt-4o".mode (or any overrides.*.mode) to "manual" a
valid manual_rules_path must be provided; change the inline note next to
manual_rules_path to clearly state this requirement and/or add a brief sentence
under overrides indicating that manual mode requires a rules file.

In `@config/config.go`:
- Around line 788-810: The current json.Unmarshal into variable decoded loses
earlier duplicate JSON keys; before unmarshalling, use a json.Decoder to stream
the top-level object from expandString(string(raw)) and detect duplicate raw
keys so you can reject the file early. Concretely: create a json.Decoder over
strings.NewReader(expandString(string(raw))), require the first token is '{',
iterate by reading each key token (as string), check a seenKeys
map[string]struct{} for duplicates and return an error if a key repeats (before
trimming), then decode the corresponding value into a temporary []string (or raw
json.RawMessage then unmarshal) and populate the existing decoded/manual logic
(retain trimming and duplicate-after-trim checks) using the same identifiers
decoded, manual, raw, path so the rest of the function can remain unchanged.

---

Duplicate comments:
In `@internal/server/fallback_test.go`:
- Around line 389-394: The test currently only checks responsesCalls and
core.GetFallbackUsed but should also verify the fallback selector order and the
actual payload returned for the non-stream Responses path: after the call,
assert that the provider recorded the selector invocation sequence (e.g., check
provider.requestedSelectors or provider.selectorCalls contains the expected
primary-then-fallback order), and assert the HTTP response body/parsed Responses
payload equals the expected fallback payload (and optionally
content-type/status) to ensure the fallback return is correct; update the
assertions around provider.responsesCalls and
core.GetFallbackUsed(c.Request().Context()) to also validate the selector call
sequence and the exact returned payload.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0ff29072-187d-4139-948a-f24ebeae1ad2

📥 Commits

Reviewing files that changed from the base of the PR and between 5a0f290 and 26ff193.

📒 Files selected for processing (6)
  • config/config.example.yaml
  • config/config.go
  • config/config_test.go
  • internal/executionplans/types.go
  • internal/executionplans/types_test.go
  • internal/server/fallback_test.go

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@config/config.go`:
- Around line 817-821: Decoding into a []string accepts JSON null and yields a
nil slice, which silently disables the rule; in the fallback.manual_rules_path
parsing (where decoder.Decode(&models) is called and result assigned to
decoded[key]) explicitly reject null by first decoding into an interface{} or
using decoder.Token/RawMessage to detect a JSON null, and if null return a
fmt.Errorf indicating "fallback.manual_rules_path: null not allowed for %q"
(including path); only after confirming the value is a JSON array decode/convert
it into []string and assign to decoded[key].

In `@internal/server/fallback_test.go`:
- Around line 411-453: Add two negative tests for the Responses endpoint
mirroring existing ChatCompletion tests: create
TestResponses_DoesNotFallbackOnNonAvailabilityError and
TestResponses_DoesNotFallbackWhenExecutionPolicyDisablesFallback that follow the
pattern of TestResponses_StreamFallsBackToAlternateModel; use the same
fallbackProvider/test handler setup but configure responsesErrors to return a
non-availability error (not core.NewNotFoundError) for the primary model in the
first test and assert that the handler does not attempt the alternate model and
that fallback is not marked in context, and in the second test configure the
execution policy to disable fallback (use the fallbackResolverStub or policy
stub used elsewhere) and assert no fallback attempt and no fallback-used context
flag; reference handler.Responses, fallbackProvider, fallbackResolverStub and
core.GetFallbackUsed to locate and structure the tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 328c1e1c-622a-4e6c-9026-caedb87a4730

📥 Commits

Reviewing files that changed from the base of the PR and between 26ff193 and 378521d.

📒 Files selected for processing (4)
  • config/config.example.yaml
  • config/config.go
  • config/config_test.go
  • internal/server/fallback_test.go

Comment on lines +817 to +821
var models []string
if err := decoder.Decode(&models); err != nil {
return fmt.Errorf("fallback.manual_rules_path: failed to parse %q: %w", path, err)
}
decoded[key] = models
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In Go's encoding/json package, does decoding JSON nullinto a[]string return a nil slice without an error?

💡 Result:

Yes, decoding JSON null into a []string using Go's encoding/json package (via Unmarshal or Decode) sets the slice to nil and returns no error.

Citations:


🏁 Script executed:

cat -n config/config.go | sed -n '800,850p'

Repository: ENTERPILOT/GOModel

Length of output: 2158


🏁 Script executed:

# Find the function containing lines 817-821
rg -n "func " config/config.go | head -20

Repository: ENTERPILOT/GOModel

Length of output: 940


🏁 Script executed:

# Check if core.GatewayError is used elsewhere in this file
rg -n "core\.GatewayError\|GatewayError" config/config.go | head -10

Repository: ENTERPILOT/GOModel

Length of output: 44


🏁 Script executed:

# Check how decoded is used after this code
rg -n "decoded\[" config/config.go | head -20

Repository: ENTERPILOT/GOModel

Length of output: 89


🏁 Script executed:

# Find the function signature and see how it's called
rg -B5 "^func loadFallbackConfig" config/config.go | head -20

Repository: ENTERPILOT/GOModel

Length of output: 147


🏁 Script executed:

# Find where loadFallbackConfig is called from
rg -n "loadFallbackConfig" config/config.go

Repository: ENTERPILOT/GOModel

Length of output: 181


🏁 Script executed:

# Check the Load() function which likely handles overall config loading
sed -n '649,750p' config/config.go | cat -n

Repository: ENTERPILOT/GOModel

Length of output: 3314


🏁 Script executed:

# Check imports at the top of config.go
head -50 config/config.go | grep -E "^import|^$"

Repository: ENTERPILOT/GOModel

Length of output: 77


🏁 Script executed:

# Verify what happens at lines 850-870 with the decoded models
sed -n '850,870p' config/config.go | cat -n

Repository: ENTERPILOT/GOModel

Length of output: 785


🏁 Script executed:

# Check if a nil/empty models slice causes actual issues downstream
sed -n '841,865p' config/config.go | cat -n

Repository: ENTERPILOT/GOModel

Length of output: 991


Reject null manual-rule values to avoid silent fallback disablement.

Decoding directly into []string accepts JSON null without error, resulting in a nil slice. This nil slice then produces an empty normalized array downstream, silently disabling that rule instead of rejecting the configuration.

💡 Proposed fix
-		var models []string
-		if err := decoder.Decode(&models); err != nil {
-			return fmt.Errorf("fallback.manual_rules_path: failed to parse %q: %w", path, err)
-		}
-		decoded[key] = models
+		var rawModels json.RawMessage
+		if err := decoder.Decode(&rawModels); err != nil {
+			return fmt.Errorf("fallback.manual_rules_path: failed to parse %q: %w", path, err)
+		}
+
+		trimmedModels := strings.TrimSpace(string(rawModels))
+		if trimmedModels == "null" || !strings.HasPrefix(trimmedModels, "[") {
+			return fmt.Errorf("fallback.manual_rules_path: key %q in %q must map to an array of strings", key, path)
+		}
+
+		var models []string
+		if err := json.Unmarshal(rawModels, &models); err != nil {
+			return fmt.Errorf("fallback.manual_rules_path: key %q in %q must map to an array of strings: %w", key, path, err)
+		}
+		decoded[key] = models
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/config.go` around lines 817 - 821, Decoding into a []string accepts
JSON null and yields a nil slice, which silently disables the rule; in the
fallback.manual_rules_path parsing (where decoder.Decode(&models) is called and
result assigned to decoded[key]) explicitly reject null by first decoding into
an interface{} or using decoder.Token/RawMessage to detect a JSON null, and if
null return a fmt.Errorf indicating "fallback.manual_rules_path: null not
allowed for %q" (including path); only after confirming the value is a JSON
array decode/convert it into []string and assign to decoded[key].

Comment on lines +411 to +453
func TestResponses_StreamFallsBackToAlternateModel(t *testing.T) {
provider := &fallbackProvider{
responsesStreams: map[string]string{
"azure/gpt-4o": "data: {\"type\":\"response.output_text.delta\",\"delta\":\"fallback response\"}\n\ndata: [DONE]\n\n",
},
responsesErrors: map[string]error{
"gpt-4o": core.NewNotFoundError("model not found"),
},
supportedModels: map[string]string{
"gpt-4o": "openai",
"azure/gpt-4o": "azure",
},
}

handler := newHandler(provider, nil, nil, nil, nil, nil, fallbackResolverStub{
selectors: []core.ModelSelector{{Provider: "azure", Model: "gpt-4o"}},
}, nil)

e := echo.New()
req := httptest.NewRequest(http.MethodPost, "/v1/responses", strings.NewReader(`{"model":"gpt-4o","stream":true,"input":"hi"}`))
req.Header.Set("Content-Type", "application/json")
rec := httptest.NewRecorder()
c := e.NewContext(req, rec)

if err := handler.Responses(c); err != nil {
t.Fatalf("handler.Responses() error = %v", err)
}
if rec.Code != http.StatusOK {
t.Fatalf("status = %d, want %d body=%s", rec.Code, http.StatusOK, rec.Body.String())
}
if len(provider.responsesCalls) != 2 {
t.Fatalf("responses calls = %v, want 2 attempts", provider.responsesCalls)
}
if provider.responsesCalls[0] != "gpt-4o" || provider.responsesCalls[1] != "azure/gpt-4o" {
t.Fatalf("responses calls = %v, want [gpt-4o azure/gpt-4o]", provider.responsesCalls)
}
if !strings.Contains(rec.Body.String(), "fallback response") {
t.Fatalf("response body = %s, want fallback stream content", rec.Body.String())
}
if !core.GetFallbackUsed(c.Request().Context()) {
t.Fatal("expected request context to be marked as fallback-used")
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider adding negative tests for Responses endpoint.

The ChatCompletion tests include TestChatCompletion_DoesNotFallbackOnNonAvailabilityError and TestChatCompletion_DoesNotFallbackWhenExecutionPolicyDisablesFallback, but there are no equivalent tests for the Responses endpoint. Adding these would ensure consistent coverage across both endpoints.

Missing test scenarios:

  • TestResponses_DoesNotFallbackOnNonAvailabilityError
  • TestResponses_DoesNotFallbackWhenExecutionPolicyDisablesFallback
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/server/fallback_test.go` around lines 411 - 453, Add two negative
tests for the Responses endpoint mirroring existing ChatCompletion tests: create
TestResponses_DoesNotFallbackOnNonAvailabilityError and
TestResponses_DoesNotFallbackWhenExecutionPolicyDisablesFallback that follow the
pattern of TestResponses_StreamFallsBackToAlternateModel; use the same
fallbackProvider/test handler setup but configure responsesErrors to return a
non-availability error (not core.NewNotFoundError) for the primary model in the
first test and assert that the handler does not attempt the alternate model and
that fallback is not marked in context, and in the second test configure the
execution policy to disable fallback (use the fallbackResolverStub or policy
stub used elsewhere) and assert no fallback attempt and no fallback-used context
flag; reference handler.Responses, fallbackProvider, fallbackResolverStub and
core.GetFallbackUsed to locate and structure the tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants