implementation(PR-98): model aliases#99
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds model-alias support: configuration and docs, a new AliasResolver that maps aliases to actual models and discovers endpoints, propagation of endpoint→model maps in request context, and request-body rewriting of the top-level "model" field before proxying to backend endpoints. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Handler as Application/Handler
participant Resolver as AliasResolver
participant Proxy as ProxyService
participant Rewriter as core.RewriteModelForAlias
participant Backend as ModelEndpoint
Client->>Handler: POST /v1/chat/completions (model: alias)
Handler->>Resolver: ResolveEndpoints(alias)
Resolver-->>Handler: map(endpointURL -> actualModel)
Handler->>Proxy: forward request (with alias-map in context)
Proxy->>Rewriter: RewriteModelForAlias(ctx, req, endpoint)
rect rgba(200,230,255,0.5)
Rewriter-->>Proxy: modified request body (model: actualModel)
end
Proxy->>Backend: proxied request with actualModel
Backend-->>Proxy: response
Proxy-->>Client: response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@internal/adapter/proxy/core/model_rewrite_test.go`:
- Around line 247-267: The test TestRewriteModelForAlias_BodyRereadable
overpromises by only reading the request body once after calling
RewriteModelForAlias; either rename the test to indicate it only verifies the
body is readable after rewrite (e.g.,
TestRewriteModelForAlias_BodyReadableAfterRewrite) or actually perform a second
read by resetting the body position (use the bytes.Reader created in
RewriteModelForAlias via io.NopCloser(bytes.NewReader(...)) — obtain a new
bytes.Reader or type-assert to enable Seek or re-wrap the same bytes into a new
io.NopCloser before the second read) so the test truly verifies multiple reads.
Ensure you update the test name or add the second read and assertions
accordingly while referencing TestRewriteModelForAlias_BodyRereadable and
RewriteModelForAlias.
🧹 Nitpick comments (4)
internal/config/types.go (1)
24-35: Consider adding validation forModelAliasesentries.Currently there's no config-level validation — an alias with an empty model list (e.g.,
gpt-oss-120b: []) or an empty alias key would silently pass through. AValidate()method onConfig(or within the existing validation flow) could catch misconfigurations early and provide clear error messages at startup.internal/app/handlers/handler_proxy.go (1)
406-455: Alias resolution routing logic looks correct and well-structured.The fallback to standard routing when alias resolution yields no endpoints (lines 415-429) is a sensible resilience pattern. The method correctly filters candidates against resolved endpoints and stores the rewrite map for downstream use. Comments use appropriate Australian English ("recognises").
One minor note: on line 416, when
erris nil butendpointToModelis empty, the log message includes"error", errwhich will emit"error", <nil>. This is cosmetic but slightly noisy.Optional: omit error from log when nil
- if err != nil || len(endpointToModel) == 0 { - logger.Warn("Model alias resolved to no endpoints, falling back to standard routing", - "alias", aliasName, - "error", err) + if err != nil || len(endpointToModel) == 0 { + logFields := []any{"alias", aliasName} + if err != nil { + logFields = append(logFields, "error", err) + } + logger.Warn("Model alias resolved to no endpoints, falling back to standard routing", logFields...)internal/adapter/proxy/core/model_rewrite.go (1)
56-83: JSON re-serialisation reorders keys and strips formatting.
rewriteModelFieldunmarshals intomap[string]json.RawMessageand re-marshals, which produces compact JSON with keys in an undefined order. This is functionally correct for the model field rewrite, but it does alter the byte-level representation of the entire request body. If any downstream component or backend is sensitive to key ordering or whitespace, this could cause subtle issues.This is likely acceptable given that all consumers are JSON parsers, but worth noting.
internal/adapter/registry/alias.go (1)
13-19: Remove the unusedreverseIndexfield fromAliasResolver.The
reverseIndexfield is populated in the constructor but never read by any method in the type. None of the methods (IsAlias,GetActualModels,ResolveEndpoints) reference it—they all use only thealiasesmap. Removing this field will reduce the struct size and eliminate dead code that serves no purpose.
| // Ensure the http.Request's Body can be read multiple times after rewrite | ||
| func TestRewriteModelForAlias_BodyRereadable(t *testing.T) { | ||
| aliasMap := map[string]string{ | ||
| "http://ollama:11434": "rewritten-model", | ||
| } | ||
| ctx := context.WithValue(context.Background(), constants.ContextModelAliasMapKey, aliasMap) | ||
| body := `{"model": "original-model"}` | ||
| r, _ := http.NewRequestWithContext(ctx, "POST", "/v1/chat/completions", bytes.NewBufferString(body)) | ||
| r.ContentLength = int64(len(body)) | ||
|
|
||
| endpoint := &domain.Endpoint{URLString: "http://ollama:11434"} | ||
| RewriteModelForAlias(ctx, r, endpoint) | ||
|
|
||
| // Read body first time | ||
| body1, err := io.ReadAll(r.Body) | ||
| if err != nil { | ||
| t.Fatalf("first read failed: %v", err) | ||
| } | ||
| if len(body1) == 0 { | ||
| t.Error("first read should return non-empty body") | ||
| } |
There was a problem hiding this comment.
Test name overpromises — body is only read once.
TestRewriteModelForAlias_BodyRereadable claims to verify the body can be read multiple times, but it only performs a single read. Since RewriteModelForAlias sets the body to io.NopCloser(bytes.NewReader(...)), a bytes.Reader does support Seek, but it's not exposed through the io.ReadCloser interface. Either rename the test to reflect what it actually checks (e.g., _BodyReadableAfterRewrite) or add a second read after seeking.
🤖 Prompt for AI Agents
In `@internal/adapter/proxy/core/model_rewrite_test.go` around lines 247 - 267,
The test TestRewriteModelForAlias_BodyRereadable overpromises by only reading
the request body once after calling RewriteModelForAlias; either rename the test
to indicate it only verifies the body is readable after rewrite (e.g.,
TestRewriteModelForAlias_BodyReadableAfterRewrite) or actually perform a second
read by resetting the body position (use the bytes.Reader created in
RewriteModelForAlias via io.NopCloser(bytes.NewReader(...)) — obtain a new
bytes.Reader or type-assert to enable Seek or re-wrap the same bytes into a new
io.NopCloser before the second read) so the test truly verifies multiple reads.
Ensure you update the test name or add the second read and assertions
accordingly while referencing TestRewriteModelForAlias_BodyRereadable and
RewriteModelForAlias.
- Rename BodyRereadable test to BodyReadableAfterRewrite (test only reads once) - Remove unused reverseIndex field from AliasResolver - Omit nil error from log when alias resolves to no endpoints
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@internal/app/handlers/handler_proxy.go`:
- Around line 406-458: The alias-success path in resolveAliasEndpoints doesn't
set profile.RoutingDecision, so downstream code (executeProxyRequest,
logRequestResult) lacks routing metadata; before returning aliasEndpoints,
synthesize and assign a lightweight domain.RoutingDecision to
profile.RoutingDecision (e.g., populate the selected endpoint/URL, a strategy or
reason like "alias", and any minimal fields used by logging/metrics), reusing
domain.RoutingDecision type and keys (and keep
profile.SetInspectionMeta(endpointToModel) as-is) so observability consumers
receive consistent routing info for alias-routed requests.
- Around line 435-440: The lookup inconsistency can cause silent mismatches:
update all map key uses and comparisons to consistently use the getter rather
than direct field access—replace any direct accesses to endpoint.URLString with
endpoint.GetURLString() (including where endpointToModel is populated and where
it's queried, e.g., the aliasEndpoints loop and the other occurrence around the
earlier reported URLString use) so both map keys and lookups use the same
normalized value provided by GetURLString().
🧹 Nitpick comments (1)
internal/adapter/proxy/core/model_rewrite_test.go (1)
154-199: Consider adding a table case where the new model name contains characters that need JSON escaping.Model names with characters like
"or\would exercise whetherrewriteModelFieldproduces valid JSON after substitution. Not blocking, just a hardening suggestion.
| var aliasEndpoints []*domain.Endpoint | ||
| for _, endpoint := range candidates { | ||
| if _, ok := endpointToModel[endpoint.GetURLString()]; ok { | ||
| aliasEndpoints = append(aliasEndpoints, endpoint) | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if GetURLString() simply returns URLString or does something else
ast-grep --pattern $'func ($_ *Endpoint) GetURLString() string {
$$$
}'Repository: thushan/olla
Length of output: 228
🏁 Script executed:
#!/bin/bash
# Find ResolveEndpoints function and how endpointToModel is built
rg -A 30 "ResolveEndpoints" internal/app/handlers/handler_proxy.go | head -80Repository: thushan/olla
Length of output: 1170
🏁 Script executed:
#!/bin/bash
# Find the ResolveEndpoints method implementation
rg -B 5 "func.*ResolveEndpoints.*{" --type go | head -50Repository: thushan/olla
Length of output: 169
🏁 Script executed:
#!/bin/bash
# Find ResolveEndpoints implementation
rg "func.*ResolveEndpoints" --type go -A 20 | head -100Repository: thushan/olla
Length of output: 5997
🏁 Script executed:
#!/bin/bash
# Get full ResolveEndpoints implementation
sed -n '1,500p' internal/adapter/registry/alias.go | grep -A 50 "func.*ResolveEndpoints"Repository: thushan/olla
Length of output: 1209
🏁 Script executed:
#!/bin/bash
# Find GetEndpointsForModel implementation
rg "func.*GetEndpointsForModel" --type go -A 15Repository: thushan/olla
Length of output: 15046
🏁 Script executed:
#!/bin/bash
# Get more context from MemoryModelRegistry
rg -A 30 "func.*MemoryModelRegistry.*GetEndpointsForModel" internal/adapter/registry/memory_registry.goRepository: thushan/olla
Length of output: 890
🏁 Script executed:
#!/bin/bash
# Find RegisterModel implementation
rg -A 40 "func.*MemoryModelRegistry.*RegisterModel" internal/adapter/registry/memory_registry.go | head -60Repository: thushan/olla
Length of output: 2041
🏁 Script executed:
#!/bin/bash
# Get full RegisterModel function
sed -n '/func.*MemoryModelRegistry.*RegisterModel/,/^func/p' internal/adapter/registry/memory_registry.go | head -100Repository: thushan/olla
Length of output: 1859
🏁 Script executed:
#!/bin/bash
# Find calls to RegisterModel that might use endpoint.URLString or GetURLString
rg "RegisterModel.*endpoint\." --type go -B 2 -A 2 | head -50Repository: thushan/olla
Length of output: 919
🏁 Script executed:
#!/bin/bash
# Check line 566 and surrounding context in handler_proxy.go
sed -n '560,575p' internal/app/handlers/handler_proxy.goRepository: thushan/olla
Length of output: 550
🏁 Script executed:
#!/bin/bash
# Find Endpoint struct definition
rg -A 20 "type Endpoint struct" internal/core/domain/endpoint.goRepository: thushan/olla
Length of output: 739
🏁 Script executed:
#!/bin/bash
# Find where URLString is assigned
rg "URLString\s*=" internal/core/domain/endpoint.go -B 2 -A 1Repository: thushan/olla
Length of output: 38
🏁 Script executed:
#!/bin/bash
# Search for Endpoint construction
rg "URLString\s*=" --type go -B 2 -A 1 | grep -A 3 -B 3 "URLString"Repository: thushan/olla
Length of output: 2248
🏁 Script executed:
#!/bin/bash
# Search for Endpoint construction
rg "&domain.Endpoint\{" --type go -A 10 | head -100Repository: thushan/olla
Length of output: 6932
🏁 Script executed:
#!/bin/bash
# Search for modifications to endpoint.URL
rg "endpoint\.URL\s*=" --type go -B 2 -A 1Repository: thushan/olla
Length of output: 459
Standardise URL lookups to prevent silent endpoint mismatches.
The code currently uses both endpoint.GetURLString() and endpoint.URLString inconsistently—some places call the getter method (line 437 here) whilst others access the field directly (line 566). Whilst GetURLString() presently returns the field unchanged, this pattern is fragile. If the getter is later modified to apply normalisation (e.g., trailing slash handling, scheme normalisation), the lookup into endpointToModel would silently miss endpoints. Ensure consistent use of one approach throughout the codebase, preferably via the getter method for maintainability.
🤖 Prompt for AI Agents
In `@internal/app/handlers/handler_proxy.go` around lines 435 - 440, The lookup
inconsistency can cause silent mismatches: update all map key uses and
comparisons to consistently use the getter rather than direct field
access—replace any direct accesses to endpoint.URLString with
endpoint.GetURLString() (including where endpointToModel is populated and where
it's queried, e.g., the aliasEndpoints loop and the other occurrence around the
earlier reported URLString use) so both map keys and lookups use the same
normalized value provided by GetURLString().
- Set RoutingDecision on alias happy path for response headers/metrics - Standardise URL lookups: endpoint.URLString → endpoint.GetURLString()
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@docs/content/concepts/model-aliases.md`:
- Around line 193-198: Replace the inaccurate troubleshooting bullet "Endpoint
URL in the alias mapping doesn't match the discovered endpoint" with wording
that reflects alias → model-name mapping (e.g., "The actual model name in the
alias list doesn't exactly match the model name reported by the backend") so
readers understand aliases map names to models rather than endpoint URLs; update
the sentence in the model-aliases.md section where that bullet appears.
- Around line 40-65: The fenced ASCII-art block starting with the line
containing Client request: "model": "my-llama" should include a language
specifier to satisfy markdownlint MD040; update the opening triple-backtick that
begins that diagram to ```text so the block becomes a text-fenced code block
(leave the diagram content unchanged and only modify the opening fence).
🧹 Nitpick comments (1)
internal/app/handlers/handler_proxy.go (1)
362-365: Comment describes "what" rather than "why".The coding guidelines ask comments to explain why rather than what. Consider something like:
// aliases need separate resolution because they map to multiple actual model names across backends.As per coding guidelines,
**/*.go: "Comment on why rather than what in code".✏️ Suggested comment rewording
- // check for model alias before standard routing + // aliases map one name to multiple backend-specific models, so they + // need dedicated resolution rather than the standard single-model path if a.aliasResolver != nil && a.aliasResolver.IsAlias(profile.ModelName) {
- Add language specifier to ASCII-art code block (```text) - Fix inaccurate troubleshooting cause: aliases map model names, not endpoint URLs
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@docs/content/concepts/model-aliases.md`:
- Around line 196-197: Update the first bullet text in the list that currently
reads "Actual model name in the alias list doesn't exactly match the model name
reported by the backend" to include the missing article at the start so it reads
"The actual model name in the alias list doesn’t exactly match the model name
reported by the backend"; locate and edit the exact bullet string in
docs/content/concepts/model-aliases.md (the list entry "Actual model name in the
alias list doesn't exactly match the model name reported by the backend") and
replace it with the corrected sentence.
- Around line 32-34: Replace the phrase "the exact same model" with the more
concise "the same model" in the paragraph that reads "Without aliases, a client
request for `llama3.1:8b` would only match the Ollama endpoint — even though the
other backends have the exact same model." Update that sentence so it reads
"...even though the other backends have the same model," preserving the
surrounding wording and inline code `llama3.1:8b`.
| - Actual model name in the alias list doesn't exactly match the model name reported by the backend | ||
| - Request body is not JSON |
There was a problem hiding this comment.
Add missing article for readability.
Suggested: “The actual model name in the alias list doesn’t exactly match the model name reported by the backend.”
🧰 Tools
🪛 LanguageTool
[uncategorized] ~196-~196: Possible missing article found.
Context: ...al model name. Possible Causes: - Actual model name in the alias list doesn't ex...
(AI_HYDRA_LEO_MISSING_THE)
🤖 Prompt for AI Agents
In `@docs/content/concepts/model-aliases.md` around lines 196 - 197, Update the
first bullet text in the list that currently reads "Actual model name in the
alias list doesn't exactly match the model name reported by the backend" to
include the missing article at the start so it reads "The actual model name in
the alias list doesn’t exactly match the model name reported by the backend";
locate and edit the exact bullet string in
docs/content/concepts/model-aliases.md (the list entry "Actual model name in the
alias list doesn't exactly match the model name reported by the backend") and
replace it with the corrected sentence.
|
Will it at some point stop complaining? 😁 |
|
Great work on the PR, haven't had a chance for a full review yet (on holidays till tomorrow!). To make it clean, thinking of putting these in the models.yaml might be better/cleaner, thoughts? I'll review this tomorrow once I have a dev environment, thanks again, I'll make a fix for the CI in main you can rebase on tomorrow too (for |
|
ah, if you run better align on your branch and get it to fix it for you, and push, you'll be fine. Alternatively, there's the |
I'm feeling like the models.yaml is somewhat of an internal mapping config; e.g. I don't use it and only have the config.yaml mapped outside of my docker container. Moving the logic into the models.yaml would mean that everybody that wants to use it needs to copy the existing one from https://github.com/thushan/olla/blob/main/config/models.yaml and adjust it to their specific needs, right? Otherwise they would lose all "default" settings, aren't they?
I'll have a look this evening and get back to you :) |
Apologies I misunderstood, I thought you were going to add default aliases to common models across HF etc (that we'd ship). This was a royal pain to keep up to date. I agree in that case because users will have their own mappings. I've given this to our existing pool of test boxes to use and setup aliases, i think one of the folks will set this up tomorrow and we'll use them over the weekend as an initial test across multiple backends (and lots of models). Great contribution, thanks. |
|
Hey @thushan, sorry for the late reply. I did the betteralign recommendations. Tell me, once your test boxes were successful, how to proceed, thanks :) |
|
No worries, nothing left to do on our part, we'll let the tests happen over the weekend and hopefully be ready for a release on Monday with some of the other PRs. |
|
This is looking good, couple of suggestions:
thinking something like: that we can call at startup. From my brief view (I will have a proper test tonight after this was raised), it feels like because we unmarshal and marshal again (to substitute the model name), the top level keys are reordered and whitespace is removed (so compact JSON). It would be semantically identical but not byte identical (ignoring model name change) to the original payload? I say this because we logged requests and the JSONL's came in very different (so our internal validator rated this very different threshold wise vs 5-8% difference, which we expect with model name change). |
…JSON rewrite - Add ValidateModelAliases() to Config with startup validation in main.go - Add handler_proxy_alias_test.go with 6 tests for resolveAliasEndpoints - Rewrite rewriteModelField to use regex substitution preserving JSON formatting - Add byte-identity and validation tests
|
Thanks for the feedback @thushan! Went through your suggestions and tackled all three. Tests for Added
New method on Byte-identical JSON rewrite Good catch on the key reordering issue. Swapped the unmarshal/marshal approach for a regex that only touches the |
|
Thanks, I noticed some failures with this new round when I tried this yesterday (mostly fuzz tests for model matching). The regex might be the cause, let's see if the CodeRabbit hops onto it, I will dig it up otherwise. @coderabbitai review |
🧩 Analysis chain
🏁 Script executed: Length of output: 49772 🏁 Script executed: Length of output: 12180
The ProblemThe regex approach has a subtle but critical issue: byte-stream ordering vs. JSON structure mismatch. The code validates that var parsed map[string]json.RawMessage
if err := json.Unmarshal(body, &parsed); err != nil {
return body
}
if _, hasModel := parsed["model"]; !hasModel {
return body
}But then the regex Failure case:{
"messages": [{"role": "system", "content": "Use model: \"gpt-4\""}],
"model": "actual-alias"
}
Additional fuzz-test triggers:
The FixReplace the regex with a stateful byte-scanner that finds the top-level func rewriteModelField(body []byte, newModel string) []byte {
// Validate JSON structure first
var parsed map[string]json.RawMessage
if err := json.Unmarshal(body, &parsed); err != nil || parsed["model"] == nil {
return body
}
// Locate the top-level "model" key in the byte stream
// by scanning with a simple state machine that tracks nesting depth
const (
stateKey = 0
stateValue = 1
stateString = 2
)
var (
state = stateKey
depth = 0
inString = false
escaped = false
modelStart = -1
modelEnd = -1
keyStart = -1
)
for i := 0; i < len(body); i++ {
b := body[i]
if escaped {
escaped = false
continue
}
if b == '\\' && inString {
escaped = true
continue
}
if b == '"' {
inString = !inString
if depth == 1 && !inString && state == stateKey {
// Just finished reading a key at depth 1
if keyStart >= 0 && string(body[keyStart:i+1]) == `"model"` {
// Found top-level "model" key, now find its value
state = stateValue
}
keyStart = -1
} else if depth == 1 && inString && state == stateKey {
keyStart = i
} else if depth == 1 && !inString && state == stateValue {
// Just finished reading the model value
modelEnd = i + 1
break
} else if depth == 1 && inString && state == stateValue {
modelStart = i
}
continue
}
if inString {
continue
}
switch b {
case '{', '[':
depth++
if depth == 1 {
state = stateKey
}
case '}', ']':
depth--
case ':':
if depth == 1 && state == stateKey {
state = stateValue
}
case ',':
if depth == 1 {
state = stateKey
}
}
}
if modelStart < 0 || modelEnd < 0 {
return body
}
// Build the rewritten body
var buf bytes.Buffer
buf.Write(body[:modelStart])
buf.WriteByte('"')
escapedModel, _ := json.Marshal(newModel)
buf.Write(escapedModel[1 : len(escapedModel)-1]) // strip quotes
buf.WriteByte('"')
buf.Write(body[modelEnd:])
return buf.Bytes()
}This approach:
Simpler AlternativeIf byte-identical formatting isn't critical, just use standard JSON marshal/unmarshal: func rewriteModelField(body []byte, newModel string) []byte {
var parsed map[string]interface{}
if err := json.Unmarshal(body, &parsed); err != nil {
return body
}
if _, ok := parsed["model"]; !ok {
return body
}
parsed["model"] = newModel
rewritten, err := json.Marshal(parsed)
if err != nil {
return body
}
return rewritten
}This loses whitespace/formatting but is bulletproof. Given the JSONL validator threshold issue you mentioned (expecting 5-8% diff but seeing more), preserving formatting might not be working anyway since Let me know which direction you prefer and I can provide a complete drop-in replacement with tests. ✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (3)
internal/adapter/proxy/core/model_rewrite_test.go (1)
154-199: Consider adding test cases for model names containing special JSON characters.The
jsonEscapeStringhelper handles RFC 8259 escaping, but there are no tests exercising it with adversarial input — e.g., model names containing quotes (my"model), backslashes (path\to\model), or control characters. A few extra table entries would increase confidence in the escaping path.Suggested additional test cases
{ name: "preserves other fields", body: `{"model": "old", "temperature": 0.7, "messages": []}`, newModel: "new", wantModel: "new", }, + { + name: "model name with quotes", + body: `{"model": "old-model"}`, + newModel: `my"model`, + wantModel: `my"model`, + }, + { + name: "model name with backslash", + body: `{"model": "old-model"}`, + newModel: `path\to\model`, + wantModel: `path\to\model`, + },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/adapter/proxy/core/model_rewrite_test.go` around lines 154 - 199, Add test cases to TestRewriteModelField that exercise jsonEscapeString by including model names with special JSON characters (e.g., quotes like my"model, backslashes like path\to\model, and control characters). For each new table entry, set body to include the original model value, set newModel to the adversarial string, and assert via the existing rewriteModelField -> json.Unmarshal parsed["model"] path that the final modelName equals the expected escaped/unescaped value; reference the test table in TestRewriteModelField and the jsonEscapeString/rewriteModelField helpers to locate where to add these entries.internal/adapter/proxy/core/model_rewrite.go (1)
46-48:ContentLength == 0may be too broad — consider also checking for-1(unknown length) bodies.When
ContentLengthis0, the early return is correct. However, note that chunked or streaming request bodies may arrive withContentLength == -1. The current code handles this correctly (the condition is false, so we proceed to read), but the comment on Line 33 says "The request has no body" — it would be worth noting thatContentLength == 0specifically means an empty body, not an absent one (that's ther.Body == nilbranch).This is a documentation nit — the logic itself is correct.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/adapter/proxy/core/model_rewrite.go` around lines 46 - 48, Update the explanatory comment around the early-return check that uses r.Body and r.ContentLength to clearly distinguish the cases: state that r.Body == nil means no body present, r.ContentLength == 0 means an explicitly empty body, and r.ContentLength == -1 denotes unknown/chunked streaming bodies which should be processed (i.e., the current code path that proceeds to read). Keep the existing condition (if r.Body == nil || r.ContentLength == 0 { return }) unchanged; only modify the comment text near that check to reflect these distinctions and avoid implying that ContentLength == 0 covers all "no body" cases.internal/config/types.go (1)
386-391: Consider accepting a logger parameter instead of using the globalsloglogger.Calling
slog.Warndirectly couples this config type to the global logger, making it harder to test log output or silence warnings in tests. An optional refactor would be to accept a*slog.Loggerparameter (or use a method receiver that holds one).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/config/types.go` around lines 386 - 391, The code currently calls slog.Warn directly (see slog.Warn, aliasName, modelName, seen) which couples this logic to the global logger; update the function or method that contains this duplicate-check (the block using seen[modelName]) to accept a logger parameter (e.g., *slog.Logger) or make the enclosing type hold a logger receiver, and replace slog.Warn with logger.Warn; ensure the parameter can be nil by falling back to the global slog only when logger == nil to preserve current behavior for callers that don't provide one.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/adapter/proxy/core/model_rewrite_test.go`:
- Around line 202-248: Add a unit test (e.g.,
TestRewriteModelField_NestedModelKey) that constructs JSON with a nested "model"
key inside a sub-object and a top-level "model", calls
rewriteModelField([]byte(body), "rewritten"), unmarshals the result into
map[string]json.RawMessage, decodes parsed["model"] into a string and asserts it
equals "rewritten" while ensuring the nested metadata.model remains unchanged;
this documents the current regex limitation and will fail until
rewriteModelField is changed to only target the top-level "model" key.
In `@internal/adapter/proxy/core/model_rewrite.go`:
- Around line 15-19: The current regex modelFieldPattern and its replacement
logic can mistakenly match a nested "model" key; update the rewrite logic that
uses modelFieldPattern (and the replaced flag) to either (A) implement a
depth-aware byte scanner that tracks JSON object brace depth and only
matches/replaces the "model" key when depth == 1 (top-level), preserving
formatting around the matched quoted value, or (B) replace the ad-hoc regex
approach with a proper unmarshal -> modify top-level map[string]interface{} ->
marshal round-trip to reliably change the top-level "model" value; ensure
whichever path you pick still respects escaped quotes and update/remove
modelFieldPattern accordingly.
In `@internal/config/types.go`:
- Around line 379-393: The loop over actualModels should validate and normalize
each model name for leading/trailing whitespace: for each modelName in
actualModels (in the alias validation block using aliasName), trim whitespace
(e.g., strings.TrimSpace) into trimmedName, reject if trimmedName == "" (return
an error like "model alias %q has empty model name at position %d") or if
trimmedName != modelName (return an error about unexpected surrounding
whitespace), and use trimmedName when checking for duplicates in the seen map
and logging; update the duplicate warning to reference trimmedName.
---
Duplicate comments:
In `@internal/adapter/proxy/core/model_rewrite_test.go`:
- Around line 296-315: The test
TestRewriteModelForAlias_BodyReadableAfterRewrite claims body re-readability but
only reads once; update the test to perform a second io.ReadAll(r.Body) after
the first read and assert the second read is non-empty and equals the first (or
equals the original body string) to validate re-readability of the request body
after calling RewriteModelForAlias; reference the test function
TestRewriteModelForAlias_BodyReadableAfterRewrite, the RewriteModelForAlias
call, and r.Body when adding the second read and assertions.
---
Nitpick comments:
In `@internal/adapter/proxy/core/model_rewrite_test.go`:
- Around line 154-199: Add test cases to TestRewriteModelField that exercise
jsonEscapeString by including model names with special JSON characters (e.g.,
quotes like my"model, backslashes like path\to\model, and control characters).
For each new table entry, set body to include the original model value, set
newModel to the adversarial string, and assert via the existing
rewriteModelField -> json.Unmarshal parsed["model"] path that the final
modelName equals the expected escaped/unescaped value; reference the test table
in TestRewriteModelField and the jsonEscapeString/rewriteModelField helpers to
locate where to add these entries.
In `@internal/adapter/proxy/core/model_rewrite.go`:
- Around line 46-48: Update the explanatory comment around the early-return
check that uses r.Body and r.ContentLength to clearly distinguish the cases:
state that r.Body == nil means no body present, r.ContentLength == 0 means an
explicitly empty body, and r.ContentLength == -1 denotes unknown/chunked
streaming bodies which should be processed (i.e., the current code path that
proceeds to read). Keep the existing condition (if r.Body == nil ||
r.ContentLength == 0 { return }) unchanged; only modify the comment text near
that check to reflect these distinctions and avoid implying that ContentLength
== 0 covers all "no body" cases.
In `@internal/config/types.go`:
- Around line 386-391: The code currently calls slog.Warn directly (see
slog.Warn, aliasName, modelName, seen) which couples this logic to the global
logger; update the function or method that contains this duplicate-check (the
block using seen[modelName]) to accept a logger parameter (e.g., *slog.Logger)
or make the enclosing type hold a logger receiver, and replace slog.Warn with
logger.Warn; ensure the parameter can be nil by falling back to the global slog
only when logger == nil to preserve current behavior for callers that don't
provide one.
| // Check each model name in the list | ||
| seen := make(map[string]bool) | ||
| for i, modelName := range actualModels { | ||
| if modelName == "" { | ||
| return fmt.Errorf("model alias %q has empty model name at position %d", aliasName, i) | ||
| } | ||
|
|
||
| // Duplicate model names (warning, not a fatal error) | ||
| if seen[modelName] { | ||
| slog.Warn("Duplicate model name in alias configuration", | ||
| "alias", aliasName, | ||
| "model", modelName) | ||
| } | ||
| seen[modelName] = true | ||
| } |
There was a problem hiding this comment.
Individual model names are not checked for leading/trailing whitespace.
Alias names are validated for whitespace (Line 370), but the individual model names within the list are not. A model name like " gpt-oss:120b " would pass validation but silently fail to match any endpoint at runtime.
Proposed fix
for i, modelName := range actualModels {
if modelName == "" {
return fmt.Errorf("model alias %q has empty model name at position %d", aliasName, i)
}
+ if strings.TrimSpace(modelName) != modelName {
+ return fmt.Errorf("model alias %q has model name with leading/trailing whitespace at position %d: %q", aliasName, i, modelName)
+ }
// Duplicate model names (warning, not a fatal error)
if seen[modelName] {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Check each model name in the list | |
| seen := make(map[string]bool) | |
| for i, modelName := range actualModels { | |
| if modelName == "" { | |
| return fmt.Errorf("model alias %q has empty model name at position %d", aliasName, i) | |
| } | |
| // Duplicate model names (warning, not a fatal error) | |
| if seen[modelName] { | |
| slog.Warn("Duplicate model name in alias configuration", | |
| "alias", aliasName, | |
| "model", modelName) | |
| } | |
| seen[modelName] = true | |
| } | |
| // Check each model name in the list | |
| seen := make(map[string]bool) | |
| for i, modelName := range actualModels { | |
| if modelName == "" { | |
| return fmt.Errorf("model alias %q has empty model name at position %d", aliasName, i) | |
| } | |
| if strings.TrimSpace(modelName) != modelName { | |
| return fmt.Errorf("model alias %q has model name with leading/trailing whitespace at position %d: %q", aliasName, i, modelName) | |
| } | |
| // Duplicate model names (warning, not a fatal error) | |
| if seen[modelName] { | |
| slog.Warn("Duplicate model name in alias configuration", | |
| "alias", aliasName, | |
| "model", modelName) | |
| } | |
| seen[modelName] = true | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/config/types.go` around lines 379 - 393, The loop over actualModels
should validate and normalize each model name for leading/trailing whitespace:
for each modelName in actualModels (in the alias validation block using
aliasName), trim whitespace (e.g., strings.TrimSpace) into trimmedName, reject
if trimmedName == "" (return an error like "model alias %q has empty model name
at position %d") or if trimmedName != modelName (return an error about
unexpected surrounding whitespace), and use trimmedName when checking for
duplicates in the seen map and logging; update the duplicate warning to
reference trimmedName.
|
Wowser, it did detect the issue but I wouldn't write a full on parser. There there code-rabbit... This is from one of the attempts earlier for model patching that might be useful. because it keeps to depth == 1, we avoid too much overhead and avoids that double Unmarshal'ing. |
|
Hey mate, we'll get 0.0.23 released and target this for 0.0.24 so it gives you time to refactor etc. |
The regex approach could falsely match the word 'model' inside nested JSON string values (e.g. message content like "Use model: gpt-4"), causing incorrect rewrites and failing fuzz tests. Replace with a streaming json.Decoder that walks only top-level object keys via InputOffset to precisely locate the value's byte range, then splices in the replacement — preserving all surrounding whitespace, key ordering, and formatting. Also removes the regexp import. Add tests covering: model name in nested message content, model key in nested objects, and model names with special characters requiring JSON escaping.
|
Sorry for the delay! Pushed a fix for the regex issue you flagged — swapped it out for a proper json.Decoder token scanner that walks only top-level keys, so nested content like message strings with the word "model" in them can't accidentally get rewritten anymore. Also added test cases covering exactly those scenarios. Should be solid now! |
|
Amazing work, there are some conflicts, if you can resolve those i'll get this reviewed tomorrow nad hopefully merged 🥳 |
|
Great work @dnnspaul, everythings green, now the pressure's on me to review 😰 will update soon. |
thushan
left a comment
There was a problem hiding this comment.
Excellent work, i think youv'e done an amazing job! thank-you for your contribution and especially for writing/including docs.
We'll merge this and do a full test and prepare a new release hopefully before Easter.
Feature: Model Aliases
Related Issue
Closes #98
What This Does
Hey @thushan 👋
This PR implements the model alias feature requested in #98. When using multiple backends (Ollama, LM Studio, llamacpp, etc.), the same model often has different names across platforms. Model aliases let users define a single virtual name that maps to all the platform-specific variants, so any backend that has the model can serve the request — regardless of naming differences.
Example Configuration
In
config.yaml:When a request arrives with
"model": "gpt-oss-120b", olla will:"model"field in the request body to match what the selected backend actually expects (e.g.gpt-oss:120bfor Ollama)Self-referencing aliases work correctly
If the alias name is also a real model name on some backend, it still works:
The alias always takes priority during routing. The self-referencing entry ensures the endpoint serving
gpt-oss-120bnatively is included in the candidate pool, and the request body keepsgpt-oss-120bfor that backend (no unnecessary rewrite).What's NOT included
Per the discussion in the issue, the
only_alias: truesetting (to hide non-aliased models from/v1/models) is not part of this PR. Happy to add it as a follow-up if desired.How It Works (Architecture)
The implementation follows olla's hexagonal architecture and touches minimal surface area:
Config.ModelAliasesmap[string][]stringfield in config, parsed frommodel_aliasesin YAMLAliasResolverhandler_proxy.gofilterEndpointsByProfile), passes the rewrite map through contextcore.RewriteModelForAlias()"model"field in the JSON body right before the request is sent to the selected backendThe alias resolution hooks into the existing model routing pipeline without changing the
ModelRegistryinterface or any existing routing strategies.Files Changed
New files:
internal/adapter/registry/alias.go— AliasResolverinternal/adapter/registry/alias_test.go— Testsinternal/adapter/proxy/core/model_rewrite.go— Request body model rewritinginternal/adapter/proxy/core/model_rewrite_test.go— TestsModified files:
internal/config/types.go— AddedModelAliasesfield toConfiginternal/config/config_test.go— Config parsing tests for aliasesinternal/core/constants/context.go— Context key for alias rewrite mapinternal/app/handlers/application.go— Wires up the AliasResolverinternal/app/handlers/handler_proxy.go— Alias-aware endpoint filtering + context passinginternal/adapter/proxy/olla/service.go— Model rewrite call (legacy path)internal/adapter/proxy/olla/service_retry.go— Model rewrite call (retry path)internal/adapter/proxy/sherpa/service_retry.go— Model rewrite callconfig/config.yaml— Documented example (commented out)config.yaml— SameTesting
go test ./internal/...)model_aliasesgo build ./...,go vet ./internal/...)Summary by CodeRabbit
New Features
Configuration
Documentation
Tests