Conversation
fix tests update path stripping tests hmm streaming settings seem to be invalid, fix with isolating for Olla swapped the wrong default settings update benchmark tests
WalkthroughAdds a provider-agnostic BaseConverter and ConversionHelper for model conversion, refactors converters to embed and use it, introduces a unified proxy config package with defaults and Olla/Sherpa embedding, and updates many tests to new config/constructor initialization patterns. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Caller
participant Conv as Converter (e.g., Ollama/LMStudio)
participant Base as BaseConverter
participant Help as ConversionHelper
participant Model as UnifiedModel
participant EP as SourceEndpoint
Caller->>Conv: convertModel(Model)
Conv->>Base: NewConversionHelper(Model)
Base->>Help: construct helper (alias, endpoint)
Help-->>Conv: helper
alt No provider alias
Conv->>Help: ShouldSkip()
Help-->>Conv: true
Conv-->>Caller: skip model
else Alias found
Conv->>Help: GetModelType/GetState/GetDiskSize
Help->>EP: resolve endpoint-aware fallbacks
Help-->>Conv: values
Conv->>Help: GetMetadataString/Int/Bool
Help-->>Conv: metadata values
Conv-->>Caller: formatted model record
end
sequenceDiagram
autonumber
actor App
participant Factory as Proxy Factory
participant CfgPkg as config (unified)
participant SherpaCfg as sherpa.Configuration
participant OllaCfg as olla.Configuration
participant ServiceS as Sherpa Service
participant ServiceO as Olla Service
App->>Factory: Create Sherpa/Olla proxies
Factory->>SherpaCfg: new Configuration{} + field assignments
Factory->>OllaCfg: new Configuration{} + field assignments
SherpaCfg->>CfgPkg: use BaseProxyConfig getters for defaults
OllaCfg->>CfgPkg: use OllaConfig getters/overrides for defaults
Factory->>ServiceS: NewService(Sherpa config)
Factory->>ServiceO: NewService(Olla config)
ServiceS-->>App: proxy instance
ServiceO-->>App: proxy instance
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
internal/adapter/converter/ollama_converter.go (1)
87-113: Bug: default branch does not uppercase unknown quantisation; also make lookup case‑insensitiveThe comment says we uppercase by default, but the code returns the input unchanged. Additionally, if callers pass mixed‑case values, the map won’t match. Fix by normalising to lower case before lookup and uppercasing on fallback.
Apply this diff:
@@ -func denormalizeQuantization(quant string) string { - // Reverse the normalization mappings +func denormalizeQuantization(quant string) string { + // Reverse the normalisation mappings (inputs may be mixed case) + quantLower := strings.ToLower(strings.TrimSpace(quant)) mappings := map[string]string{ @@ - if denormalized, exists := mappings[quant]; exists { + if denormalized, exists := mappings[quantLower]; exists { return denormalized } - // Default: uppercase the quantization - return quant + // Default: uppercase the quantisation + return strings.ToUpper(quantLower) }And add the missing import:
@@ -import ( - "time" +import ( + "time" + "strings" )Optional: lift the mappings map to a package‑level var to avoid repeated allocations on hot paths.
internal/adapter/proxy/sherpa/service_leak_stress_test.go (1)
103-127: Data race and loop‑variable capture bug in concurrent stress test
- A single shared state is mutated by 100 goroutines — racy.
- The loop variable i is captured by the goroutine, causing nondeterministic branching (classic Go gotcha).
Fix by using a per‑goroutine state and capturing i as a parameter.
Apply these diffs:
@@ - state := &streamState{} - logger := createTestLogger() + logger := createTestLogger() @@ - for i := 0; i < numConcurrent; i++ { + for i := 0; i < numConcurrent; i++ { wg.Add(1) - go func() { + go func(i int) { defer wg.Done() ctx := context.Background() + // Avoid shared mutable state across goroutines + localState := &streamState{} @@ - if i%2 == 0 { + if i%2 == 0 { reader := &delayedReader{delay: 100 * time.Millisecond} localBuffer := make([]byte, 1024) - s.performTimedRead(ctx, reader, localBuffer, 50*time.Millisecond, state, logger) + s.performTimedRead(ctx, reader, localBuffer, 50*time.Millisecond, localState, logger) } else { reader := strings.NewReader("quick data") localBuffer := make([]byte, 1024) - s.performTimedRead(ctx, reader, localBuffer, 50*time.Millisecond, state, logger) + s.performTimedRead(ctx, reader, localBuffer, 50*time.Millisecond, localState, logger) } - }() + }(i) }internal/adapter/proxy/olla/service_leak_test.go (1)
72-83: Fixed sleeps make cleanup assertions flaky; poll with timeout insteadThe ticker and GC scheduling can vary under CI load. Replace fixed sleeps with a bounded wait loop to avoid intermittent failures.
-// Wait for cleanup to run -time.Sleep(200 * time.Millisecond) - -// Verify pools were cleaned up -poolCount = 0 -s.endpointPools.Range(func(k string, v *connectionPool) bool { - poolCount++ - return true -}) -if poolCount != 0 { - t.Errorf("Expected 0 pools after cleanup, got %d", poolCount) -} +deadline := time.Now().Add(2 * time.Second) +for { + poolCount = 0 + s.endpointPools.Range(func(k string, v *connectionPool) bool { + poolCount++ + return true + }) + if poolCount == 0 || time.Now().After(deadline) { + break + } + time.Sleep(20 * time.Millisecond) +} +if poolCount != 0 { + t.Errorf("Expected 0 pools after cleanup, got %d", poolCount) +}internal/adapter/proxy/proxy_integration_test.go (1)
288-289: Use a valid but unreachable endpoint instead of an invalid port"http://localhost:99999" has an invalid port and can yield parsing errors rather than the intended connection-refused path, making the error assertions brittle. Use a valid, closed port to deterministically exercise dial failures.
Apply:
- endpoint = createTestEndpoint("test", "http://localhost:99999", domain.StatusHealthy) + // Use a valid but closed port to force a connection refusal deterministically. + endpoint = createTestEndpoint("test", "http://127.0.0.1:1", domain.StatusHealthy)internal/adapter/proxy/olla/service.go (1)
686-701: Bug: UpdateConfig discards caller-supplied Olla pooling settings; also minor naming shadowing
UpdateConfigcurrently preservesMaxIdleConns,IdleConnTimeout, andMaxConnsPerHostfrom the old config, ignoring any updated values provided by the caller. The shared test builds a new*olla.Configurationwith different values (e.g.MaxIdleConns=400), which won’t take effect. This makes live reconfiguration ineffective.Also, parameter name
configshadows the importedconfigpackage elsewhere in the file. Renaming tocfgavoids mental overhead.Fix: if the incoming
ports.ProxyConfigurationis actually*olla.Configuration, copy Olla-specific fields; otherwise, keep the current preserve-old behaviour.-func (s *Service) UpdateConfig(config ports.ProxyConfiguration) { - newConfig := &Configuration{} - newConfig.ProxyPrefix = config.GetProxyPrefix() - newConfig.ConnectionTimeout = config.GetConnectionTimeout() - newConfig.ConnectionKeepAlive = config.GetConnectionKeepAlive() - newConfig.ResponseTimeout = config.GetResponseTimeout() - newConfig.ReadTimeout = config.GetReadTimeout() - newConfig.StreamBufferSize = config.GetStreamBufferSize() - newConfig.Profile = config.GetProxyProfile() - // Preserve Olla-specific settings - newConfig.MaxIdleConns = s.configuration.MaxIdleConns - newConfig.IdleConnTimeout = s.configuration.IdleConnTimeout - newConfig.MaxConnsPerHost = s.configuration.MaxConnsPerHost +func (s *Service) UpdateConfig(cfg ports.ProxyConfiguration) { + newConfig := &Configuration{} + newConfig.ProxyPrefix = cfg.GetProxyPrefix() + newConfig.ConnectionTimeout = cfg.GetConnectionTimeout() + newConfig.ConnectionKeepAlive = cfg.GetConnectionKeepAlive() + newConfig.ResponseTimeout = cfg.GetResponseTimeout() + newConfig.ReadTimeout = cfg.GetReadTimeout() + newConfig.StreamBufferSize = cfg.GetStreamBufferSize() + newConfig.Profile = cfg.GetProxyProfile() + // Apply Olla-specific settings if provided by caller; otherwise preserve old values + if incoming, ok := cfg.(*Configuration); ok && incoming != nil { + if incoming.MaxIdleConns != 0 { + newConfig.MaxIdleConns = incoming.MaxIdleConns + } else { + newConfig.MaxIdleConns = s.configuration.MaxIdleConns + } + if incoming.IdleConnTimeout != 0 { + newConfig.IdleConnTimeout = incoming.IdleConnTimeout + } else { + newConfig.IdleConnTimeout = s.configuration.IdleConnTimeout + } + if incoming.MaxConnsPerHost != 0 { + newConfig.MaxConnsPerHost = incoming.MaxConnsPerHost + } else { + newConfig.MaxConnsPerHost = s.configuration.MaxConnsPerHost + } + } else { + newConfig.MaxIdleConns = s.configuration.MaxIdleConns + newConfig.IdleConnTimeout = s.configuration.IdleConnTimeout + newConfig.MaxConnsPerHost = s.configuration.MaxConnsPerHost + } // Update configuration atomically s.configuration = newConfig }Optional (follow-up): existing per-endpoint
connectionPool.transportinstances won’t reflect updated pooling limits until they’re recreated. You could force-refresh by closing pools on config change to lazily rebuild with new settings, or update fields in-place where safe (e.g.MaxIdleConnsPerHost). Happy to sketch a scoped approach if you want to enable live updates.
🧹 Nitpick comments (40)
internal/adapter/converter/lmstudio_converter.go (1)
68-75: Normalise model type for LM Studio responses
helper.GetModelType("llm")may return metadata-driven values like"chat"or"completion". If LM Studio expects coarse-grained types (e.g.,"llm","vlm","embeddings"), returning"chat"could be incompatible.Recommend normalising here to ensure predictable output for LM Studio clients.
Would you confirm LM Studio’s accepted values for the
typefield? If it only accepts"llm","vlm","embeddings", consider this minimal normalisation:- modelType := helper.GetModelType("llm") + modelType := helper.GetModelType("llm") + // Normalise finer-grained types to LM Studio's expected categories. + if modelType == "chat" || modelType == "completion" { + modelType = "llm" + } + if modelType == "embedding" { + modelType = "embeddings" + }internal/adapter/converter/base_converter.go (4)
7-17: Doc comments should explain “why”, and use Australian EnglishThe implementation is fine. Per guidelines, prefer comments that justify decisions rather than restate what the code already shows. A small wording tweak improves clarity.
Apply:
-// BaseConverter provides common functionality for all converters +// BaseConverter centralises provider‑agnostic logic so individual converters stay thin and consistent, +// reducing duplication and the risk of divergent behaviour. -// NewBaseConverter creates a new base converter +// NewBaseConverter records the provider type so helper lookups are properly provider‑scoped.
29-38: Endpoint lookup by NativeName is reasonable; minor noteMatching
providerNametoNativeNamemirrors how aliases map to platform names. If some providers useEndpointNamein routing, consider a future extension to check both. No change required now.
40-57: Metadata extractors: consider json.Number and string numericsCurrent int extraction covers
intandfloat64. Some JSON paths may usejson.Numberor numeric strings depending on deserialisation options. Supporting these would make the helpers more robust.+import ( + "github.com/thushan/olla/internal/core/domain" + "encoding/json" +) @@ func (b *BaseConverter) ExtractMetadataInt(metadata map[string]interface{}, key string) int { if val, ok := metadata[key].(int); ok { return val } if val, ok := metadata[key].(float64); ok { return int(val) } + if val, ok := metadata[key].(json.Number); ok { + if i64, err := val.Int64(); err == nil { + return int(i64) + } + } + if s, ok := metadata[key].(string); ok { + if f, err := strconv.ParseFloat(s, 64); err == nil { + return int(f) + } + } return 0 }Note: this adds
encoding/jsonandstrconvimports.
88-105: Type determination: verify metadata semantics across providersReturning
metadata["type"]verbatim (e.g.,"chat") gives flexibility, but downstream formatters (e.g., LM Studio) may need canonical categories. Consider centralising a normalisation step here, but be aware it will affect tests asserting"chat".Would you like a
NormaliseModelType()helper that maps:
"chat","completion","llm"→"llm""vision","vlm"→"vlm""embedding","embeddings"→"embeddings"If so, I can propose a patch and update affected tests.
internal/adapter/converter/openai_converter.go (2)
54-62: Prefer provider‑specific alias over “first alias”; fix typoUsing the first alias is pragmatic but can be ambiguous. Since this converter now embeds
BaseConverter, prefer its alias resolution first, then fall back to the first alias for backwards compatibility. Also fix “alas” → “alias”.func (c *OpenAIConverter) convertModel(model *domain.UnifiedModel) OpenAIModelData { - // OLLA-85: [Unification] Models with different digests fail to unify correctly. - // we need to use first alas as ID for routing compatibility - // to make sure the returned model ID can be used for requests + // OLLA-85: [Unification] Models with different digests fail to unify correctly. + // Use the provider‑specific alias when present to avoid routing ambiguity; + // fall back to the first alias to preserve historical compatibility. modelID := model.ID - if len(model.Aliases) > 0 { - modelID = model.Aliases[0].Name - } + if c.BaseConverter != nil { + if helper := c.BaseConverter.NewConversionHelper(model); helper != nil && helper.Alias != "" { + modelID = helper.Alias + } else if len(model.Aliases) > 0 { + modelID = model.Aliases[0].Name + } + } else if len(model.Aliases) > 0 { + modelID = model.Aliases[0].Name + }
63-69: Created timestamp source: consider stable time for deterministic outputsUsing
time.Now().Unix()makes responses time‑variant, which can complicate caching and test determinism. Ifmodel.LastSeenis meaningful, prefer that when set, falling back totime.Now().- return OpenAIModelData{ - ID: modelID, - Object: "model", - Created: time.Now().Unix(), - OwnedBy: "olla", - } + created := time.Now().Unix() + if !model.LastSeen.IsZero() { + created = model.LastSeen.Unix() + } + return OpenAIModelData{ + ID: modelID, + Object: "model", + Created: created, + OwnedBy: "olla", + }internal/adapter/converter/unified_converter.go (2)
82-85: Fix typo in comment: “alias”Minor spelling issue.
-// we need to use first alas as ID for routing compatibility +// we need to use the first alias as ID for routing compatibility
183-194: Use the embedded BaseConverter instead of creating a new oneA new base with empty provider type is unnecessary and slightly obscures intent. Using
c.BaseConverterkeeps behaviour consistent and future‑proof.- // Use base converter to determine model type - base := NewBaseConverter("") - modelType := base.DetermineModelType(model, "") + // Use the embedded base converter to determine model type (provider‑agnostic here) + modelType := c.BaseConverter.DetermineModelType(model, "")internal/adapter/converter/base_converter_test.go (2)
149-182: Metadata extractor tests are solid; consider adding json.Number caseGreat coverage. If you adopt json.Number/string support in the extractor, add corresponding tests.
266-334: Model type determination tests are valuable; watch for future normalisationIf you later normalise metadata types (e.g., mapping
"chat"to"llm"), update these expectations accordingly. For now, behaviour is consistent with the current design.internal/adapter/proxy/sherpa/service_streaming.go (1)
80-86: Treat non‑positive read timeouts as “use default”Defensive tweak: if a negative duration slips through config, we should still fall back to the default. Micro change, clearer semantics.
Apply this diff:
func (s *Service) getReadTimeout() time.Duration { readTimeout := s.configuration.GetReadTimeout() - if readTimeout == 0 { + if readTimeout <= 0 { readTimeout = config.DefaultReadTimeout } return readTimeout }internal/adapter/proxy/sherpa/service_leak_test.go (1)
150-220: Unify leak tests across Sherpa and Olla enginesPer the repository guideline to maintain a shared proxy test suite running against both Sherpa and Olla, consider table‑driving these leak tests to execute for both engines behind a common interface. This ensures parity and prevents regressions diverging between engines.
I can sketch a small test harness that instantiates each engine and runs the same scenarios if helpful.
internal/adapter/proxy/sherpa/service_retry_test.go (2)
81-167: Reduce duplication by extracting a small test config helperThe three tests duplicate identical configuration blocks. A tiny helper reduces repetition and keeps changes local if defaults move again.
You can add this helper (outside the changed hunks) and use it in all three tests:
// test helper (near top of file) func newTestConfig() *Configuration { c := &Configuration{} c.ProxyPrefix = "/olla/" c.ConnectionTimeout = time.Second c.ConnectionKeepAlive = time.Second c.ResponseTimeout = time.Second c.ReadTimeout = time.Second c.StreamBufferSize = 8192 return c }Then replace each block with:
config := newTestConfig()Also applies to: 220-295, 296-387
81-167: Consider executing retry tests against both enginesTo uphold the shared‑suite guideline, table‑drive these tests to run with Sherpa and Olla services where feasible (construct via a common factory or interface). This helps catch engine‑specific regressions early.
internal/adapter/proxy/proxy_sherpa_test.go (5)
146-148: Avoid incidental flakiness in disconnect test by relaxing read timeoutA 200ms ReadTimeout is close to the test’s 100ms chunk cadence and 250ms cancel, introducing a second failure mode (read timeout vs client cancel). Bump it so cancellation is the primary trigger.
- config.ReadTimeout = 200 * time.Millisecond + config.ReadTimeout = 2 * time.Second
158-160: Mirror timeout relaxation for Olla caseSame rationale as for Sherpa to keep the test deterministic.
- config.ReadTimeout = 200 * time.Millisecond + config.ReadTimeout = 2 * time.Second
261-263: Buffer pooling test asserts only through side‑effectsThe test relies on “no failures across multiple requests” as a proxy for pooling correctness. Consider minimally asserting response sizes or using b.ReportAllocs in a benchmark variant to validate that allocations remain stable with the chosen StreamBufferSize.
163-166: Use the canonical context key in proxy testsThe dynamic
tc.proxyPrefixandtc.proxyConfigPrefixkeys injected in tests aren’t read by the proxy implementations (which exclusively useconstants.ContextRoutePrefixKey). To keep tests aligned with production behaviour and avoid brittle overrides, update or remove these uses.• internal/adapter/proxy/proxy_path_stripping_test.go:164
• internal/adapter/proxy/proxy_path_stripping_test.go:321Suggested diff:
- // dynamic key that proxy impls ignore - ctx := context.WithValue(req.Context(), tc.proxyPrefix, tc.contextPrefix) + // use the stable key recognised by the proxy implementation + ctx := context.WithValue(req.Context(), constants.ContextRoutePrefixKey, tc.contextPrefix) - // dynamic key that proxy impls ignore - ctx := context.WithValue(req.Context(), tc.proxyConfigPrefix, tc.handlerPrefix) + ctx := context.WithValue(req.Context(), constants.ContextRoutePrefixKey, tc.handlerPrefix)
27-30: Extract a shared Sherpa test config builder to reduce duplicationThe explicit field assignments for
ResponseTimeout,ReadTimeoutandStreamBufferSizerecur throughout your Sherpa proxy tests (e.g.proxy_sherpa_test.go:27–30, as well as inservice_retry_test.go,service_leak_test.go, etc.). Defining a small helper function will localise these defaults, keep tests DRY, and make future changes to the defaults impossible to miss.It’s safe to introduce a builder because the Sherpa service always uses the getters—and falls back to its own defaults only when you haven’t explicitly set a field. For example:
- In
service_streaming.go,getReadTimeout()callss.configuration.GetReadTimeout()and only if that yields zero does it useconfig.DefaultReadTimeout.- In
service.go, both the buffer and cloned configs are initialised viaconfiguration.GetStreamBufferSize()andconfig.GetReadTimeout()/GetResponseTimeout().Implementing a test-only helper won’t affect production behaviour, and will clean up over a dozen lines of boilerplate across your test suite.
Suggested diff:
+// newSherpaTestConfig returns a Configuration pre-populated with the +// proxy tests’ standard timeouts and buffer size, allowing optional +// overrides. +func newSherpaTestConfig(overrides func(*sherpa.Configuration)) *sherpa.Configuration { + c := &sherpa.Configuration{ + ResponseTimeout: 30 * time.Second, + ReadTimeout: 10 * time.Second, + StreamBufferSize: 1024, + } + if overrides != nil { + overrides(c) + } + return c +}Then, in your tests:
- config := &sherpa.Configuration{} - config.ResponseTimeout = 30 * time.Second - config.ReadTimeout = 10 * time.Second - config.StreamBufferSize = 1024 + config := newSherpaTestConfig(nil)This keeps all defaults in one place, and any test needing non-standard values can simply pass an override function.
internal/adapter/converter/vllm_converter_test.go (1)
210-210: Decouple tests from unexported VLLMConverter helpersYour tests in
internal/adapter/converter/vllm_converter_test.godirectly cast the result ofNewVLLMConverter()to*VLLMConverterand invoke two unexported methods:
- Line ≈210 —
converter.determineOwner(…)- Line ≈248 —
converter.findVLLMNativeName(…)This tight coupling makes the tests fragile to any future renames, embedding changes or constructor tweaks. Consider one of the following lightweight approaches to decouple:
• Export thin wrappers on
BaseConverter
– e.g. addfunc (b *BaseConverter) DetermineOwnerFromID(id string) stringand
func (b *BaseConverter) FindProviderAliasName(model *domain.UnifiedModel) (string, bool)
– Then in tests callconverter := NewVLLMConverter(); converter.DetermineOwnerFromID(...)without a type assertion.• Introduce a package-private test shim in a
_test.gofile
– e.g. definefunc DetermineOwner(c ports.ModelResponseConverter, id string) stringthat performs the cast internally
– Keeps your tests free of repeated.(*VLLMConverter)assertions.• Or simply verify equivalent behaviour via the exported
ConvertToFormatsurface
– Assert that theOwnedByfield of the resultingVLLMModelResponsematches expectations, covering both helper functions indirectly.These changes will make your suite more maintainable and resilient to refactors.
internal/adapter/proxy/proxy_path_stripping_test.go (4)
140-147: Config construction: suggest a tiny factory to avoid repetitionYou set ProxyPrefix in multiple places; a helper reduces duplication and makes intent clearer.
- sherpaConfig := &sherpa.Configuration{} - sherpaConfig.ProxyPrefix = tc.proxyPrefix + sherpaConfig := &sherpa.Configuration{ } + sherpaConfig.ProxyPrefix = tc.proxyPrefix config = sherpaConfigOptionally:
+func newSherpaConfigWithPrefix(prefix string) *sherpa.Configuration { + c := &sherpa.Configuration{} + c.ProxyPrefix = prefix + return c +}
216-218: Same for Sherpa in double‑strip prevention testApply the same small factory for consistency and readability.
- sherpaConfig := &sherpa.Configuration{} - sherpaConfig.ProxyPrefix = "/olla" + sherpaConfig := newSherpaConfigWithPrefix("/olla")
305-307: Prefer helper for Olla configuration as wellKeeps both implementations aligned and reduces test boilerplate.
- config := &olla.Configuration{} - config.ProxyPrefix = tc.proxyConfigPrefix + config := &olla.Configuration{} + config.ProxyPrefix = tc.proxyConfigPrefixOr via helper:
+func newOllaConfigWithPrefix(prefix string) *olla.Configuration { + c := &olla.Configuration{} + c.ProxyPrefix = prefix + return c +}
163-166: Use the stable context key constant in proxy_path_stripping_test.goThe test at lines 163–166 currently injects the route prefix into the request context using
tc.proxyPrefixas the key, but:
- All production code and other proxy tests (e.g.
proxy_routing_test.go,config.go) consistently useconstants.ContextRoutePrefixKeyas the context key.- Mixing in
tc.proxyPrefixrisks the test silently passing without actually exercising any real lookup in the proxy implementation.To align this test with the rest of the codebase, either:
- Update the block to use the constant key, or
- Remove it entirely to reflect that proxy implementations defensively ignore context-based overrides.
Suggested optional refactor:
- if tc.contextPrefix != "" { - ctx := context.WithValue(req.Context(), tc.proxyPrefix, tc.contextPrefix) - req = req.WithContext(ctx) - } + // Context-based route‐prefix overrides are ignored by proxy implementations. + // To test/document this, you could inject it with the stable key: + // ctx := context.WithValue(req.Context(), constants.ContextRoutePrefixKey, tc.contextPrefix) + // req = req.WithContext(ctx)internal/adapter/proxy/benchmark_comparison_test.go (1)
58-61: Reduce duplication with small config builder helpersThese two setup blocks are identical to patterns repeated across the suite. Consider local helpers to keep benchmarks focused and reduce future drift when defaults evolve.
Example replacement within the changed ranges:
- config := &sherpa.Configuration{} - config.ResponseTimeout = 30 * time.Second - config.ReadTimeout = 10 * time.Second - config.StreamBufferSize = 8192 + config := newSherpaCfg(30*time.Second, 10*time.Second, 8192)- config := &olla.Configuration{} - config.ResponseTimeout = 30 * time.Second - config.ReadTimeout = 10 * time.Second - config.StreamBufferSize = 8192 - config.MaxIdleConns = 200 - config.IdleConnTimeout = 90 * time.Second - config.MaxConnsPerHost = 50 + config := newOllaCfg(30*time.Second, 10*time.Second, 8192, 200, 90*time.Second, 50)Add these helpers near the top of the file:
func newSherpaCfg(resp, read time.Duration, buf int) *sherpa.Configuration { c := &sherpa.Configuration{} c.ResponseTimeout = resp c.ReadTimeout = read c.StreamBufferSize = buf return c } func newOllaCfg(resp, read time.Duration, buf int, maxIdle int, idleTO time.Duration, maxPerHost int) *olla.Configuration { c := &olla.Configuration{} c.ResponseTimeout = resp c.ReadTimeout = read c.StreamBufferSize = buf c.MaxIdleConns = maxIdle c.IdleConnTimeout = idleTO c.MaxConnsPerHost = maxPerHost return c }Also applies to: 71-77
internal/adapter/proxy/proxy_streaming_profiles_test.go (2)
222-226: DRY up repeated config initialisation across suitesThe Sherpa/Olla config assembly is copy-pasted six times here. Factor into suite-aware helpers (e.g., makeConfig(profile string)) to improve readability and keep behaviour aligned if defaults change.
Illustrative replacement within the changed blocks:
- config := &sherpa.Configuration{} - config.Profile = tt.profile - config.ResponseTimeout = 5 * time.Second - config.ReadTimeout = 5 * time.Second - config.StreamBufferSize = 8192 + config := newSherpaCfg(tt.profile, 5*time.Second, 5*time.Second, 8192)- config := &olla.Configuration{} - config.Profile = tt.profile - config.ResponseTimeout = 5 * time.Second - config.ReadTimeout = 5 * time.Second - config.StreamBufferSize = 8192 - config.MaxIdleConns = 10 - config.IdleConnTimeout = 90 * time.Second - config.MaxConnsPerHost = 5 + config := newOllaCfg(tt.profile, 5*time.Second, 5*time.Second, 8192, 10, 90*time.Second, 5)Add minimal helpers once in this file:
func newSherpaCfg(profile string, resp, read time.Duration, buf int) *sherpa.Configuration { c := &sherpa.Configuration{} c.Profile, c.ResponseTimeout, c.ReadTimeout, c.StreamBufferSize = profile, resp, read, buf return c } func newOllaCfg(profile string, resp, read time.Duration, buf, maxIdle, maxPerHost int, idleTO time.Duration) *olla.Configuration { c := &olla.Configuration{} c.Profile, c.ResponseTimeout, c.ReadTimeout, c.StreamBufferSize = profile, resp, read, buf c.MaxIdleConns, c.MaxConnsPerHost, c.IdleConnTimeout = maxIdle, maxPerHost, idleTO return c }Also applies to: 229-236, 311-316, 318-326, 510-515, 517-525
192-275: Parallelise subtests carefully or leave as-is to avoid loop variable capture trapsIf you plan to add t.Parallel() to these subtests, copy the loop variable (tt := tt) inside the loop before t.Run; otherwise parallel execution will race on tt. As written (serial), it’s fine.
internal/adapter/proxy/proxy_olla_test.go (2)
24-31: Consolidate Olla configuration setup and assert UpdateConfig semantics
- Config initialisation is repeated across the file; introduce a small helper to create a tuned Olla config for tests to reduce duplication and drift.
- In TestOllaProxyService_UpdateConfig, newConfig does not set pooling fields; if UpdateConfig replaces the whole struct, these may be reset to zero and then defaulted. Either set them explicitly in newConfig or assert that previous values are preserved by the update path.
Minimal changes within the changed ranges:
- config := &olla.Configuration{} - config.ResponseTimeout = 30 * time.Second - config.ReadTimeout = 10 * time.Second - config.StreamBufferSize = 1024 - config.MaxIdleConns = 200 - config.IdleConnTimeout = 90 * time.Second - config.MaxConnsPerHost = 50 + config := newOllaTestCfg(30*time.Second, 10*time.Second, 1024, 200, 90*time.Second, 50)And in UpdateConfig:
- newConfig := &olla.Configuration{} - newConfig.ResponseTimeout = 20 * time.Second - newConfig.ReadTimeout = 10 * time.Second - newConfig.StreamBufferSize = 8192 + newConfig := &olla.Configuration{} + newConfig.ResponseTimeout = 20 * time.Second + newConfig.ReadTimeout = 10 * time.Second + newConfig.StreamBufferSize = 8192 + // Preserve pooling where not under test to avoid accidental regression from zeroing. + newConfig.MaxIdleConns = initialConfig.MaxIdleConns + newConfig.IdleConnTimeout = initialConfig.IdleConnTimeout + newConfig.MaxConnsPerHost = initialConfig.MaxConnsPerHostAdd once in this file:
func newOllaTestCfg(resp, read time.Duration, buf, maxIdle int, idleTO time.Duration, maxPerHost int) *olla.Configuration { c := &olla.Configuration{} c.ResponseTimeout, c.ReadTimeout, c.StreamBufferSize = resp, read, buf c.MaxIdleConns, c.IdleConnTimeout, c.MaxConnsPerHost = maxIdle, idleTO, maxPerHost return c }Also applies to: 51-57, 117-123, 199-205, 257-263, 478-485, 493-496, 529-535
449-469: Avoid magic numbers for defaults; assert against config constantsIn TestOllaProxyService_ConfigDefaults you’re asserting specific numeric defaults (e.g., 64*1024, 90s). Prefer asserting against the actual default constants to keep tests resilient to tuning changes.
Example:
import proxycfg "github.com/thushan/olla/internal/adapter/proxy/config" if config.StreamBufferSize != proxycfg.OllaDefaultStreamBufferSize { ... } if config.MaxConnsPerHost != proxycfg.OllaDefaultMaxConnsPerHost { ... } if config.IdleConnTimeout != proxycfg.OllaDefaultIdleConnTimeout { ... }internal/adapter/proxy/proxy_integration_test.go (2)
174-178: Config initialisation repetition; introduce shared builders to keep parity tests tidyEach test case re-assembles Sherpa and Olla configs with the same fields. Consolidate into small builder helpers to avoid drift and emphasise parity logic.
Illustrative within changed ranges:
- sherpaConfig := &sherpa.Configuration{} - sherpaConfig.ResponseTimeout = 30 * time.Second - sherpaConfig.ReadTimeout = 10 * time.Second - sherpaConfig.StreamBufferSize = 8192 + sherpaConfig := newSherpaCfg(30*time.Second, 10*time.Second, 8192) - ollaConfig := &olla.Configuration{} - ollaConfig.ResponseTimeout = 30 * time.Second - ollaConfig.ReadTimeout = 10 * time.Second - ollaConfig.StreamBufferSize = 8192 - ollaConfig.MaxIdleConns = 200 - ollaConfig.IdleConnTimeout = 90 * time.Second - ollaConfig.MaxConnsPerHost = 50 + ollaConfig := newOllaCfg(30*time.Second, 10*time.Second, 8192, 200, 90*time.Second, 50)Add once in this file:
func newSherpaCfg(resp, read time.Duration, buf int) *sherpa.Configuration { c := &sherpa.Configuration{} c.ResponseTimeout, c.ReadTimeout, c.StreamBufferSize = resp, read, buf return c } func newOllaCfg(resp, read time.Duration, buf, maxIdle int, idleTO time.Duration, maxPerHost int) *olla.Configuration { c := &olla.Configuration{} c.ResponseTimeout, c.ReadTimeout, c.StreamBufferSize = resp, read, buf c.MaxIdleConns, c.IdleConnTimeout, c.MaxConnsPerHost = maxIdle, idleTO, maxPerHost return c }Also applies to: 179-186, 201-205, 206-213, 242-246, 247-254, 417-423, 462-465, 589-594, 595-603
498-571: Performance test average latency maths is acceptable, but consider reporting p95Dividing wall-clock by request count under concurrency gives a coarse average; adding a simple p95 sample (tracking per-request durations) would better capture tail behaviour without much complexity. Optional.
internal/adapter/proxy/olla/config.go (1)
4-5: LGTM: Embed OllaConfig; add compile-time interface assertion for safetyEmbedding
config.OllaConfigkeeps the surface unified. Consider adding an interface assertion to catch drift at compile time.package olla import ( "github.com/thushan/olla/internal/adapter/proxy/config" + "github.com/thushan/olla/internal/core/ports" ) // Configuration holds Olla proxy settings type Configuration struct { config.OllaConfig } + +// Ensure Configuration satisfies the proxy configuration interface +var _ ports.ProxyConfiguration = (*Configuration)(nil)Also applies to: 7-10
internal/adapter/proxy/sherpa/config.go (1)
8-11: Prefer embedding config.SherpaConfig (not BaseProxyConfig) for intent + forward‑compatibilitySherpa now has its own wrapper type in the shared package. Embedding that keeps semantics self‑documenting and protects you if Sherpa‑specific fields are added later.
Apply:
-// Configuration holds Sherpa proxy settings +// Configuration holds Sherpa proxy settings. We embed config.SherpaConfig +// (rather than the bare BaseProxyConfig) so future Sherpa‑specific fields +// and defaults flow through without touching this package. type Configuration struct { - config.BaseProxyConfig + config.SherpaConfig }internal/adapter/proxy/config/unified.go (6)
27-40: Clarify zero‑value semantics in the interface docDocumenting how zero values are interpreted helps implementers avoid surprises and keeps behaviour consistent across proxies.
Apply:
-// ProxyConfig defines the interface for all proxy configurations +// ProxyConfig defines the interface for all proxy configurations. +// Zero‑value semantics: +// - Durations of 0 adopt package defaults unless otherwise noted. +// - ResponseTimeout of 0 means “no response timeout”. +// - StreamBufferSize of 0 adopts the proxy‑specific default. type ProxyConfig interface {
61-67: Avoid product‑specific defaults in Base; push “/olla/” into OllaConfigBaking “/olla/” into the base ties all proxy implementations to Olla’s path. Prefer a neutral base default and override in OllaConfig.
Suggested changes:
func (c *BaseProxyConfig) GetProxyPrefix() string { if c.ProxyPrefix == "" { - return "/olla/" + return "/" } return c.ProxyPrefix }And add an Olla override:
+// GetProxyPrefix returns the proxy prefix, defaulting to "/olla/" for Olla. +func (c *OllaConfig) GetProxyPrefix() string { + if c.ProxyPrefix == "" { + return "/olla/" + } + return c.ProxyPrefix +}
53-59: Call out that implementations may override profile defaultsMake the override pattern explicit to avoid confusion when packages (like Sherpa) diverge from “auto”.
Apply:
-// GetProxyProfile returns the proxy profile, defaulting to "auto" if not set +// GetProxyProfile returns the proxy profile, defaulting to "auto" if not set. +// Implementations may override this to reflect typical usage (e.g. Sherpa +// defaults to "streaming" to optimise interactive behaviour). func (c *BaseProxyConfig) GetProxyProfile() string {
106-109: Implement real validation (non‑negative durations, sensible buffer size)Returning nil hides misconfiguration that can lead to surprising runtime behaviour.
Apply:
+import ( + "time" + "github.com/thushan/olla/internal/core/constants" + "fmt" +) @@ func (c *BaseProxyConfig) Validate() error { - return nil + if c.StreamBufferSize < 0 { + return fmt.Errorf("stream buffer size must be >= 0 (got %d)", c.StreamBufferSize) + } + if c.ConnectionTimeout < 0 || c.ConnectionKeepAlive < 0 || c.ResponseTimeout < 0 || c.ReadTimeout < 0 { + return fmt.Errorf("timeouts must be >= 0 (got ct=%v, ka=%v, rt=%v, read=%v)", + c.ConnectionTimeout, c.ConnectionKeepAlive, c.ResponseTimeout, c.ReadTimeout) + } + // Optional: require a leading slash for prefixes to avoid path bugs. + if c.ProxyPrefix != "" && c.ProxyPrefix[0] != '/' { + return fmt.Errorf("proxy prefix must start with '/', got %q", c.ProxyPrefix) + } + return nil }
9-25: Tighten comments and prefer Australian English (“behaviour”)Minor doc polish to explain why these defaults exist and match the project’s style guide.
Apply:
-// Default values for proxy settings +// Default values for proxy settings — chosen to provide sensible out‑of‑the‑box behaviour. @@ - // Olla-specific defaults for high-performance + // Olla‑specific defaults tuned for high‑throughput streaming workloads @@ - // Olla uses 30s timeouts for faster failure detection in AI workloads + // Olla uses 30 s timeouts for faster failure detection in AI workloads
111-115: Dead code: remove unusedSherpaConfigA repository-wide search shows
SherpaConfigis only ever declared in
internal/adapter/proxy/config/unified.goand never referenced elsewhere.
To reduce cognitive overhead, you can safely delete this dead type.• File:
internal/adapter/proxy/config/unified.go(lines 111–115)
• Remove the block:- // SherpaConfig extends BaseProxyConfig with Sherpa-specific configuration - type SherpaConfig struct { - BaseProxyConfig - }• If you meant to have a distinct wrapper for Sherpa configuration, consider merging it with the existing Sherpa config type (e.g. in
pkg/sherpa/config) instead of keeping two parallel definitions.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (27)
internal/adapter/converter/base_converter.go(1 hunks)internal/adapter/converter/base_converter_test.go(1 hunks)internal/adapter/converter/lmstudio_converter.go(2 hunks)internal/adapter/converter/ollama_converter.go(2 hunks)internal/adapter/converter/openai_converter.go(1 hunks)internal/adapter/converter/unified_converter.go(2 hunks)internal/adapter/converter/vllm_converter.go(2 hunks)internal/adapter/converter/vllm_converter_test.go(2 hunks)internal/adapter/proxy/benchmark_comparison_test.go(2 hunks)internal/adapter/proxy/benchmark_refactor_test.go(1 hunks)internal/adapter/proxy/config/unified.go(1 hunks)internal/adapter/proxy/factory.go(1 hunks)internal/adapter/proxy/olla/config.go(1 hunks)internal/adapter/proxy/olla/service.go(4 hunks)internal/adapter/proxy/olla/service_leak_test.go(2 hunks)internal/adapter/proxy/proxy_integration_test.go(6 hunks)internal/adapter/proxy/proxy_olla_test.go(7 hunks)internal/adapter/proxy/proxy_path_stripping_test.go(5 hunks)internal/adapter/proxy/proxy_sherpa_test.go(7 hunks)internal/adapter/proxy/proxy_streaming_profiles_test.go(3 hunks)internal/adapter/proxy/proxy_test.go(3 hunks)internal/adapter/proxy/sherpa/config.go(1 hunks)internal/adapter/proxy/sherpa/service.go(2 hunks)internal/adapter/proxy/sherpa/service_leak_stress_test.go(2 hunks)internal/adapter/proxy/sherpa/service_leak_test.go(2 hunks)internal/adapter/proxy/sherpa/service_retry_test.go(3 hunks)internal/adapter/proxy/sherpa/service_streaming.go(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{go,md}
📄 CodeRabbit inference engine (CLAUDE.md)
Use Australian English in comments and documentation; write comments explaining why rather than what
Files:
internal/adapter/proxy/proxy_path_stripping_test.gointernal/adapter/proxy/sherpa/service_streaming.gointernal/adapter/proxy/sherpa/service_retry_test.gointernal/adapter/proxy/proxy_test.gointernal/adapter/proxy/benchmark_comparison_test.gointernal/adapter/proxy/sherpa/service_leak_stress_test.gointernal/adapter/proxy/factory.gointernal/adapter/proxy/proxy_sherpa_test.gointernal/adapter/converter/base_converter_test.gointernal/adapter/proxy/olla/service_leak_test.gointernal/adapter/proxy/benchmark_refactor_test.gointernal/adapter/converter/lmstudio_converter.gointernal/adapter/proxy/sherpa/config.gointernal/adapter/proxy/sherpa/service_leak_test.gointernal/adapter/converter/vllm_converter_test.gointernal/adapter/proxy/proxy_integration_test.gointernal/adapter/proxy/proxy_streaming_profiles_test.gointernal/adapter/proxy/proxy_olla_test.gointernal/adapter/proxy/olla/config.gointernal/adapter/converter/base_converter.gointernal/adapter/converter/openai_converter.gointernal/adapter/proxy/config/unified.gointernal/adapter/converter/vllm_converter.gointernal/adapter/proxy/olla/service.gointernal/adapter/converter/ollama_converter.gointernal/adapter/proxy/sherpa/service.gointernal/adapter/converter/unified_converter.go
internal/adapter/proxy/**/*_test.go
📄 CodeRabbit inference engine (CLAUDE.md)
Maintain a shared test suite that validates both Sherpa and Olla proxy engines
Files:
internal/adapter/proxy/proxy_path_stripping_test.gointernal/adapter/proxy/sherpa/service_retry_test.gointernal/adapter/proxy/proxy_test.gointernal/adapter/proxy/benchmark_comparison_test.gointernal/adapter/proxy/sherpa/service_leak_stress_test.gointernal/adapter/proxy/proxy_sherpa_test.gointernal/adapter/proxy/olla/service_leak_test.gointernal/adapter/proxy/benchmark_refactor_test.gointernal/adapter/proxy/sherpa/service_leak_test.gointernal/adapter/proxy/proxy_integration_test.gointernal/adapter/proxy/proxy_streaming_profiles_test.gointernal/adapter/proxy/proxy_olla_test.go
🧠 Learnings (2)
📚 Learning: 2025-08-21T11:42:29.496Z
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-21T11:42:29.496Z
Learning: Applies to internal/adapter/proxy/**/*_test.go : Maintain a shared test suite that validates both Sherpa and Olla proxy engines
Applied to files:
internal/adapter/proxy/proxy_path_stripping_test.gointernal/adapter/proxy/sherpa/service_streaming.gointernal/adapter/proxy/sherpa/service_retry_test.gointernal/adapter/proxy/proxy_test.gointernal/adapter/proxy/benchmark_comparison_test.gointernal/adapter/proxy/sherpa/service_leak_stress_test.gointernal/adapter/proxy/factory.gointernal/adapter/proxy/proxy_sherpa_test.gointernal/adapter/proxy/benchmark_refactor_test.gointernal/adapter/proxy/proxy_integration_test.gointernal/adapter/proxy/proxy_streaming_profiles_test.gointernal/adapter/proxy/proxy_olla_test.go
📚 Learning: 2025-08-21T11:42:29.496Z
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-21T11:42:29.496Z
Learning: Applies to internal/app/handlers/handler_proxy.go : Serve proxy routes under the /olla/ prefix
Applied to files:
internal/adapter/proxy/proxy_path_stripping_test.gointernal/adapter/proxy/sherpa/service_streaming.gointernal/adapter/proxy/proxy_sherpa_test.gointernal/adapter/proxy/proxy_olla_test.gointernal/adapter/proxy/olla/service.go
🧬 Code graph analysis (27)
internal/adapter/proxy/proxy_path_stripping_test.go (3)
internal/adapter/proxy/olla/config.go (1)
Configuration(8-10)internal/adapter/proxy/sherpa/config.go (1)
Configuration(9-11)internal/adapter/proxy/config.go (1)
Configuration(28-42)
internal/adapter/proxy/sherpa/service_streaming.go (2)
internal/adapter/proxy/config/unified.go (1)
DefaultReadTimeout(11-11)internal/adapter/proxy/config.go (1)
DefaultReadTimeout(14-14)
internal/adapter/proxy/sherpa/service_retry_test.go (1)
internal/adapter/proxy/sherpa/config.go (1)
Configuration(9-11)
internal/adapter/proxy/proxy_test.go (3)
internal/adapter/proxy/olla/config.go (1)
Configuration(8-10)internal/adapter/proxy/sherpa/config.go (1)
Configuration(9-11)internal/adapter/proxy/config.go (1)
Configuration(28-42)
internal/adapter/proxy/benchmark_comparison_test.go (2)
internal/adapter/proxy/olla/config.go (1)
Configuration(8-10)internal/adapter/proxy/sherpa/config.go (1)
Configuration(9-11)
internal/adapter/proxy/sherpa/service_leak_stress_test.go (1)
internal/adapter/proxy/sherpa/config.go (1)
Configuration(9-11)
internal/adapter/proxy/factory.go (6)
internal/adapter/proxy/olla/config.go (1)
Configuration(8-10)internal/adapter/proxy/sherpa/config.go (1)
Configuration(9-11)internal/adapter/proxy/config.go (1)
Configuration(28-42)internal/adapter/proxy/olla/service.go (1)
NewService(136-213)internal/adapter/proxy/sherpa/service.go (1)
NewService(70-123)internal/core/ports/proxy.go (2)
DiscoveryService(72-77)ProxyConfiguration(31-39)
internal/adapter/proxy/proxy_sherpa_test.go (3)
internal/adapter/proxy/olla/config.go (1)
Configuration(8-10)internal/adapter/proxy/sherpa/config.go (1)
Configuration(9-11)internal/adapter/proxy/config.go (1)
Configuration(28-42)
internal/adapter/converter/base_converter_test.go (2)
internal/adapter/converter/base_converter.go (1)
NewBaseConverter(13-17)internal/core/domain/unified_model.go (3)
UnifiedModel(15-31)AliasEntry(9-12)SourceEndpoint(34-44)
internal/adapter/proxy/olla/service_leak_test.go (1)
internal/adapter/proxy/olla/config.go (1)
Configuration(8-10)
internal/adapter/proxy/benchmark_refactor_test.go (4)
internal/adapter/proxy/olla/config.go (1)
Configuration(8-10)internal/adapter/proxy/sherpa/config.go (1)
Configuration(9-11)internal/adapter/proxy/olla/service.go (1)
NewService(136-213)internal/adapter/proxy/sherpa/service.go (1)
NewService(70-123)
internal/adapter/converter/lmstudio_converter.go (3)
internal/adapter/converter/base_converter.go (2)
BaseConverter(8-10)NewBaseConverter(13-17)internal/core/ports/model_converter.go (1)
ModelResponseConverter(10-15)internal/core/constants/providers.go (1)
ProviderPrefixLMStudio1(17-17)
internal/adapter/proxy/sherpa/config.go (1)
internal/adapter/proxy/config/unified.go (1)
BaseProxyConfig(43-51)
internal/adapter/proxy/sherpa/service_leak_test.go (1)
internal/adapter/proxy/sherpa/config.go (1)
Configuration(9-11)
internal/adapter/converter/vllm_converter_test.go (1)
internal/adapter/converter/vllm_converter.go (2)
NewVLLMConverter(24-28)VLLMConverter(19-21)
internal/adapter/proxy/proxy_integration_test.go (3)
internal/adapter/proxy/olla/config.go (1)
Configuration(8-10)internal/adapter/proxy/sherpa/config.go (1)
Configuration(9-11)internal/adapter/proxy/config.go (1)
Configuration(28-42)
internal/adapter/proxy/proxy_streaming_profiles_test.go (4)
internal/adapter/proxy/olla/config.go (1)
Configuration(8-10)internal/adapter/proxy/sherpa/config.go (1)
Configuration(9-11)internal/adapter/proxy/config.go (1)
Configuration(28-42)internal/core/constants/config.go (1)
ConfigurationProxyProfileAuto(4-4)
internal/adapter/proxy/proxy_olla_test.go (3)
internal/adapter/proxy/olla/config.go (1)
Configuration(8-10)internal/adapter/proxy/config.go (1)
Configuration(28-42)internal/adapter/proxy/olla/service.go (1)
NewService(136-213)
internal/adapter/proxy/olla/config.go (1)
internal/adapter/proxy/config/unified.go (1)
OllaConfig(117-124)
internal/adapter/converter/base_converter.go (1)
internal/core/domain/unified_model.go (2)
UnifiedModel(15-31)SourceEndpoint(34-44)
internal/adapter/converter/openai_converter.go (2)
internal/adapter/converter/base_converter.go (2)
BaseConverter(8-10)NewBaseConverter(13-17)internal/core/ports/model_converter.go (1)
ModelResponseConverter(10-15)
internal/adapter/proxy/config/unified.go (2)
pkg/format/format.go (1)
Duration(31-46)internal/core/constants/config.go (1)
ConfigurationProxyProfileAuto(4-4)
internal/adapter/converter/vllm_converter.go (3)
internal/adapter/converter/base_converter.go (2)
BaseConverter(8-10)NewBaseConverter(13-17)internal/core/ports/model_converter.go (1)
ModelResponseConverter(10-15)internal/core/constants/providers.go (1)
ProviderTypeVLLM(8-8)
internal/adapter/proxy/olla/service.go (3)
internal/adapter/proxy/config/unified.go (5)
OllaDefaultStreamBufferSize(17-17)OllaDefaultMaxIdleConns(18-18)OllaDefaultMaxConnsPerHost(19-19)OllaDefaultIdleConnTimeout(20-20)DefaultReadTimeout(11-11)internal/adapter/proxy/config.go (5)
OllaDefaultMaxIdleConns(23-23)OllaDefaultMaxConnsPerHost(24-24)OllaDefaultIdleConnTimeout(25-25)DefaultReadTimeout(14-14)Configuration(28-42)internal/adapter/proxy/olla/config.go (1)
Configuration(8-10)
internal/adapter/converter/ollama_converter.go (3)
internal/adapter/converter/base_converter.go (2)
BaseConverter(8-10)NewBaseConverter(13-17)internal/core/ports/model_converter.go (1)
ModelResponseConverter(10-15)internal/core/constants/providers.go (1)
ProviderTypeOllama(4-4)
internal/adapter/proxy/sherpa/service.go (2)
internal/adapter/proxy/sherpa/config.go (1)
Configuration(9-11)internal/adapter/proxy/config.go (1)
Configuration(28-42)
internal/adapter/converter/unified_converter.go (2)
internal/adapter/converter/base_converter.go (2)
BaseConverter(8-10)NewBaseConverter(13-17)internal/core/ports/model_converter.go (1)
ModelResponseConverter(10-15)
internal/adapter/proxy/factory.go
Outdated
| ollaConfig := &olla.Configuration{} | ||
| ollaConfig.ProxyPrefix = config.GetProxyPrefix() | ||
| ollaConfig.ConnectionTimeout = config.GetConnectionTimeout() | ||
| ollaConfig.ConnectionKeepAlive = config.GetConnectionKeepAlive() | ||
| ollaConfig.ResponseTimeout = config.GetResponseTimeout() | ||
| ollaConfig.ReadTimeout = config.GetReadTimeout() | ||
| ollaConfig.StreamBufferSize = config.GetStreamBufferSize() | ||
| ollaConfig.Profile = config.GetProxyProfile() | ||
| // Olla-specific fields | ||
| ollaConfig.MaxIdleConns = 200 | ||
| ollaConfig.IdleConnTimeout = 90 * time.Second | ||
| ollaConfig.MaxConnsPerHost = 50 |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Don’t hard-code Olla-specific pooling defaults in the factory
Hard-coding MaxIdleConns/IdleConnTimeout/MaxConnsPerHost here duplicates defaults already applied in olla.NewService and risks divergence if those defaults are tuned later. Either:
- leave these zero to let olla.NewService apply its canonical defaults, or
- propagate them only when supplied by a richer configuration that exposes Olla-specific getters.
Replace the hard-coded assignments:
- // Olla-specific fields
- ollaConfig.MaxIdleConns = 200
- ollaConfig.IdleConnTimeout = 90 * time.Second
- ollaConfig.MaxConnsPerHost = 50
+ // Olla-specific fields: keep zero to defer to service defaults, or map via optional getters if present.
+ if oc, ok := config.(interface {
+ GetMaxIdleConns() int
+ GetIdleConnTimeout() time.Duration
+ GetMaxConnsPerHost() int
+ }); ok {
+ ollaConfig.MaxIdleConns = oc.GetMaxIdleConns()
+ ollaConfig.IdleConnTimeout = oc.GetIdleConnTimeout()
+ ollaConfig.MaxConnsPerHost = oc.GetMaxConnsPerHost()
+ }Why: avoids config drift and keeps the factory focussed on translation rather than policy.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ollaConfig := &olla.Configuration{} | |
| ollaConfig.ProxyPrefix = config.GetProxyPrefix() | |
| ollaConfig.ConnectionTimeout = config.GetConnectionTimeout() | |
| ollaConfig.ConnectionKeepAlive = config.GetConnectionKeepAlive() | |
| ollaConfig.ResponseTimeout = config.GetResponseTimeout() | |
| ollaConfig.ReadTimeout = config.GetReadTimeout() | |
| ollaConfig.StreamBufferSize = config.GetStreamBufferSize() | |
| ollaConfig.Profile = config.GetProxyProfile() | |
| // Olla-specific fields | |
| ollaConfig.MaxIdleConns = 200 | |
| ollaConfig.IdleConnTimeout = 90 * time.Second | |
| ollaConfig.MaxConnsPerHost = 50 | |
| ollaConfig := &olla.Configuration{} | |
| ollaConfig.ProxyPrefix = config.GetProxyPrefix() | |
| ollaConfig.ConnectionTimeout = config.GetConnectionTimeout() | |
| ollaConfig.ConnectionKeepAlive = config.GetConnectionKeepAlive() | |
| ollaConfig.ResponseTimeout = config.GetResponseTimeout() | |
| ollaConfig.ReadTimeout = config.GetReadTimeout() | |
| ollaConfig.StreamBufferSize = config.GetStreamBufferSize() | |
| ollaConfig.Profile = config.GetProxyProfile() | |
| // Olla-specific fields: keep zero to defer to service defaults, or map via optional getters if present. | |
| if oc, ok := config.(interface { | |
| GetMaxIdleConns() int | |
| GetIdleConnTimeout() time.Duration | |
| GetMaxConnsPerHost() int | |
| }); ok { | |
| ollaConfig.MaxIdleConns = oc.GetMaxIdleConns() | |
| ollaConfig.IdleConnTimeout = oc.GetIdleConnTimeout() | |
| ollaConfig.MaxConnsPerHost = oc.GetMaxConnsPerHost() | |
| } |
🤖 Prompt for AI Agents
In internal/adapter/proxy/factory.go around lines 55 to 66, the factory
currently hard-codes Olla-specific pooling defaults (MaxIdleConns,
IdleConnTimeout, MaxConnsPerHost); remove those hard-coded assignments so the
values remain zeroed and let olla.NewService apply its canonical defaults, or
alternatively only assign them from a dedicated Olla-specific config getter if
such richer configuration is available; update the file to stop duplicating
policy-level defaults and rely on the Olla constructor or explicit config
propagation instead.
| newConfig := &Configuration{} | ||
| newConfig.ProxyPrefix = config.GetProxyPrefix() | ||
| newConfig.ConnectionTimeout = config.GetConnectionTimeout() | ||
| newConfig.ConnectionKeepAlive = config.GetConnectionKeepAlive() | ||
| newConfig.ResponseTimeout = config.GetResponseTimeout() | ||
| newConfig.ReadTimeout = config.GetReadTimeout() | ||
| newConfig.StreamBufferSize = config.GetStreamBufferSize() | ||
| newConfig.Profile = config.GetProxyProfile() | ||
|
|
There was a problem hiding this comment.
Config updates don’t take effect for new dials and buffer size—transport and pool not refreshed
NewService’s transport.DialContext closure captures the original configuration pointer. After UpdateConfig, s.configuration points to a new object, but the DialContext still reads timeouts from the stale pointer. Similarly, the buffer pool size depends on StreamBufferSize but isn’t reinitialised.
Impact: subsequent connections won’t respect updated connection/read/keepalive timeouts; streaming may continue with the old buffer size. This is subtle and hard to detect in tests.
Suggest updating the transport’s DialContext after swapping the config, and (optionally) rebuilding the buffer pool when the size changes.
Patch (minimal change inside UpdateConfig to refresh DialContext):
func (s *Service) UpdateConfig(config ports.ProxyConfiguration) {
newConfig := &Configuration{}
newConfig.ProxyPrefix = config.GetProxyPrefix()
newConfig.ConnectionTimeout = config.GetConnectionTimeout()
newConfig.ConnectionKeepAlive = config.GetConnectionKeepAlive()
newConfig.ResponseTimeout = config.GetResponseTimeout()
newConfig.ReadTimeout = config.GetReadTimeout()
newConfig.StreamBufferSize = config.GetStreamBufferSize()
newConfig.Profile = config.GetProxyProfile()
- s.configuration = newConfig
+ s.configuration = newConfig
+ if s.transport != nil {
+ s.transport.DialContext = func(ctx context.Context, network, addr string) (net.Conn, error) {
+ dialer := &net.Dialer{
+ Timeout: s.configuration.GetConnectionTimeout(),
+ KeepAlive: s.configuration.GetConnectionKeepAlive(),
+ }
+ conn, err := dialer.DialContext(ctx, network, addr)
+ if err != nil {
+ return nil, err
+ }
+ if tcpConn, ok := conn.(*net.TCPConn); ok {
+ if terr := tcpConn.SetNoDelay(DefaultSetNoDelay); terr != nil {
+ s.Logger.Warn("failed to set NoDelay", "err", terr)
+ }
+ }
+ return conn, nil
+ }
+ }
}Optional: rebuild the buffer pool if the size changed:
+ // Optional: rebuild buffer pool if size changed
+ if s.bufferPool != nil && newConfig.StreamBufferSize != 0 {
+ bufferPool, err := pool.NewLitePool(func() *[]byte {
+ buf := make([]byte, newConfig.StreamBufferSize)
+ return &buf
+ })
+ if err == nil {
+ s.bufferPool = bufferPool
+ } else {
+ s.Logger.Warn("failed to rebuild buffer pool", "err", err)
+ }
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| newConfig := &Configuration{} | |
| newConfig.ProxyPrefix = config.GetProxyPrefix() | |
| newConfig.ConnectionTimeout = config.GetConnectionTimeout() | |
| newConfig.ConnectionKeepAlive = config.GetConnectionKeepAlive() | |
| newConfig.ResponseTimeout = config.GetResponseTimeout() | |
| newConfig.ReadTimeout = config.GetReadTimeout() | |
| newConfig.StreamBufferSize = config.GetStreamBufferSize() | |
| newConfig.Profile = config.GetProxyProfile() | |
| func (s *Service) UpdateConfig(config ports.ProxyConfiguration) { | |
| newConfig := &Configuration{} | |
| newConfig.ProxyPrefix = config.GetProxyPrefix() | |
| newConfig.ConnectionTimeout = config.GetConnectionTimeout() | |
| newConfig.ConnectionKeepAlive = config.GetConnectionKeepAlive() | |
| newConfig.ResponseTimeout = config.GetResponseTimeout() | |
| newConfig.ReadTimeout = config.GetReadTimeout() | |
| newConfig.StreamBufferSize = config.GetStreamBufferSize() | |
| newConfig.Profile = config.GetProxyProfile() | |
| s.configuration = newConfig | |
| if s.transport != nil { | |
| s.transport.DialContext = func(ctx context.Context, network, addr string) (net.Conn, error) { | |
| dialer := &net.Dialer{ | |
| Timeout: s.configuration.GetConnectionTimeout(), | |
| KeepAlive: s.configuration.GetConnectionKeepAlive(), | |
| } | |
| conn, err := dialer.DialContext(ctx, network, addr) | |
| if err != nil { | |
| return nil, err | |
| } | |
| if tcpConn, ok := conn.(*net.TCPConn); ok { | |
| if terr := tcpConn.SetNoDelay(DefaultSetNoDelay); terr != nil { | |
| s.Logger.Warn("failed to set NoDelay", "err", terr) | |
| } | |
| } | |
| return conn, nil | |
| } | |
| } | |
| } |
🤖 Prompt for AI Agents
internal/adapter/proxy/sherpa/service.go around lines 147 to 155: the
UpdateConfig implementation replaces s.configuration but the transport's
DialContext closure still closes over the old configuration pointer and the
buffer pool size is never reinitialized, so new timeouts and buffer sizes don't
take effect; to fix, after swapping s.configuration acquire the same lock used
for updates, recreate or update s.transport.DialContext so it reads
timeouts/keepalive from the updated s.configuration (or rebuild the transport
with the new values) and, if the new StreamBufferSize differs from the old,
reinitialize the buffer pool to the new size (ensuring any pool swap is done
under lock to avoid races).
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/adapter/proxy/olla/service.go (1)
215-244: Wrong transport field mapped: MaxConnsPerHost not setMaxConnsPerHost from config is being assigned to MaxIdleConnsPerHost, leaving MaxConnsPerHost at default (unbounded). This changes connection limiting semantics and can cause unexpected load patterns.
Apply this diff to map fields correctly (optionally keep the idle-per-host setting if intended):
return &http.Transport{ MaxIdleConns: config.MaxIdleConns, - MaxIdleConnsPerHost: config.MaxConnsPerHost, + MaxConnsPerHost: config.MaxConnsPerHost, + // Optional: keep per-host idle aligned with the connection cap + MaxIdleConnsPerHost: config.MaxConnsPerHost, IdleConnTimeout: config.IdleConnTimeout, TLSHandshakeTimeout: DefaultTLSHandshakeTimeout, DisableCompression: true, ForceAttemptHTTP2: true,
🧹 Nitpick comments (1)
internal/adapter/proxy/olla/service.go (1)
686-709: Also accept Olla-specific getters in UpdateConfigCurrently we only copy Olla-specific fields when the incoming config is exactly *olla.Configuration. Mirror the factory’s optional-getters path so unified configs carrying these values flow through.
Apply this diff:
// we try to get Olla-specific fields from incoming config if it's an *olla.Configuration - if ollaConfig, ok := config.(*Configuration); ok && ollaConfig != nil { + if oc, ok := config.(interface { + GetMaxIdleConns() int + GetIdleConnTimeout() time.Duration + GetMaxConnsPerHost() int + }); ok { + newConfig.MaxIdleConns = oc.GetMaxIdleConns() + newConfig.IdleConnTimeout = oc.GetIdleConnTimeout() + newConfig.MaxConnsPerHost = oc.GetMaxConnsPerHost() + } else if ollaConfig, ok := config.(*Configuration); ok && ollaConfig != nil { newConfig.MaxIdleConns = ollaConfig.MaxIdleConns newConfig.IdleConnTimeout = ollaConfig.IdleConnTimeout newConfig.MaxConnsPerHost = ollaConfig.MaxConnsPerHost } else {
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
internal/adapter/proxy/factory.go(1 hunks)internal/adapter/proxy/olla/service.go(4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
internal/adapter/proxy/**/service.go
📄 CodeRabbit inference engine (CLAUDE.md)
Proxy services must include response headers:
X-Olla-Endpoint,X-Olla-Model,X-Olla-Backend-Type,X-Olla-Request-ID,X-Olla-Response-Time
Files:
internal/adapter/proxy/olla/service.go
🧠 Learnings (2)
📚 Learning: 2025-09-23T08:30:20.348Z
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-23T08:30:20.348Z
Learning: Applies to internal/adapter/proxy/**/service.go : Proxy services must include response headers: `X-Olla-Endpoint`, `X-Olla-Model`, `X-Olla-Backend-Type`, `X-Olla-Request-ID`, `X-Olla-Response-Time`
Applied to files:
internal/adapter/proxy/factory.gointernal/adapter/proxy/olla/service.go
📚 Learning: 2025-09-23T08:30:20.348Z
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-23T08:30:20.348Z
Learning: Applies to internal/app/handlers/handler_proxy.go : All proxied routes must use the `/olla/` URL prefix
Applied to files:
internal/adapter/proxy/olla/service.go
🧬 Code graph analysis (2)
internal/adapter/proxy/factory.go (4)
internal/adapter/proxy/olla/config.go (1)
Configuration(8-10)internal/adapter/proxy/config.go (1)
Configuration(28-42)internal/adapter/proxy/olla/service.go (1)
NewService(136-213)internal/core/ports/proxy.go (2)
DiscoveryService(72-77)ProxyConfiguration(31-39)
internal/adapter/proxy/olla/service.go (3)
internal/adapter/proxy/config/unified.go (5)
OllaDefaultStreamBufferSize(17-17)OllaDefaultMaxIdleConns(18-18)OllaDefaultMaxConnsPerHost(19-19)OllaDefaultIdleConnTimeout(20-20)DefaultReadTimeout(11-11)internal/adapter/proxy/config.go (5)
OllaDefaultMaxIdleConns(23-23)OllaDefaultMaxConnsPerHost(24-24)OllaDefaultIdleConnTimeout(25-25)DefaultReadTimeout(14-14)Configuration(28-42)internal/adapter/proxy/olla/config.go (1)
Configuration(8-10)
🔇 Additional comments (6)
internal/adapter/proxy/factory.go (3)
42-49: Sherpa config mapping looks goodField-by-field translation from ProxyConfiguration is clear and consistent.
55-63: Olla config mapping looks goodGeneric fields and Profile are correctly propagated.
64-77: Remove hard‑coded Olla pooling defaults; rely on service defaults or explicit gettersThis reintroduces divergent defaults (e.g. MaxIdleConns=200 here vs 100 in config constants) and duplicates policy the service already applies. Keep zero values so olla.NewService applies canonical defaults, or only set via the optional getters.
Apply this diff to drop the fallback block:
if ollaSpecific, ok := config.(interface { GetMaxIdleConns() int GetIdleConnTimeout() time.Duration GetMaxConnsPerHost() int }); ok { ollaConfig.MaxIdleConns = ollaSpecific.GetMaxIdleConns() ollaConfig.IdleConnTimeout = ollaSpecific.GetIdleConnTimeout() ollaConfig.MaxConnsPerHost = ollaSpecific.GetMaxConnsPerHost() - } else { - // fallback option with defaults - ollaConfig.MaxIdleConns = 200 - ollaConfig.IdleConnTimeout = 90 * time.Second - ollaConfig.MaxConnsPerHost = 50 }internal/adapter/proxy/olla/service.go (3)
43-44: Correct package importSwitching to shared proxy config source is appropriate.
146-159: Defaulting now aligns with centralised configGood move to source defaults from config constants rather than literals.
561-562: Confirmed: SetResponseHeaders sets all required X-Olla headersIt sets X-Olla-Request-ID, X-Olla-Response-Time, X-Olla-Endpoint, X-Olla-Backend-Type and X-Olla-Model. Conditions: Request-ID set when stats.RequestID != ""; Response-Time set when stats.StartTime is non-zero; Model set when stats.Model != ""; Endpoint and Backend-Type set when endpoint != nil.
This PR creates a unified proxy configuration structure to avoid duplication across proxies.
Summary by CodeRabbit
New Features
Refactor
Tests