refactor: Proxy Configurations by thushan · Pull Request #59 · thushan/olla

thushan · 2025-08-23T09:09:21Z

This PR creates a unified proxy configuration structure to avoid duplication across proxies.

Summary by CodeRabbit

New Features
- Consistent, provider-agnostic model conversion with improved type/state detection; ID, publisher, type and state derivation updated.
- Quantisation level now included in Ollama output.
- Unified proxy configuration with sensible defaults and profile support.
Refactor
- Converters (Ollama, LM Studio, OpenAI, vLLM, Unified) share a common conversion helper to centralise logic.
- Proxy implementations migrated to a centralised config layer for consistent defaults and behaviour.
Tests
- Extensive new and updated unit and integration tests for converters and proxies.

fix tests update path stripping tests hmm streaming settings seem to be invalid, fix with isolating for Olla swapped the wrong default settings update benchmark tests

coderabbitai · 2025-08-23T09:09:29Z

Walkthrough

Adds a provider-agnostic BaseConverter and ConversionHelper for model conversion, refactors converters to embed and use it, introduces a unified proxy config package with defaults and Olla/Sherpa embedding, and updates many tests to new config/constructor initialization patterns.

Changes

Cohort / File(s)	Summary
Converter core utilities `internal/adapter/converter/base_converter.go`, `internal/adapter/converter/base_converter_test.go`	New BaseConverter and ConversionHelper: alias/endpoint resolution, metadata extractors (string/int/bool), disk-size/state/type resolution, capability checks; comprehensive unit tests.
Converters refactor (embed BaseConverter) `internal/adapter/converter/lmstudio_converter.go`, `internal/adapter/converter/ollama_converter.go`, `internal/adapter/converter/openai_converter.go`, `internal/adapter/converter/unified_converter.go`, `internal/adapter/converter/vllm_converter.go`, `internal/adapter/converter/vllm_converter_test.go`	Converters now embed `BaseConverter`; conversion flows use NewConversionHelper and helper accessors (ShouldSkip, GetModelType/GetState/GetDiskSize/GetMetadata). Ollama adds quantization denormalisation. Tests updated to use constructors.
Unified proxy config package `internal/adapter/proxy/config/unified.go`	New ProxyConfig interface, BaseProxyConfig, Sherpa/Olla concrete config types, and default constants/getters for timeouts, buffer sizes and Olla pooling.
Proxy services/config embedding & wiring `internal/adapter/proxy/sherpa/config.go`, `internal/adapter/proxy/olla/config.go`, `internal/adapter/proxy/sherpa/service.go`, `internal/adapter/proxy/olla/service.go`, `internal/adapter/proxy/sherpa/service_streaming.go`, `internal/adapter/proxy/factory.go`	Replace per-impl fields with embedded shared configs, centralise defaults via config package, add Profile field propagation, change config construction to zero-value + field assignments, and adapt service factory to accept unified getters.
Proxy tests and benchmarks init style `internal/adapter/proxy/benchmark_comparison_test.go`, `internal/adapter/proxy/benchmark_refactor_test.go`, `internal/adapter/proxy/proxy_integration_test.go`, `internal/adapter/proxy/proxy_olla_test.go`, `internal/adapter/proxy/proxy_path_stripping_test.go`, `internal/adapter/proxy/proxy_sherpa_test.go`, `internal/adapter/proxy/proxy_streaming_profiles_test.go`, `internal/adapter/proxy/proxy_test.go`, `internal/adapter/proxy/sherpa/service_leak_test.go`, `internal/adapter/proxy/sherpa/service_leak_stress_test.go`, `internal/adapter/proxy/sherpa/service_retry_test.go`, `internal/adapter/proxy/olla/service_leak_test.go`	Replace composite-literal config construction with zero-value + field assignments across tests; preserve values and behaviour; minor test constructor changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller
  participant Conv as Converter (e.g., Ollama/LMStudio)
  participant Base as BaseConverter
  participant Help as ConversionHelper
  participant Model as UnifiedModel
  participant EP as SourceEndpoint

  Caller->>Conv: convertModel(Model)
  Conv->>Base: NewConversionHelper(Model)
  Base->>Help: construct helper (alias, endpoint)
  Help-->>Conv: helper
  alt No provider alias
    Conv->>Help: ShouldSkip()
    Help-->>Conv: true
    Conv-->>Caller: skip model
  else Alias found
    Conv->>Help: GetModelType/GetState/GetDiskSize
    Help->>EP: resolve endpoint-aware fallbacks
    Help-->>Conv: values
    Conv->>Help: GetMetadataString/Int/Bool
    Help-->>Conv: metadata values
    Conv-->>Caller: formatted model record
  end

sequenceDiagram
  autonumber
  actor App
  participant Factory as Proxy Factory
  participant CfgPkg as config (unified)
  participant SherpaCfg as sherpa.Configuration
  participant OllaCfg as olla.Configuration
  participant ServiceS as Sherpa Service
  participant ServiceO as Olla Service

  App->>Factory: Create Sherpa/Olla proxies
  Factory->>SherpaCfg: new Configuration{} + field assignments
  Factory->>OllaCfg: new Configuration{} + field assignments
  SherpaCfg->>CfgPkg: use BaseProxyConfig getters for defaults
  OllaCfg->>CfgPkg: use OllaConfig getters/overrides for defaults
  Factory->>ServiceS: NewService(Sherpa config)
  Factory->>ServiceO: NewService(Olla config)
  ServiceS-->>App: proxy instance
  ServiceO-->>App: proxy instance

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

chore: Consolidate Converters #58 — Adds the same BaseConverter and ConversionHelper and refactors converters; directly related at code level.
feat: proxy consolidation #34 — Unifies proxy configuration and updates Sherpa/Olla usage; overlaps with the proxy/config refactor here.
feat: API Redesign #33 — Related provider constants/prefix changes and API redesign that this converter refactor depends on.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "refactor: Proxy Configurations" succinctly and accurately describes the primary change in the PR, which centralises and refactors proxy configuration handling into a unified configuration API; it is concise, on-topic, and clear to reviewers scanning the history.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/proxy-configuration

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)

internal/adapter/converter/ollama_converter.go (1)
87-113: Bug: default branch does not uppercase unknown quantisation; also make lookup case‑insensitive

The comment says we uppercase by default, but the code returns the input unchanged. Additionally, if callers pass mixed‑case values, the map won’t match. Fix by normalising to lower case before lookup and uppercasing on fallback.

Apply this diff:
@@
-func denormalizeQuantization(quant string) string {
-	// Reverse the normalization mappings
+func denormalizeQuantization(quant string) string {
+	// Reverse the normalisation mappings (inputs may be mixed case)
+	quantLower := strings.ToLower(strings.TrimSpace(quant))
 	mappings := map[string]string{
@@
-	if denormalized, exists := mappings[quant]; exists {
+	if denormalized, exists := mappings[quantLower]; exists {
 		return denormalized
 	}
 
-	// Default: uppercase the quantization
-	return quant
+	// Default: uppercase the quantisation
+	return strings.ToUpper(quantLower)
 }
And add the missing import:
@@
-import (
-	"time"
+import (
+	"time"
+	"strings"
 )
Optional: lift the mappings map to a package‑level var to avoid repeated allocations on hot paths.
internal/adapter/proxy/sherpa/service_leak_stress_test.go (1)
103-127: Data race and loop‑variable capture bug in concurrent stress test

A single shared state is mutated by 100 goroutines — racy.

The loop variable i is captured by the goroutine, causing nondeterministic branching (classic Go gotcha).

Fix by using a per‑goroutine state and capturing i as a parameter.

Apply these diffs:
@@
-    state := &streamState{}
-    logger := createTestLogger()
+    logger := createTestLogger()
@@
-  for i := 0; i < numConcurrent; i++ {
+  for i := 0; i < numConcurrent; i++ {
     wg.Add(1)
-    go func() {
+    go func(i int) {
       defer wg.Done()
       ctx := context.Background()
+      // Avoid shared mutable state across goroutines
+      localState := &streamState{}
@@
-      if i%2 == 0 {
+      if i%2 == 0 {
         reader := &delayedReader{delay: 100 * time.Millisecond}
         localBuffer := make([]byte, 1024)
-        s.performTimedRead(ctx, reader, localBuffer, 50*time.Millisecond, state, logger)
+        s.performTimedRead(ctx, reader, localBuffer, 50*time.Millisecond, localState, logger)
       } else {
         reader := strings.NewReader("quick data")
         localBuffer := make([]byte, 1024)
-        s.performTimedRead(ctx, reader, localBuffer, 50*time.Millisecond, state, logger)
+        s.performTimedRead(ctx, reader, localBuffer, 50*time.Millisecond, localState, logger)
       }
-    }()
+    }(i)
   }
internal/adapter/proxy/olla/service_leak_test.go (1)
72-83: Fixed sleeps make cleanup assertions flaky; poll with timeout instead

The ticker and GC scheduling can vary under CI load. Replace fixed sleeps with a bounded wait loop to avoid intermittent failures.
-// Wait for cleanup to run
-time.Sleep(200 * time.Millisecond)
-
-// Verify pools were cleaned up
-poolCount = 0
-s.endpointPools.Range(func(k string, v *connectionPool) bool {
-	poolCount++
-	return true
-})
-if poolCount != 0 {
-	t.Errorf("Expected 0 pools after cleanup, got %d", poolCount)
-}
+deadline := time.Now().Add(2 * time.Second)
+for {
+	poolCount = 0
+	s.endpointPools.Range(func(k string, v *connectionPool) bool {
+		poolCount++
+		return true
+	})
+	if poolCount == 0 || time.Now().After(deadline) {
+		break
+	}
+	time.Sleep(20 * time.Millisecond)
+}
+if poolCount != 0 {
+	t.Errorf("Expected 0 pools after cleanup, got %d", poolCount)
+}
internal/adapter/proxy/proxy_integration_test.go (1)
288-289: Use a valid but unreachable endpoint instead of an invalid port

"http://localhost:99999" has an invalid port and can yield parsing errors rather than the intended connection-refused path, making the error assertions brittle. Use a valid, closed port to deterministically exercise dial failures.

Apply:
- endpoint = createTestEndpoint("test", "http://localhost:99999", domain.StatusHealthy)
+ // Use a valid but closed port to force a connection refusal deterministically.
+ endpoint = createTestEndpoint("test", "http://127.0.0.1:1", domain.StatusHealthy)
internal/adapter/proxy/olla/service.go (1)
686-701: Bug: UpdateConfig discards caller-supplied Olla pooling settings; also minor naming shadowing

UpdateConfig currently preserves MaxIdleConns, IdleConnTimeout, and MaxConnsPerHost from the old config, ignoring any updated values provided by the caller. The shared test builds a new *olla.Configuration with different values (e.g. MaxIdleConns=400), which won’t take effect. This makes live reconfiguration ineffective.

Also, parameter name config shadows the imported config package elsewhere in the file. Renaming to cfg avoids mental overhead.

Fix: if the incoming ports.ProxyConfiguration is actually *olla.Configuration, copy Olla-specific fields; otherwise, keep the current preserve-old behaviour.
-func (s *Service) UpdateConfig(config ports.ProxyConfiguration) {
-  newConfig := &Configuration{}
-  newConfig.ProxyPrefix = config.GetProxyPrefix()
-  newConfig.ConnectionTimeout = config.GetConnectionTimeout()
-  newConfig.ConnectionKeepAlive = config.GetConnectionKeepAlive()
-  newConfig.ResponseTimeout = config.GetResponseTimeout()
-  newConfig.ReadTimeout = config.GetReadTimeout()
-  newConfig.StreamBufferSize = config.GetStreamBufferSize()
-  newConfig.Profile = config.GetProxyProfile()
-  // Preserve Olla-specific settings
-  newConfig.MaxIdleConns = s.configuration.MaxIdleConns
-  newConfig.IdleConnTimeout = s.configuration.IdleConnTimeout
-  newConfig.MaxConnsPerHost = s.configuration.MaxConnsPerHost
+func (s *Service) UpdateConfig(cfg ports.ProxyConfiguration) {
+  newConfig := &Configuration{}
+  newConfig.ProxyPrefix = cfg.GetProxyPrefix()
+  newConfig.ConnectionTimeout = cfg.GetConnectionTimeout()
+  newConfig.ConnectionKeepAlive = cfg.GetConnectionKeepAlive()
+  newConfig.ResponseTimeout = cfg.GetResponseTimeout()
+  newConfig.ReadTimeout = cfg.GetReadTimeout()
+  newConfig.StreamBufferSize = cfg.GetStreamBufferSize()
+  newConfig.Profile = cfg.GetProxyProfile()
+  // Apply Olla-specific settings if provided by caller; otherwise preserve old values
+  if incoming, ok := cfg.(*Configuration); ok && incoming != nil {
+    if incoming.MaxIdleConns != 0 {
+      newConfig.MaxIdleConns = incoming.MaxIdleConns
+    } else {
+      newConfig.MaxIdleConns = s.configuration.MaxIdleConns
+    }
+    if incoming.IdleConnTimeout != 0 {
+      newConfig.IdleConnTimeout = incoming.IdleConnTimeout
+    } else {
+      newConfig.IdleConnTimeout = s.configuration.IdleConnTimeout
+    }
+    if incoming.MaxConnsPerHost != 0 {
+      newConfig.MaxConnsPerHost = incoming.MaxConnsPerHost
+    } else {
+      newConfig.MaxConnsPerHost = s.configuration.MaxConnsPerHost
+    }
+  } else {
+    newConfig.MaxIdleConns = s.configuration.MaxIdleConns
+    newConfig.IdleConnTimeout = s.configuration.IdleConnTimeout
+    newConfig.MaxConnsPerHost = s.configuration.MaxConnsPerHost
+  }
 
   // Update configuration atomically
   s.configuration = newConfig
 }
Optional (follow-up): existing per-endpoint connectionPool.transport instances won’t reflect updated pooling limits until they’re recreated. You could force-refresh by closing pools on config change to lazily rebuild with new settings, or update fields in-place where safe (e.g. MaxIdleConnsPerHost). Happy to sketch a scoped approach if you want to enable live updates.

🧹 Nitpick comments (40)

internal/adapter/converter/lmstudio_converter.go (1)
68-75: Normalise model type for LM Studio responses

helper.GetModelType("llm") may return metadata-driven values like "chat" or "completion". If LM Studio expects coarse-grained types (e.g., "llm", "vlm", "embeddings"), returning "chat" could be incompatible.

Recommend normalising here to ensure predictable output for LM Studio clients.

Would you confirm LM Studio’s accepted values for the type field? If it only accepts "llm", "vlm", "embeddings", consider this minimal normalisation:
-	modelType := helper.GetModelType("llm")
+	modelType := helper.GetModelType("llm")
+	// Normalise finer-grained types to LM Studio's expected categories.
+	if modelType == "chat" || modelType == "completion" {
+		modelType = "llm"
+	}
+	if modelType == "embedding" {
+		modelType = "embeddings"
+	}
internal/adapter/converter/base_converter.go (4)
7-17: Doc comments should explain “why”, and use Australian English

The implementation is fine. Per guidelines, prefer comments that justify decisions rather than restate what the code already shows. A small wording tweak improves clarity.

Apply:
-// BaseConverter provides common functionality for all converters
+// BaseConverter centralises provider‑agnostic logic so individual converters stay thin and consistent,
+// reducing duplication and the risk of divergent behaviour.
 
-// NewBaseConverter creates a new base converter
+// NewBaseConverter records the provider type so helper lookups are properly provider‑scoped.
29-38: Endpoint lookup by NativeName is reasonable; minor note

Matching providerName to NativeName mirrors how aliases map to platform names. If some providers use EndpointName in routing, consider a future extension to check both. No change required now.

40-57: Metadata extractors: consider json.Number and string numerics

Current int extraction covers int and float64. Some JSON paths may use json.Number or numeric strings depending on deserialisation options. Supporting these would make the helpers more robust.
+import (
+	"github.com/thushan/olla/internal/core/domain"
+	"encoding/json"
+)
@@
 func (b *BaseConverter) ExtractMetadataInt(metadata map[string]interface{}, key string) int {
 	if val, ok := metadata[key].(int); ok {
 		return val
 	}
 	if val, ok := metadata[key].(float64); ok {
 		return int(val)
 	}
+	if val, ok := metadata[key].(json.Number); ok {
+		if i64, err := val.Int64(); err == nil {
+			return int(i64)
+		}
+	}
+	if s, ok := metadata[key].(string); ok {
+		if f, err := strconv.ParseFloat(s, 64); err == nil {
+			return int(f)
+		}
+	}
 	return 0
 }
Note: this adds encoding/json and strconv imports.

88-105: Type determination: verify metadata semantics across providers

Returning metadata["type"] verbatim (e.g., "chat") gives flexibility, but downstream formatters (e.g., LM Studio) may need canonical categories. Consider centralising a normalisation step here, but be aware it will affect tests asserting "chat".

Would you like a NormaliseModelType() helper that maps:

"chat", "completion", "llm" → "llm"

"vision", "vlm" → "vlm"

"embedding", "embeddings" → "embeddings"

If so, I can propose a patch and update affected tests.
internal/adapter/converter/openai_converter.go (2)
54-62: Prefer provider‑specific alias over “first alias”; fix typo

Using the first alias is pragmatic but can be ambiguous. Since this converter now embeds BaseConverter, prefer its alias resolution first, then fall back to the first alias for backwards compatibility. Also fix “alas” → “alias”.
 func (c *OpenAIConverter) convertModel(model *domain.UnifiedModel) OpenAIModelData {
-	// OLLA-85: [Unification] Models with different digests fail to unify correctly.
-	// we need to use first alas as ID for routing compatibility
-	// to make sure the returned model ID can be used for requests
+	// OLLA-85: [Unification] Models with different digests fail to unify correctly.
+	// Use the provider‑specific alias when present to avoid routing ambiguity;
+	// fall back to the first alias to preserve historical compatibility.
 	modelID := model.ID
-	if len(model.Aliases) > 0 {
-		modelID = model.Aliases[0].Name
-	}
+	if c.BaseConverter != nil {
+		if helper := c.BaseConverter.NewConversionHelper(model); helper != nil && helper.Alias != "" {
+			modelID = helper.Alias
+		} else if len(model.Aliases) > 0 {
+			modelID = model.Aliases[0].Name
+		}
+	} else if len(model.Aliases) > 0 {
+		modelID = model.Aliases[0].Name
+	}
63-69: Created timestamp source: consider stable time for deterministic outputs

Using time.Now().Unix() makes responses time‑variant, which can complicate caching and test determinism. If model.LastSeen is meaningful, prefer that when set, falling back to time.Now().
-	return OpenAIModelData{
-		ID:      modelID,
-		Object:  "model",
-		Created: time.Now().Unix(),
-		OwnedBy: "olla",
-	}
+	created := time.Now().Unix()
+	if !model.LastSeen.IsZero() {
+		created = model.LastSeen.Unix()
+	}
+	return OpenAIModelData{
+		ID:      modelID,
+		Object:  "model",
+		Created: created,
+		OwnedBy: "olla",
+	}
internal/adapter/converter/unified_converter.go (2)
82-85: Fix typo in comment: “alias”

Minor spelling issue.
-// we need to use first alas as ID for routing compatibility
+// we need to use the first alias as ID for routing compatibility
183-194: Use the embedded BaseConverter instead of creating a new one

A new base with empty provider type is unnecessary and slightly obscures intent. Using c.BaseConverter keeps behaviour consistent and future‑proof.
-	// Use base converter to determine model type
-	base := NewBaseConverter("")
-	modelType := base.DetermineModelType(model, "")
+	// Use the embedded base converter to determine model type (provider‑agnostic here)
+	modelType := c.BaseConverter.DetermineModelType(model, "")
internal/adapter/converter/base_converter_test.go (2)

149-182: Metadata extractor tests are solid; consider adding json.Number case

Great coverage. If you adopt json.Number/string support in the extractor, add corresponding tests.

266-334: Model type determination tests are valuable; watch for future normalisation

If you later normalise metadata types (e.g., mapping "chat" to "llm"), update these expectations accordingly. For now, behaviour is consistent with the current design.
internal/adapter/proxy/sherpa/service_streaming.go (1)
80-86: Treat non‑positive read timeouts as “use default”

Defensive tweak: if a negative duration slips through config, we should still fall back to the default. Micro change, clearer semantics.

Apply this diff:
 func (s *Service) getReadTimeout() time.Duration {
   readTimeout := s.configuration.GetReadTimeout()
-  if readTimeout == 0 {
+  if readTimeout <= 0 {
     readTimeout = config.DefaultReadTimeout
   }
   return readTimeout
 }
internal/adapter/proxy/sherpa/service_leak_test.go (1)

150-220: Unify leak tests across Sherpa and Olla engines

Per the repository guideline to maintain a shared proxy test suite running against both Sherpa and Olla, consider table‑driving these leak tests to execute for both engines behind a common interface. This ensures parity and prevents regressions diverging between engines.

I can sketch a small test harness that instantiates each engine and runs the same scenarios if helpful.
internal/adapter/proxy/sherpa/service_retry_test.go (2)
81-167: Reduce duplication by extracting a small test config helper

The three tests duplicate identical configuration blocks. A tiny helper reduces repetition and keeps changes local if defaults move again.

You can add this helper (outside the changed hunks) and use it in all three tests:
// test helper (near top of file)
func newTestConfig() *Configuration {
	c := &Configuration{}
	c.ProxyPrefix = "/olla/"
	c.ConnectionTimeout = time.Second
	c.ConnectionKeepAlive = time.Second
	c.ResponseTimeout = time.Second
	c.ReadTimeout = time.Second
	c.StreamBufferSize = 8192
	return c
}
Then replace each block with:
config := newTestConfig()
Also applies to: 220-295, 296-387

81-167: Consider executing retry tests against both engines

To uphold the shared‑suite guideline, table‑drive these tests to run with Sherpa and Olla services where feasible (construct via a common factory or interface). This helps catch engine‑specific regressions early.
internal/adapter/proxy/proxy_sherpa_test.go (5)
146-148: Avoid incidental flakiness in disconnect test by relaxing read timeout

A 200ms ReadTimeout is close to the test’s 100ms chunk cadence and 250ms cancel, introducing a second failure mode (read timeout vs client cancel). Bump it so cancellation is the primary trigger.
-				config.ReadTimeout = 200 * time.Millisecond
+				config.ReadTimeout = 2 * time.Second
158-160: Mirror timeout relaxation for Olla case

Same rationale as for Sherpa to keep the test deterministic.
-				config.ReadTimeout = 200 * time.Millisecond
+				config.ReadTimeout = 2 * time.Second
261-263: Buffer pooling test asserts only through side‑effects

The test relies on “no failures across multiple requests” as a proxy for pooling correctness. Consider minimally asserting response sizes or using b.ReportAllocs in a benchmark variant to validate that allocations remain stable with the chosen StreamBufferSize.

163-166: Use the canonical context key in proxy tests

The dynamic tc.proxyPrefix and tc.proxyConfigPrefix keys injected in tests aren’t read by the proxy implementations (which exclusively use constants.ContextRoutePrefixKey). To keep tests aligned with production behaviour and avoid brittle overrides, update or remove these uses.

• internal/adapter/proxy/proxy_path_stripping_test.go:164
• internal/adapter/proxy/proxy_path_stripping_test.go:321

Suggested diff:
-   // dynamic key that proxy impls ignore
-   ctx := context.WithValue(req.Context(), tc.proxyPrefix, tc.contextPrefix)
+   // use the stable key recognised by the proxy implementation
+   ctx := context.WithValue(req.Context(), constants.ContextRoutePrefixKey, tc.contextPrefix)

-   // dynamic key that proxy impls ignore
-   ctx := context.WithValue(req.Context(), tc.proxyConfigPrefix, tc.handlerPrefix)
+   ctx := context.WithValue(req.Context(), constants.ContextRoutePrefixKey, tc.handlerPrefix)
27-30: Extract a shared Sherpa test config builder to reduce duplication

The explicit field assignments for ResponseTimeout, ReadTimeout and StreamBufferSize recur throughout your Sherpa proxy tests (e.g. proxy_sherpa_test.go:27–30, as well as in service_retry_test.go, service_leak_test.go, etc.). Defining a small helper function will localise these defaults, keep tests DRY, and make future changes to the defaults impossible to miss.

It’s safe to introduce a builder because the Sherpa service always uses the getters—and falls back to its own defaults only when you haven’t explicitly set a field. For example:

In service_streaming.go, getReadTimeout() calls s.configuration.GetReadTimeout() and only if that yields zero does it use config.DefaultReadTimeout.

In service.go, both the buffer and cloned configs are initialised via configuration.GetStreamBufferSize() and config.GetReadTimeout()/GetResponseTimeout().

Implementing a test-only helper won’t affect production behaviour, and will clean up over a dozen lines of boilerplate across your test suite.

Suggested diff:
+// newSherpaTestConfig returns a Configuration pre-populated with the
+// proxy tests’ standard timeouts and buffer size, allowing optional
+// overrides.
+func newSherpaTestConfig(overrides func(*sherpa.Configuration)) *sherpa.Configuration {
+    c := &sherpa.Configuration{
+        ResponseTimeout: 30 * time.Second,
+        ReadTimeout:     10 * time.Second,
+        StreamBufferSize: 1024,
+    }
+    if overrides != nil {
+        overrides(c)
+    }
+    return c
+}
Then, in your tests:
-   config := &sherpa.Configuration{}
-   config.ResponseTimeout = 30 * time.Second
-   config.ReadTimeout = 10 * time.Second
-   config.StreamBufferSize = 1024
+   config := newSherpaTestConfig(nil)
This keeps all defaults in one place, and any test needing non-standard values can simply pass an override function.
internal/adapter/converter/vllm_converter_test.go (1)

210-210: Decouple tests from unexported VLLMConverter helpers

Your tests in internal/adapter/converter/vllm_converter_test.go directly cast the result of NewVLLMConverter() to *VLLMConverter and invoke two unexported methods:

Line ≈210 — converter.determineOwner(…)

Line ≈248 — converter.findVLLMNativeName(…)

This tight coupling makes the tests fragile to any future renames, embedding changes or constructor tweaks. Consider one of the following lightweight approaches to decouple:

• Export thin wrappers on BaseConverter
– e.g. add func (b *BaseConverter) DetermineOwnerFromID(id string) string and
func (b *BaseConverter) FindProviderAliasName(model *domain.UnifiedModel) (string, bool)
– Then in tests call converter := NewVLLMConverter(); converter.DetermineOwnerFromID(...) without a type assertion.

• Introduce a package-private test shim in a _test.go file
– e.g. define func DetermineOwner(c ports.ModelResponseConverter, id string) string that performs the cast internally
– Keeps your tests free of repeated .(*VLLMConverter) assertions.

• Or simply verify equivalent behaviour via the exported ConvertToFormat surface
– Assert that the OwnedBy field of the resulting VLLMModelResponse matches expectations, covering both helper functions indirectly.

These changes will make your suite more maintainable and resilient to refactors.
internal/adapter/proxy/proxy_path_stripping_test.go (4)
140-147: Config construction: suggest a tiny factory to avoid repetition

You set ProxyPrefix in multiple places; a helper reduces duplication and makes intent clearer.
-						sherpaConfig := &sherpa.Configuration{}
-						sherpaConfig.ProxyPrefix = tc.proxyPrefix
+						sherpaConfig := &sherpa.Configuration{ }
+						sherpaConfig.ProxyPrefix = tc.proxyPrefix
 						config = sherpaConfig
Optionally:
+func newSherpaConfigWithPrefix(prefix string) *sherpa.Configuration {
+	c := &sherpa.Configuration{}
+	c.ProxyPrefix = prefix
+	return c
+}
216-218: Same for Sherpa in double‑strip prevention test

Apply the same small factory for consistency and readability.
-	sherpaConfig := &sherpa.Configuration{}
-	sherpaConfig.ProxyPrefix = "/olla"
+	sherpaConfig := newSherpaConfigWithPrefix("/olla")
305-307: Prefer helper for Olla configuration as well

Keeps both implementations aligned and reduces test boilerplate.
-			config := &olla.Configuration{}
-			config.ProxyPrefix = tc.proxyConfigPrefix
+			config := &olla.Configuration{}
+			config.ProxyPrefix = tc.proxyConfigPrefix
Or via helper:
+func newOllaConfigWithPrefix(prefix string) *olla.Configuration {
+	c := &olla.Configuration{}
+	c.ProxyPrefix = prefix
+	return c
+}
163-166: Use the stable context key constant in proxy_path_stripping_test.go

The test at lines 163–166 currently injects the route prefix into the request context using tc.proxyPrefix as the key, but:

All production code and other proxy tests (e.g. proxy_routing_test.go, config.go) consistently use constants.ContextRoutePrefixKey as the context key.

Mixing in tc.proxyPrefix risks the test silently passing without actually exercising any real lookup in the proxy implementation.

To align this test with the rest of the codebase, either:

Update the block to use the constant key, or

Remove it entirely to reflect that proxy implementations defensively ignore context-based overrides.

Suggested optional refactor:
-                   if tc.contextPrefix != "" {
-                       ctx := context.WithValue(req.Context(), tc.proxyPrefix, tc.contextPrefix)
-                       req = req.WithContext(ctx)
-                   }
+                   // Context-based route‐prefix overrides are ignored by proxy implementations.
+                   // To test/document this, you could inject it with the stable key:
+                   // ctx := context.WithValue(req.Context(), constants.ContextRoutePrefixKey, tc.contextPrefix)
+                   // req = req.WithContext(ctx)
internal/adapter/proxy/benchmark_comparison_test.go (1)
58-61: Reduce duplication with small config builder helpers

These two setup blocks are identical to patterns repeated across the suite. Consider local helpers to keep benchmarks focused and reduce future drift when defaults evolve.

Example replacement within the changed ranges:
-	config := &sherpa.Configuration{}
-	config.ResponseTimeout = 30 * time.Second
-	config.ReadTimeout = 10 * time.Second
-	config.StreamBufferSize = 8192
+	config := newSherpaCfg(30*time.Second, 10*time.Second, 8192)
-	config := &olla.Configuration{}
-	config.ResponseTimeout = 30 * time.Second
-	config.ReadTimeout = 10 * time.Second
-	config.StreamBufferSize = 8192
-	config.MaxIdleConns = 200
-	config.IdleConnTimeout = 90 * time.Second
-	config.MaxConnsPerHost = 50
+	config := newOllaCfg(30*time.Second, 10*time.Second, 8192, 200, 90*time.Second, 50)
Add these helpers near the top of the file:
func newSherpaCfg(resp, read time.Duration, buf int) *sherpa.Configuration {
	c := &sherpa.Configuration{}
	c.ResponseTimeout = resp
	c.ReadTimeout = read
	c.StreamBufferSize = buf
	return c
}
func newOllaCfg(resp, read time.Duration, buf int, maxIdle int, idleTO time.Duration, maxPerHost int) *olla.Configuration {
	c := &olla.Configuration{}
	c.ResponseTimeout = resp
	c.ReadTimeout = read
	c.StreamBufferSize = buf
	c.MaxIdleConns = maxIdle
	c.IdleConnTimeout = idleTO
	c.MaxConnsPerHost = maxPerHost
	return c
}
Also applies to: 71-77
internal/adapter/proxy/proxy_streaming_profiles_test.go (2)
222-226: DRY up repeated config initialisation across suites

The Sherpa/Olla config assembly is copy-pasted six times here. Factor into suite-aware helpers (e.g., makeConfig(profile string)) to improve readability and keep behaviour aligned if defaults change.

Illustrative replacement within the changed blocks:
- config := &sherpa.Configuration{}
- config.Profile = tt.profile
- config.ResponseTimeout = 5 * time.Second
- config.ReadTimeout = 5 * time.Second
- config.StreamBufferSize = 8192
+ config := newSherpaCfg(tt.profile, 5*time.Second, 5*time.Second, 8192)
- config := &olla.Configuration{}
- config.Profile = tt.profile
- config.ResponseTimeout = 5 * time.Second
- config.ReadTimeout = 5 * time.Second
- config.StreamBufferSize = 8192
- config.MaxIdleConns = 10
- config.IdleConnTimeout = 90 * time.Second
- config.MaxConnsPerHost = 5
+ config := newOllaCfg(tt.profile, 5*time.Second, 5*time.Second, 8192, 10, 90*time.Second, 5)
Add minimal helpers once in this file:
func newSherpaCfg(profile string, resp, read time.Duration, buf int) *sherpa.Configuration {
	c := &sherpa.Configuration{}
	c.Profile, c.ResponseTimeout, c.ReadTimeout, c.StreamBufferSize = profile, resp, read, buf
	return c
}
func newOllaCfg(profile string, resp, read time.Duration, buf, maxIdle, maxPerHost int, idleTO time.Duration) *olla.Configuration {
	c := &olla.Configuration{}
	c.Profile, c.ResponseTimeout, c.ReadTimeout, c.StreamBufferSize = profile, resp, read, buf
	c.MaxIdleConns, c.MaxConnsPerHost, c.IdleConnTimeout = maxIdle, maxPerHost, idleTO
	return c
}
Also applies to: 229-236, 311-316, 318-326, 510-515, 517-525

192-275: Parallelise subtests carefully or leave as-is to avoid loop variable capture traps

If you plan to add t.Parallel() to these subtests, copy the loop variable (tt := tt) inside the loop before t.Run; otherwise parallel execution will race on tt. As written (serial), it’s fine.
internal/adapter/proxy/proxy_olla_test.go (2)
24-31: Consolidate Olla configuration setup and assert UpdateConfig semantics

Config initialisation is repeated across the file; introduce a small helper to create a tuned Olla config for tests to reduce duplication and drift.

In TestOllaProxyService_UpdateConfig, newConfig does not set pooling fields; if UpdateConfig replaces the whole struct, these may be reset to zero and then defaulted. Either set them explicitly in newConfig or assert that previous values are preserved by the update path.

Minimal changes within the changed ranges:
- config := &olla.Configuration{}
- config.ResponseTimeout = 30 * time.Second
- config.ReadTimeout = 10 * time.Second
- config.StreamBufferSize = 1024
- config.MaxIdleConns = 200
- config.IdleConnTimeout = 90 * time.Second
- config.MaxConnsPerHost = 50
+ config := newOllaTestCfg(30*time.Second, 10*time.Second, 1024, 200, 90*time.Second, 50)
And in UpdateConfig:
- newConfig := &olla.Configuration{}
- newConfig.ResponseTimeout = 20 * time.Second
- newConfig.ReadTimeout = 10 * time.Second
- newConfig.StreamBufferSize = 8192
+ newConfig := &olla.Configuration{}
+ newConfig.ResponseTimeout = 20 * time.Second
+ newConfig.ReadTimeout = 10 * time.Second
+ newConfig.StreamBufferSize = 8192
+ // Preserve pooling where not under test to avoid accidental regression from zeroing.
+ newConfig.MaxIdleConns = initialConfig.MaxIdleConns
+ newConfig.IdleConnTimeout = initialConfig.IdleConnTimeout
+ newConfig.MaxConnsPerHost = initialConfig.MaxConnsPerHost
Add once in this file:
func newOllaTestCfg(resp, read time.Duration, buf, maxIdle int, idleTO time.Duration, maxPerHost int) *olla.Configuration {
	c := &olla.Configuration{}
	c.ResponseTimeout, c.ReadTimeout, c.StreamBufferSize = resp, read, buf
	c.MaxIdleConns, c.IdleConnTimeout, c.MaxConnsPerHost = maxIdle, idleTO, maxPerHost
	return c
}
Also applies to: 51-57, 117-123, 199-205, 257-263, 478-485, 493-496, 529-535

449-469: Avoid magic numbers for defaults; assert against config constants

In TestOllaProxyService_ConfigDefaults you’re asserting specific numeric defaults (e.g., 64*1024, 90s). Prefer asserting against the actual default constants to keep tests resilient to tuning changes.

Example:
import proxycfg "github.com/thushan/olla/internal/adapter/proxy/config"

if config.StreamBufferSize != proxycfg.OllaDefaultStreamBufferSize { ... }
if config.MaxConnsPerHost != proxycfg.OllaDefaultMaxConnsPerHost { ... }
if config.IdleConnTimeout != proxycfg.OllaDefaultIdleConnTimeout { ... }
internal/adapter/proxy/proxy_integration_test.go (2)
174-178: Config initialisation repetition; introduce shared builders to keep parity tests tidy

Each test case re-assembles Sherpa and Olla configs with the same fields. Consolidate into small builder helpers to avoid drift and emphasise parity logic.

Illustrative within changed ranges:
- sherpaConfig := &sherpa.Configuration{}
- sherpaConfig.ResponseTimeout = 30 * time.Second
- sherpaConfig.ReadTimeout = 10 * time.Second
- sherpaConfig.StreamBufferSize = 8192
+ sherpaConfig := newSherpaCfg(30*time.Second, 10*time.Second, 8192)

- ollaConfig := &olla.Configuration{}
- ollaConfig.ResponseTimeout = 30 * time.Second
- ollaConfig.ReadTimeout = 10 * time.Second
- ollaConfig.StreamBufferSize = 8192
- ollaConfig.MaxIdleConns = 200
- ollaConfig.IdleConnTimeout = 90 * time.Second
- ollaConfig.MaxConnsPerHost = 50
+ ollaConfig := newOllaCfg(30*time.Second, 10*time.Second, 8192, 200, 90*time.Second, 50)
Add once in this file:
func newSherpaCfg(resp, read time.Duration, buf int) *sherpa.Configuration {
	c := &sherpa.Configuration{}
	c.ResponseTimeout, c.ReadTimeout, c.StreamBufferSize = resp, read, buf
	return c
}
func newOllaCfg(resp, read time.Duration, buf, maxIdle int, idleTO time.Duration, maxPerHost int) *olla.Configuration {
	c := &olla.Configuration{}
	c.ResponseTimeout, c.ReadTimeout, c.StreamBufferSize = resp, read, buf
	c.MaxIdleConns, c.IdleConnTimeout, c.MaxConnsPerHost = maxIdle, idleTO, maxPerHost
	return c
}
Also applies to: 179-186, 201-205, 206-213, 242-246, 247-254, 417-423, 462-465, 589-594, 595-603

498-571: Performance test average latency maths is acceptable, but consider reporting p95

Dividing wall-clock by request count under concurrency gives a coarse average; adding a simple p95 sample (tracking per-request durations) would better capture tail behaviour without much complexity. Optional.
internal/adapter/proxy/olla/config.go (1)
4-5: LGTM: Embed OllaConfig; add compile-time interface assertion for safety

Embedding config.OllaConfig keeps the surface unified. Consider adding an interface assertion to catch drift at compile time.
 package olla

 import (
   "github.com/thushan/olla/internal/adapter/proxy/config"
+  "github.com/thushan/olla/internal/core/ports"
 )

 // Configuration holds Olla proxy settings
 type Configuration struct {
   config.OllaConfig
 }
+
+// Ensure Configuration satisfies the proxy configuration interface
+var _ ports.ProxyConfiguration = (*Configuration)(nil)
Also applies to: 7-10
internal/adapter/proxy/sherpa/config.go (1)
8-11: Prefer embedding config.SherpaConfig (not BaseProxyConfig) for intent + forward‑compatibility

Sherpa now has its own wrapper type in the shared package. Embedding that keeps semantics self‑documenting and protects you if Sherpa‑specific fields are added later.

Apply:
-// Configuration holds Sherpa proxy settings
+// Configuration holds Sherpa proxy settings. We embed config.SherpaConfig
+// (rather than the bare BaseProxyConfig) so future Sherpa‑specific fields
+// and defaults flow through without touching this package.
 type Configuration struct {
-	config.BaseProxyConfig
+	config.SherpaConfig
 }
internal/adapter/proxy/config/unified.go (6)
27-40: Clarify zero‑value semantics in the interface doc

Documenting how zero values are interpreted helps implementers avoid surprises and keeps behaviour consistent across proxies.

Apply:
-// ProxyConfig defines the interface for all proxy configurations
+// ProxyConfig defines the interface for all proxy configurations.
+// Zero‑value semantics:
+//  - Durations of 0 adopt package defaults unless otherwise noted.
+//  - ResponseTimeout of 0 means “no response timeout”.
+//  - StreamBufferSize of 0 adopts the proxy‑specific default.
 type ProxyConfig interface {
61-67: Avoid product‑specific defaults in Base; push “/olla/” into OllaConfig

Baking “/olla/” into the base ties all proxy implementations to Olla’s path. Prefer a neutral base default and override in OllaConfig.

Suggested changes:
 func (c *BaseProxyConfig) GetProxyPrefix() string {
   if c.ProxyPrefix == "" {
-    return "/olla/"
+    return "/"
   }
   return c.ProxyPrefix
 }
And add an Olla override:
+// GetProxyPrefix returns the proxy prefix, defaulting to "/olla/" for Olla.
+func (c *OllaConfig) GetProxyPrefix() string {
+  if c.ProxyPrefix == "" {
+    return "/olla/"
+  }
+  return c.ProxyPrefix
+}
53-59: Call out that implementations may override profile defaults

Make the override pattern explicit to avoid confusion when packages (like Sherpa) diverge from “auto”.

Apply:
-// GetProxyProfile returns the proxy profile, defaulting to "auto" if not set
+// GetProxyProfile returns the proxy profile, defaulting to "auto" if not set.
+// Implementations may override this to reflect typical usage (e.g. Sherpa
+// defaults to "streaming" to optimise interactive behaviour).
 func (c *BaseProxyConfig) GetProxyProfile() string {
106-109: Implement real validation (non‑negative durations, sensible buffer size)

Returning nil hides misconfiguration that can lead to surprising runtime behaviour.

Apply:
+import (
+  "time"
+  "github.com/thushan/olla/internal/core/constants"
+  "fmt"
+)
@@
 func (c *BaseProxyConfig) Validate() error {
-  return nil
+  if c.StreamBufferSize < 0 {
+    return fmt.Errorf("stream buffer size must be >= 0 (got %d)", c.StreamBufferSize)
+  }
+  if c.ConnectionTimeout < 0 || c.ConnectionKeepAlive < 0 || c.ResponseTimeout < 0 || c.ReadTimeout < 0 {
+    return fmt.Errorf("timeouts must be >= 0 (got ct=%v, ka=%v, rt=%v, read=%v)",
+      c.ConnectionTimeout, c.ConnectionKeepAlive, c.ResponseTimeout, c.ReadTimeout)
+  }
+  // Optional: require a leading slash for prefixes to avoid path bugs.
+  if c.ProxyPrefix != "" && c.ProxyPrefix[0] != '/' {
+    return fmt.Errorf("proxy prefix must start with '/', got %q", c.ProxyPrefix)
+  }
+  return nil
 }
9-25: Tighten comments and prefer Australian English (“behaviour”)

Minor doc polish to explain why these defaults exist and match the project’s style guide.

Apply:
-// Default values for proxy settings
+// Default values for proxy settings — chosen to provide sensible out‑of‑the‑box behaviour.
@@
-  // Olla-specific defaults for high-performance
+  // Olla‑specific defaults tuned for high‑throughput streaming workloads
@@
-  // Olla uses 30s timeouts for faster failure detection in AI workloads
+  // Olla uses 30 s timeouts for faster failure detection in AI workloads
111-115: Dead code: remove unused SherpaConfig

A repository-wide search shows SherpaConfig is only ever declared in
internal/adapter/proxy/config/unified.go and never referenced elsewhere.
To reduce cognitive overhead, you can safely delete this dead type.

• File: internal/adapter/proxy/config/unified.go (lines 111–115)
• Remove the block:
- // SherpaConfig extends BaseProxyConfig with Sherpa-specific configuration
- type SherpaConfig struct {
-     BaseProxyConfig
- }
• If you meant to have a distinct wrapper for Sherpa configuration, consider merging it with the existing Sherpa config type (e.g. in pkg/sherpa/config) instead of keeping two parallel definitions.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 33c2a04 and dbf6dee.

📒 Files selected for processing (27)

internal/adapter/converter/base_converter.go (1 hunks)
internal/adapter/converter/base_converter_test.go (1 hunks)
internal/adapter/converter/lmstudio_converter.go (2 hunks)
internal/adapter/converter/ollama_converter.go (2 hunks)
internal/adapter/converter/openai_converter.go (1 hunks)
internal/adapter/converter/unified_converter.go (2 hunks)
internal/adapter/converter/vllm_converter.go (2 hunks)
internal/adapter/converter/vllm_converter_test.go (2 hunks)
internal/adapter/proxy/benchmark_comparison_test.go (2 hunks)
internal/adapter/proxy/benchmark_refactor_test.go (1 hunks)
internal/adapter/proxy/config/unified.go (1 hunks)
internal/adapter/proxy/factory.go (1 hunks)
internal/adapter/proxy/olla/config.go (1 hunks)
internal/adapter/proxy/olla/service.go (4 hunks)
internal/adapter/proxy/olla/service_leak_test.go (2 hunks)
internal/adapter/proxy/proxy_integration_test.go (6 hunks)
internal/adapter/proxy/proxy_olla_test.go (7 hunks)
internal/adapter/proxy/proxy_path_stripping_test.go (5 hunks)
internal/adapter/proxy/proxy_sherpa_test.go (7 hunks)
internal/adapter/proxy/proxy_streaming_profiles_test.go (3 hunks)
internal/adapter/proxy/proxy_test.go (3 hunks)
internal/adapter/proxy/sherpa/config.go (1 hunks)
internal/adapter/proxy/sherpa/service.go (2 hunks)
internal/adapter/proxy/sherpa/service_leak_stress_test.go (2 hunks)
internal/adapter/proxy/sherpa/service_leak_test.go (2 hunks)
internal/adapter/proxy/sherpa/service_retry_test.go (3 hunks)
internal/adapter/proxy/sherpa/service_streaming.go (2 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.{go,md}

📄 CodeRabbit inference engine (CLAUDE.md)

Use Australian English in comments and documentation; write comments explaining why rather than what

Files:

internal/adapter/proxy/proxy_path_stripping_test.go
internal/adapter/proxy/sherpa/service_streaming.go
internal/adapter/proxy/sherpa/service_retry_test.go
internal/adapter/proxy/proxy_test.go
internal/adapter/proxy/benchmark_comparison_test.go
internal/adapter/proxy/sherpa/service_leak_stress_test.go
internal/adapter/proxy/factory.go
internal/adapter/proxy/proxy_sherpa_test.go
internal/adapter/converter/base_converter_test.go
internal/adapter/proxy/olla/service_leak_test.go
internal/adapter/proxy/benchmark_refactor_test.go
internal/adapter/converter/lmstudio_converter.go
internal/adapter/proxy/sherpa/config.go
internal/adapter/proxy/sherpa/service_leak_test.go
internal/adapter/converter/vllm_converter_test.go
internal/adapter/proxy/proxy_integration_test.go
internal/adapter/proxy/proxy_streaming_profiles_test.go
internal/adapter/proxy/proxy_olla_test.go
internal/adapter/proxy/olla/config.go
internal/adapter/converter/base_converter.go
internal/adapter/converter/openai_converter.go
internal/adapter/proxy/config/unified.go
internal/adapter/converter/vllm_converter.go
internal/adapter/proxy/olla/service.go
internal/adapter/converter/ollama_converter.go
internal/adapter/proxy/sherpa/service.go
internal/adapter/converter/unified_converter.go

internal/adapter/proxy/**/*_test.go

📄 CodeRabbit inference engine (CLAUDE.md)

Maintain a shared test suite that validates both Sherpa and Olla proxy engines

Files:

internal/adapter/proxy/proxy_path_stripping_test.go
internal/adapter/proxy/sherpa/service_retry_test.go
internal/adapter/proxy/proxy_test.go
internal/adapter/proxy/benchmark_comparison_test.go
internal/adapter/proxy/sherpa/service_leak_stress_test.go
internal/adapter/proxy/proxy_sherpa_test.go
internal/adapter/proxy/olla/service_leak_test.go
internal/adapter/proxy/benchmark_refactor_test.go
internal/adapter/proxy/sherpa/service_leak_test.go
internal/adapter/proxy/proxy_integration_test.go
internal/adapter/proxy/proxy_streaming_profiles_test.go
internal/adapter/proxy/proxy_olla_test.go

🧠 Learnings (2)

📚 Learning: 2025-08-21T11:42:29.496Z

Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-21T11:42:29.496Z
Learning: Applies to internal/adapter/proxy/**/*_test.go : Maintain a shared test suite that validates both Sherpa and Olla proxy engines

Applied to files:

internal/adapter/proxy/proxy_path_stripping_test.go
internal/adapter/proxy/sherpa/service_streaming.go
internal/adapter/proxy/sherpa/service_retry_test.go
internal/adapter/proxy/proxy_test.go
internal/adapter/proxy/benchmark_comparison_test.go
internal/adapter/proxy/sherpa/service_leak_stress_test.go
internal/adapter/proxy/factory.go
internal/adapter/proxy/proxy_sherpa_test.go
internal/adapter/proxy/benchmark_refactor_test.go
internal/adapter/proxy/proxy_integration_test.go
internal/adapter/proxy/proxy_streaming_profiles_test.go
internal/adapter/proxy/proxy_olla_test.go

📚 Learning: 2025-08-21T11:42:29.496Z

Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-21T11:42:29.496Z
Learning: Applies to internal/app/handlers/handler_proxy.go : Serve proxy routes under the /olla/ prefix

Applied to files:

internal/adapter/proxy/proxy_path_stripping_test.go
internal/adapter/proxy/sherpa/service_streaming.go
internal/adapter/proxy/proxy_sherpa_test.go
internal/adapter/proxy/proxy_olla_test.go
internal/adapter/proxy/olla/service.go

🧬 Code graph analysis (27)

internal/adapter/proxy/proxy_path_stripping_test.go (3)

internal/adapter/proxy/olla/config.go (1)

Configuration (8-10)

internal/adapter/proxy/sherpa/config.go (1)

Configuration (9-11)

internal/adapter/proxy/config.go (1)

Configuration (28-42)

internal/adapter/proxy/sherpa/service_streaming.go (2)

internal/adapter/proxy/config/unified.go (1)

DefaultReadTimeout (11-11)

internal/adapter/proxy/config.go (1)

DefaultReadTimeout (14-14)

internal/adapter/proxy/sherpa/service_retry_test.go (1)

internal/adapter/proxy/sherpa/config.go (1)

Configuration (9-11)

internal/adapter/proxy/proxy_test.go (3)

internal/adapter/proxy/olla/config.go (1)

Configuration (8-10)

internal/adapter/proxy/sherpa/config.go (1)

Configuration (9-11)

internal/adapter/proxy/config.go (1)

Configuration (28-42)

internal/adapter/proxy/benchmark_comparison_test.go (2)

internal/adapter/proxy/olla/config.go (1)

Configuration (8-10)

internal/adapter/proxy/sherpa/config.go (1)

Configuration (9-11)

internal/adapter/proxy/sherpa/service_leak_stress_test.go (1)

internal/adapter/proxy/sherpa/config.go (1)

Configuration (9-11)

internal/adapter/proxy/factory.go (6)

internal/adapter/proxy/olla/config.go (1)

Configuration (8-10)

internal/adapter/proxy/sherpa/config.go (1)

Configuration (9-11)

internal/adapter/proxy/config.go (1)

Configuration (28-42)

internal/adapter/proxy/olla/service.go (1)

NewService (136-213)

internal/adapter/proxy/sherpa/service.go (1)

NewService (70-123)

internal/core/ports/proxy.go (2)

DiscoveryService (72-77)

ProxyConfiguration (31-39)

internal/adapter/proxy/proxy_sherpa_test.go (3)

internal/adapter/proxy/olla/config.go (1)

Configuration (8-10)

internal/adapter/proxy/sherpa/config.go (1)

Configuration (9-11)

internal/adapter/proxy/config.go (1)

Configuration (28-42)

internal/adapter/converter/base_converter_test.go (2)

internal/adapter/converter/base_converter.go (1)

NewBaseConverter (13-17)

internal/core/domain/unified_model.go (3)

UnifiedModel (15-31)

AliasEntry (9-12)

SourceEndpoint (34-44)

internal/adapter/proxy/olla/service_leak_test.go (1)

internal/adapter/proxy/olla/config.go (1)

Configuration (8-10)

internal/adapter/proxy/benchmark_refactor_test.go (4)

internal/adapter/proxy/olla/config.go (1)

Configuration (8-10)

internal/adapter/proxy/sherpa/config.go (1)

Configuration (9-11)

internal/adapter/proxy/olla/service.go (1)

NewService (136-213)

internal/adapter/proxy/sherpa/service.go (1)

NewService (70-123)

internal/adapter/converter/lmstudio_converter.go (3)

internal/adapter/converter/base_converter.go (2)

BaseConverter (8-10)

NewBaseConverter (13-17)

internal/core/ports/model_converter.go (1)

ModelResponseConverter (10-15)

internal/core/constants/providers.go (1)

ProviderPrefixLMStudio1 (17-17)

internal/adapter/proxy/sherpa/config.go (1)

internal/adapter/proxy/config/unified.go (1)

BaseProxyConfig (43-51)

internal/adapter/proxy/sherpa/service_leak_test.go (1)

internal/adapter/proxy/sherpa/config.go (1)

Configuration (9-11)

internal/adapter/converter/vllm_converter_test.go (1)

internal/adapter/converter/vllm_converter.go (2)

NewVLLMConverter (24-28)

VLLMConverter (19-21)

internal/adapter/proxy/proxy_integration_test.go (3)

internal/adapter/proxy/olla/config.go (1)

Configuration (8-10)

internal/adapter/proxy/sherpa/config.go (1)

Configuration (9-11)

internal/adapter/proxy/config.go (1)

Configuration (28-42)

internal/adapter/proxy/proxy_streaming_profiles_test.go (4)

internal/adapter/proxy/olla/config.go (1)

Configuration (8-10)

internal/adapter/proxy/sherpa/config.go (1)

Configuration (9-11)

internal/adapter/proxy/config.go (1)

Configuration (28-42)

internal/core/constants/config.go (1)

ConfigurationProxyProfileAuto (4-4)

internal/adapter/proxy/proxy_olla_test.go (3)

internal/adapter/proxy/olla/config.go (1)

Configuration (8-10)

internal/adapter/proxy/config.go (1)

Configuration (28-42)

internal/adapter/proxy/olla/service.go (1)

NewService (136-213)

internal/adapter/proxy/olla/config.go (1)

internal/adapter/proxy/config/unified.go (1)

OllaConfig (117-124)

internal/adapter/converter/base_converter.go (1)

internal/core/domain/unified_model.go (2)

UnifiedModel (15-31)

SourceEndpoint (34-44)

internal/adapter/converter/openai_converter.go (2)

internal/adapter/converter/base_converter.go (2)

BaseConverter (8-10)

NewBaseConverter (13-17)

internal/core/ports/model_converter.go (1)

ModelResponseConverter (10-15)

internal/adapter/proxy/config/unified.go (2)

pkg/format/format.go (1)

Duration (31-46)

internal/core/constants/config.go (1)

ConfigurationProxyProfileAuto (4-4)

internal/adapter/converter/vllm_converter.go (3)

internal/adapter/converter/base_converter.go (2)

BaseConverter (8-10)

NewBaseConverter (13-17)

internal/core/ports/model_converter.go (1)

ModelResponseConverter (10-15)

internal/core/constants/providers.go (1)

ProviderTypeVLLM (8-8)

internal/adapter/proxy/olla/service.go (3)

internal/adapter/proxy/config/unified.go (5)

OllaDefaultStreamBufferSize (17-17)

OllaDefaultMaxIdleConns (18-18)

OllaDefaultMaxConnsPerHost (19-19)

OllaDefaultIdleConnTimeout (20-20)

DefaultReadTimeout (11-11)

internal/adapter/proxy/config.go (5)

OllaDefaultMaxIdleConns (23-23)

OllaDefaultMaxConnsPerHost (24-24)

OllaDefaultIdleConnTimeout (25-25)

DefaultReadTimeout (14-14)

Configuration (28-42)

internal/adapter/proxy/olla/config.go (1)

Configuration (8-10)

internal/adapter/converter/ollama_converter.go (3)

internal/adapter/converter/base_converter.go (2)

BaseConverter (8-10)

NewBaseConverter (13-17)

internal/core/ports/model_converter.go (1)

ModelResponseConverter (10-15)

internal/core/constants/providers.go (1)

ProviderTypeOllama (4-4)

internal/adapter/proxy/sherpa/service.go (2)

internal/adapter/proxy/sherpa/config.go (1)

Configuration (9-11)

internal/adapter/proxy/config.go (1)

Configuration (28-42)

internal/adapter/converter/unified_converter.go (2)

internal/adapter/converter/base_converter.go (2)

BaseConverter (8-10)

NewBaseConverter (13-17)

internal/core/ports/model_converter.go (1)

ModelResponseConverter (10-15)

coderabbitai · 2025-08-23T09:24:13Z

internal/adapter/proxy/factory.go

+		ollaConfig := &olla.Configuration{}
+		ollaConfig.ProxyPrefix = config.GetProxyPrefix()
+		ollaConfig.ConnectionTimeout = config.GetConnectionTimeout()
+		ollaConfig.ConnectionKeepAlive = config.GetConnectionKeepAlive()
+		ollaConfig.ResponseTimeout = config.GetResponseTimeout()
+		ollaConfig.ReadTimeout = config.GetReadTimeout()
+		ollaConfig.StreamBufferSize = config.GetStreamBufferSize()
+		ollaConfig.Profile = config.GetProxyProfile()
+		// Olla-specific fields
+		ollaConfig.MaxIdleConns = 200
+		ollaConfig.IdleConnTimeout = 90 * time.Second
+		ollaConfig.MaxConnsPerHost = 50


🛠️ Refactor suggestion

Don’t hard-code Olla-specific pooling defaults in the factory

Hard-coding MaxIdleConns/IdleConnTimeout/MaxConnsPerHost here duplicates defaults already applied in olla.NewService and risks divergence if those defaults are tuned later. Either:

leave these zero to let olla.NewService apply its canonical defaults, or

propagate them only when supplied by a richer configuration that exposes Olla-specific getters.

Replace the hard-coded assignments:

- // Olla-specific fields - ollaConfig.MaxIdleConns = 200 - ollaConfig.IdleConnTimeout = 90 * time.Second - ollaConfig.MaxConnsPerHost = 50 + // Olla-specific fields: keep zero to defer to service defaults, or map via optional getters if present. + if oc, ok := config.(interface { + GetMaxIdleConns() int + GetIdleConnTimeout() time.Duration + GetMaxConnsPerHost() int + }); ok { + ollaConfig.MaxIdleConns = oc.GetMaxIdleConns() + ollaConfig.IdleConnTimeout = oc.GetIdleConnTimeout() + ollaConfig.MaxConnsPerHost = oc.GetMaxConnsPerHost() + }

Why: avoids config drift and keeps the factory focussed on translation rather than policy.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

ollaConfig := &olla.Configuration{}

ollaConfig.ProxyPrefix = config.GetProxyPrefix()

ollaConfig.ConnectionTimeout = config.GetConnectionTimeout()

ollaConfig.ConnectionKeepAlive = config.GetConnectionKeepAlive()

ollaConfig.ResponseTimeout = config.GetResponseTimeout()

ollaConfig.ReadTimeout = config.GetReadTimeout()

ollaConfig.StreamBufferSize = config.GetStreamBufferSize()

ollaConfig.Profile = config.GetProxyProfile()

// Olla-specific fields

ollaConfig.MaxIdleConns = 200

ollaConfig.IdleConnTimeout = 90 * time.Second

ollaConfig.MaxConnsPerHost = 50

ollaConfig := &olla.Configuration{}

ollaConfig.ProxyPrefix = config.GetProxyPrefix()

ollaConfig.ConnectionTimeout = config.GetConnectionTimeout()

ollaConfig.ConnectionKeepAlive = config.GetConnectionKeepAlive()

ollaConfig.ResponseTimeout = config.GetResponseTimeout()

ollaConfig.ReadTimeout = config.GetReadTimeout()

ollaConfig.StreamBufferSize = config.GetStreamBufferSize()

ollaConfig.Profile = config.GetProxyProfile()

// Olla-specific fields: keep zero to defer to service defaults, or map via optional getters if present.

if oc, ok := config.(interface {

GetMaxIdleConns() int

GetIdleConnTimeout() time.Duration

GetMaxConnsPerHost() int

}); ok {

ollaConfig.MaxIdleConns = oc.GetMaxIdleConns()

ollaConfig.IdleConnTimeout = oc.GetIdleConnTimeout()

ollaConfig.MaxConnsPerHost = oc.GetMaxConnsPerHost()

}

🤖 Prompt for AI Agents

In internal/adapter/proxy/factory.go around lines 55 to 66, the factory currently hard-codes Olla-specific pooling defaults (MaxIdleConns, IdleConnTimeout, MaxConnsPerHost); remove those hard-coded assignments so the values remain zeroed and let olla.NewService apply its canonical defaults, or alternatively only assign them from a dedicated Olla-specific config getter if such richer configuration is available; update the file to stop duplicating policy-level defaults and rely on the Olla constructor or explicit config propagation instead.

internal/adapter/proxy/proxy_test.go

coderabbitai · 2025-08-23T09:24:14Z

internal/adapter/proxy/sherpa/service.go

+	newConfig := &Configuration{}
+	newConfig.ProxyPrefix = config.GetProxyPrefix()
+	newConfig.ConnectionTimeout = config.GetConnectionTimeout()
+	newConfig.ConnectionKeepAlive = config.GetConnectionKeepAlive()
+	newConfig.ResponseTimeout = config.GetResponseTimeout()
+	newConfig.ReadTimeout = config.GetReadTimeout()
+	newConfig.StreamBufferSize = config.GetStreamBufferSize()
+	newConfig.Profile = config.GetProxyProfile()



⚠️ Potential issue

Config updates don’t take effect for new dials and buffer size—transport and pool not refreshed

NewService’s transport.DialContext closure captures the original configuration pointer. After UpdateConfig, s.configuration points to a new object, but the DialContext still reads timeouts from the stale pointer. Similarly, the buffer pool size depends on StreamBufferSize but isn’t reinitialised.

Impact: subsequent connections won’t respect updated connection/read/keepalive timeouts; streaming may continue with the old buffer size. This is subtle and hard to detect in tests.

Suggest updating the transport’s DialContext after swapping the config, and (optionally) rebuilding the buffer pool when the size changes.

Patch (minimal change inside UpdateConfig to refresh DialContext):

func (s *Service) UpdateConfig(config ports.ProxyConfiguration) { newConfig := &Configuration{} newConfig.ProxyPrefix = config.GetProxyPrefix() newConfig.ConnectionTimeout = config.GetConnectionTimeout() newConfig.ConnectionKeepAlive = config.GetConnectionKeepAlive() newConfig.ResponseTimeout = config.GetResponseTimeout() newConfig.ReadTimeout = config.GetReadTimeout() newConfig.StreamBufferSize = config.GetStreamBufferSize() newConfig.Profile = config.GetProxyProfile() - s.configuration = newConfig + s.configuration = newConfig + if s.transport != nil { + s.transport.DialContext = func(ctx context.Context, network, addr string) (net.Conn, error) { + dialer := &net.Dialer{ + Timeout: s.configuration.GetConnectionTimeout(), + KeepAlive: s.configuration.GetConnectionKeepAlive(), + } + conn, err := dialer.DialContext(ctx, network, addr) + if err != nil { + return nil, err + } + if tcpConn, ok := conn.(*net.TCPConn); ok { + if terr := tcpConn.SetNoDelay(DefaultSetNoDelay); terr != nil { + s.Logger.Warn("failed to set NoDelay", "err", terr) + } + } + return conn, nil + } + } }

Optional: rebuild the buffer pool if the size changed:

+ // Optional: rebuild buffer pool if size changed + if s.bufferPool != nil && newConfig.StreamBufferSize != 0 { + bufferPool, err := pool.NewLitePool(func() *[]byte { + buf := make([]byte, newConfig.StreamBufferSize) + return &buf + }) + if err == nil { + s.bufferPool = bufferPool + } else { + s.Logger.Warn("failed to rebuild buffer pool", "err", err) + } + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

newConfig := &Configuration{}

newConfig.ProxyPrefix = config.GetProxyPrefix()

newConfig.ConnectionTimeout = config.GetConnectionTimeout()

newConfig.ConnectionKeepAlive = config.GetConnectionKeepAlive()

newConfig.ResponseTimeout = config.GetResponseTimeout()

newConfig.ReadTimeout = config.GetReadTimeout()

newConfig.StreamBufferSize = config.GetStreamBufferSize()

newConfig.Profile = config.GetProxyProfile()

func (s *Service) UpdateConfig(config ports.ProxyConfiguration) {

newConfig := &Configuration{}

newConfig.ProxyPrefix = config.GetProxyPrefix()

newConfig.ConnectionTimeout = config.GetConnectionTimeout()

newConfig.ConnectionKeepAlive = config.GetConnectionKeepAlive()

newConfig.ResponseTimeout = config.GetResponseTimeout()

newConfig.ReadTimeout = config.GetReadTimeout()

newConfig.StreamBufferSize = config.GetStreamBufferSize()

newConfig.Profile = config.GetProxyProfile()

s.configuration = newConfig

if s.transport != nil {

s.transport.DialContext = func(ctx context.Context, network, addr string) (net.Conn, error) {

dialer := &net.Dialer{

Timeout: s.configuration.GetConnectionTimeout(),

KeepAlive: s.configuration.GetConnectionKeepAlive(),

}

conn, err := dialer.DialContext(ctx, network, addr)

if err != nil {

return nil, err

}

if tcpConn, ok := conn.(*net.TCPConn); ok {

if terr := tcpConn.SetNoDelay(DefaultSetNoDelay); terr != nil {

s.Logger.Warn("failed to set NoDelay", "err", terr)

}

}

return conn, nil

}

}

}

🤖 Prompt for AI Agents

internal/adapter/proxy/sherpa/service.go around lines 147 to 155: the UpdateConfig implementation replaces s.configuration but the transport's DialContext closure still closes over the old configuration pointer and the buffer pool size is never reinitialized, so new timeouts and buffer sizes don't take effect; to fix, after swapping s.configuration acquire the same lock used for updates, recreate or update s.transport.DialContext so it reads timeouts/keepalive from the updated s.configuration (or rebuild the transport with the new values) and, if the new StreamBufferSize differs from the old, reinitialize the buffer pool to the new size (ensuring any pool swap is done under lock to avoid races).

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

internal/adapter/proxy/olla/service.go (1)
215-244: Wrong transport field mapped: MaxConnsPerHost not set

MaxConnsPerHost from config is being assigned to MaxIdleConnsPerHost, leaving MaxConnsPerHost at default (unbounded). This changes connection limiting semantics and can cause unexpected load patterns.

Apply this diff to map fields correctly (optionally keep the idle-per-host setting if intended):
 	return &http.Transport{
 		MaxIdleConns:        config.MaxIdleConns,
-		MaxIdleConnsPerHost: config.MaxConnsPerHost,
+		MaxConnsPerHost:     config.MaxConnsPerHost,
+		// Optional: keep per-host idle aligned with the connection cap
+		MaxIdleConnsPerHost: config.MaxConnsPerHost,
 		IdleConnTimeout:     config.IdleConnTimeout,
 		TLSHandshakeTimeout: DefaultTLSHandshakeTimeout,
 		DisableCompression:  true,
 		ForceAttemptHTTP2:   true,

🧹 Nitpick comments (1)

internal/adapter/proxy/olla/service.go (1)

686-709: Also accept Olla-specific getters in UpdateConfig

Currently we only copy Olla-specific fields when the incoming config is exactly *olla.Configuration. Mirror the factory’s optional-getters path so unified configs carrying these values flow through.

Apply this diff:

 	// we try to get Olla-specific fields from incoming config if it's an *olla.Configuration
-	if ollaConfig, ok := config.(*Configuration); ok && ollaConfig != nil {
+	if oc, ok := config.(interface {
+		GetMaxIdleConns() int
+		GetIdleConnTimeout() time.Duration
+		GetMaxConnsPerHost() int
+	}); ok {
+		newConfig.MaxIdleConns = oc.GetMaxIdleConns()
+		newConfig.IdleConnTimeout = oc.GetIdleConnTimeout()
+		newConfig.MaxConnsPerHost = oc.GetMaxConnsPerHost()
+	} else if ollaConfig, ok := config.(*Configuration); ok && ollaConfig != nil {
 		newConfig.MaxIdleConns = ollaConfig.MaxIdleConns
 		newConfig.IdleConnTimeout = ollaConfig.IdleConnTimeout
 		newConfig.MaxConnsPerHost = ollaConfig.MaxConnsPerHost
 	} else {

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dbf6dee and d7bec85.

📒 Files selected for processing (2)

internal/adapter/proxy/factory.go (1 hunks)
internal/adapter/proxy/olla/service.go (4 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

internal/adapter/proxy/**/service.go

📄 CodeRabbit inference engine (CLAUDE.md)

Proxy services must include response headers: X-Olla-Endpoint, X-Olla-Model, X-Olla-Backend-Type, X-Olla-Request-ID, X-Olla-Response-Time

Files:

internal/adapter/proxy/olla/service.go

🧠 Learnings (2)

📚 Learning: 2025-09-23T08:30:20.348Z

Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-23T08:30:20.348Z
Learning: Applies to internal/adapter/proxy/**/service.go : Proxy services must include response headers: `X-Olla-Endpoint`, `X-Olla-Model`, `X-Olla-Backend-Type`, `X-Olla-Request-ID`, `X-Olla-Response-Time`

Applied to files:

internal/adapter/proxy/factory.go
internal/adapter/proxy/olla/service.go

📚 Learning: 2025-09-23T08:30:20.348Z

Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-23T08:30:20.348Z
Learning: Applies to internal/app/handlers/handler_proxy.go : All proxied routes must use the `/olla/` URL prefix

Applied to files:

internal/adapter/proxy/olla/service.go

🧬 Code graph analysis (2)

internal/adapter/proxy/factory.go (4)

internal/adapter/proxy/olla/config.go (1)

Configuration (8-10)

internal/adapter/proxy/config.go (1)

Configuration (28-42)

internal/adapter/proxy/olla/service.go (1)

NewService (136-213)

internal/core/ports/proxy.go (2)

DiscoveryService (72-77)

ProxyConfiguration (31-39)

internal/adapter/proxy/olla/service.go (3)

internal/adapter/proxy/config/unified.go (5)

OllaDefaultStreamBufferSize (17-17)

OllaDefaultMaxIdleConns (18-18)

OllaDefaultMaxConnsPerHost (19-19)

OllaDefaultIdleConnTimeout (20-20)

DefaultReadTimeout (11-11)

internal/adapter/proxy/config.go (5)

OllaDefaultMaxIdleConns (23-23)

OllaDefaultMaxConnsPerHost (24-24)

OllaDefaultIdleConnTimeout (25-25)

DefaultReadTimeout (14-14)

Configuration (28-42)

internal/adapter/proxy/olla/config.go (1)

Configuration (8-10)

🔇 Additional comments (6)

internal/adapter/proxy/factory.go (3)
42-49: Sherpa config mapping looks good

Field-by-field translation from ProxyConfiguration is clear and consistent.

55-63: Olla config mapping looks good

Generic fields and Profile are correctly propagated.

64-77: Remove hard‑coded Olla pooling defaults; rely on service defaults or explicit getters

This reintroduces divergent defaults (e.g. MaxIdleConns=200 here vs 100 in config constants) and duplicates policy the service already applies. Keep zero values so olla.NewService applies canonical defaults, or only set via the optional getters.

Apply this diff to drop the fallback block:
 		if ollaSpecific, ok := config.(interface {
 			GetMaxIdleConns() int
 			GetIdleConnTimeout() time.Duration
 			GetMaxConnsPerHost() int
 		}); ok {
 			ollaConfig.MaxIdleConns = ollaSpecific.GetMaxIdleConns()
 			ollaConfig.IdleConnTimeout = ollaSpecific.GetIdleConnTimeout()
 			ollaConfig.MaxConnsPerHost = ollaSpecific.GetMaxConnsPerHost()
-		} else {
-			// fallback option with defaults
-			ollaConfig.MaxIdleConns = 200
-			ollaConfig.IdleConnTimeout = 90 * time.Second
-			ollaConfig.MaxConnsPerHost = 50
 		}
internal/adapter/proxy/olla/service.go (3)

43-44: Correct package import

Switching to shared proxy config source is appropriate.

146-159: Defaulting now aligns with centralised config

Good move to source defaults from config constants rather than literals.

561-562: Confirmed: SetResponseHeaders sets all required X-Olla headers

It sets X-Olla-Request-ID, X-Olla-Response-Time, X-Olla-Endpoint, X-Olla-Backend-Type and X-Olla-Model. Conditions: Request-ID set when stats.RequestID != ""; Response-Time set when stats.StartTime is non-zero; Model set when stats.Model != ""; Endpoint and Backend-Type set when endpoint != nil.

initial consolidation of Proxy Configuration

dbf6dee

fix tests update path stripping tests hmm streaming settings seem to be invalid, fix with isolating for Olla swapped the wrong default settings update benchmark tests

thushan added the refactor Refactoring of code label Aug 23, 2025

coderabbitai bot reviewed Aug 23, 2025

View reviewed changes

thushan added 2 commits September 23, 2025 21:34

use the specific settings and fallback if unavailable

a83c209

update the olla service config and fallback too

d7bec85

coderabbitai bot reviewed Sep 23, 2025

View reviewed changes

Merge branch 'main' into feature/proxy-configuration

3d0f39f

thushan merged commit 4377c5e into main Sep 23, 2025
5 of 6 checks passed

thushan deleted the feature/proxy-configuration branch September 23, 2025 11:54

chenrui333 mentioned this pull request Sep 23, 2025

olla 0.0.18 chenrui333/homebrew-tap#1685

Closed

coderabbitai bot mentioned this pull request Sep 26, 2025

feat: backend/sglang #69

Merged

This was referenced Oct 9, 2025

feat: backend/lemonade #70

Merged

feat: Backend llamacpp #73

Merged

coderabbitai bot mentioned this pull request Oct 20, 2025

feat: Anthropic Message format Support #76

Merged

This was referenced Feb 20, 2026

feature: backend/vllm-mlx #109

Merged

feature: Anthropic agent fixes and improvements #113

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: Proxy Configurations#59

refactor: Proxy Configurations#59
thushan merged 4 commits intomainfrom
feature/proxy-configuration

thushan commented Aug 23, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Aug 23, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Aug 23, 2025

Uh oh!

Uh oh!

coderabbitai bot Aug 23, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

thushan commented Aug 23, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Aug 23, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thushan commented Aug 23, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 23, 2025 •

edited

Loading