feat: Anthropic Message format Support by thushan · Pull Request #76 · thushan/olla

thushan · 2025-10-20T12:19:45Z

This PR brings in Anthropic Support so you can use tools like Claude Code to route through to OpenAI models locally.

/olla/anthropic

Uses a new Translator concept in Olla (from Scout) that takes an incoming Anthropic request, converts it to OpenAI format, forwards through to Olla Pipeline (just like the openai endpoint) then upon the backend returning an OpenAI message, converts it back to Anthropic format.

This has been quite a tricky endeavour and is very much an experimental release.

Summary by CodeRabbit

New Features
- Anthropic Messages API translator added (messages, streaming via SSE, models and token-count endpoints); configurable enable/size limits and runtime route registration.
Documentation
- Extensive guides and quick‑start examples for Anthropic translation and integrations (Claude Code, OpenCode, Crush CLI, LM Studio, Ollama, vLLM, llama.cpp), plus API reference and streaming semantics.
Other
- Optional inspector logging, experimental capability listed in version output, and comprehensive test suites and end‑to‑end examples.

…ntblock) add thinking to payload to avoid Claude Code breakage add validator, we may accidentally send invalid top_/temps otherwise enable path rewriting oops avoid olla in path add 10mb limit for payload add tests update tests fix broken tests lint issue fix broken tests update handler_translation update handler_translation x2 fix tests

- maxtokens validation - streaming malformed chunking logs and continues now - weird behaviour for system ContentBlock message - tool argument errors - messageid format - compatibility with Claude Code v2 and v1.6 releases Also adds better error logging

refactor tests and add tests for configuration of translators fix broken tests across translator lint issues update handler and app for config

… for async

initial examples rewrite some bits refactor examples add test scripts for examples add one for lmstudio update handlers for experimental features

coderabbitai · 2025-10-20T12:19:56Z

Warning

Rate limit exceeded

@thushan has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 2 minutes and 38 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between c3b7cdc and 3375b48.

📒 Files selected for processing (1)

internal/adapter/inspector/simple.go (1 hunks)

Walkthrough

Adds an Anthropic Messages API translator (request/response/streaming, token counting), a translator registry with dynamic route registration, inspector logging, many tests and examples, and replaces direct use of http.Flusher with http.ResponseController in streaming code paths.

Changes

Cohort / File(s)	Summary
Configuration & Defaults `config/config.yaml`, `internal/config/config.go`, `internal/config/types.go`, `internal/config/config_test.go`	Add `translators.anthropic` config (enabled, max_message_size), env overrides, validation and tests; default enabled with 10MB limit.
Translator Core & Types `internal/adapter/translator/anthropic/translator.go`, `internal/adapter/translator/anthropic/types.go`, `internal/adapter/translator/anthropic/constants.go`	New Translator type, inspector wiring, Anthropic request/response/streaming types and constants, session handling and Anthropic-style error formatting.
Request Translation `internal/adapter/translator/anthropic/request.go`, `internal/adapter/translator/anthropic/tools.go`, `internal/adapter/translator/anthropic/request_test.go`, `internal/adapter/translator/anthropic/extended_features_test.go`, `internal/adapter/translator/anthropic/verify_fields_test.go`, `internal/adapter/translator/anthropic/security_test.go`	Implement TransformRequest mapping Anthropic messages/tools/system → OpenAI-like payloads, tool-choice mapping, request validation, size limits and extensive tests.
Response Translation `internal/adapter/translator/anthropic/response.go`, `internal/adapter/translator/anthropic/response_test.go`	Implement TransformResponse converting OpenAI responses → Anthropic format, finish_reason mapping, tool_use conversion, message ID generation and tests.
Streaming Translation `internal/adapter/translator/anthropic/streaming.go`, `internal/adapter/translator/anthropic/streaming_test.go`, `internal/adapter/translator/anthropic/streaming_test_helpers.go`, `internal/adapter/translator/anthropic/streaming_test_helpers.go`	SSE streaming translation with StreamingState, content/tool buffering, partial-JSON handling, event emission/finalisation and comprehensive streaming tests/helpers.
Token Counting `internal/adapter/translator/anthropic/token_count.go`, `internal/adapter/translator/anthropic/token_count_test.go`	Character-based token estimation endpoint that validates and bounds request body size, returns TokenCountResponse; tests verify counts and error paths.
Translator Abstractions & Registry `internal/adapter/translator/types.go`, `internal/adapter/translator/registry.go`, `internal/adapter/translator/registry_test.go`	Add RequestTranslator interface, TransformedRequest/TokenCountResponse types, optional extension interfaces (PathProvider, ErrorWriter, TokenCounter, ModelsProvider) and a concurrent Registry with tests.
Application Handlers & Routing `internal/app/handlers/application.go`, `internal/app/handlers/handler_translation.go`, `internal/app/handlers/handler_translator_models.go`, `internal/app/handlers/server_routes.go`, `internal/app/handlers/handler_translation_test.go`, `internal/app/handlers/handler_translation_bench_test.go`, `internal/app/handlers/handler_translator_models_test.go`	Wire translator registry into Application, add translation handler (streaming & non-streaming), models and token-count handlers, dynamic translator route registration, response recorder types, tests and benchmarks.
Proxy Streaming Infra `internal/adapter/proxy/olla/service.go`, `internal/adapter/proxy/olla/streaming_helpers.go`, `internal/adapter/proxy/sherpa/service_streaming.go`	Replace http.Flusher assertions with http.ResponseController usage; propagate rc through streaming write/flush logic and adjust call sites and signatures.
Inspector `internal/adapter/inspector/simple.go`, `internal/adapter/inspector/simple_test.go`, `docs/content/notes/anthropic-inspector.md`	Add a simple file-based inspector (per-session JSONL) with sanitisation, enable flag, concurrency-safe writes and tests; docs added.
Handlers & Version `internal/app/handlers/handler_version.go`, `internal/version/version.go`, `internal/app/services/http.go`	Add ExperimentalCapabilities field to version response and populate it; improve ListenAndServe error handling.
Examples, Docs & Misc `docs/...`, `examples/...`, `docs/mkdocs.yml`, `.gitignore`, `go.mod`	Large documentation and example additions for Anthropic integration, mkdocs nav updates, examples/compose/config/test scripts, `.gitignore` updates and added indirect `google/uuid` dependency.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Anthropic client
    participant Olla as Olla proxy
    participant Translator as Anthropic Translator
    participant Backend as OpenAI-compatible backend

    Client->>Olla: POST /olla/anthropic/v1/messages (Anthropic JSON)
    Olla->>Translator: TransformRequest(ctx, http.Request)
    Translator->>Translator: Convert messages/tools/system -> OpenAI format
    Translator-->>Olla: Transformed OpenAI request
    Olla->>Backend: Forward request
    Backend-->>Olla: OpenAI response (or SSE stream)
    Olla->>Translator: TransformResponse / TransformStreamingResponse
    Translator->>Translator: Map choices, tool_calls, finish reasons
    Translator-->>Olla: Anthropic-format response (or SSE events)
    Olla-->>Client: HTTP 200 / SSE stream (Anthropic format)

sequenceDiagram
    participant Backend as OpenAI backend (SSE)
    participant Translator as Anthropic Translator
    participant Client as Anthropic client

    Backend-->>Translator: SSE chunks (deltas, tool_calls, usage)
    Translator->>Translator: Buffer state, assemble blocks, parse partial JSON
    alt text delta
        Translator-->>Client: content_block_delta events
    else tool call
        Translator-->>Client: input_json_delta / content_block_start/stop
    end
    Translator->>Translator: finalize -> message_delta + message_stop
    Translator-->>Client: final events with usage

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat: API Redesign #33 — Overlaps dynamic translator route registration and registry plumbing; likely touches the same routing/registration code paths.
feat: anthropic / message logger (development only) #77 — Related: inspector and translator wiring changes intersect with the Anthropic inspector and translator integration.
feat: proxy profiles #42 — Related: streaming flush refactor (http.Flusher → http.ResponseController) and streaming helper updates intersect with these changes.

Suggested labels

enhancement

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "feat: Anthropic Message format Support" directly corresponds to the core objective of this pull request, which is to add Anthropic Messages API format support to Olla through a translator mechanism. The changeset comprehensively implements this feature across configuration, handlers, translators, documentation, and examples, all focused on converting between Anthropic and OpenAI formats. The title is concise, follows conventional commit conventions with the "feat:" prefix, and clearly communicates the primary change without vague terminology or unnecessary noise. A team member scanning through version history would immediately understand this PR introduces Anthropic message format support.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 27

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

examples/crush-vllm/README.md (2)

721-721: Minor grammar: Missing preposition.

Consider revising "List models from Olla" to "List models available from Olla" for improved clarity.

1049-1049: Minor grammar: Missing article.

"Configure highest priority for local backends" should include the article "the": "Configure the highest priority for local backends".

🧹 Nitpick comments (74)

examples/crush-vllm/test.sh (1)

40-223: Consider adding timeouts to curl commands.

The curl commands throughout the test script don't specify timeouts. While acceptable for local development, adding --max-time 10 would prevent the script from hanging indefinitely if services become unresponsive, improving the developer experience in CI or automated testing scenarios.

internal/adapter/translator/anthropic/translator_test.go (1)

1-158: Consider adding benchmarks for critical translation paths.

The translator performs performance-critical operations (JSON marshaling, buffer pooling, request/response transformation), and the PR documentation mentions "<5ms translation latency." Adding benchmark tests would help verify this claim and catch performance regressions.

As per coding guidelines, would you like me to generate benchmark tests for the translation operations? I can create benchmarks for:

WriteError performance

Request translation (if not covered in request_test.go)

Response translation (if not covered in response_test.go)

Streaming translation (if not covered in streaming_test.go)
internal/config/types.go (2)
20-21: Wire-up looks good; consider omitempty to keep serialised config minimal.

Optional: tag Translators with yaml:"translators,omitempty" so empty blocks aren’t written back out.

157-167: Make units explicit and centralise defaults/limits.

Rename MaxMessageSize to MaxMessageSizeBytes (or document “bytes”) and hoist shared constants so translator and config can’t drift.

Apply constants inside this file:
+// Anthropic translator sizing (bytes)
+const (
+    DefaultAnthropicMessageSize = 10 << 20  // 10 MiB
+    MaxAnthropicMessageSize     = 100 << 20 // 100 MiB
+)
examples/claude-code-llamacpp/test.sh (3)
4-17: Harden script: fail early, check deps, allow env overrides.

Use set -Eeuo pipefail plus a small dependency check for curl and jq.

Let env override URLs.
-set -e
+set -Eeuo pipefail
+
+need() { command -v "$1" >/dev/null 2>&1 || { echo "Missing dependency: $1" >&2; exit 1; }; }
+need curl; need jq
-
-OLLA_URL="http://localhost:40114"
-ANTHROPIC_URL="${OLLA_URL}/olla/anthropic/v1"
-LLAMACPP_URL="http://localhost:8080"
+OLLA_URL="${OLLA_URL:-http://localhost:40114}"
+ANTHROPIC_URL="${ANTHROPIC_URL:-${OLLA_URL}/olla/anthropic/v1}"
+LLAMACPP_URL="${LLAMACPP_URL:-http://localhost:8080}"
60-73: Send Anthropic version header for parity with real clients.

Add anthropic-version to the non‑streaming request.
-RESPONSE=$(curl -s -X POST "${ANTHROPIC_URL}/messages" \
+RESPONSE=$(curl -s -X POST "${ANTHROPIC_URL}/messages" \
   -H "Content-Type: application/json" \
+  -H "anthropic-version: 2023-06-01" \
   -d "{
85-95: Likewise for streaming.

Include anthropic-version on the streaming request for consistency.
-STREAM_OUTPUT=$(curl -sN -X POST "${ANTHROPIC_URL}/messages" \
+STREAM_OUTPUT=$(curl -sN -X POST "${ANTHROPIC_URL}/messages" \
   -H "Content-Type: application/json" \
+  -H "anthropic-version: 2023-06-01" \
   -d "{
examples/opencode-lmstudio/test.sh (3)
5-13: Stronger bash safety and dependency checks.

Enable strict mode and verify required tools present.
-set -e
+set -Eeuo pipefail
+need() { command -v "$1" >/dev/null 2>&1 || { echo "Missing dependency: $1" >&2; exit 1; }; }
+need curl; need jq; need docker
55-63: Container check can be brittle; prefer compose or name match.

Use a more robust check to avoid false negatives if the container name differs.
-if docker ps | grep -q "olla"; then
+if docker compose ps --services --status running 2>/dev/null | grep -qx "olla"; then
     print_result 0 "Olla container is running"
 else
     print_result 1 "Olla container is not running" "Run: docker compose up -d"
     exit 1
 fi
158-165: Add Anthropic version header to match API expectations.

Include the header on the Anthropic messages call.
-ANTHROPIC_RESPONSE=$(curl -s -f -X POST "http://${OLLA_HOST}:${OLLA_PORT}/olla/anthropic/v1/messages" \
+ANTHROPIC_RESPONSE=$(curl -s -f -X POST "http://${OLLA_HOST}:${OLLA_PORT}/olla/anthropic/v1/messages" \
     -H "Content-Type: application/json" \
+    -H "anthropic-version: 2023-06-01" \
     -d "{
examples/claude-code-ollama/test.sh (1)
4-17: Adopt strict mode and pre-flight checks.

Reduce heisenbugs by enabling strict bash options and verifying curl/jq exist.
-set -e  # Exit on error
+set -Eeuo pipefail  # Exit on error, unset, and pipeline failures
+need() { command -v "$1" >/dev/null 2>&1 || { echo "Missing dependency: $1" >&2; exit 1; }; }
+need curl; need jq
internal/adapter/translator/anthropic/request_test.go (5)
16-21: Test logger creation looks fine. Add t.Parallel() where safe.

Most tests instantiate isolated translators; consider t.Parallel() to speed up suite runtime.

23-61: Great coverage of the happy path. Factor small helpers to reduce repetition.

Create helpers for marshalling and extracting messages to keep tests terse.
+// test helpers
+func mustTransform(t *testing.T, tr *Translator, v any) *TransformedRequest {
+    t.Helper()
+    b, err := json.Marshal(v)
+    require.NoError(t, err)
+    req := &http.Request{Body: io.NopCloser(bytes.NewReader(b))}
+    res, err := tr.TransformRequest(context.Background(), req)
+    require.NoError(t, err)
+    return res
+}
580-621: Consider a benchmark for critical path.

Add a micro‑benchmark for TransformRequest with a representative payload. As per coding guidelines.
func BenchmarkTransformRequest_SimpleMessage(b *testing.B) {
    tr := NewTranslator(createTestLogger(), createTestConfig())
    req := AnthropicRequest{
        Model: "claude-3-5-sonnet-20241022",
        MaxTokens: 256,
        Messages: []AnthropicMessage{{Role: "user", Content: "Hi"}},
    }
    body, _ := json.Marshal(req)
    httpReq := &http.Request{Body: io.NopCloser(bytes.NewReader(body))}
    b.ReportAllocs()
    for i := 0; i < b.N; i++ {
        _, _ = tr.TransformRequest(context.Background(), httpReq)
    }
}
710-759: Nice: structured tool_result serialisation validated. Add image content case for completeness.

Add a test covering image content blocks to confirm conversion/ignoring behaviour is intentional.

214-276: Tool choice mapping is well covered. Add negative case for unknown tool name in object form.

If a specific tool name is requested but not declared, assert the translator behaviour (error or fallback).
internal/adapter/translator/types.go (1)
28-37: Clarify the TargetPath comment.

Line 34's comment states "proxy handles /olla prefix" which may be unclear to developers unfamiliar with the proxy routing logic. Consider clarifying that this should be the backend-relative path without any Olla-specific prefixes.

Apply this diff to improve clarity:
-	TargetPath    string                 // Target API path for the backend (e.g., "/v1/chat/completions" - proxy handles /olla prefix)
+	TargetPath    string                 // Backend-relative API path (e.g., "/v1/chat/completions") - excludes Olla routing prefixes
docs/content/usage.md (1)
203-232: Consider using proper Markdown headings.

Lines 203, 221, and 227 use bold text (emphasis) for section labels. Using proper Markdown headings (e.g., ####) would improve document structure and accessibility.

Apply this diff to use proper headings:
-**1. Enable Translation**
+#### 1. Enable Translation
-**2. Start Olla**
+#### 2. Start Olla
-**3. Configure Claude Code**
+#### 3. Configure Claude Code
internal/config/config.go (1)
317-327: Consider clarifying the comment.

The comment on line 317 mentions "config for request size limits + streaming" but the code block only handles Anthropic translator environment overrides. Consider updating the comment to be more specific.

Apply this diff:
-	// config for request size limits + streaming
+	// Anthropic translator configuration overrides
 	if val := os.Getenv("OLLA_TRANSLATORS_ANTHROPIC_ENABLED"); val != "" {
examples/crush-vllm/compose.yaml (1)
4-4: Pin the image tag (avoid latest)

Unpinned images hurt reproducibility and can break the example unexpectedly. Pin a version or digest.

Apply:
-    image: vllm/vllm-openai:latest
+    image: vllm/vllm-openai:0.6.2 # example – pin to a known-good tag or digest
internal/config/config_test.go (2)
586-604: Use t.Setenv to simplify env handling

Reduces boilerplate and avoids leaks if a test panics.
-	// Set env vars
-	for key, value := range testEnvVars {
-		os.Setenv(key, value)
-	}
-
-	// Clean up after test
-	defer func() {
-		for key := range testEnvVars {
-			os.Unsetenv(key)
-		}
-	}()
+	for k, v in range testEnvVars {
+		t.Setenv(k, v)
+	}
635-648: Don’t reimplement substring search

Use strings.Contains for clarity.
-func contains(s, substr string) bool {
-	return len(s) >= len(substr) && (s == substr || len(substr) == 0 ||
-		(len(s) > 0 && len(substr) > 0 && stringContains(s, substr)))
-}
-
-func stringContains(s, substr string) bool {
-	for i := 0; i <= len(s)-len(substr); i++ {
-		if s[i:i+len(substr)] == substr {
-			return true
-		}
-	}
-	return false
-}
+import "strings"
+
+func contains(s, substr string) bool {
+	return strings.Contains(s, substr)
+}
internal/adapter/translator/anthropic/path_translation_test.go (2)
21-49: Also assert streaming flag propagation

You cover path mapping; add a check that stream on input → stream: true in the OpenAI payload.
@@ func TestPathTranslation(t *testing.T) {
-	for _, tt := range tests {
+	for _, tt := range tests {
 		t.Run(tt.name, func(t *testing.T) {
@@
-			// Verify target path is set correctly
+			// Verify target path is set correctly
 			assert.Equal(t, tt.expectedTargetPath, transformed.TargetPath,
 				"Target path should be set to OpenAI chat completions endpoint")
@@
-			// Verify target path is not empty
+			// Verify target path is not empty
 			assert.NotEmpty(t, transformed.TargetPath,
 				"Target path must be populated for proper routing")
+
+			// If streaming input, ensure streaming is propagated
+			if tt.anthropicReq.Stream {
+				v, ok := transformed.OpenAIRequest["stream"].(bool)
+				require.True(t, ok, "stream should be boolean in OpenAI request")
+				assert.True(t, v, "stream must be true when Anthropic request streams")
+			}
 		})
 	}
Also applies to: 84-91

1-167: Add a micro-benchmark for TransformRequest

Helps catch regressions and meets our test guideline for benchmarks under internal/*_test.go.
func BenchmarkTransformRequest_Small(b *testing.B) {
	tr := NewTranslator(createTestLogger(), createTestConfig())
	reqBody := []byte(`{"model":"claude-3-5","max_tokens":64,"messages":[{"role":"user","content":"hi"}]}`)
	for i := 0; i < b.N; i++ {
		req := httptest.NewRequest("POST", "/olla/anthropic/v1/messages", bytes.NewReader(reqBody))
		if _, err := tr.TransformRequest(context.Background(), req); err != nil {
			b.Fatal(err)
		}
	}
}
[Based on learnings]
internal/app/handlers/application.go (1)
90-179: Optional: extract translator wiring into a helper

Keeps NewApplication lean and easier to unit test.
+func initTranslators(log logger.StyledLogger, cfg *config.Config) (*translator.Registry, error) {
+    reg := translator.NewRegistry(log)
+    if cfg.Translators.Anthropic.Enabled {
+        if err := cfg.Translators.Anthropic.Validate(); err != nil {
+            return nil, fmt.Errorf("invalid Anthropic translator config: %w", err)
+        }
+        tr := anthropic.NewTranslator(log, cfg.Translators.Anthropic)
+        if err := reg.Register("anthropic", tr); err != nil {
+            return nil, err
+        }
+        log.Info("Registered Anthropic translator", "max_message_size", cfg.Translators.Anthropic.MaxMessageSize)
+    } else {
+        log.Info("Anthropic translator disabled via configuration")
+    }
+    return reg, nil
+}
Then call translatorRegistry, err := initTranslators(logger, cfg).
internal/app/handlers/server_routes.go (3)
108-120: Tidy logging; avoid odd message and log outside conditional.

The InfoWithEndpoint(" ", ...) message looks off, and it only logs when TokenCounter is implemented. Suggest a consistent info log regardless of TokenCounter, and keep Debug for paths.

Apply:
-				a.logger.InfoWithEndpoint(" ", name, "path", path)
+				a.logger.Info("Registered translator", "translator", name, "path", path)
Optionally move this info log outside the TokenCounter branch so it always emits.

Also applies to: 114-115

95-99: Docstring mismatch for token count path.

Comment says “message path + "_count_tokens"” but the path is “/count_tokens”. Update comment for accuracy.
- // Uses message path + "_count_tokens" (e.g., /olla/anthropic/v1/messages/count_tokens)
+ // Uses message path + "/count_tokens" (e.g., /olla/anthropic/v1/messages/count_tokens)
76-81: Prefer http.Method constants for consistency.*

Minor style: replace hard-coded "GET"/"POST" with http.MethodGet/http.MethodPost.

Also applies to: 88-93, 101-106
examples/opencode-lmstudio/README.md (4)
7-15: Add a language to the ASCII diagram code fence (markdownlint MD040).

Specify a language (e.g. text) to satisfy linters.
-```
+```text
 ...
-```
+```
41-41: Grammar: avoid colon after a preposition.

“Download from:” → “Download from”.
-  - Download from: https://lmstudio.ai/
+  - Download from https://lmstudio.ai/
263-263: Hyphenate “up‑to‑date”.

Minor language fix.
-4. **Mac/Windows users**: Ensure Docker Desktop is up to date for `host.docker.internal` support
+4. **Mac/Windows users**: Ensure Docker Desktop is up‑to‑date for `host.docker.internal` support
324-332: Anthropic curl example: consider specifying version header.

Some clients expect an Anthropic-Version header; adding it improves interoperability with tooling.
 curl http://localhost:40114/olla/anthropic/v1/messages \
   -H "Content-Type: application/json" \
+  -H "Anthropic-Version: 2023-06-01" \
   -d '{
     "model": "qwen2.5-coder:7b",
     "max_tokens": 1024,
     "messages": [{"role": "user", "content": "Hello!"}]
   }'
Confirm the exact version your translator expects (or if it ignores this header) and update docs accordingly.
examples/claude-code-ollama/compose.yaml (2)
3-3: Avoid :latest tags for reproducibility.

Pin images to a version or digest to prevent surprise upgrades.

Example:
image: ollama/ollama:0.5.0          # or @sha256:<digest>
...
image: ghcr.io/thushan/olla:vX.Y.Z  # or @sha256:<digest>
Also applies to: 26-26

38-42: Healthcheck tool availability.

wget may not be present in minimal images. Prefer curl -f if available, or use CMD-SHELL with busybox wget. Keep checks fast and quiet.
-  test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:40114/internal/health"]
+  test: ["CMD", "curl", "-fsS", "http://localhost:40114/internal/health"]
If curl isn’t present, switch to:
test: ["CMD-SHELL", "wget -q --spider http://localhost:40114/internal/health || exit 1"]
internal/app/handlers/handler_translator_models_test.go (1)
62-66: Tighten assertions and add error‑path test; consider testify.

Use require/assert from stretchr/testify for clearer failures.

Add a test where GetUnifiedModels returns an error to verify 5xx and error body.

Consider a small benchmark for this handler per guidelines.

Example changes:
+ import "github.com/stretchr/testify/require"
...
- if w.Code != http.StatusOK {
-   t.Errorf("expected status 200, got %d", w.Code)
- }
+ require.Equal(t, http.StatusOK, w.Code)
Add:
func TestTranslatorModelsHandler_Error(t *testing.T) {
  app := &Application{ /* inject a registry that returns error */ }
  trans := anthropic.NewTranslator(app.logger, config.AnthropicTranslatorConfig{Enabled: true})
  req := httptest.NewRequest(http.MethodGet, "/olla/anthropic/v1/models", nil)
  w := httptest.NewRecorder()
  app.translatorModelsHandler(trans)(w, req)
  if w.Code < 500 || w.Code >= 600 {
    t.Fatalf("expected 5xx, got %d", w.Code)
  }
}
Benchmark (minimal):
func BenchmarkTranslatorModelsHandler(b *testing.B) {
  // set up app/trans with N models
  req := httptest.NewRequest(http.MethodGet, "/olla/anthropic/v1/models", nil)
  for i := 0; i < b.N; i++ {
    w := httptest.NewRecorder()
    handler(w, req)
  }
}
Based on learnings.

Also applies to: 122-126, 182-214
examples/claude-code-llamacpp/compose.yaml (2)
3-3: Pin image tags for predictability.

Use a specific version or digest for llama.cpp server and olla images.
image: ghcr.io/ggml-org/llama.cpp:server-<ver>   # or @sha256:<digest>
...
image: ghcr.io/thushan/olla:vX.Y.Z               # or @sha256:<digest>
Also applies to: 41-41

26-31: Healthcheck tool availability and timing.

Ensure the images include curl; if not, switch to wget CMD-SHELL. Given model load time, consider a longer start_period for olla when backends are cold.
- start_period: 10s
+ start_period: 30s
And if curl is missing:
test: ["CMD-SHELL", "wget -q --spider http://localhost:8080/health || exit 1"]
Also applies to: 53-57
docs/content/integrations/frontend/opencode.md (1)
170-186: Verify LM Studio Docker availability.

The documentation notes that LM Studio may not have an official Docker image and references an example directory for alternatives. Consider providing a more concrete alternative (e.g., using Ollama) in the quick start to avoid user confusion.

Apply this diff to provide a clearer fallback option:
 > **Note**: LM Studio may not have an official Docker image. See the [example directory](https://github.com/thushan/olla/tree/main/examples/opencode-lmstudio/) for alternative setup methods, or substitute with Ollama if preferred.
+>
+> **Quick Alternative**: Replace the `lmstudio` service with Ollama:
+> ```yaml
+>   ollama:
+>     image: ollama/ollama:latest
+>     container_name: ollama
+>     restart: unless-stopped
+>     ports:
+>       - "11434:11434"
+> ```
examples/crush-vllm/crush-config.example.json (1)
59-114: Consider consolidating duplicate model definitions.

Both providers define identical models with nearly identical metadata. Consider documenting a way to reference shared model definitions to reduce duplication and potential inconsistencies.

For example, in a more maintainable structure, you could note in comments that both providers share models and any updates should be applied to both, or reference Crush CLI documentation if it supports shared model definitions:
   "olla-anthropic": {
     "type": "anthropic",
     "base_url": "http://localhost:40114/olla/anthropic/v1",
     "api_key": "not-required",
-    "description": "Local vLLM via Olla (Anthropic format)",
+    "description": "Local vLLM via Olla (Anthropic format). Note: Models mirror olla-openai provider.",
internal/adapter/translator/anthropic/extended_features_test.go (1)
293-373: Add benchmarks for system character counting.

The countSystemChars function is called during token estimation for every request. Consider adding benchmarks to ensure this operation remains performant, especially for large system prompt arrays.

As per coding guidelines, add a benchmark:
func BenchmarkCountSystemChars(b *testing.B) {
	system := []interface{}{
		map[string]interface{}{"type": "text", "text": "You are a helpful assistant with expertise in multiple domains. "},
		map[string]interface{}{"type": "text", "text": "Provide detailed and accurate responses. "},
		map[string]interface{}{"type": "text", "text": "Consider edge cases and best practices."},
	}
	
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		_ = countSystemChars(system)
	}
}
Based on coding guidelines.
internal/adapter/translator/anthropic/token_count.go (1)

37-57: Optional: improve character counting fairness for Unicode.

len(string) counts bytes, not runes. If you want a closer heuristic across languages, consider utf8.RuneCountInString. This will slightly change totals; gate behind a flag if you need stability.

Also applies to: 85-112, 113-156, 157-183
internal/app/handlers/handler_translation_bench_test.go (2)
99-111: Report bytes/op and assert required proxy headers.

Call b.SetBytes(int64(len(bodyBytes))) to get bytes/op in reports.

Assert presence of mandated headers to prevent regressions: X-Olla-Endpoint, X-Olla-Model, X-Olla-Backend-Type, X-Olla-Request-ID, X-Olla-Response-Time. As per coding guidelines.

Apply:
   b.ResetTimer()
   b.ReportAllocs()
+  b.SetBytes(int64(len(bodyBytes)))
 
   for i := 0; i < b.N; i++ {
@@
     if rec.Code != http.StatusOK {
       b.Fatalf("Expected status 200, got %d", rec.Code)
     }
+    // Verify required headers are set by the handler
+    required := []string{
+      "X-Olla-Endpoint",
+      "X-Olla-Model",
+      "X-Olla-Backend-Type",
+      constants.HeaderXOllaRequestID,
+      "X-Olla-Response-Time",
+    }
+    for _, h := range required {
+      if rec.Header().Get(h) == "" {
+        b.Fatalf("missing required header %q", h)
+      }
+    }
   }
As per coding guidelines.

178-191: Apply the same improvements to streaming benchmark.

Mirror b.SetBytes and header assertions in the streaming case to validate SSE path behaviour as well.

Apply:
   b.ResetTimer()
   b.ReportAllocs()
+  b.SetBytes(int64(len(bodyBytes)))
@@
     if rec.Code != http.StatusOK {
       b.Fatalf("Expected status 200, got %d", rec.Code)
     }
+    required := []string{
+      "X-Olla-Endpoint",
+      "X-Olla-Model",
+      "X-Olla-Backend-Type",
+      constants.HeaderXOllaRequestID,
+      "X-Olla-Response-Time",
+    }
+    for _, h := range required {
+      if rec.Header().Get(h) == "" {
+        b.Fatalf("missing required header %q", h)
+      }
+    }
As per coding guidelines.
internal/adapter/translator/registry_test.go (3)
80-85: Fix test case name to match behaviour.

The case name says “register multiple translators” but registers one. Rename for clarity.

Apply:
-      name:           "register multiple translators",
+      name:           "register single translator",
118-146: Strengthen overwrite assertion.

Verify the registry actually holds the second instance by pointer equality.

Apply:
@@
   retrieved, err := registry.Get("anthropic")
   if err != nil {
     t.Fatalf("Expected no error, got: %v", err)
   }
 
-  // Simple check that it's a different instance (in real code you'd use pointer comparison)
-  if retrieved.Name() != "anthropic" {
-    t.Errorf("Expected overwritten translator name to be anthropic, got %s", retrieved.Name())
-  }
+  if mt, ok := retrieved.(*mockTranslator); !ok || mt != second {
+    t.Errorf("Expected registry to hold the second translator instance after overwrite")
+  }
40-56: Run tests in parallel to reduce wall time.

Call t.Parallel() at the start of each top-level test and within subtests where safe.

Also applies to: 57-116, 148-214, 216-270, 272-295, 296-348, 350-394
internal/adapter/translator/anthropic/translator.go (2)
32-36: Avoid panicking on pool init; prefer returning an error.

Panics in constructors make integration harder. Consider NewTranslator(log, cfg) (*Translator, error) and bubble the error; if you want a hard-fail helper, provide MustNewTranslator(...). Current code path is unlikely to fail but this change improves resilience.

98-104: Prefer a typed error payload for stability.

Using a struct with JSON tags avoids accidental shape drift from map usage and improves IDE support.

Example:
type anthropicError struct {
  Type  string `json:"type"`
  Error struct {
    Type    string `json:"type"`
    Message string `json:"message"`
  } `json:"error"`
}
internal/adapter/translator/anthropic/security_test.go (3)
16-59: Mark tests parallel and set HTTP method/headers.

Add t.Parallel() to speed the suite, and initialise requests with Method: "POST" and Header: Content-Type: application/json to better mirror real traffic.

Apply pattern:
 func TestRequestValidation_RequiredFields(t *testing.T) {
+  t.Parallel()
@@
-  req := &http.Request{
-    Body: io.NopCloser(bytes.NewReader(body)),
-  }
+  req := &http.Request{
+    Method: "POST",
+    Header: http.Header{ "Content-Type": []string{"application/json"} },
+    Body:   io.NopCloser(bytes.NewReader(body)),
+  }
Repeat similarly across tests in this file. Based on learnings (testify works fine with parallel tests; avoid suite parallelisation).

298-321: Trim oversize payload to “limit+1” to reduce memory pressure.

11 MiB is larger than necessary. Use (10<<20)+1 to assert the limit breach with less heap churn.

Apply:
-    largeContent := strings.Repeat("A", 11*1024*1024) // 11mb of 'a's
+    largeContent := strings.Repeat("A", (10<<20)+1) // limit+1 bytes
16-416: Optional: factor request builders to reduce duplication.

Introduce small helpers to build requests and marshal payloads to cut repetition and potential inconsistencies.
internal/adapter/translator/anthropic/tools.go (2)
10-26: Validate tool definitions before emitting OpenAI functions

Consider guarding against invalid tools (empty name, non-object schema) to avoid backend 400s. At minimum: trim/validate name, ensure schema.type == "object", and optionally log skipped tools instead of passing through.

Apply a lightweight validation:
 func (t *Translator) convertTools(anthropicTools []AnthropicTool) []map[string]interface{} {
   openaiTools := make([]map[string]interface{}, 0, len(anthropicTools))

   for _, tool := range anthropicTools {
+    if strings.TrimSpace(tool.Name) == "" {
+      t.logger.Warn("skipping tool with empty name")
+      continue
+    }
+    if schemaType, _ := tool.InputSchema["type"].(string); schemaType == "" {
+      t.logger.Warn("tool schema missing 'type', defaulting to object", "tool", tool.Name)
+      if tool.InputSchema == nil {
+        tool.InputSchema = map[string]interface{}{}
+      }
+      tool.InputSchema["type"] = "object"
+    }
     openaiTool := map[string]interface{}{
       "type": openAITypeFunction,
       "function": map[string]interface{}{
         "name":        tool.Name,
         "description": tool.Description,
         "parameters":  tool.InputSchema, // Direct mapping - both use JSON Schema
       },
     }
     openaiTools = append(openaiTools, openaiTool)
   }

   return openaiTools
 }
33-81: Tool choice mapping looks correct; add warnings on unknown inputs

Semantics for "auto"/"any"/specific tool are good. For unknown string/object values, defaulting to auto is fine, but add a debug/warn log to aid troubleshooting. Also trim and validate the specific tool name before returning.
 func (t *Translator) convertToolChoice(toolChoice interface{}) (interface{}, error) {
   // Handle string form
   if choiceStr, ok := toolChoice.(string); ok {
+    choiceStr = strings.TrimSpace(choiceStr)
     switch choiceStr {
     case toolChoiceAuto:
       return openAIToolChoiceAuto, nil
     case toolChoiceAny:
       // Anthropic's "any" means require a tool call
       // OpenAI uses "required" for this behaviour
       return openAIToolChoiceRequired, nil
     case toolChoiceNone:
       return openAIToolChoiceNone, nil
     default:
-      // Unknown string, default to auto
+      t.logger.Warn("unknown tool_choice string; defaulting to auto", "value", choiceStr)
       return openAIToolChoiceAuto, nil
     }
   }

   // Handle object form
   if choiceMap, ok := toolChoice.(map[string]interface{}); ok {
     choiceType, _ := choiceMap["type"].(string)

     switch choiceType {
     case toolChoiceAuto:
       return openAIToolChoiceAuto, nil
     case toolChoiceAny:
       // Semantic mapping: Anthropic "any" -> OpenAI "required"
       return openAIToolChoiceRequired, nil
     case toolChoiceTool:
       // Force specific tool selection
       toolName, ok := choiceMap["name"].(string)
-      if !ok {
+      if !ok || strings.TrimSpace(toolName) == "" {
         return nil, fmt.Errorf("tool_choice type 'tool' requires 'name' field")
       }
       return map[string]interface{}{
         "type": openAITypeFunction,
         "function": map[string]interface{}{
           "name": toolName,
         },
       }, nil
     default:
-      // Unknown type, default to auto
+      t.logger.Warn("unknown tool_choice object; defaulting to auto", "type", choiceType)
       return openAIToolChoiceAuto, nil
     }
   }

   // Safe default for unknown formats
   return openAIToolChoiceAuto, nil
 }
docs/content/concepts/api-translation.md (2)
166-176: Specify languages on fenced code blocks to satisfy MD040 and improve rendering

Add explicit languages to code fences (e.g., text for SSE, mermaid/json already done elsewhere).
-```
+```text
 data: {"choices":[{"delta":{"content":"Hello"}}]}

 data: {"choices":[{"delta":{"content":"!"}}]}

 data: [DONE]
```diff
-```
+```text
 event: message_start
 data: {"type":"message_start","message":{"id":"msg_01ABC","role":"assistant","model":"llama4.0:latest"}}
 ...
 event: message_stop
Also applies to: 174-196

---

`225-226`: **Minor wording: “Full with” → “Full — with”**

Slight readability improvement.

```diff
-| `"any"` | `"required"` | ✅ Full with semantic mapping |
+| `"any"` | `"required"` | ✅ Full — with semantic mapping |
internal/adapter/translator/anthropic/integration_test.go (2)
38-49: Set Content-Type for fidelity

Add JSON content type header to mirror real requests and avoid future parser changes breaking tests.
 func createHTTPRequest(t *testing.T, anthropicReq AnthropicRequest) *http.Request {
   t.Helper()

   body, err := json.Marshal(anthropicReq)
   require.NoError(t, err, "Failed to marshal Anthropic request")

   req, err := http.NewRequest("POST", "/v1/messages", bytes.NewReader(body))
   require.NoError(t, err, "Failed to create HTTP request")
+  req.Header.Set("Content-Type", "application/json")

   return req
 }
654-706: Add a negative test for tool_choice {type:"tool"} missing name

Exercise the error path in convertToolChoice and ensure TransformRequest surfaces it.
@@ t.Run("tool_choice_variations", func(t *testing.T) {
   testCases := []struct {
     name               string
     toolChoice         interface{}
     expectedOpenAI     interface{}
     expectedShouldFail bool
   }{
@@
     },
+    {
+      "specific_tool_missing_name",
+      map[string]interface{}{"type": "tool"},
+      nil,
+      true,
+    },
   }
internal/adapter/proxy/sherpa/service_streaming.go (1)
212-235: Flush errors are ignored (good), but ensure headers are committed early

Call WriteHeader(200) before the first chunk if headers haven’t been sent to guarantee clients see headers even if the first write/flush fails.
 func (s *Service) writeData(w http.ResponseWriter, data []byte, rc *http.ResponseController, state *streamState, rlog logger.StyledLogger) error {
@@
-  if !state.clientDisconnected {
+  if !state.clientDisconnected {
+    // Ensure headers committed before streaming
+    rc.Flush() // harmless if already written; commits headers if not
     written, writeErr := w.Write(data)
internal/adapter/translator/anthropic/response_test.go (1)
628-665: Strengthen ID prefix assertion

Use a prefix check rather than contains to avoid false positives.
-assert.Contains(t, anthropicResp.ID, "msg_01", "Message ID should start with msg_01 prefix")
+assert.True(t, strings.HasPrefix(anthropicResp.ID, "msg_01"), "Message ID should start with msg_01 prefix")
internal/adapter/translator/registry.go (1)

87-96: Minor: clarify locking expectations in helper

getAvailableNames depends on the caller holding the lock. Consider renaming to unsafeGetAvailableNames or amend the comment to explicitly state it must be called with mu already held (RLock or Lock).

docs/content/api-reference/anthropic.md (1)

592-603: Doc ↔ implementation mismatch: “All responses include standard Olla headers”

The docs state all responses include X-Olla-* headers. The new translator models endpoint (GET /olla/anthropic/v1/models) currently writes only Content-Type and status, not the X-Olla-* headers.

Please either:

Update the handler to emit at least X-Olla-Request-ID and X-Olla-Response-Time (and any available others), or

Narrow the wording to “proxy responses” to match current behaviour.

I recommend emitting X-Olla-Response-Time everywhere for consistency.
internal/app/handlers/handler_translator_models.go (3)
49-51: Emit standard Olla headers (at least response time) for consistency with docs

Set X-Olla-Response-Time (and X-Olla-Request-ID if available from context/middleware) before writing the response.
 func (a *Application) translatorModelsHandler(trans translator.RequestTranslator) http.HandlerFunc {
   return func(w http.ResponseWriter, r *http.Request) {
-    ctx := r.Context()
+    start := time.Now()
+    ctx := r.Context()
@@
-    w.Header().Set(constants.HeaderContentType, constants.ContentTypeJSON)
+    w.Header().Set(constants.HeaderContentType, constants.ContentTypeJSON)
+    // Best-effort response time header
+    w.Header().Set(constants.HeaderXOllaResponseTime, time.Since(start).String())
     w.WriteHeader(http.StatusOK)
-    json.NewEncoder(w).Encode(response)
+    _ = json.NewEncoder(w).Encode(response)
   }
 }
If you already have a request ID in context, also set X-Olla-Request-ID. [Based on learnings]

96-106: Align error payload with documented Anthropic format by including top-level type

Add "type": "error" at the top level to match the documented error schema.
- errorResp := map[string]interface{}{
-   "error": map[string]interface{}{
-     "message": message,
-     "type":    "models_error",
-   },
- }
+ errorResp := map[string]interface{}{
+   "type": "error",
+   "error": map[string]interface{}{
+     "message": message,
+     "type":    "models_error",
+   },
+ }
66-76: Prefer model timestamps when available

Instead of time.Now().Unix() per entry, use model.LastSeen.Unix() if non‑zero to provide a stable, meaningful created value. Fallback to time.Now() only when LastSeen is zero.
internal/app/handlers/handler_translation_test.go (1)
1-204: Add microbenchmarks for translation paths

Per guidelines, include benchmarks for critical paths (non‑streaming and streaming). Example stubs:
func BenchmarkTranslationHandler_NonStreaming(b *testing.B) {
  app := &Application{ /* init with mocks as in TestTranslationHandler_NonStreaming */ }
  h := app.translationHandler(&mockTranslator{name: "bench"})
  body, _ := json.Marshal(map[string]any{"model":"m","messages":[]any{{"role":"user","content":"hi"}}})
  req := httptest.NewRequest("POST", "/test", bytes.NewReader(body))
  req.Header.Set(constants.HeaderContentType, constants.ContentTypeJSON)
  for i := 0; i < b.N; i++ {
    rec := httptest.NewRecorder()
    h.ServeHTTP(rec, req)
  }
}

func BenchmarkTransformStreamingResponse(b *testing.B) {
  tlog := &mockStyledLogger{}
  // create translator under test and a representative stream
  chunks := []string{
    "data: {\"id\":\"x\",\"model\":\"m\",\"choices\":[{\"delta\":{\"content\":\"Hello\"},\"index\":0}]}\n\n",
    "data: {\"id\":\"x\",\"choices\":[{\"delta\":{},\"index\":0,\"finish_reason\":\"stop\"}]}\n\n",
    "data: [DONE]\n\n",
  }
  for i := 0; i < b.N; i++ {
    tr := &mockTranslator{name:"bench-stream"}
    rec := httptest.NewRecorder()
    _ = tr.TransformStreamingResponse(context.Background(), strings.NewReader(strings.Join(chunks, "")), rec, nil)
  }
}
internal/adapter/translator/anthropic/streaming_test.go (2)
442-482: Add a benchmark for streaming translation

Given this path is performance‑critical, add a benchmark to quantify overhead.
func BenchmarkAnthropic_TransformStreamingResponse(b *testing.B) {
  tr := NewTranslator(createStreamingTestLogger(), createTestConfig())
  chunks := []string{
    "data: {\"id\":\"chatcmpl-bench\",\"model\":\"claude-3-5-sonnet-20241022\",\"choices\":[{\"delta\":{\"content\":\"Test\"},\"index\":0}]}\n\n",
    "data: {\"id\":\"chatcmpl-bench\",\"choices\":[{\"delta\":{},\"index\":0,\"finish_reason\":\"stop\"}]}\n\n",
    "data: [DONE]\n\n",
  }
  payload := strings.Join(chunks, "")
  b.ReportAllocs()
  for i := 0; i < b.N; i++ {
    rec := httptest.NewRecorder()
    _ = tr.TransformStreamingResponse(context.Background(), strings.NewReader(payload), rec, nil)
  }
}
23-29: Optional: simulate real streaming with io.Pipe for stronger signal

createMockOpenAIStream writes a full buffer at once. For closer parity with runtime, consider an io.Pipe that writes chunks with small delays to surface any read/flush timing bugs.
internal/adapter/translator/anthropic/response.go (2)
115-125: Handle content_filter finish_reason.

OpenAI may emit content_filter. Map it explicitly to a sensible stop reason (e.g., end_turn) to avoid ambiguity.
 func mapFinishReasonToStopReason(finishReason string) string {
   switch finishReason {
   case "stop":
     return "end_turn"
   case "tool_calls":
     return contentTypeToolUse
   case "length":
     return "max_tokens"
+  case "content_filter":
+    return "end_turn"
   default:
     return "end_turn"
   }
 }
191-200: Strengthen ID fallback uniqueness.

If crypto/rand fails, current fallback can return identical IDs (msg_01fallback0). Mix in time to ensure uniqueness.
-import (
+import (
   "context"
   "crypto/rand"
   "encoding/json"
   "fmt"
   "math/big"
   "net/http"
+  "time"
 )
@@
   if _, err := rand.Read(randomBytes); err != nil {
     // crypto/rand shouldn't fail but just in case
     t.logger.Warn("Failed to generate random bytes for message ID", "error", err)
-    return fmt.Sprintf("msg_01fallback%d", big.NewInt(0).SetBytes(randomBytes[:8]).Uint64())
+    ts := time.Now().UnixNano()
+    return fmt.Sprintf("msg_01fallback%x", ts)
   }
internal/adapter/translator/anthropic/request.go (2)
209-213: Preserve user text boundaries.

Joining with "" collapses spacing. Prefer a newline (or space) to retain readability.
-      "content": strings.Join(textParts, ""),
+      "content": strings.Join(textParts, "\n"),
39-41: Set Content-Type for rewritten request body.

After marshalling OpenAI body, set JSON content type to avoid backend heuristics.
   r.Body = io.NopCloser(bytes.NewReader(openaiBody))
   r.ContentLength = int64(len(openaiBody))
+  r.Header.Set("Content-Type", "application/json")
internal/adapter/translator/anthropic/types.go (1)

35-38: Consider allowing MaxTokens=0 as “use default”.

Strictly requiring ≥1 may break clients that omit or set 0 expecting server defaults. Option: accept 0 and inject a configured default during translation.
internal/adapter/translator/anthropic/streaming.go (1)
31-35: Add X-Accel-Buffering: no for SSE behind proxies.

Helps disable buffering on Nginx to keep events flowing.
   w.Header().Set(constants.HeaderContentType, "text/event-stream")
   w.Header().Set("Cache-Control", "no-cache")
   w.Header().Set("Connection", "keep-alive")
+  w.Header().Set("X-Accel-Buffering", "no")
internal/app/handlers/handler_translation.go (1)
39-41: Set Content-Type after rewriting request body (defensive).

Helps avoid backend surprises if Content-Type wasn’t JSON.
 r.Body = io.NopCloser(bytes.NewReader(openaiBody))
 r.ContentLength = int64(len(openaiBody))
+r.Header.Set(constants.HeaderContentType, constants.ContentTypeJSON)

coderabbitai · 2025-10-20T12:33:59Z

docs/content/api-reference/anthropic.md

+```
+Content-Type: application/json
+anthropic-version: 2023-06-01  # Optional, for compatibility
+```
+


⚠️ Potential issue | 🟡 Minor

Add languages to fenced code blocks to satisfy MD040 and improve readability

Specify a language for the following fences:

Lines 80–84 (headers) → use text

Lines 198–240 (SSE example) → use text

Lines 470–474 (headers) → use text

Lines 568–589 (full SSE sequence) → use text

Example:

-``` +```text Content-Type: application/json anthropic-version: 2023-06-01 # Optional, for compatibility

Also applies to: 198-240, 470-474, 568-589 <details> <summary>🧰 Tools</summary> <details> <summary>🪛 markdownlint-cli2 (0.18.1)</summary> 80-80: Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details> <details> <summary>🤖 Prompt for AI Agents</summary>

In docs/content/api-reference/anthropic.md around lines 80-84, 198-240, 470-474,
and 568-589, the fenced code blocks lack a language tag which triggers MD040 and
reduces readability; update each opening fence to include the language
identifier "text" (i.e. replace withtext for those blocks) so the
headers, SSE examples, and full SSE sequence are marked as plain text.

</details>  

coderabbitai · 2025-10-20T12:33:59Z

docs/content/concepts/api-translation.md

+| Anthropic | OpenAI | Behaviour |
+|-----------|--------|-----------|
+| `"auto"` | `"auto"` | Let model decide whether to use tools |
+| `"any"` | `"required"` | Force model to use a tool (any tool) |
+| `"none"` | `"none"` | Disable tool use |
+| `{"type":"tool","name":"X"}` | `{"type":"function","function":{"name":"X"}}` | Force specific tool |
+
+---


⚠️ Potential issue | 🟡 Minor

Hyphenate “Force-specific” for correctness

Update table text to “Force‑specific tool”.

-| `{"type":"tool","name":"X"}` | `{"type":"function","function":{"name":"X"}}` | Force specific tool | +| `{"type":"tool","name":"X"}` | `{"type":"function","function":{"name":"X"}}` | Force-specific tool |

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

| Anthropic | OpenAI | Behaviour |

|-----------|--------|-----------|

| `"auto"` | `"auto"` | Let model decide whether to use tools |

| `"any"` | `"required"` | Force model to use a tool (any tool) |

| `"none"` | `"none"` | Disable tool use |

| `{"type":"tool","name":"X"}` | `{"type":"function","function":{"name":"X"}}` | Force specific tool |

---

| Anthropic | OpenAI | Behaviour |

|-----------|--------|-----------|

| `"auto"` | `"auto"` | Let model decide whether to use tools |

| `"any"` | `"required"` | Force model to use a tool (any tool) |

| `"none"` | `"none"` | Disable tool use |

| `{"type":"tool","name":"X"}` | `{"type":"function","function":{"name":"X"}}` | Force-specific tool |

---

🧰 Tools

🪛 LanguageTool

[uncategorized] ~242-~242: When ‘Force-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ...:"function","function":{"name":"X"}}` | Force specific tool | --- ## Architecture ### Trans...

(SPECIFIC_HYPHEN)

🤖 Prompt for AI Agents

docs/content/concepts/api-translation.md around lines 237 to 244: the table row label in the Behaviour column reads "Force specific tool" and needs to be hyphenated; update that cell text to "Force-specific tool" so it correctly uses a hyphen.

coderabbitai · 2025-10-20T12:33:59Z

docs/content/concepts/api-translation.md

+1. Request arrives at `/olla/anthropic/v1/messages`
+2. Handler invokes Anthropic translator
+3. Translator converts request to OpenAI format
+4. Proxy routes to backend (standard Olla routing)
+5. Backend responds in OpenAI format
+6. Translator converts response to Anthropic format
+7. Response returned to client
+


⚠️ Potential issue | 🟡 Minor

Fix duplicate word in the numbered “Process” list

There appears to be a repeated word; remove the duplicate.

-1. Request arrives at `/olla/anthropic/v1/messages` -2. Handler invokes Anthropic translator -3. Translator Translator converts request to OpenAI format +1. Request arrives at `/olla/anthropic/v1/messages` +2. Handler invokes Anthropic translator +3. Translator converts request to OpenAI format

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools

🪛 LanguageTool

[duplication] ~295-~295: Possible typo: you repeated a word.
Context: .../messages` 2. Handler invokes Anthropic translator 3. Translator converts request to OpenAI format 4. Pr...

(ENGLISH_WORD_REPEAT_RULE)

🤖 Prompt for AI Agents

In docs/content/concepts/api-translation.md around lines 294 to 301, a word is duplicated in the numbered “Process” list; edit the affected list item to remove the repeated word so the sentence reads cleanly and grammatically (e.g., fix the duplicated token in the translator/convert step), keeping the numbering and overall steps unchanged.

docs/content/integrations/frontend/claude-code.md

coderabbitai · 2025-10-20T12:33:59Z

examples/crush-vllm/olla.yaml

+translators:
+  anthropic:
+    enabled: true
+    # Anthropic-specific configuration
+    version: "2023-06-01"
+    # Map Anthropic system messages to OpenAI format
+    system_message_handling: "prepend"  # Options: prepend, append, merge
+    # Handle max_tokens requirement (Anthropic requires it)
+    default_max_tokens: 4096
+


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify Anthropic translator configuration fields.

The configuration includes fields (version, system_message_handling, default_max_tokens) that appear to be documentation placeholders rather than implemented features. Ensure these configuration options are actually supported by the Anthropic translator implementation before including them in examples.

Run the following script to verify if these configuration fields are used in the translator code:

🏁 Script executed:

#!/bin/bash # Search for references to these configuration fields in the translator implementation rg -n --type=go -C3 'version.*2023-06-01|system_message_handling|default_max_tokens' internal/adapter/translator/anthropic/

Length of output: 0

I'll conduct broader searches to understand the codebase structure and verify whether these configuration fields are actually implemented.

--- 🏁 Script executed: ```shell #!/bin/bash # Find files related to Anthropic or translators fd -t f -i anthropic

Length of output: 94

🏁 Script executed:

#!/bin/bash # Search for translator-related files and directories fd -t d translator

Length of output: 28

🏁 Script executed:

#!/bin/bash # Search the entire codebase for these config field names rg -n 'system_message_handling|default_max_tokens' --type=go --type=yaml

Length of output: 168

🏁 Script executed:

#!/bin/bash # Check if the example YAML file is validated or parsed anywhere rg -n 'olla\.yaml|crush-vllm' --type=go

Length of output: 0

I need to check what translators are actually implemented and review the documentation.

--- 🏁 Script executed: ```shell #!/bin/bash # List translator implementations ls -la internal/adapter/translator/

Length of output: 361

🏁 Script executed:

#!/bin/bash # Check if Anthropic translator exists ls -la internal/adapter/translator/ | grep -i anthropic

Length of output: 59

🏁 Script executed:

#!/bin/bash # View the Anthropic documentation to see if these fields are documented cat docs/content/integrations/api-translation/anthropic.md

Length of output: 22724

🏁 Script executed:

#!/bin/bash # Search for YAML config parsing and validation logic rg -n 'yaml|config|unmarshal' --type=go -C2 | head -50

Length of output: 2483

--- 🏁 Script executed: ```shell #!/bin/bash # List files in Anthropic translator implementation ls -la internal/adapter/translator/anthropic/

Length of output: 1382

🏁 Script executed:

#!/bin/bash # Search for struct definitions or config handling in Anthropic translator rg -n 'type.*Config|Unmarshal|version|system_message' internal/adapter/translator/anthropic/ --type=go

Length of output: 2248

Remove undocumented configuration fields from Anthropic translator example.

The Anthropic translator example includes three unsupported configuration options (version, system_message_handling, default_max_tokens) that are not documented or implemented. The official documentation lists only enabled (boolean) and max_message_size (integer) as supported configuration options, and these fields appear nowhere in the translator implementation. Remove lines 46, 48, and 50 to prevent user confusion.

🤖 Prompt for AI Agents

In examples/crush-vllm/olla.yaml around lines 42 to 51, the Anthropic translator example contains undocumented/unsupported fields (version, system_message_handling, default_max_tokens); remove the three lines (currently at lines 46, 48, and 50) so the block only contains supported options (enabled and max_message_size) to match the translator implementation and docs.

internal/app/handlers/handler_translation.go

coderabbitai · 2025-10-20T12:34:01Z

internal/app/handlers/server_routes.go

+			a.routeRegistry.RegisterWithMethod(
+				path,
+				handler,
+				name+" Messages API",
+				"POST",
+			)
+
+			// Register models endpoint for translators
+			// Uses base path from translator + "/models" (e.g., /olla/anthropic/v1/models)
+			modelsPath := extractBasePath(path) + "/models"
+			modelsHandler := a.translatorModelsHandler(trans)
+
+			a.routeRegistry.RegisterWithMethod(
+				modelsPath,
+				modelsHandler,
+				name+" Models API",
+				"GET",
+			)


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Ensure proxy headers are set for translator endpoints.

Per guidelines for internal/app/handlers/*.go, proxy responses must set: X-Olla-Endpoint, X-Olla-Model, X-Olla-Backend-Type, X-Olla-Request-ID, X-Olla-Response-Time. Please confirm translationHandler and translatorModelsHandler/tokenCountHandler set these (or are wrapped by middleware that does).

Run:

If missing, consider a lightweight wrapper:

- handler := a.translationHandler(trans) + handler := a.withProxyHeaders(a.translationHandler(trans))

…and implement withProxyHeaders once in this package.

As per coding guidelines.

Also applies to: 101-107

🏁 Script executed:

#!/bin/bash # Verify required proxy headers are set by handlers or middleware. rg -n --type=go -C2 $'X-Olla-(Endpoint|Model|Backend-Type|Request-ID|Response-Time)' internal/app/handlers || true # Inspect translation handler for header writes ast-grep --pattern $'func ($_ $_) translationHandler($_) $_ { $$$ }' || true rg -n --type=go -C3 'translationHandler$' internal/app/handlers || true # Check if routeRegistry injects headers automatically for RegisterWithMethod/RegisterProxyRoute rg -n --type=go -C3 'RegisterWithMethod|RegisterProxyRoute' internal || true

Length of output: 31618

Expected output would show where headers are set in the actual handlers, confirming whether they're being set for translator endpoints. --- 🏁 Script executed: ```shell #!/bin/bash # Find executeTranslatedStreamingRequest and executeTranslatedNonStreamingRequest implementations rg -n --type=go 'executeTranslatedStreamingRequest|executeTranslatedNonStreamingRequest' internal/app/handlers -A 20 | head -100 # Find translatorModelsHandler and tokenCountHandler implementations rg -n --type=go -B2 'func.*translatorModelsHandler|func.*tokenCountHandler' internal/app/handlers | head -80 # Search for where X-Olla headers are written (w.Header().Set with X-Olla) rg -n --type=go 'w\.Header\($\.Set.*X-Olla' internal/app/handlers | head -30

Length of output: 185

--- 🏁 Script executed: ```shell #!/bin/bash # Find executeTranslatedStreamingRequest and executeTranslatedNonStreamingRequest implementations rg -n --type=go 'func.*executeTranslatedStreamingRequest|func.*executeTranslatedNonStreamingRequest' internal/app/handlers -A 25 | head -150 # Find translatorModelsHandler and tokenCountHandler implementations rg -n --type=go 'func.*translatorModelsHandler|func.*tokenCountHandler' internal/app/handlers -A 20 | head -100 # Search for where X-Olla headers are actually written rg -n --type=go 'w\.Header\.Set.*constants\.HeaderXOlla|w\.Header\.Set.*"X-Olla' internal/app/handlers -B2 -A2 | head -80

Length of output: 14776

--- 🏁 Script executed: ```shell #!/bin/bash # Find writeTranslatorModelsError implementation rg -n --type=go 'func.*writeTranslatorModelsError' internal/app/handlers -A 15 # Find writeTranslatorError implementation rg -n --type=go 'func.*writeTranslatorError' internal/app/handlers -A 15 # Find ProxyRequestToEndpoints implementation (likely in adapter/proxy) rg -n --type=go 'func.*ProxyRequestToEndpoints' internal -A 15 # Check if there's middleware wrapping the handlers rg -n --type=go 'withProxyHeaders|setProxyHeaders' internal -B2 -A5

Length of output: 22713

--- 🏁 Script executed: ```shell #!/bin/bash # Find ProxyRequestToEndpointsWithRetry implementation in olla rg -n --type=go 'func.*ProxyRequestToEndpointsWithRetry' internal/adapter/proxy/olla/service_retry.go -A 50 | head -100 # Find actual response header writing in proxy service rg -n --type=go 'w\.Header\.Set.*HeaderXOlla' internal/adapter/proxy -B2 -A2 | head -80 # Check if there's any middleware or wrapper that adds headers rg -n --type=go 'HeaderXOlla' internal -B3 -A3 | grep -E '(func|\.Set)' | head -40

Length of output: 8131

--- 🏁 Script executed: ```shell #!/bin/bash # Find the function containing the header-setting code in common.go rg -n --type=go --context 20 'h\.Set\(constants\.HeaderXOllaRequestID' internal/adapter/proxy/core/common.go | head -80 # Find what calls this function rg -n --type=go 'SetProxyResponseHeaders|WriteProxyHeaders|copyResponseHeaders' internal/adapter/proxy -B3 -A3 | head -60

Length of output: 1432

--- 🏁 Script executed: ```shell #!/bin/bash # Find all calls to SetResponseHeaders rg -n --type=go 'SetResponseHeaders\(' internal -B3 -A3 # Check if executeTranslatedNonStreamingRequest or executeTranslatedStreamingRequest call it rg -n --type=go 'executeTranslatedNonStreamingRequest|executeTranslatedStreamingRequest' internal/app/handlers/handler_translation.go -A 50 | grep -E '(SetResponseHeaders|w\.Header)' # Check if proxyService implementations call SetResponseHeaders rg -n --type=go 'proxyToSingleEndpoint|retryHandler\.ExecuteWithRetry' internal/adapter/proxy -A 30 | grep -E '(SetResponseHeaders|w\.Header)'

Length of output: 4130

--- 🏁 Script executed: ```shell #!/bin/bash # Find ExecuteWithRetry implementation to see if it calls SetResponseHeaders rg -n --type=go 'func.*ExecuteWithRetry' internal/adapter/proxy -A 40 | head -120 # Also check for any wrapper or middleware between handler and proxy service rg -n --type=go 'translationHandler|translatorModelsHandler|tokenCountHandler' internal/app/handlers/server_routes.go -B2 -A2

Length of output: 3500

--- 🏁 Script executed: ```shell #!/bin/bash # Full implementation of translatorModelsHandler to see if it sets headers rg -n --type=go 'func $a \*Application$ translatorModelsHandler' internal/app/handlers -A 80 | head -150 # Full implementation of tokenCountHandler rg -n --type=go 'func $a \*Application$ tokenCountHandler' internal/app/handlers -A 80 | head -150

Length of output: 13357

Set X-Olla proxy headers on translator Models and Token Count endpoints.

Per guidelines, translator endpoints must set response headers: X-Olla-Endpoint, X-Olla-Model, X-Olla-Backend-Type, X-Olla-Request-ID, X-Olla-Response-Time.

translationHandler (lines 76–80): ✓ Already compliant via proxyService.ProxyRequestToEndpoints() → core.SetResponseHeaders()

translatorModelsHandler (lines 85–87): ✗ Missing header calls

tokenCountHandler (lines 101–107): ✗ Missing header calls

Add header setting for the Models and Token Count handlers. Use the existing copyOllaHeaders utility if available, or call core.SetResponseHeaders() with appropriate stats/endpoint context.

internal/app/handlers/server_routes.go

internal/config/types.go

- fix typos in configuration for Claude Code - Refactor Streaming tests and use helpers - update handler tests - fix trailing slashes in paths causing issues when not escaped - update logging to avoid PII data leak - add additional tests for token counts, configuration

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (3)

internal/adapter/translator/anthropic/response.go (2)
205-206: Fix typo in comment.

“ambigious” → “ambiguous”.

Apply:
-// encode bytes to base58, shorter and less ambigious than hex
+// encode bytes to base58, shorter and less ambiguous than hex
141-151: Reduce PII risk in tool-args logging.

Don’t log the preview string under a name suggesting it’s the full value; prefer length only.

Apply:
-        t.logger.Warn("Failed to parse tool arguments, using empty input",
+        t.logger.Warn("Failed to parse tool arguments; using empty input",
             "tool", name,
             "tool_id", id,
             "error", err,
-            "raw_arguments", util.TruncateString(argsStr, util.DefaultTruncateLengthPII),
-            "raw_arguments_len", len(argsStr))
+            "raw_arguments_len", len(argsStr))
internal/app/handlers/handler_translation.go (1)
76-79: Fix typo in comment.

“make shure” → “make sure”.

Apply:
- // make shure that we have at least one endpoint available
+ // make sure that we have at least one endpoint available

🧹 Nitpick comments (14)

docs/content/integrations/frontend/claude-code.md (3)
101-115: Add a language to the ASCII diagram code fence (markdownlint MD040).

Specify a fence language to satisfy linters and improve rendering.
-```
+```text
 ...
-```
+```
11-12: Prefer headings over bold for section titles (markdownlint MD036).

Use proper headings for structure and navigation.
-**Set in Claude Code:**
+## Set in Claude Code

-**Option A: Environment Variables (Recommended)**
+### Option A: Environment Variables (Recommended)

-**Option B: Configuration File**
+### Option B: Configuration File
Also applies to: 275-276, 290-296

537-541: Minor wording: “multi‑task” → “multitask”.

Improve consistency with common usage.
- - `qwen3:32b` - Strong multi-task performance
+ - `qwen3:32b` - Strong multitask performance
internal/adapter/translator/anthropic/token_count.go (2)
120-146: Support pointer to typed content blocks in message.Content.

Messages can mirror system’s typed forms. Add *[]ContentBlock handling to avoid undercounting.
 func countMessageChars(msg *AnthropicMessage) int {
   totalChars := 0
 
   switch content := msg.Content.(type) {
   case string:
@@
   case []ContentBlock:
     // typed block arrays
     for _, block := range content {
       totalChars += countTypedContentBlockChars(&block)
     }
+  case *[]ContentBlock:
+    if content != nil {
+      for _, block := range *content {
+        totalChars += countTypedContentBlockChars(&block)
+      }
+    }
   }
 
   return totalChars
 }
51-71: Optional: count runes, not bytes, for string lengths.

Using len(s) overestimates non‑ASCII text (bytes vs runes). Consider utf8.RuneCountInString for more language‑agnostic estimates. Minimal change: introduce a small helper and use it in the string paths (text, names, serialized JSON remains byte‑based).

Example:
import "unicode/utf8"
func runeLen(s string) int { return utf8.RuneCountInString(s) }
Then replace len(...) on user‑visible text fields with runeLen(...). No functional change required now.

Also applies to: 73-119, 148-190, 192-230
internal/adapter/translator/anthropic/integration_test.go (1)
92-98: Run top‑level tests in parallel to speed CI (safe here).

Each Test* creates its own translator/config; mark them parallel.
 func TestAnthropicToOpenAIToAnthropic_RoundTrip(t *testing.T) {
   translator := NewTranslator(createIntegrationTestLogger(), createTestConfig())
   ctx := context.Background()
+  t.Parallel()
 }
@@
 func TestAnthropicToolCalling_RoundTrip(t *testing.T) {
   translator := NewTranslator(createIntegrationTestLogger(), createTestConfig())
   ctx := context.Background()
+  t.Parallel()
 }
@@
 func TestAnthropicEdgeCases_RoundTrip(t *testing.T) {
   translator := NewTranslator(createIntegrationTestLogger(), createTestConfig())
   ctx := context.Background()
+  t.Parallel()
 }
@@
 func TestAnthropicModelPreservation(t *testing.T) {
   translator := NewTranslator(createIntegrationTestLogger(), createTestConfig())
   ctx := context.Background()
+  t.Parallel()
 }
@@
 func TestAnthropicUsageTracking(t *testing.T) {
   translator := NewTranslator(createIntegrationTestLogger(), createTestConfig())
   ctx := context.Background()
+  t.Parallel()
 }
Also applies to: 374-380, 827-833, 1033-1039, 1074-1081
internal/adapter/translator/registry.go (1)

72-85: Optional: expose Unregister for dynamic reconfiguration.

If translators can be reloaded/replaced at runtime, an Unregister(name) method (with lock) prevents stale entries during reinit.

No change required now.

Also applies to: 87-105
internal/adapter/translator/anthropic/response.go (2)
79-85: Handle OpenAI message.content arrays (multi‑part).

Support []interface{} with text parts to avoid dropping content from newer APIs.

Apply:
- // Handle text content
- if textContent, ok := message["content"].(string); ok && textContent != "" {
-   content = append(content, ContentBlock{
-     Type: contentTypeText,
-     Text: textContent,
-   })
- }
+ // Handle text content
+ switch c := message["content"].(type) {
+ case string:
+   if c != "" {
+     content = append(content, ContentBlock{Type: contentTypeText, Text: c})
+   }
+ case []interface{}:
+   // Concatenate text parts; ignore non-text here (images not supported in non-stream path yet)
+   var b strings.Builder
+   for _, part := range c {
+     if m, ok := part.(map[string]interface{}); ok {
+       if t, _ := m["type"].(string); t == "text" {
+         if txt, _ := m["text"].(string); txt != "" {
+           b.WriteString(txt)
+         }
+       }
+     }
+   }
+   if b.Len() > 0 {
+     content = append(content, ContentBlock{Type: contentTypeText, Text: b.String()})
+   }
+ }
Add import:
+ "strings"
117-127: Map ‘content_filter’ finish_reason.

Add explicit mapping to a sensible default to avoid surprises.

Apply:
   switch finishReason {
   case "stop":
     return "end_turn"
   case "tool_calls":
     return contentTypeToolUse
   case "length":
     return "max_tokens"
+  case "content_filter":
+    return "end_turn"
   default:
     return "end_turn"
   }
internal/adapter/translator/anthropic/streaming_test.go (3)
14-59: Mark tests parallel where safe.

Add t.Parallel() at the start of each TestTransformStreamingResponse_* to speed CI.

Example:
 func TestTransformStreamingResponse_SimpleText(t *testing.T) {
+  t.Parallel()
   translator := newTestTranslator()
Repeat for other tests in this file.

523-544: Add boundary test for max_message_size.

Include a case exceeding configured MaxMessageSize to assert translator returns an error/status rather than hanging.

Would you like me to draft the test using a >10MiB single-arg chunk?

296-338: Finish-reason default case.

Add a case asserting unknown finish_reason maps to “end_turn” to pin expected default.
internal/app/handlers/handler_translation.go (1)
153-165: Avoid string keys in context.

Use a typed, unexported key to prevent collisions.

Apply:
 func (a *Application) prepareProxyContext(ctx context.Context, r *http.Request, pr *proxyRequest) (context.Context, *http.Request) {
- if pr.model != "" {
-   ctx = context.WithValue(ctx, "model", pr.model)
-   r = r.WithContext(ctx)
- }
+ if pr.model != "" {
+   ctx = context.WithValue(ctx, ctxKeyModel, pr.model)
+   r = r.WithContext(ctx)
+ }
Add near the top of the file:
// context keys
type ctxKey string

const ctxKeyModel ctxKey = "model"
internal/adapter/translator/anthropic/streaming_test_helpers.go (1)
137-151: Optionally ignore non‑JSON data lines.

If future streams include “data: [DONE]”, skip parsing those lines to avoid false errors in helpers.

Example:
   } else if strings.HasPrefix(line, "data: ") {
     dataStr := strings.TrimPrefix(line, "data: ")
+    if dataStr == "[DONE]" {
+      continue
+    }
     var data map[string]interface{}
     if err := json.Unmarshal([]byte(dataStr), &data); err != nil {
       return nil, err
     }

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4944aaa and a93664d.

📒 Files selected for processing (13)

docs/content/integrations/frontend/claude-code.md (1 hunks)
examples/claude-code-ollama/olla.yaml (1 hunks)
examples/crush-vllm/olla.yaml (1 hunks)
internal/adapter/translator/anthropic/integration_test.go (1 hunks)
internal/adapter/translator/anthropic/response.go (1 hunks)
internal/adapter/translator/anthropic/response_test.go (1 hunks)
internal/adapter/translator/anthropic/streaming.go (1 hunks)
internal/adapter/translator/anthropic/streaming_test.go (1 hunks)
internal/adapter/translator/anthropic/streaming_test_helpers.go (1 hunks)
internal/adapter/translator/anthropic/token_count.go (1 hunks)
internal/adapter/translator/anthropic/token_count_test.go (1 hunks)
internal/adapter/translator/registry.go (1 hunks)
internal/app/handlers/handler_translation.go (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

internal/adapter/translator/anthropic/token_count_test.go
examples/claude-code-ollama/olla.yaml
internal/adapter/translator/anthropic/streaming.go

🧰 Additional context used

📓 Path-based instructions (2)

{internal,pkg}/**/*_test.go

📄 CodeRabbit inference engine (CLAUDE.md)

Include Go benchmarks (Benchmark* functions) for critical paths, proxy engine comparisons, pooling efficiency, and circuit breaker behaviour

Files:

internal/adapter/translator/anthropic/streaming_test.go
internal/adapter/translator/anthropic/response_test.go
internal/adapter/translator/anthropic/integration_test.go

internal/app/handlers/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

Set response headers on proxy responses: X-Olla-Endpoint, X-Olla-Model, X-Olla-Backend-Type, X-Olla-Request-ID, X-Olla-Response-Time

Files:

internal/app/handlers/handler_translation.go

🧠 Learnings (3)

📚 Learning: 2025-09-23T08:30:20.366Z

Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-23T08:30:20.366Z
Learning: Applies to internal/app/handlers/*.go : Set response headers on proxy responses: `X-Olla-Endpoint`, `X-Olla-Model`, `X-Olla-Backend-Type`, `X-Olla-Request-ID`, `X-Olla-Response-Time`

Applied to files:

internal/app/handlers/handler_translation.go

📚 Learning: 2025-09-23T08:30:20.366Z

Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-23T08:30:20.366Z
Learning: Applies to internal/adapter/proxy/**/service.go : Proxy services must include response headers: `X-Olla-Endpoint`, `X-Olla-Model`, `X-Olla-Backend-Type`, `X-Olla-Request-ID`, `X-Olla-Response-Time`

Applied to files:

internal/app/handlers/handler_translation.go

📚 Learning: 2025-09-23T08:30:20.366Z

Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-23T08:30:20.366Z
Learning: Applies to {internal,pkg}/**/*_test.go : Include Go benchmarks (Benchmark* functions) for critical paths, proxy engine comparisons, pooling efficiency, and circuit breaker behaviour

Applied to files:

internal/adapter/translator/anthropic/integration_test.go

🧬 Code graph analysis (7)

internal/adapter/translator/anthropic/token_count.go (3)

internal/adapter/translator/anthropic/translator.go (1)

Translator (16-21)

internal/adapter/translator/types.go (1)

TokenCountResponse (58-62)

internal/adapter/translator/anthropic/types.go (3)

AnthropicRequest (7-21)

ContentBlock (60-69)

AnthropicMessage (53-56)

internal/adapter/translator/anthropic/response.go (3)

internal/adapter/translator/anthropic/translator.go (1)

Translator (16-21)

internal/adapter/translator/anthropic/types.go (3)

AnthropicResponse (106-115)

ContentBlock (60-69)

AnthropicUsage (118-121)

internal/util/parsing.go (2)

TruncateString (23-34)

DefaultTruncateLengthPII (8-8)

internal/adapter/translator/anthropic/response_test.go (5)

internal/logger/styled.go (1)

StyledLogger (12-35)

internal/config/types.go (1)

Config (12-21)

internal/logger/styled_plain.go (1)

NewPlainStyledLogger (18-22)

internal/adapter/translator/anthropic/translator.go (1)

NewTranslator (26-51)

internal/adapter/translator/anthropic/types.go (2)

AnthropicResponse (106-115)

ContentBlock (60-69)

internal/app/handlers/handler_translation.go (6)

internal/app/handlers/application.go (1)

Application (71-88)

internal/adapter/translator/types.go (3)

RequestTranslator (11-26)

ErrorWriter (47-49)

TokenCounter (53-55)

internal/util/request.go (1)

StripPrefix (74-83)

internal/core/constants/endpoint.go (1)

DefaultOllaProxyPathPrefix (5-5)

internal/core/constants/content.go (10)

HeaderContentType (57-57)

ContentTypeJSON (7-7)

HeaderXOllaModel (104-104)

HeaderXOllaRequestID (101-101)

HeaderXOllaEndpoint (102-102)

HeaderXOllaBackendType (103-103)

HeaderXOllaResponseTime (105-105)

HeaderXOllaRoutingStrategy (106-106)

HeaderXOllaRoutingDecision (107-107)

HeaderXOllaRoutingReason (108-108)

internal/core/domain/endpoint.go (1)

Endpoint (19-39)

internal/adapter/translator/anthropic/integration_test.go (5)

internal/logger/styled.go (1)

StyledLogger (12-35)

internal/config/types.go (2)

Config (12-21)

AnthropicTranslatorConfig (176-179)

internal/logger/styled_plain.go (1)

NewPlainStyledLogger (18-22)

internal/adapter/translator/anthropic/types.go (5)

AnthropicRequest (7-21)

AnthropicMessage (53-56)

AnthropicResponse (106-115)

AnthropicTool (82-86)

ContentBlock (60-69)

internal/adapter/translator/anthropic/translator.go (1)

NewTranslator (26-51)

internal/adapter/translator/registry.go (2)

internal/adapter/translator/types.go (1)

RequestTranslator (11-26)

internal/logger/styled.go (1)

StyledLogger (12-35)

internal/adapter/translator/anthropic/streaming_test_helpers.go (3)

internal/adapter/translator/anthropic/translator.go (2)

Translator (16-21)

NewTranslator (26-51)

internal/config/types.go (2)

Config (12-21)

AnthropicTranslatorConfig (176-179)

internal/logger/styled_plain.go (1)

NewPlainStyledLogger (18-22)

🪛 LanguageTool

docs/content/integrations/frontend/claude-code.md

[grammar] ~342-~342: A determiner may be missing.
Context: ...t-connections**: Routes to backend with fewest active requests **Timeout Configuratio...

(THE_SUPERLATIVE)

[misspelling] ~539-~539: This word is normally spelled as one.
Context: ...soning abilities - qwen3:32b - Strong multi-task performance **Performance vs Quality T...

(EN_COMPOUNDS_MULTI_TASK)

[uncategorized] ~591-~591: Possible missing preposition found.
Context: ...og ### No Models Available **List models from Olla**:bash curl http://localh...

(AI_HYDRA_LEO_MISSING_OF)

[grammar] ~845-~845: A determiner may be missing.
Context: ... Workflow - Local-first: Configure highest priority for local backends - **Fallbac...

(THE_SUPERLATIVE)

🪛 markdownlint-cli2 (0.18.1)

docs/content/integrations/frontend/claude-code.md

32-32: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

101-101: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

275-275: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

290-290: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

🔇 Additional comments (8)

internal/adapter/translator/anthropic/integration_test.go (1)

1150-1598: Benchmarks for hot paths look good.

Good coverage for typical request/response shapes; alloc tracking and reset timing are correct. LGTM.

internal/adapter/translator/anthropic/response_test.go (2)

901-915: Benchmark added for TransformResponse hot path — aligns with repo guidance.

Good to have allocation tracking here. LGTM.

7-15: Testify version requirement satisfied.

The go.mod file confirms testify is pinned to v1.11.1, which meets the ≥ v1.11.1 requirement. No further action needed.

internal/adapter/translator/registry.go (1)

26-56: Registration guards and overwrite warning look solid.

Nil translator and empty-name cases handled; sorted name reporting is helpful. LGTM.

docs/content/integrations/frontend/claude-code.md (1)

14-24: The suggested environment variable names do not align with Claude Code's actual configuration.

Based on the web search results, the original review comment contains inaccurate recommendations:

ANTHROPIC_API_BASE_URL: Claude Code does not support this variable name. For proxy providers, use provider-specific variables such as ANTHROPIC_BEDROCK_BASE_URL or ANTHROPIC_VERTEX_BASE_URL instead.

ANTHROPIC_AUTH_TOKEN: This variable is valid for proxy authentication and remains appropriate for proxy scenarios.

ANTHROPIC_API_KEY: This is the correct variable name for configuring the API key.

The standardisation suggested in the review comment should not be applied as written. Before making changes, verify what configuration the examples in the documentation are meant to demonstrate—whether they target the main Anthropic API or a specific proxy provider.

Likely an incorrect or invalid review comment.

examples/crush-vllm/olla.yaml (1)

41-47: Anthropic translator config looks correct.

Only supported fields are present and match the code config struct.

internal/adapter/translator/anthropic/streaming_test_helpers.go (1)

22-37: LGTM – concise test translator config.

Helper uses minimal, realistic config and error‑level logging.

internal/app/handlers/handler_translation.go (1)

39-41: Request body reuse across retries is correctly handled.

The concern in your review is valid in principle: bytes.NewReader() cannot be re-read after EOF without seeking. However, the retry implementation already addresses this.

When the request reaches the proxy service, the RetryHandler explicitly preserves the request body by reading it into memory (io.ReadAll) during the first call to preserveRequestBody(). For each retry attempt, resetRequestBodyForRetry() creates a fresh bytes.NewReader from the preserved byte slice, ensuring each attempt starts with a new readable reader at position 0.

The code at handler_translation.go lines 39–41 correctly creates the initial body as io.NopCloser(bytes.NewReader(openaiBody)), and the retry logic in core/retry.go properly handles body preservation and recreation across all retry attempts.

coderabbitai · 2025-10-21T06:45:41Z

internal/adapter/translator/anthropic/response.go

+	randomBytes := make([]byte, 16)
+	if _, err := rand.Read(randomBytes); err != nil {
+		// crypto/rand shouldn't fail but just in case
+		t.logger.Warn("Failed to generate random bytes for message ID", "error", err)
+		return fmt.Sprintf("msg_01fallback%d", big.NewInt(0).SetBytes(randomBytes[:8]).Uint64())
+	}
+
+	encoded := encodeBase58(randomBytes)
+	return fmt.Sprintf("msg_01%s", encoded)
+}


⚠️ Potential issue | 🟠 Major

Avoid ID collisions in fallback path.

If crypto/rand fails, current fallback can repeat across requests. Use time-based uniqueness.

Apply:

func (t *Translator) generateMessageID() string { randomBytes := make([]byte, 16) if _, err := rand.Read(randomBytes); err != nil { // crypto/rand shouldn't fail but just in case t.logger.Warn("Failed to generate random bytes for message ID", "error", err) - return fmt.Sprintf("msg_01fallback%d", big.NewInt(0).SetBytes(randomBytes[:8]).Uint64()) + return fmt.Sprintf("msg_01fallback_%d", time.Now().UnixNano()) } encoded := encodeBase58(randomBytes) return fmt.Sprintf("msg_01%s", encoded) }

Also add import:

+ "time"

🤖 Prompt for AI Agents

internal/adapter/translator/anthropic/response.go around lines 194-203: the fallback path when crypto/rand fails can produce repeating IDs (not unique); change the fallback to include a time-based uniqueness (e.g., append or combine time.Now().UnixNano() or a timestamp-encoded value with the existing bytes/encoding) so each fallback ID is unique across requests, and add the time import to the file's imports.

coderabbitai · 2025-10-21T06:45:41Z

internal/adapter/translator/anthropic/token_count.go

+	// bounded read to prevent OOM attacks
+	// read up to maxMessageSize + 1 to detect oversized requests
+	limitedReader := io.LimitReader(r.Body, t.maxMessageSize+1)
+	body, err := io.ReadAll(limitedReader)
+	if err != nil {
+		// close the original body even on error
+		_ = r.Body.Close()
+		return nil, fmt.Errorf("failed to read request body: %w", err)
+	}
+
+	// detect oversized requests
+	if int64(len(body)) > t.maxMessageSize {
+		_ = r.Body.Close()
+		return nil, fmt.Errorf("request body exceeds maximum size of %d bytes", t.maxMessageSize)
+	}
+
+	// reset body for downstream handlers to re-read
+	r.Body = io.NopCloser(bytes.NewReader(body))
+


⚠️ Potential issue | 🔴 Critical

Close the original request body before replacing it to avoid leaking the underlying reader.

On success, the original r.Body is never closed, then replaced with a new NopCloser. That can leak the underlying connection/FD.

Apply:

func (t *Translator) CountTokens(ctx context.Context, r *http.Request) (*translator.TokenCountResponse, error) { // bounded read to prevent OOM attacks // read up to maxMessageSize + 1 to detect oversized requests - limitedReader := io.LimitReader(r.Body, t.maxMessageSize+1) + origBody := r.Body + limitedReader := io.LimitReader(origBody, t.maxMessageSize+1) body, err := io.ReadAll(limitedReader) if err != nil { - // close the original body even on error - _ = r.Body.Close() + _ = origBody.Close() return nil, fmt.Errorf("failed to read request body: %w", err) } // detect oversized requests if int64(len(body)) > t.maxMessageSize { - _ = r.Body.Close() + _ = origBody.Close() return nil, fmt.Errorf("request body exceeds maximum size of %d bytes", t.maxMessageSize) } // reset body for downstream handlers to re-read - r.Body = io.NopCloser(bytes.NewReader(body)) + _ = origBody.Close() // release original before replacing + r.Body = io.NopCloser(bytes.NewReader(body))

🤖 Prompt for AI Agents

internal/adapter/translator/anthropic/token_count.go around lines 16 to 34: after successfully reading the bounded body you must close the original r.Body before replacing it to avoid leaking the underlying reader; change the flow so that once io.ReadAll succeeds and the size check passes you call _ = r.Body.Close() (or defer a close right after a successful read and before early returns) and then set r.Body = io.NopCloser(bytes.NewReader(body)); ensure the original body is closed on all success and error paths (already closed on errors) so no file descriptor/connection is leaked.

coderabbitai · 2025-10-21T06:45:41Z

internal/app/handlers/handler_translation.go

+	w.Header().Set(constants.HeaderContentType, constants.ContentTypeJSON)
+	a.copyOllaHeaders(recorder, w)
+


🛠️ Refactor suggestion | 🟠 Major

Always set X‑Olla‑Model on success.

Mirror error/streaming paths and set the fallback header here too. As per coding guidelines.

Apply:

w.Header().Set(constants.HeaderContentType, constants.ContentTypeJSON) a.copyOllaHeaders(recorder, w) + a.setModelHeaderIfMissing(w, pr.model)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In internal/app/handlers/handler_translation.go around lines 260-262, the success path sets Content-Type and copies Olla headers but fails to ensure the X-Olla-Model header is present; mirror the error/streaming paths by explicitly setting X-Olla-Model on the response: check the recorder or copied headers for an X-Olla-Model value and if empty set w.Header().Set(constants.HeaderOllaModel, constants.OllaModelFallback) (or the existing fallback constant used elsewhere), so the response always includes X-Olla-Model on success.

update token_count and translator strategies

feat: anthropic / message logger (development only)

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (4)

internal/adapter/translator/anthropic/response.go (3)

151-162: Still logging raw args preview; normalise/rename to avoid PII leakage.

You reintroduced a preview under key raw_arguments. Either drop it entirely or rename to raw_arguments_preview and normalise whitespace. Example change below keeps only a short, normalised preview plus length. This avoids accidental full-arg logging and newline injection.

@@
-        t.logger.Warn("Failed to parse tool arguments, using empty input",
-            "tool", name,
-            "tool_id", id,
-            "error", err,
-            "raw_arguments", util.TruncateString(argsStr, util.DefaultTruncateLengthPII),
-            "raw_arguments_len", len(argsStr))
+        // Normalise newlines and log only a short redacted preview + length
+        preview := util.TruncateString(strings.ReplaceAll(strings.ReplaceAll(argsStr, "\n", " "), "\r", " "), util.DefaultTruncateLengthPII)
+        t.logger.Warn("Failed to parse tool arguments; using empty input",
+            "tool", name,
+            "tool_id", id,
+            "error", err,
+            "raw_arguments_preview", preview,
+            "raw_arguments_len", len(argsStr))

Also add the import:

@@
-import (
+import (
   "context"
   "crypto/rand"
   "encoding/json"
   "fmt"
   "math/big"
   "net/http"
+  "strings"

215-215: Typo in comment.

“ambigious” → “ambiguous”.

-// encode bytes to base58, shorter and less ambigious than hex
+// encode bytes to base58, shorter and less ambiguous than hex

204-213: Fallback message IDs can collide; make the fallback time-based.

If crypto/rand fails, the current path can emit the same ID across requests. Use a timestamp-based fallback.

@@
-import (
+import (
   "context"
   "crypto/rand"
   "encoding/json"
   "fmt"
   "math/big"
   "net/http"
+  "time"
@@
   if _, err := rand.Read(randomBytes); err != nil {
     // crypto/rand shouldn't fail but just in case
     t.logger.Warn("Failed to generate random bytes for message ID", "error", err)
-    return fmt.Sprintf("msg_01fallback%d", big.NewInt(0).SetBytes(randomBytes[:8]).Uint64())
+    return fmt.Sprintf("msg_01fallback_%d", time.Now().UnixNano())
   }

internal/adapter/translator/anthropic/token_count.go (1)

16-34: Close original body before resetting to avoid leaks.

On success, the original r.Body isn’t closed before replacement.

-  limitedReader := io.LimitReader(r.Body, t.maxMessageSize+1)
-  body, err := io.ReadAll(limitedReader)
+  origBody := r.Body
+  limitedReader := io.LimitReader(origBody, t.maxMessageSize+1)
+  body, err := io.ReadAll(limitedReader)
   if err != nil {
-    // close the original body even on error
-    _ = r.Body.Close()
+    _ = origBody.Close()
     return nil, fmt.Errorf("failed to read request body: %w", err)
   }
 
   // detect oversized requests
   if int64(len(body)) > t.maxMessageSize {
-    _ = r.Body.Close()
+    _ = origBody.Close()
     return nil, fmt.Errorf("request body exceeds maximum size of %d bytes", t.maxMessageSize)
   }
 
   // reset body for downstream handlers to re-read
-  r.Body = io.NopCloser(bytes.NewReader(body))
+  _ = origBody.Close()
+  r.Body = io.NopCloser(bytes.NewReader(body))

🧹 Nitpick comments (17)

docs/content/notes/anthropic-inspector.md (4)
52-64: Align config snippet with shipped defaults (path and safety).

Docs show output_dir "logs/inspector", while config/config.yaml uses "logs/inspector/anthropic" and inspector is off by default. Align to avoid confusion and accidental enablement.

Apply:
-    inspector:
-      enabled: true
-      output_dir: "logs/inspector"
-      session_header: "X-Session-ID"
+    inspector:
+      enabled: false
+      output_dir: "logs/inspector/anthropic"
+      session_header: "X-Session-ID"
81-85: Use the same output_dir in examples.

Examples reference logs/inspector/... but config uses logs/inspector/anthropic. Unify to prevent users tailing the wrong path.

Apply:
-{"type":"request","ts":"2025-10-21T10:30:00Z","model":"qwen/qwen3-coder","body":{...}}
-{"type":"response","ts":"2025-10-21T10:30:03Z","body":{...}}
+{"type":"request","ts":"2025-10-21T10:30:00Z","model":"qwen/qwen3-coder","body":{...}}
+{"type":"response","ts":"2025-10-21T10:30:03Z","body":{...}}
-cat logs/inspector/2025-10-21/debug-session.jsonl | jq 'select(.type == "request") | .body'
+cat logs/inspector/anthropic/2025-10-21/debug-session.jsonl | jq 'select(.type == "request") | .body'
-cat logs/inspector/2025-10-21/debug-session.jsonl | jq 'select(.type == "response") | .body.content[0].text'
+cat logs/inspector/anthropic/2025-10-21/debug-session.jsonl | jq 'select(.type == "response") | .body.content[0].text'
-jq 'select(.type == "response") | .body.content[0].text' logs/inspector/2025-10-21/compare-qwen.jsonl > qwen.txt
-jq 'select(.type == "response") | .body.content[0].text' logs/inspector/2025-10-21/compare-glm.jsonl > glm.txt
+jq 'select(.type == "response") | .body.content[0].text' logs/inspector/anthropic/2025-10-21/compare-qwen.jsonl > qwen.txt
+jq 'select(.type == "response") | .body.content[0].text' logs/inspector/anthropic/2025-10-21/compare-glm.jsonl > glm.txt
-chmod 700 logs/inspector/
-chown olla:olla logs/inspector/
+chmod 700 logs/inspector/anthropic/
+chown olla:olla logs/inspector/anthropic/
Also applies to: 90-98, 102-106, 112-115, 119-121

27-49: Add log retention/rotation guidance.

Inspector dumps sensitive JSONL; recommend retention and rotation to reduce risk.

Consider adding:

Retention: delete files older than N days.

Rotation: daily directories already used; suggest logrotate/systemd-tmpfiles snippet.

Encryption at rest if logs leave dev machines.

I can add a short section with sample logrotate and tmpfiles rules on request.

20-26: Calibrate “Minimal performance impact”.

Disk I/O + JSON encoding + serialised writes add overhead. Suggest “Low additional overhead for development; not optimised for prod.” to set expectations.
internal/adapter/inspector/simple.go (4)
122-127: Emit UTC timestamps to match docs and ease analysis.

Current code uses local time; docs and many tools expect Zulu.

Apply:
-		Timestamp: time.Now().Format(time.RFC3339),
+		Timestamp: time.Now().UTC().Format(time.RFC3339),
and
-		Timestamp: time.Now().Format(time.RFC3339),
+		Timestamp: time.Now().UTC().Format(time.RFC3339),
Also applies to: 140-144

94-101: Strengthen ancestor check; avoid string HasPrefix.

Using Clean+HasPrefix is brittle for path ancestry checks. Prefer filepath.Rel.

Apply:
-	// try and ensure the resolved path is within the output directory
-	// uses Clean to normalise paths and prevent ../ bypasses
-	if !strings.HasPrefix(filepath.Clean(absTestPath), filepath.Clean(absOutputDir)) {
-		return "", fmt.Errorf("session ID would escape output directory")
-	}
+	// ensure resolved path stays within outputDir
+	rel, err := filepath.Rel(absOutputDir, absTestPath)
+	if err != nil || filepath.IsAbs(rel) || strings.HasPrefix(rel, "..") {
+		return "", fmt.Errorf("session ID would escape output directory")
+	}
168-173: Typo in comment.

“owtner” → “owner”. No behaviour change.

Apply:
-	// Use 0700 permissions - owtner only access (not world-readable)
+	// Use 0700 permissions - owner only access (not world-readable)
46-57: Optional: guard against nil logger.

If log is nil, methods will panic. Default to a no‑op/plain logger.

I can wire this with slog.Default() if desired; let me push a tiny patch on request.
internal/adapter/inspector/simple_test.go (2)
1-509: Add micro-benchmarks for inspector logging (per guidelines).

Tests are good; please include Benchmark* for logging hot paths.

Append:
+func BenchmarkSimpleInspector_LogRequest(b *testing.B) {
+	tmpDir := b.TempDir()
+	log := logger.NewPlainStyledLogger(slog.Default())
+	inspector := NewSimple(true, tmpDir, "X-Session-ID", log)
+	body := []byte(`{"bench":"req"}`)
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		_ = inspector.LogRequest("bench-session", "bench-model", body)
+	}
+}
+
+func BenchmarkSimpleInspector_LogResponse(b *testing.B) {
+	tmpDir := b.TempDir()
+	log := logger.NewPlainStyledLogger(slog.Default())
+	inspector := NewSimple(true, tmpDir, "X-Session-ID", log)
+	body := []byte(`{"bench":"resp"}`)
+	b.ResetTimer()
+	for i := 0; i < b.N; i++ {
+		_ = inspector.LogResponse("bench-session", body)
+	}
+}
As per coding guidelines.

471-509: Minor: avoid mutating private fields in tests.

Directly toggling warningLogged couples tests to internals. Better to assert one‑time behaviour without resetting, or expose a test‑only getter.

Happy to adjust tests after the thread‑safe warning change.
internal/adapter/translator/anthropic/response.go (2)
88-95: Handle array-style message content from OpenAI.

OpenAI may return content as an array of parts. Concatenate text parts so you don’t drop content.
@@ func (t *Translator) convertResponseContent(message map[string]interface{}, finishReason string) ([]ContentBlock, string) {
   // Handle text content
   if textContent, ok := message["content"].(string); ok && textContent != "" {
     content = append(content, ContentBlock{
       Type: contentTypeText,
       Text: textContent,
     })
   }
+  // Handle array content parts (pick up text blocks)
+  if parts, ok := message["content"].([]interface{}); ok && len(parts) > 0 {
+    var sb strings.Builder
+    for _, p := range parts {
+      if pm, ok := p.(map[string]interface{}); ok {
+        if pt, _ := pm["type"].(string); pt == contentTypeText {
+          if tx, _ := pm["text"].(string); tx != "" {
+            sb.WriteString(tx)
+          }
+        }
+      }
+    }
+    if sb.Len() > 0 {
+      content = append(content, ContentBlock{Type: contentTypeText, Text: sb.String()})
+    }
+  }
125-137: Cover additional finish reasons.

Consider mapping content_filter (and similar) explicitly to end_turn to avoid surprises.
   switch finishReason {
   case "stop":
     return "end_turn"
   case "tool_calls":
     return contentTypeToolUse
   case "length":
     return "max_tokens"
+  case "content_filter":
+    return "end_turn"
   default:
     return "end_turn"
   }
internal/adapter/translator/anthropic/token_count.go (1)
35-38: Prefer strict JSON decoding for parity with TransformRequest.

Use json.Decoder with DisallowUnknownFields to keep behaviour consistent and surface unknown-field errors.
-  var req AnthropicRequest
-  if err := json.Unmarshal(body, &req); err != nil {
+  var req AnthropicRequest
+  dec := json.NewDecoder(bytes.NewReader(body))
+  dec.DisallowUnknownFields()
+  if err := dec.Decode(&req); err != nil {
     return nil, fmt.Errorf("failed to parse request: %w", err)
   }
internal/adapter/translator/anthropic/request.go (1)
16-27: Improve oversize detection UX.

LimitReader will truncate and yield a JSON parse error (“unexpected EOF”) rather than a clear “too large” error. If Content-Length is present, reject early; otherwise use a LimitedReader with limit+1 and check post-read.
-  limitedBody := io.LimitReader(r.Body, t.maxMessageSize)
+  if r.ContentLength > 0 && r.ContentLength > t.maxMessageSize {
+    return nil, fmt.Errorf("request body exceeds maximum size of %d bytes", t.maxMessageSize)
+  }
+  limitedBody := io.LimitReader(r.Body, t.maxMessageSize+1)
internal/adapter/translator/anthropic/streaming.go (3)
263-264: Consider explicit handling of missing index.

The type assertion silently defaults to 0 if the index field is missing or not a float64. Whilst this might work in practice, it could mask malformed tool call deltas.
-	index, _ := toolCall["index"].(float64)
-	toolIndex := int(index)
+	index, ok := toolCall["index"].(float64)
+	if !ok {
+		return nil, false  // require valid index
+	}
+	toolIndex := int(index)
335-335: Inconsistent error wrapping for flush operations.

Lines 335 and 353 return rc.Flush() directly, whilst other locations (e.g., lines 60, 176, 199, 222, 238, 298, 401, 450, 460) wrap flush errors with context using fmt.Errorf("flush failed: %w", err). Consistent error messages improve debuggability.

Apply this diff to lines 335 and 353:
-	return rc.Flush()
+	if err := rc.Flush(); err != nil {
+		return fmt.Errorf("flush failed: %w", err)
+	}
+	return nil
Also applies to: 353-353

407-431: Add bounds checking for defensive programming.

Line 413 accesses state.contentBlocks[blockIndex] without verifying that blockIndex is within bounds. Whilst the internal mapping should ensure this is safe, adding a bounds check would prevent potential panics if the state becomes inconsistent.
 	for toolIndex, builder := range state.toolCallBuffers {
 		argsJSON := builder.String()
 		if argsJSON != "" {
 			var input map[string]interface{}
 			if err := json.Unmarshal([]byte(argsJSON), &input); err == nil {
 				// use mapping to find the correct block, avoids linear search
 				if blockIndex, found := state.toolIndexToBlock[toolIndex]; found {
+					// bounds check before accessing slice
+					if blockIndex < 0 || blockIndex >= len(state.contentBlocks) {
+						t.logger.Error("Block index out of bounds",
+							"tool_index", toolIndex,
+							"block_index", blockIndex,
+							"blocks_len", len(state.contentBlocks))
+						continue
+					}
 					// validate block type before updating to catch any state inconsistencies
 					if state.contentBlocks[blockIndex].Type != contentTypeToolUse {

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a93664d and fcba73e.

📒 Files selected for processing (12)

config/config.yaml (2 hunks)
docs/content/notes/anthropic-inspector.md (1 hunks)
docs/content/notes/overview.md (1 hunks)
docs/mkdocs.yml (5 hunks)
internal/adapter/inspector/simple.go (1 hunks)
internal/adapter/inspector/simple_test.go (1 hunks)
internal/adapter/translator/anthropic/constants.go (1 hunks)
internal/adapter/translator/anthropic/request.go (1 hunks)
internal/adapter/translator/anthropic/response.go (1 hunks)
internal/adapter/translator/anthropic/streaming.go (1 hunks)
internal/adapter/translator/anthropic/token_count.go (1 hunks)
internal/adapter/translator/anthropic/translator.go (1 hunks)

✅ Files skipped from review due to trivial changes (1)

docs/content/notes/overview.md

🧰 Additional context used

📓 Path-based instructions (1)

{internal,pkg}/**/*_test.go

📄 CodeRabbit inference engine (CLAUDE.md)

Include Go benchmarks (Benchmark* functions) for critical paths, proxy engine comparisons, pooling efficiency, and circuit breaker behaviour

Files:

internal/adapter/inspector/simple_test.go

🧬 Code graph analysis (7)

internal/adapter/translator/anthropic/token_count.go (3)

internal/adapter/translator/anthropic/translator.go (1)

Translator (17-23)

internal/adapter/translator/types.go (1)

TokenCountResponse (58-62)

internal/adapter/translator/anthropic/types.go (3)

AnthropicRequest (7-21)

ContentBlock (60-69)

AnthropicMessage (53-56)

internal/adapter/inspector/simple_test.go (2)

internal/logger/styled_plain.go (1)

NewPlainStyledLogger (18-22)

internal/adapter/inspector/simple.go (2)

NewSimple (47-58)

Entry (17-22)

internal/adapter/translator/anthropic/request.go (3)

internal/adapter/translator/anthropic/translator.go (1)

Translator (17-23)

internal/adapter/translator/types.go (1)

TransformedRequest (30-37)

internal/adapter/translator/anthropic/types.go (3)

AnthropicRequest (7-21)

AnthropicMessage (53-56)

ContentBlock (60-69)

internal/adapter/translator/anthropic/translator.go (6)

internal/logger/styled.go (1)

StyledLogger (12-35)

pkg/pool/lite_pool.go (2)

Pool (39-42)

NewLitePool (44-68)

internal/adapter/inspector/simple.go (2)

Simple (26-33)

NewSimple (47-58)

internal/config/types.go (1)

AnthropicTranslatorConfig (179-183)

internal/adapter/translator/anthropic/types.go (1)

ContentBlock (60-69)

internal/core/constants/content.go (2)

HeaderContentType (57-57)

ContentTypeJSON (7-7)

internal/adapter/inspector/simple.go (1)

internal/logger/styled.go (1)

StyledLogger (12-35)

internal/adapter/translator/anthropic/response.go (3)

internal/adapter/translator/anthropic/translator.go (1)

Translator (17-23)

internal/adapter/translator/anthropic/types.go (3)

AnthropicResponse (106-115)

ContentBlock (60-69)

AnthropicUsage (118-121)

internal/util/parsing.go (2)

TruncateString (23-34)

DefaultTruncateLengthPII (8-8)

internal/adapter/translator/anthropic/streaming.go (4)

internal/adapter/translator/anthropic/types.go (3)

ContentBlock (60-69)

AnthropicResponse (106-115)

AnthropicUsage (118-121)

internal/adapter/translator/anthropic/translator.go (1)

Translator (17-23)

internal/core/constants/content.go (1)

HeaderContentType (57-57)

internal/util/parsing.go (2)

TruncateString (23-34)

DefaultTruncateLengthPII (8-8)

🪛 LanguageTool

docs/content/notes/anthropic-inspector.md

[style] ~12-~12: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 1754 characters long)
Context: ...s and troubleshoot integration issues. !!! danger "DEVELOPMENT USE ONLY - DO NOT E...

(EN_EXCESSIVE_EXCLAMATION)

🪛 markdownlint-cli2 (0.18.1)

docs/content/notes/anthropic-inspector.md

29-29: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

54-54: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)

70-70: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)

81-81: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)

90-90: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)

96-96: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)

102-102: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)

112-112: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)

119-119: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Cross-Platform Build Validation (windows-latest)

🔇 Additional comments (10)

docs/mkdocs.yml (1)

111-195: Nav targets verified—all files present.

Verification confirmed that all 63 nav entries in docs/mkdocs.yml reference existing files in docs/content/. No broken links or missing documentation files detected.

internal/adapter/translator/anthropic/translator.go (1)

84-90: **** — GetSessionHeader() is properly exposed and exists.

The inspector's GetSessionHeader() method is publicly defined in internal/adapter/inspector/simple.go (line 202) and returns the configured session header. This is not tight coupling to internals; it is legitimate use of a public API. The code compiles and functions correctly. The verification concern does not apply.

Likely an incorrect or invalid review comment.

internal/adapter/translator/anthropic/streaming.go (8)

16-29: LGTM!

The StreamingState structure is well-designed and addresses past review feedback. Using map[int] for tool call buffers and maintaining the toolIndexToBlock mapping ensures correct tool argument association.

32-71: LGTM!

The entry point correctly sets up SSE headers, initializes state, and ensures message_start is sent even for empty streams. Good defensive programming.

74-100: Good scanner configuration and resilient error handling.

The increased buffer size (1 MiB max) correctly addresses the "token too long" issue from past reviews. Continuing on line processing errors (line 91) is a reasonable trade-off for streaming resilience, though it means some stream corruption may go unnoticed by the caller.

103-167: LGTM!

Line processing correctly addresses the privacy concern from past reviews by logging truncated data and length. Token extraction and routing logic are sound.

184-246: LGTM!

Content delta handling correctly manages block lifecycle, closing previous blocks of different types before starting new ones. State updates are consistent.

357-389: LGTM!

Tool call delta handling correctly initializes buffers before use (lines 369-371), ensuring subsequent operations are safe. Block initialization and argument buffering logic are sound.

392-471: Finalisation logic correctly addresses past review feedback.

The use of toolIndexToBlock mapping (lines 413-422) efficiently locates the correct block without linear search. Type validation and comprehensive error logging help catch state inconsistencies.

473-541: LGTM!

Inspector logging correctly reconstructs the complete response from streaming state. SSE event formatting follows the standard and matches Anthropic's expected format.

coderabbitai · 2025-10-22T06:24:08Z

config/config.yaml

+  anthropic:
+    enabled: true
+    max_message_size: 10485760  # 10MB - Anthropic API limit
+    # !! WARNING: Do not enable inspector in production without reviewing data privacy !!
+    #             Anthropic messages may contain sensitive user data.
+    #  This release is quite taxing and unoptimised, so use with caution & only for debug.
+    inspector:
+      enabled: false                          # Enable request/response logging
+      output_dir: "logs/inspector/anthropic"  # Where to write logs
+      session_header: "X-Session-ID"          # Header to use for session grouping


⚠️ Potential issue | 🟠 Major

Default experimental translator to disabled; unify inspector path.

Enabling Anthropic translation by default can surprise existing deployments. Safer to keep it opt‑in while experimental.

Apply:

anthropic: - enabled: true + enabled: false max_message_size: 10485760 # 10MB - Anthropic API limit @@ inspector: - enabled: false # Enable request/response logging - output_dir: "logs/inspector/anthropic" # Where to write logs + enabled: false # Enable request/response logging + output_dir: "logs/inspector/anthropic" # Where to write logs session_header: "X-Session-ID" # Header to use for session grouping

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

anthropic:

enabled: true

max_message_size: 10485760 # 10MB - Anthropic API limit

# !! WARNING: Do not enable inspector in production without reviewing data privacy !!

# Anthropic messages may contain sensitive user data.

# This release is quite taxing and unoptimised, so use with caution & only for debug.

inspector:

enabled: false # Enable request/response logging

output_dir: "logs/inspector/anthropic" # Where to write logs

session_header: "X-Session-ID" # Header to use for session grouping

anthropic:

enabled: false

max_message_size: 10485760 # 10MB - Anthropic API limit

# !! WARNING: Do not enable inspector in production without reviewing data privacy !!

# Anthropic messages may contain sensitive user data.

# This release is quite taxing and unoptimised, so use with caution & only for debug.

inspector:

enabled: false # Enable request/response logging

output_dir: "logs/inspector/anthropic" # Where to write logs

session_header: "X-Session-ID" # Header to use for session grouping

🤖 Prompt for AI Agents

config/config.yaml lines 137-146: the Anthropics translator is enabled by default and the inspector path is inconsistent; change anthropic.enabled to false to make the experimental translator opt-in, and set inspector.output_dir to a unified inspector path (e.g., logs/inspector/translator/anthropic) so all translator inspectors share a consistent base directory; update the YAML values accordingly and keep inspector.enabled false by default.

internal/adapter/inspector/simple.go

internal/adapter/translator/anthropic/request.go

OLLA-287: skip empty assistant messages

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (8)

docs/content/integrations/frontend/claude-code.md (6)
32-40: Migrate emphasis to proper markdown heading.

Line 32 uses emphasis (**What you get via Olla**) as a heading. For better markdown structure and consistency, convert to an actual heading.
-**What you get via Olla**
+## What you get via Olla
101-101: Specify language for fenced code block.

The architecture diagram code block (line 101) lacks a language identifier, which impairs syntax highlighting and markdown compliance.
-```
+```text
275-275: Migrate emphasis to proper markdown heading.

Line 275 uses emphasis (**Option A: Environment Variables (Recommended)**) as a heading. Convert to an actual markdown heading for consistency.
-**Option A: Environment Variables (Recommended)**
+### Option A: Environment Variables (Recommended)
290-290: Migrate emphasis to proper markdown heading.

Line 290 uses emphasis (**Option B: Configuration File**) as a heading. Convert to an actual markdown heading for consistency.
-**Option B: Configuration File**
+### Option B: Configuration File
342-342: Add missing determiner for grammatical correctness.

Line 342 reads "Routes to backend with fewest active requests" — it should include "the" before "fewest" for proper grammar.
-- **least-connections**: Routes to backend with fewest active requests
+- **least-connections**: Routes to backend with the fewest active requests
845-845: Add missing determiner for grammatical correctness.

Line 845 reads "Configure highest priority for local backends" — it should include "the" before "highest" for proper grammar.
-- **Local-first**: Configure highest priority for local backends
+- **Local-first**: Configure the highest priority for local backends
internal/adapter/translator/anthropic/request.go (2)
33-37: Consider preserving original request bytes to avoid re-marshaling overhead.

Re-marshaling the decoded request creates additional processing overhead. If the original request body could be read and preserved (e.g., using io.TeeReader or reading into a buffer first), this re-marshaling step could be eliminated.

However, this is a minor optimisation that can be deferred, as the current implementation prioritises strict validation via the decoder's DisallowUnknownFields setting.

227-253: Optimise string concatenation in text content accumulation.

Line 245 uses textContent += text in a loop, which is inefficient for multiple text blocks as it creates intermediate string allocations.

Apply this diff to use strings.Builder:
 func (t *Translator) convertAssistantMessage(blocks []interface{}) map[string]interface{} {
 	msg := map[string]interface{}{
 		"role": "assistant",
 	}
 
-	var textContent string
+	var textBuilder strings.Builder
 	var toolCalls []map[string]interface{}
 
 	for _, block := range blocks {
 		blockMap, ok := block.(map[string]interface{})
 		if !ok {
 			continue
 		}
 
 		blockType, _ := blockMap["type"].(string)
 		switch blockType {
 		case contentTypeText:
 			if text, ok := blockMap["text"].(string); ok {
-				textContent += text
+				textBuilder.WriteString(text)
 			}
 		case contentTypeToolUse:
 			toolCall := t.convertToolUse(blockMap)
 			if toolCall != nil {
 				toolCalls = append(toolCalls, toolCall)
 			}
 		}
 	}
+
+	textContent := textBuilder.String()
 
 	// openai wants null content when only tool calls present
 	if textContent != "" {

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fcba73e and b6d4916.

📒 Files selected for processing (4)

config/config.yaml (2 hunks)
docs/content/integrations/frontend/claude-code.md (1 hunks)
internal/adapter/inspector/simple.go (1 hunks)
internal/adapter/translator/anthropic/request.go (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

config/config.yaml

🧰 Additional context used

🧬 Code graph analysis (2)

internal/adapter/translator/anthropic/request.go (3)

internal/adapter/translator/anthropic/translator.go (1)

Translator (17-23)

internal/adapter/translator/types.go (1)

TransformedRequest (30-37)

internal/adapter/translator/anthropic/types.go (3)

AnthropicRequest (7-21)

AnthropicMessage (53-56)

ContentBlock (60-69)

internal/adapter/inspector/simple.go (1)

internal/logger/styled.go (1)

StyledLogger (12-35)

🪛 LanguageTool

docs/content/integrations/frontend/claude-code.md

[uncategorized] ~34-~34: The preposition ‘for’ seems more likely in this position.
Context: ...** * Consistent Anthropic Messages API at /olla/anthropic/v1/* * Load-balancing...

(AI_HYDRA_LEO_REPLACE_AT_FOR)

[grammar] ~342-~342: A determiner may be missing.
Context: ...t-connections**: Routes to backend with fewest active requests **Timeout Configuratio...

(THE_SUPERLATIVE)

[misspelling] ~539-~539: This word is normally spelled as one.
Context: ...soning abilities - qwen3:32b - Strong multi-task performance **Performance vs Quality T...

(EN_COMPOUNDS_MULTI_TASK)

[uncategorized] ~591-~591: Possible missing preposition found.
Context: ...og ### No Models Available **List models from Olla**:bash curl http://localh...

(AI_HYDRA_LEO_MISSING_OF)

[grammar] ~845-~845: A determiner may be missing.
Context: ... Workflow - Local-first: Configure highest priority for local backends - **Fallbac...

(THE_SUPERLATIVE)

🪛 markdownlint-cli2 (0.18.1)

docs/content/integrations/frontend/claude-code.md

32-32: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

101-101: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

275-275: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

290-290: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: test (1.24)

🔇 Additional comments (6)

internal/adapter/translator/anthropic/request.go (6)

107-130: Well-structured message conversion with proper system prompt injection.

The implementation correctly handles system prompt injection as the first OpenAI message and properly pre-allocates capacity for the common case. Error propagation is clean.

133-171: Clean message conversion with proper role-specific handling.

The implementation correctly handles both string and block-based content, with a sensible fallback for single-map content. Empty messages are filtered out appropriately.

174-224: Robust user message conversion with proper separation of text and tool results.

The implementation correctly splits text content and tool results into separate OpenAI messages as required by the OpenAI API. Tool result content is appropriately handled for both string and structured formats.

266-270: Excellent fix for empty assistant messages.

The OLLA-287 fix correctly guards against emitting empty assistant messages when there's neither text content nor tool calls. This directly addresses the concern raised in the previous review.

274-298: Solid tool_use to tool_call conversion with appropriate error handling.

The implementation correctly maps Anthropic's tool_use format to OpenAI's tool_call structure, with sensible fallback to an empty object when input marshaling fails.

301-319: ****

The callback function passed to forEachSystemContentBlock always returns nil and never propagates errors. The forEachSystemBlock implementation only returns non-nil errors when the callback itself returns an error—which cannot occur here. Discarding the error is appropriate in this context.

Likely an incorrect or invalid review comment.

internal/adapter/inspector/simple.go

This reverts commit c3b7cdc.

thushan added 13 commits October 15, 2025 09:44

initial anthropic message translation layer

69ab80d

update translation for responses for tooling

e1b98f7

SSE streaming and fixing issues with buffer failures, adds tests

ee59020

handler integration and fix integration tests.

1c45442

refactor to try and build translators a little easier.

2b6166c

refactor for reuse

4964cb9

testing fixes:

49d1806

- maxtokens validation - streaming malformed chunking logs and continues now - weird behaviour for system ContentBlock message - tool argument errors - messageid format - compatibility with Claude Code v2 and v1.6 releases Also adds better error logging

add configuration for translators

fe2962b

refactor tests and add tests for configuration of translators fix broken tests across translator lint issues update handler and app for config

remove async for now, looks like this will require a bit of a rethink…

2860c41

… for async

quick fix for hanging requests when no healthy endpoints available

b8e03d4

initial docs

0e93cd1

initial examples rewrite some bits refactor examples add test scripts for examples add one for lmstudio update handlers for experimental features

mkdocs missing bits

4944aaa

thushan self-assigned this Oct 20, 2025

thushan added the experimental Experimental label Oct 20, 2025

coderabbitai bot reviewed Oct 20, 2025

View reviewed changes

coderabbitai bot reviewed Oct 21, 2025

View reviewed changes

thushan and others added 6 commits October 21, 2025 22:16

simple inspector implementation

e2d0b86

lint and mint!

3603e34

documentation and warning on load

441421f

security hardening and update config

94e65e0

cleanup the sessionid bits

46f1d3c

update token_count and translator strategies

Merge pull request #77 from thushan/feature/anthropic-enhanced

fcba73e

feat: anthropic / message logger (development only)

coderabbitai bot reviewed Oct 22, 2025

View reviewed changes

thushan added 3 commits October 22, 2025 17:33

fix windows paths not working on... windows (doh!).

7039ccf

disable anthropic by default

1f04916

sync warning

b6d4916

OLLA-287: skip empty assistant messages

coderabbitai bot reviewed Oct 22, 2025

View reviewed changes

internal/adapter/inspector/simple.go Show resolved Hide resolved

internal/adapter/inspector/simple.go Show resolved Hide resolved

thushan added 2 commits October 22, 2025 18:01

silly me, should have done it in the sync.

c3b7cdc

Revert "silly me, should have done it in the sync."

3375b48

This reverts commit c3b7cdc.

thushan merged commit 5107ff0 into main Oct 22, 2025
8 checks passed

thushan deleted the feature/anthropic branch October 22, 2025 07:09

chenrui333 mentioned this pull request Oct 22, 2025

olla 0.0.20 chenrui333/homebrew-tap#2199

Closed

coderabbitai bot mentioned this pull request Feb 13, 2026

feat: Anthropic Pass-through #105

Merged

		w.Header().Set(constants.HeaderContentType, constants.ContentTypeJSON)
		a.copyOllaHeaders(recorder, w)

Uh oh!

Conversation

thushan commented Oct 20, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thushan commented Oct 20, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 20, 2025 •

edited

Loading