Conversation
…ntblock) add thinking to payload to avoid Claude Code breakage add validator, we may accidentally send invalid top_/temps otherwise enable path rewriting oops avoid olla in path add 10mb limit for payload add tests update tests fix broken tests lint issue fix broken tests update handler_translation update handler_translation x2 fix tests
- maxtokens validation - streaming malformed chunking logs and continues now - weird behaviour for system ContentBlock message - tool argument errors - messageid format - compatibility with Claude Code v2 and v1.6 releases Also adds better error logging
refactor tests and add tests for configuration of translators fix broken tests across translator lint issues update handler and app for config
initial examples rewrite some bits refactor examples add test scripts for examples add one for lmstudio update handlers for experimental features
|
Warning Rate limit exceeded@thushan has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 2 minutes and 38 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
WalkthroughAdds an Anthropic Messages API translator (request/response/streaming, token counting), a translator registry with dynamic route registration, inspector logging, many tests and examples, and replaces direct use of http.Flusher with http.ResponseController in streaming code paths. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Anthropic client
participant Olla as Olla proxy
participant Translator as Anthropic Translator
participant Backend as OpenAI-compatible backend
Client->>Olla: POST /olla/anthropic/v1/messages (Anthropic JSON)
Olla->>Translator: TransformRequest(ctx, http.Request)
Translator->>Translator: Convert messages/tools/system -> OpenAI format
Translator-->>Olla: Transformed OpenAI request
Olla->>Backend: Forward request
Backend-->>Olla: OpenAI response (or SSE stream)
Olla->>Translator: TransformResponse / TransformStreamingResponse
Translator->>Translator: Map choices, tool_calls, finish reasons
Translator-->>Olla: Anthropic-format response (or SSE events)
Olla-->>Client: HTTP 200 / SSE stream (Anthropic format)
sequenceDiagram
participant Backend as OpenAI backend (SSE)
participant Translator as Anthropic Translator
participant Client as Anthropic client
Backend-->>Translator: SSE chunks (deltas, tool_calls, usage)
Translator->>Translator: Buffer state, assemble blocks, parse partial JSON
alt text delta
Translator-->>Client: content_block_delta events
else tool call
Translator-->>Client: input_json_delta / content_block_start/stop
end
Translator->>Translator: finalize -> message_delta + message_stop
Translator-->>Client: final events with usage
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 27
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
examples/crush-vllm/README.md (2)
721-721: Minor grammar: Missing preposition.Consider revising "List models from Olla" to "List models available from Olla" for improved clarity.
1049-1049: Minor grammar: Missing article."Configure highest priority for local backends" should include the article "the": "Configure the highest priority for local backends".
🧹 Nitpick comments (74)
examples/crush-vllm/test.sh (1)
40-223: Consider adding timeouts to curl commands.The curl commands throughout the test script don't specify timeouts. While acceptable for local development, adding
--max-time 10would prevent the script from hanging indefinitely if services become unresponsive, improving the developer experience in CI or automated testing scenarios.internal/adapter/translator/anthropic/translator_test.go (1)
1-158: Consider adding benchmarks for critical translation paths.The translator performs performance-critical operations (JSON marshaling, buffer pooling, request/response transformation), and the PR documentation mentions "<5ms translation latency." Adding benchmark tests would help verify this claim and catch performance regressions.
As per coding guidelines, would you like me to generate benchmark tests for the translation operations? I can create benchmarks for:
- WriteError performance
- Request translation (if not covered in request_test.go)
- Response translation (if not covered in response_test.go)
- Streaming translation (if not covered in streaming_test.go)
internal/config/types.go (2)
20-21: Wire-up looks good; consideromitemptyto keep serialised config minimal.Optional: tag
Translatorswithyaml:"translators,omitempty"so empty blocks aren’t written back out.
157-167: Make units explicit and centralise defaults/limits.Rename
MaxMessageSizetoMaxMessageSizeBytes(or document “bytes”) and hoist shared constants so translator and config can’t drift.Apply constants inside this file:
+// Anthropic translator sizing (bytes) +const ( + DefaultAnthropicMessageSize = 10 << 20 // 10 MiB + MaxAnthropicMessageSize = 100 << 20 // 100 MiB +)examples/claude-code-llamacpp/test.sh (3)
4-17: Harden script: fail early, check deps, allow env overrides.
- Use
set -Eeuo pipefailplus a small dependency check forcurlandjq.- Let env override URLs.
-set -e +set -Eeuo pipefail + +need() { command -v "$1" >/dev/null 2>&1 || { echo "Missing dependency: $1" >&2; exit 1; }; } +need curl; need jq - -OLLA_URL="http://localhost:40114" -ANTHROPIC_URL="${OLLA_URL}/olla/anthropic/v1" -LLAMACPP_URL="http://localhost:8080" +OLLA_URL="${OLLA_URL:-http://localhost:40114}" +ANTHROPIC_URL="${ANTHROPIC_URL:-${OLLA_URL}/olla/anthropic/v1}" +LLAMACPP_URL="${LLAMACPP_URL:-http://localhost:8080}"
60-73: Send Anthropic version header for parity with real clients.Add
anthropic-versionto the non‑streaming request.-RESPONSE=$(curl -s -X POST "${ANTHROPIC_URL}/messages" \ +RESPONSE=$(curl -s -X POST "${ANTHROPIC_URL}/messages" \ -H "Content-Type: application/json" \ + -H "anthropic-version: 2023-06-01" \ -d "{
85-95: Likewise for streaming.Include
anthropic-versionon the streaming request for consistency.-STREAM_OUTPUT=$(curl -sN -X POST "${ANTHROPIC_URL}/messages" \ +STREAM_OUTPUT=$(curl -sN -X POST "${ANTHROPIC_URL}/messages" \ -H "Content-Type: application/json" \ + -H "anthropic-version: 2023-06-01" \ -d "{examples/opencode-lmstudio/test.sh (3)
5-13: Stronger bash safety and dependency checks.Enable strict mode and verify required tools present.
-set -e +set -Eeuo pipefail +need() { command -v "$1" >/dev/null 2>&1 || { echo "Missing dependency: $1" >&2; exit 1; }; } +need curl; need jq; need docker
55-63: Container check can be brittle; prefer compose or name match.Use a more robust check to avoid false negatives if the container name differs.
-if docker ps | grep -q "olla"; then +if docker compose ps --services --status running 2>/dev/null | grep -qx "olla"; then print_result 0 "Olla container is running" else print_result 1 "Olla container is not running" "Run: docker compose up -d" exit 1 fi
158-165: Add Anthropic version header to match API expectations.Include the header on the Anthropic messages call.
-ANTHROPIC_RESPONSE=$(curl -s -f -X POST "http://${OLLA_HOST}:${OLLA_PORT}/olla/anthropic/v1/messages" \ +ANTHROPIC_RESPONSE=$(curl -s -f -X POST "http://${OLLA_HOST}:${OLLA_PORT}/olla/anthropic/v1/messages" \ -H "Content-Type: application/json" \ + -H "anthropic-version: 2023-06-01" \ -d "{examples/claude-code-ollama/test.sh (1)
4-17: Adopt strict mode and pre-flight checks.Reduce heisenbugs by enabling strict bash options and verifying
curl/jqexist.-set -e # Exit on error +set -Eeuo pipefail # Exit on error, unset, and pipeline failures +need() { command -v "$1" >/dev/null 2>&1 || { echo "Missing dependency: $1" >&2; exit 1; }; } +need curl; need jqinternal/adapter/translator/anthropic/request_test.go (5)
16-21: Test logger creation looks fine. Addt.Parallel()where safe.Most tests instantiate isolated translators; consider
t.Parallel()to speed up suite runtime.
23-61: Great coverage of the happy path. Factor small helpers to reduce repetition.Create helpers for marshalling and extracting messages to keep tests terse.
+// test helpers +func mustTransform(t *testing.T, tr *Translator, v any) *TransformedRequest { + t.Helper() + b, err := json.Marshal(v) + require.NoError(t, err) + req := &http.Request{Body: io.NopCloser(bytes.NewReader(b))} + res, err := tr.TransformRequest(context.Background(), req) + require.NoError(t, err) + return res +}
580-621: Consider a benchmark for critical path.Add a micro‑benchmark for
TransformRequestwith a representative payload. As per coding guidelines.func BenchmarkTransformRequest_SimpleMessage(b *testing.B) { tr := NewTranslator(createTestLogger(), createTestConfig()) req := AnthropicRequest{ Model: "claude-3-5-sonnet-20241022", MaxTokens: 256, Messages: []AnthropicMessage{{Role: "user", Content: "Hi"}}, } body, _ := json.Marshal(req) httpReq := &http.Request{Body: io.NopCloser(bytes.NewReader(body))} b.ReportAllocs() for i := 0; i < b.N; i++ { _, _ = tr.TransformRequest(context.Background(), httpReq) } }
710-759: Nice: structured tool_result serialisation validated. Add image content case for completeness.Add a test covering
imagecontent blocks to confirm conversion/ignoring behaviour is intentional.
214-276: Tool choice mapping is well covered. Add negative case for unknown tool name in object form.If a specific tool name is requested but not declared, assert the translator behaviour (error or fallback).
internal/adapter/translator/types.go (1)
28-37: Clarify the TargetPath comment.Line 34's comment states "proxy handles /olla prefix" which may be unclear to developers unfamiliar with the proxy routing logic. Consider clarifying that this should be the backend-relative path without any Olla-specific prefixes.
Apply this diff to improve clarity:
- TargetPath string // Target API path for the backend (e.g., "/v1/chat/completions" - proxy handles /olla prefix) + TargetPath string // Backend-relative API path (e.g., "/v1/chat/completions") - excludes Olla routing prefixesdocs/content/usage.md (1)
203-232: Consider using proper Markdown headings.Lines 203, 221, and 227 use bold text (emphasis) for section labels. Using proper Markdown headings (e.g.,
####) would improve document structure and accessibility.Apply this diff to use proper headings:
-**1. Enable Translation** +#### 1. Enable Translation-**2. Start Olla** +#### 2. Start Olla-**3. Configure Claude Code** +#### 3. Configure Claude Codeinternal/config/config.go (1)
317-327: Consider clarifying the comment.The comment on line 317 mentions "config for request size limits + streaming" but the code block only handles Anthropic translator environment overrides. Consider updating the comment to be more specific.
Apply this diff:
- // config for request size limits + streaming + // Anthropic translator configuration overrides if val := os.Getenv("OLLA_TRANSLATORS_ANTHROPIC_ENABLED"); val != "" {examples/crush-vllm/compose.yaml (1)
4-4: Pin the image tag (avoidlatest)Unpinned images hurt reproducibility and can break the example unexpectedly. Pin a version or digest.
Apply:
- image: vllm/vllm-openai:latest + image: vllm/vllm-openai:0.6.2 # example – pin to a known-good tag or digestinternal/config/config_test.go (2)
586-604: Uset.Setenvto simplify env handlingReduces boilerplate and avoids leaks if a test panics.
- // Set env vars - for key, value := range testEnvVars { - os.Setenv(key, value) - } - - // Clean up after test - defer func() { - for key := range testEnvVars { - os.Unsetenv(key) - } - }() + for k, v in range testEnvVars { + t.Setenv(k, v) + }
635-648: Don’t reimplement substring searchUse
strings.Containsfor clarity.-func contains(s, substr string) bool { - return len(s) >= len(substr) && (s == substr || len(substr) == 0 || - (len(s) > 0 && len(substr) > 0 && stringContains(s, substr))) -} - -func stringContains(s, substr string) bool { - for i := 0; i <= len(s)-len(substr); i++ { - if s[i:i+len(substr)] == substr { - return true - } - } - return false -} +import "strings" + +func contains(s, substr string) bool { + return strings.Contains(s, substr) +}internal/adapter/translator/anthropic/path_translation_test.go (2)
21-49: Also assert streaming flag propagationYou cover path mapping; add a check that
streamon input →stream: truein the OpenAI payload.@@ func TestPathTranslation(t *testing.T) { - for _, tt := range tests { + for _, tt := range tests { t.Run(tt.name, func(t *testing.T) { @@ - // Verify target path is set correctly + // Verify target path is set correctly assert.Equal(t, tt.expectedTargetPath, transformed.TargetPath, "Target path should be set to OpenAI chat completions endpoint") @@ - // Verify target path is not empty + // Verify target path is not empty assert.NotEmpty(t, transformed.TargetPath, "Target path must be populated for proper routing") + + // If streaming input, ensure streaming is propagated + if tt.anthropicReq.Stream { + v, ok := transformed.OpenAIRequest["stream"].(bool) + require.True(t, ok, "stream should be boolean in OpenAI request") + assert.True(t, v, "stream must be true when Anthropic request streams") + } }) }Also applies to: 84-91
1-167: Add a micro-benchmark forTransformRequestHelps catch regressions and meets our test guideline for benchmarks under internal/*_test.go.
func BenchmarkTransformRequest_Small(b *testing.B) { tr := NewTranslator(createTestLogger(), createTestConfig()) reqBody := []byte(`{"model":"claude-3-5","max_tokens":64,"messages":[{"role":"user","content":"hi"}]}`) for i := 0; i < b.N; i++ { req := httptest.NewRequest("POST", "/olla/anthropic/v1/messages", bytes.NewReader(reqBody)) if _, err := tr.TransformRequest(context.Background(), req); err != nil { b.Fatal(err) } } }[Based on learnings]
internal/app/handlers/application.go (1)
90-179: Optional: extract translator wiring into a helperKeeps
NewApplicationlean and easier to unit test.+func initTranslators(log logger.StyledLogger, cfg *config.Config) (*translator.Registry, error) { + reg := translator.NewRegistry(log) + if cfg.Translators.Anthropic.Enabled { + if err := cfg.Translators.Anthropic.Validate(); err != nil { + return nil, fmt.Errorf("invalid Anthropic translator config: %w", err) + } + tr := anthropic.NewTranslator(log, cfg.Translators.Anthropic) + if err := reg.Register("anthropic", tr); err != nil { + return nil, err + } + log.Info("Registered Anthropic translator", "max_message_size", cfg.Translators.Anthropic.MaxMessageSize) + } else { + log.Info("Anthropic translator disabled via configuration") + } + return reg, nil +}Then call
translatorRegistry, err := initTranslators(logger, cfg).internal/app/handlers/server_routes.go (3)
108-120: Tidy logging; avoid odd message and log outside conditional.The InfoWithEndpoint(" ", ...) message looks off, and it only logs when TokenCounter is implemented. Suggest a consistent info log regardless of TokenCounter, and keep Debug for paths.
Apply:
- a.logger.InfoWithEndpoint(" ", name, "path", path) + a.logger.Info("Registered translator", "translator", name, "path", path)Optionally move this info log outside the TokenCounter branch so it always emits.
Also applies to: 114-115
95-99: Docstring mismatch for token count path.Comment says “message path + "_count_tokens"” but the path is “/count_tokens”. Update comment for accuracy.
- // Uses message path + "_count_tokens" (e.g., /olla/anthropic/v1/messages/count_tokens) + // Uses message path + "/count_tokens" (e.g., /olla/anthropic/v1/messages/count_tokens)
76-81: Prefer http.Method constants for consistency.*Minor style: replace hard-coded "GET"/"POST" with http.MethodGet/http.MethodPost.
Also applies to: 88-93, 101-106
examples/opencode-lmstudio/README.md (4)
7-15: Add a language to the ASCII diagram code fence (markdownlint MD040).Specify a language (e.g. text) to satisfy linters.
-``` +```text ... -``` +```
41-41: Grammar: avoid colon after a preposition.“Download from:” → “Download from”.
- - Download from: https://lmstudio.ai/ + - Download from https://lmstudio.ai/
263-263: Hyphenate “up‑to‑date”.Minor language fix.
-4. **Mac/Windows users**: Ensure Docker Desktop is up to date for `host.docker.internal` support +4. **Mac/Windows users**: Ensure Docker Desktop is up‑to‑date for `host.docker.internal` support
324-332: Anthropic curl example: consider specifying version header.Some clients expect an Anthropic-Version header; adding it improves interoperability with tooling.
curl http://localhost:40114/olla/anthropic/v1/messages \ -H "Content-Type: application/json" \ + -H "Anthropic-Version: 2023-06-01" \ -d '{ "model": "qwen2.5-coder:7b", "max_tokens": 1024, "messages": [{"role": "user", "content": "Hello!"}] }'Confirm the exact version your translator expects (or if it ignores this header) and update docs accordingly.
examples/claude-code-ollama/compose.yaml (2)
3-3: Avoid :latest tags for reproducibility.Pin images to a version or digest to prevent surprise upgrades.
Example:
image: ollama/ollama:0.5.0 # or @sha256:<digest> ... image: ghcr.io/thushan/olla:vX.Y.Z # or @sha256:<digest>Also applies to: 26-26
38-42: Healthcheck tool availability.wget may not be present in minimal images. Prefer curl -f if available, or use CMD-SHELL with busybox wget. Keep checks fast and quiet.
- test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:40114/internal/health"] + test: ["CMD", "curl", "-fsS", "http://localhost:40114/internal/health"]If curl isn’t present, switch to:
test: ["CMD-SHELL", "wget -q --spider http://localhost:40114/internal/health || exit 1"]internal/app/handlers/handler_translator_models_test.go (1)
62-66: Tighten assertions and add error‑path test; consider testify.
- Use require/assert from stretchr/testify for clearer failures.
- Add a test where GetUnifiedModels returns an error to verify 5xx and error body.
- Consider a small benchmark for this handler per guidelines.
Example changes:
+ import "github.com/stretchr/testify/require" ... - if w.Code != http.StatusOK { - t.Errorf("expected status 200, got %d", w.Code) - } + require.Equal(t, http.StatusOK, w.Code)Add:
func TestTranslatorModelsHandler_Error(t *testing.T) { app := &Application{ /* inject a registry that returns error */ } trans := anthropic.NewTranslator(app.logger, config.AnthropicTranslatorConfig{Enabled: true}) req := httptest.NewRequest(http.MethodGet, "/olla/anthropic/v1/models", nil) w := httptest.NewRecorder() app.translatorModelsHandler(trans)(w, req) if w.Code < 500 || w.Code >= 600 { t.Fatalf("expected 5xx, got %d", w.Code) } }Benchmark (minimal):
func BenchmarkTranslatorModelsHandler(b *testing.B) { // set up app/trans with N models req := httptest.NewRequest(http.MethodGet, "/olla/anthropic/v1/models", nil) for i := 0; i < b.N; i++ { w := httptest.NewRecorder() handler(w, req) } }Based on learnings.
Also applies to: 122-126, 182-214
examples/claude-code-llamacpp/compose.yaml (2)
3-3: Pin image tags for predictability.Use a specific version or digest for llama.cpp server and olla images.
image: ghcr.io/ggml-org/llama.cpp:server-<ver> # or @sha256:<digest> ... image: ghcr.io/thushan/olla:vX.Y.Z # or @sha256:<digest>Also applies to: 41-41
26-31: Healthcheck tool availability and timing.Ensure the images include curl; if not, switch to wget CMD-SHELL. Given model load time, consider a longer start_period for olla when backends are cold.
- start_period: 10s + start_period: 30sAnd if curl is missing:
test: ["CMD-SHELL", "wget -q --spider http://localhost:8080/health || exit 1"]Also applies to: 53-57
docs/content/integrations/frontend/opencode.md (1)
170-186: Verify LM Studio Docker availability.The documentation notes that LM Studio may not have an official Docker image and references an example directory for alternatives. Consider providing a more concrete alternative (e.g., using Ollama) in the quick start to avoid user confusion.
Apply this diff to provide a clearer fallback option:
> **Note**: LM Studio may not have an official Docker image. See the [example directory](https://github.com/thushan/olla/tree/main/examples/opencode-lmstudio/) for alternative setup methods, or substitute with Ollama if preferred. +> +> **Quick Alternative**: Replace the `lmstudio` service with Ollama: +> ```yaml +> ollama: +> image: ollama/ollama:latest +> container_name: ollama +> restart: unless-stopped +> ports: +> - "11434:11434" +> ```examples/crush-vllm/crush-config.example.json (1)
59-114: Consider consolidating duplicate model definitions.Both providers define identical models with nearly identical metadata. Consider documenting a way to reference shared model definitions to reduce duplication and potential inconsistencies.
For example, in a more maintainable structure, you could note in comments that both providers share models and any updates should be applied to both, or reference Crush CLI documentation if it supports shared model definitions:
"olla-anthropic": { "type": "anthropic", "base_url": "http://localhost:40114/olla/anthropic/v1", "api_key": "not-required", - "description": "Local vLLM via Olla (Anthropic format)", + "description": "Local vLLM via Olla (Anthropic format). Note: Models mirror olla-openai provider.",internal/adapter/translator/anthropic/extended_features_test.go (1)
293-373: Add benchmarks for system character counting.The
countSystemCharsfunction is called during token estimation for every request. Consider adding benchmarks to ensure this operation remains performant, especially for large system prompt arrays.As per coding guidelines, add a benchmark:
func BenchmarkCountSystemChars(b *testing.B) { system := []interface{}{ map[string]interface{}{"type": "text", "text": "You are a helpful assistant with expertise in multiple domains. "}, map[string]interface{}{"type": "text", "text": "Provide detailed and accurate responses. "}, map[string]interface{}{"type": "text", "text": "Consider edge cases and best practices."}, } b.ResetTimer() for i := 0; i < b.N; i++ { _ = countSystemChars(system) } }Based on coding guidelines.
internal/adapter/translator/anthropic/token_count.go (1)
37-57: Optional: improve character counting fairness for Unicode.
len(string)counts bytes, not runes. If you want a closer heuristic across languages, considerutf8.RuneCountInString. This will slightly change totals; gate behind a flag if you need stability.Also applies to: 85-112, 113-156, 157-183
internal/app/handlers/handler_translation_bench_test.go (2)
99-111: Report bytes/op and assert required proxy headers.
- Call
b.SetBytes(int64(len(bodyBytes)))to get bytes/op in reports.- Assert presence of mandated headers to prevent regressions: X-Olla-Endpoint, X-Olla-Model, X-Olla-Backend-Type, X-Olla-Request-ID, X-Olla-Response-Time. As per coding guidelines.
Apply:
b.ResetTimer() b.ReportAllocs() + b.SetBytes(int64(len(bodyBytes))) for i := 0; i < b.N; i++ { @@ if rec.Code != http.StatusOK { b.Fatalf("Expected status 200, got %d", rec.Code) } + // Verify required headers are set by the handler + required := []string{ + "X-Olla-Endpoint", + "X-Olla-Model", + "X-Olla-Backend-Type", + constants.HeaderXOllaRequestID, + "X-Olla-Response-Time", + } + for _, h := range required { + if rec.Header().Get(h) == "" { + b.Fatalf("missing required header %q", h) + } + } }As per coding guidelines.
178-191: Apply the same improvements to streaming benchmark.Mirror
b.SetBytesand header assertions in the streaming case to validate SSE path behaviour as well.Apply:
b.ResetTimer() b.ReportAllocs() + b.SetBytes(int64(len(bodyBytes))) @@ if rec.Code != http.StatusOK { b.Fatalf("Expected status 200, got %d", rec.Code) } + required := []string{ + "X-Olla-Endpoint", + "X-Olla-Model", + "X-Olla-Backend-Type", + constants.HeaderXOllaRequestID, + "X-Olla-Response-Time", + } + for _, h := range required { + if rec.Header().Get(h) == "" { + b.Fatalf("missing required header %q", h) + } + }As per coding guidelines.
internal/adapter/translator/registry_test.go (3)
80-85: Fix test case name to match behaviour.The case name says “register multiple translators” but registers one. Rename for clarity.
Apply:
- name: "register multiple translators", + name: "register single translator",
118-146: Strengthen overwrite assertion.Verify the registry actually holds the second instance by pointer equality.
Apply:
@@ retrieved, err := registry.Get("anthropic") if err != nil { t.Fatalf("Expected no error, got: %v", err) } - // Simple check that it's a different instance (in real code you'd use pointer comparison) - if retrieved.Name() != "anthropic" { - t.Errorf("Expected overwritten translator name to be anthropic, got %s", retrieved.Name()) - } + if mt, ok := retrieved.(*mockTranslator); !ok || mt != second { + t.Errorf("Expected registry to hold the second translator instance after overwrite") + }
40-56: Run tests in parallel to reduce wall time.Call
t.Parallel()at the start of each top-level test and within subtests where safe.Also applies to: 57-116, 148-214, 216-270, 272-295, 296-348, 350-394
internal/adapter/translator/anthropic/translator.go (2)
32-36: Avoid panicking on pool init; prefer returning an error.Panics in constructors make integration harder. Consider
NewTranslator(log, cfg) (*Translator, error)and bubble the error; if you want a hard-fail helper, provideMustNewTranslator(...). Current code path is unlikely to fail but this change improves resilience.
98-104: Prefer a typed error payload for stability.Using a struct with JSON tags avoids accidental shape drift from map usage and improves IDE support.
Example:
type anthropicError struct { Type string `json:"type"` Error struct { Type string `json:"type"` Message string `json:"message"` } `json:"error"` }internal/adapter/translator/anthropic/security_test.go (3)
16-59: Mark tests parallel and set HTTP method/headers.Add
t.Parallel()to speed the suite, and initialise requests withMethod: "POST"andHeader: Content-Type: application/jsonto better mirror real traffic.Apply pattern:
func TestRequestValidation_RequiredFields(t *testing.T) { + t.Parallel() @@ - req := &http.Request{ - Body: io.NopCloser(bytes.NewReader(body)), - } + req := &http.Request{ + Method: "POST", + Header: http.Header{ "Content-Type": []string{"application/json"} }, + Body: io.NopCloser(bytes.NewReader(body)), + }Repeat similarly across tests in this file. Based on learnings (testify works fine with parallel tests; avoid suite parallelisation).
298-321: Trim oversize payload to “limit+1” to reduce memory pressure.11 MiB is larger than necessary. Use
(10<<20)+1to assert the limit breach with less heap churn.Apply:
- largeContent := strings.Repeat("A", 11*1024*1024) // 11mb of 'a's + largeContent := strings.Repeat("A", (10<<20)+1) // limit+1 bytes
16-416: Optional: factor request builders to reduce duplication.Introduce small helpers to build requests and marshal payloads to cut repetition and potential inconsistencies.
internal/adapter/translator/anthropic/tools.go (2)
10-26: Validate tool definitions before emitting OpenAI functionsConsider guarding against invalid tools (empty name, non-object schema) to avoid backend 400s. At minimum: trim/validate name, ensure schema.type == "object", and optionally log skipped tools instead of passing through.
Apply a lightweight validation:
func (t *Translator) convertTools(anthropicTools []AnthropicTool) []map[string]interface{} { openaiTools := make([]map[string]interface{}, 0, len(anthropicTools)) for _, tool := range anthropicTools { + if strings.TrimSpace(tool.Name) == "" { + t.logger.Warn("skipping tool with empty name") + continue + } + if schemaType, _ := tool.InputSchema["type"].(string); schemaType == "" { + t.logger.Warn("tool schema missing 'type', defaulting to object", "tool", tool.Name) + if tool.InputSchema == nil { + tool.InputSchema = map[string]interface{}{} + } + tool.InputSchema["type"] = "object" + } openaiTool := map[string]interface{}{ "type": openAITypeFunction, "function": map[string]interface{}{ "name": tool.Name, "description": tool.Description, "parameters": tool.InputSchema, // Direct mapping - both use JSON Schema }, } openaiTools = append(openaiTools, openaiTool) } return openaiTools }
33-81: Tool choice mapping looks correct; add warnings on unknown inputsSemantics for "auto"/"any"/specific tool are good. For unknown string/object values, defaulting to auto is fine, but add a debug/warn log to aid troubleshooting. Also trim and validate the specific tool name before returning.
func (t *Translator) convertToolChoice(toolChoice interface{}) (interface{}, error) { // Handle string form if choiceStr, ok := toolChoice.(string); ok { + choiceStr = strings.TrimSpace(choiceStr) switch choiceStr { case toolChoiceAuto: return openAIToolChoiceAuto, nil case toolChoiceAny: // Anthropic's "any" means require a tool call // OpenAI uses "required" for this behaviour return openAIToolChoiceRequired, nil case toolChoiceNone: return openAIToolChoiceNone, nil default: - // Unknown string, default to auto + t.logger.Warn("unknown tool_choice string; defaulting to auto", "value", choiceStr) return openAIToolChoiceAuto, nil } } // Handle object form if choiceMap, ok := toolChoice.(map[string]interface{}); ok { choiceType, _ := choiceMap["type"].(string) switch choiceType { case toolChoiceAuto: return openAIToolChoiceAuto, nil case toolChoiceAny: // Semantic mapping: Anthropic "any" -> OpenAI "required" return openAIToolChoiceRequired, nil case toolChoiceTool: // Force specific tool selection toolName, ok := choiceMap["name"].(string) - if !ok { + if !ok || strings.TrimSpace(toolName) == "" { return nil, fmt.Errorf("tool_choice type 'tool' requires 'name' field") } return map[string]interface{}{ "type": openAITypeFunction, "function": map[string]interface{}{ "name": toolName, }, }, nil default: - // Unknown type, default to auto + t.logger.Warn("unknown tool_choice object; defaulting to auto", "type", choiceType) return openAIToolChoiceAuto, nil } } // Safe default for unknown formats return openAIToolChoiceAuto, nil }docs/content/concepts/api-translation.md (2)
166-176: Specify languages on fenced code blocks to satisfy MD040 and improve renderingAdd explicit languages to code fences (e.g., text for SSE, mermaid/json already done elsewhere).
-``` +```text data: {"choices":[{"delta":{"content":"Hello"}}]} data: {"choices":[{"delta":{"content":"!"}}]} data: [DONE]```diff -``` +```text event: message_start data: {"type":"message_start","message":{"id":"msg_01ABC","role":"assistant","model":"llama4.0:latest"}} ... event: message_stopAlso applies to: 174-196 --- `225-226`: **Minor wording: “Full with” → “Full — with”** Slight readability improvement. ```diff -| `"any"` | `"required"` | ✅ Full with semantic mapping | +| `"any"` | `"required"` | ✅ Full — with semantic mapping |internal/adapter/translator/anthropic/integration_test.go (2)
38-49: Set Content-Type for fidelityAdd JSON content type header to mirror real requests and avoid future parser changes breaking tests.
func createHTTPRequest(t *testing.T, anthropicReq AnthropicRequest) *http.Request { t.Helper() body, err := json.Marshal(anthropicReq) require.NoError(t, err, "Failed to marshal Anthropic request") req, err := http.NewRequest("POST", "/v1/messages", bytes.NewReader(body)) require.NoError(t, err, "Failed to create HTTP request") + req.Header.Set("Content-Type", "application/json") return req }
654-706: Add a negative test for tool_choice {type:"tool"} missing nameExercise the error path in convertToolChoice and ensure TransformRequest surfaces it.
@@ t.Run("tool_choice_variations", func(t *testing.T) { testCases := []struct { name string toolChoice interface{} expectedOpenAI interface{} expectedShouldFail bool }{ @@ }, + { + "specific_tool_missing_name", + map[string]interface{}{"type": "tool"}, + nil, + true, + }, }internal/adapter/proxy/sherpa/service_streaming.go (1)
212-235: Flush errors are ignored (good), but ensure headers are committed earlyCall WriteHeader(200) before the first chunk if headers haven’t been sent to guarantee clients see headers even if the first write/flush fails.
func (s *Service) writeData(w http.ResponseWriter, data []byte, rc *http.ResponseController, state *streamState, rlog logger.StyledLogger) error { @@ - if !state.clientDisconnected { + if !state.clientDisconnected { + // Ensure headers committed before streaming + rc.Flush() // harmless if already written; commits headers if not written, writeErr := w.Write(data)internal/adapter/translator/anthropic/response_test.go (1)
628-665: Strengthen ID prefix assertionUse a prefix check rather than contains to avoid false positives.
-assert.Contains(t, anthropicResp.ID, "msg_01", "Message ID should start with msg_01 prefix") +assert.True(t, strings.HasPrefix(anthropicResp.ID, "msg_01"), "Message ID should start with msg_01 prefix")internal/adapter/translator/registry.go (1)
87-96: Minor: clarify locking expectations in helper
getAvailableNamesdepends on the caller holding the lock. Consider renaming tounsafeGetAvailableNamesor amend the comment to explicitly state it must be called withmualready held (RLock or Lock).docs/content/api-reference/anthropic.md (1)
592-603: Doc ↔ implementation mismatch: “All responses include standard Olla headers”The docs state all responses include
X-Olla-*headers. The new translator models endpoint (GET /olla/anthropic/v1/models) currently writes onlyContent-Typeand status, not theX-Olla-*headers.Please either:
- Update the handler to emit at least
X-Olla-Request-IDandX-Olla-Response-Time(and any available others), or- Narrow the wording to “proxy responses” to match current behaviour.
I recommend emitting
X-Olla-Response-Timeeverywhere for consistency.internal/app/handlers/handler_translator_models.go (3)
49-51: Emit standard Olla headers (at least response time) for consistency with docsSet
X-Olla-Response-Time(andX-Olla-Request-IDif available from context/middleware) before writing the response.func (a *Application) translatorModelsHandler(trans translator.RequestTranslator) http.HandlerFunc { return func(w http.ResponseWriter, r *http.Request) { - ctx := r.Context() + start := time.Now() + ctx := r.Context() @@ - w.Header().Set(constants.HeaderContentType, constants.ContentTypeJSON) + w.Header().Set(constants.HeaderContentType, constants.ContentTypeJSON) + // Best-effort response time header + w.Header().Set(constants.HeaderXOllaResponseTime, time.Since(start).String()) w.WriteHeader(http.StatusOK) - json.NewEncoder(w).Encode(response) + _ = json.NewEncoder(w).Encode(response) } }If you already have a request ID in context, also set
X-Olla-Request-ID. [Based on learnings]
96-106: Align error payload with documented Anthropic format by including top-level typeAdd
"type": "error"at the top level to match the documented error schema.- errorResp := map[string]interface{}{ - "error": map[string]interface{}{ - "message": message, - "type": "models_error", - }, - } + errorResp := map[string]interface{}{ + "type": "error", + "error": map[string]interface{}{ + "message": message, + "type": "models_error", + }, + }
66-76: Prefer model timestamps when availableInstead of
time.Now().Unix()per entry, usemodel.LastSeen.Unix()if non‑zero to provide a stable, meaningfulcreatedvalue. Fallback totime.Now()only whenLastSeenis zero.internal/app/handlers/handler_translation_test.go (1)
1-204: Add microbenchmarks for translation pathsPer guidelines, include benchmarks for critical paths (non‑streaming and streaming). Example stubs:
func BenchmarkTranslationHandler_NonStreaming(b *testing.B) { app := &Application{ /* init with mocks as in TestTranslationHandler_NonStreaming */ } h := app.translationHandler(&mockTranslator{name: "bench"}) body, _ := json.Marshal(map[string]any{"model":"m","messages":[]any{{"role":"user","content":"hi"}}}) req := httptest.NewRequest("POST", "/test", bytes.NewReader(body)) req.Header.Set(constants.HeaderContentType, constants.ContentTypeJSON) for i := 0; i < b.N; i++ { rec := httptest.NewRecorder() h.ServeHTTP(rec, req) } } func BenchmarkTransformStreamingResponse(b *testing.B) { tlog := &mockStyledLogger{} // create translator under test and a representative stream chunks := []string{ "data: {\"id\":\"x\",\"model\":\"m\",\"choices\":[{\"delta\":{\"content\":\"Hello\"},\"index\":0}]}\n\n", "data: {\"id\":\"x\",\"choices\":[{\"delta\":{},\"index\":0,\"finish_reason\":\"stop\"}]}\n\n", "data: [DONE]\n\n", } for i := 0; i < b.N; i++ { tr := &mockTranslator{name:"bench-stream"} rec := httptest.NewRecorder() _ = tr.TransformStreamingResponse(context.Background(), strings.NewReader(strings.Join(chunks, "")), rec, nil) } }internal/adapter/translator/anthropic/streaming_test.go (2)
442-482: Add a benchmark for streaming translationGiven this path is performance‑critical, add a benchmark to quantify overhead.
func BenchmarkAnthropic_TransformStreamingResponse(b *testing.B) { tr := NewTranslator(createStreamingTestLogger(), createTestConfig()) chunks := []string{ "data: {\"id\":\"chatcmpl-bench\",\"model\":\"claude-3-5-sonnet-20241022\",\"choices\":[{\"delta\":{\"content\":\"Test\"},\"index\":0}]}\n\n", "data: {\"id\":\"chatcmpl-bench\",\"choices\":[{\"delta\":{},\"index\":0,\"finish_reason\":\"stop\"}]}\n\n", "data: [DONE]\n\n", } payload := strings.Join(chunks, "") b.ReportAllocs() for i := 0; i < b.N; i++ { rec := httptest.NewRecorder() _ = tr.TransformStreamingResponse(context.Background(), strings.NewReader(payload), rec, nil) } }
23-29: Optional: simulate real streaming with io.Pipe for stronger signal
createMockOpenAIStreamwrites a full buffer at once. For closer parity with runtime, consider anio.Pipethat writes chunks with small delays to surface any read/flush timing bugs.internal/adapter/translator/anthropic/response.go (2)
115-125: Handlecontent_filterfinish_reason.OpenAI may emit
content_filter. Map it explicitly to a sensible stop reason (e.g.,end_turn) to avoid ambiguity.func mapFinishReasonToStopReason(finishReason string) string { switch finishReason { case "stop": return "end_turn" case "tool_calls": return contentTypeToolUse case "length": return "max_tokens" + case "content_filter": + return "end_turn" default: return "end_turn" } }
191-200: Strengthen ID fallback uniqueness.If
crypto/randfails, current fallback can return identical IDs (msg_01fallback0). Mix in time to ensure uniqueness.-import ( +import ( "context" "crypto/rand" "encoding/json" "fmt" "math/big" "net/http" + "time" ) @@ if _, err := rand.Read(randomBytes); err != nil { // crypto/rand shouldn't fail but just in case t.logger.Warn("Failed to generate random bytes for message ID", "error", err) - return fmt.Sprintf("msg_01fallback%d", big.NewInt(0).SetBytes(randomBytes[:8]).Uint64()) + ts := time.Now().UnixNano() + return fmt.Sprintf("msg_01fallback%x", ts) }internal/adapter/translator/anthropic/request.go (2)
209-213: Preserve user text boundaries.Joining with
""collapses spacing. Prefer a newline (or space) to retain readability.- "content": strings.Join(textParts, ""), + "content": strings.Join(textParts, "\n"),
39-41: Set Content-Type for rewritten request body.After marshalling OpenAI body, set JSON content type to avoid backend heuristics.
r.Body = io.NopCloser(bytes.NewReader(openaiBody)) r.ContentLength = int64(len(openaiBody)) + r.Header.Set("Content-Type", "application/json")internal/adapter/translator/anthropic/types.go (1)
35-38: Consider allowing MaxTokens=0 as “use default”.Strictly requiring ≥1 may break clients that omit or set 0 expecting server defaults. Option: accept 0 and inject a configured default during translation.
internal/adapter/translator/anthropic/streaming.go (1)
31-35: AddX-Accel-Buffering: nofor SSE behind proxies.Helps disable buffering on Nginx to keep events flowing.
w.Header().Set(constants.HeaderContentType, "text/event-stream") w.Header().Set("Cache-Control", "no-cache") w.Header().Set("Connection", "keep-alive") + w.Header().Set("X-Accel-Buffering", "no")internal/app/handlers/handler_translation.go (1)
39-41: Set Content-Type after rewriting request body (defensive).Helps avoid backend surprises if Content-Type wasn’t JSON.
r.Body = io.NopCloser(bytes.NewReader(openaiBody)) r.ContentLength = int64(len(openaiBody)) +r.Header.Set(constants.HeaderContentType, constants.ContentTypeJSON)
| ``` | ||
| Content-Type: application/json | ||
| anthropic-version: 2023-06-01 # Optional, for compatibility | ||
| ``` | ||
|
|
There was a problem hiding this comment.
Add languages to fenced code blocks to satisfy MD040 and improve readability
Specify a language for the following fences:
- Lines 80–84 (headers) → use
text - Lines 198–240 (SSE example) → use
text - Lines 470–474 (headers) → use
text - Lines 568–589 (full SSE sequence) → use
text
Example:
-```
+```text
Content-Type: application/json
anthropic-version: 2023-06-01 # Optional, for compatibility
Also applies to: 198-240, 470-474, 568-589
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.18.1)</summary>
80-80: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
In docs/content/api-reference/anthropic.md around lines 80-84, 198-240, 470-474,
and 568-589, the fenced code blocks lack a language tag which triggers MD040 and
reduces readability; update each opening fence to include the language
identifier "text" (i.e. replace withtext for those blocks) so the
headers, SSE examples, and full SSE sequence are marked as plain text.
</details>
<!-- fingerprinting:phantom:medusa:chinchilla -->
<!-- This is an auto-generated comment by CodeRabbit -->
| | Anthropic | OpenAI | Behaviour | | ||
| |-----------|--------|-----------| | ||
| | `"auto"` | `"auto"` | Let model decide whether to use tools | | ||
| | `"any"` | `"required"` | Force model to use a tool (any tool) | | ||
| | `"none"` | `"none"` | Disable tool use | | ||
| | `{"type":"tool","name":"X"}` | `{"type":"function","function":{"name":"X"}}` | Force specific tool | | ||
|
|
||
| --- |
There was a problem hiding this comment.
Hyphenate “Force-specific” for correctness
Update table text to “Force‑specific tool”.
-| `{"type":"tool","name":"X"}` | `{"type":"function","function":{"name":"X"}}` | Force specific tool |
+| `{"type":"tool","name":"X"}` | `{"type":"function","function":{"name":"X"}}` | Force-specific tool |📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| | Anthropic | OpenAI | Behaviour | | |
| |-----------|--------|-----------| | |
| | `"auto"` | `"auto"` | Let model decide whether to use tools | | |
| | `"any"` | `"required"` | Force model to use a tool (any tool) | | |
| | `"none"` | `"none"` | Disable tool use | | |
| | `{"type":"tool","name":"X"}` | `{"type":"function","function":{"name":"X"}}` | Force specific tool | | |
| --- | |
| | Anthropic | OpenAI | Behaviour | | |
| |-----------|--------|-----------| | |
| | `"auto"` | `"auto"` | Let model decide whether to use tools | | |
| | `"any"` | `"required"` | Force model to use a tool (any tool) | | |
| | `"none"` | `"none"` | Disable tool use | | |
| | `{"type":"tool","name":"X"}` | `{"type":"function","function":{"name":"X"}}` | Force-specific tool | | |
| --- |
🧰 Tools
🪛 LanguageTool
[uncategorized] ~242-~242: When ‘Force-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ...:"function","function":{"name":"X"}}` | Force specific tool | --- ## Architecture ### Trans...
(SPECIFIC_HYPHEN)
🤖 Prompt for AI Agents
docs/content/concepts/api-translation.md around lines 237 to 244: the table row
label in the Behaviour column reads "Force specific tool" and needs to be
hyphenated; update that cell text to "Force-specific tool" so it correctly uses
a hyphen.
| 1. Request arrives at `/olla/anthropic/v1/messages` | ||
| 2. Handler invokes Anthropic translator | ||
| 3. Translator converts request to OpenAI format | ||
| 4. Proxy routes to backend (standard Olla routing) | ||
| 5. Backend responds in OpenAI format | ||
| 6. Translator converts response to Anthropic format | ||
| 7. Response returned to client | ||
|
|
There was a problem hiding this comment.
Fix duplicate word in the numbered “Process” list
There appears to be a repeated word; remove the duplicate.
-1. Request arrives at `/olla/anthropic/v1/messages`
-2. Handler invokes Anthropic translator
-3. Translator Translator converts request to OpenAI format
+1. Request arrives at `/olla/anthropic/v1/messages`
+2. Handler invokes Anthropic translator
+3. Translator converts request to OpenAI formatCommittable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 LanguageTool
[duplication] ~295-~295: Possible typo: you repeated a word.
Context: .../messages` 2. Handler invokes Anthropic translator 3. Translator converts request to OpenAI format 4. Pr...
(ENGLISH_WORD_REPEAT_RULE)
🤖 Prompt for AI Agents
In docs/content/concepts/api-translation.md around lines 294 to 301, a word is
duplicated in the numbered “Process” list; edit the affected list item to remove
the repeated word so the sentence reads cleanly and grammatically (e.g., fix the
duplicated token in the translator/convert step), keeping the numbering and
overall steps unchanged.
| translators: | ||
| anthropic: | ||
| enabled: true | ||
| # Anthropic-specific configuration | ||
| version: "2023-06-01" | ||
| # Map Anthropic system messages to OpenAI format | ||
| system_message_handling: "prepend" # Options: prepend, append, merge | ||
| # Handle max_tokens requirement (Anthropic requires it) | ||
| default_max_tokens: 4096 | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
Verify Anthropic translator configuration fields.
The configuration includes fields (version, system_message_handling, default_max_tokens) that appear to be documentation placeholders rather than implemented features. Ensure these configuration options are actually supported by the Anthropic translator implementation before including them in examples.
Run the following script to verify if these configuration fields are used in the translator code:
🏁 Script executed:
#!/bin/bash
# Search for references to these configuration fields in the translator implementation
rg -n --type=go -C3 'version.*2023-06-01|system_message_handling|default_max_tokens' internal/adapter/translator/anthropic/Length of output: 0
I'll conduct broader searches to understand the codebase structure and verify whether these configuration fields are actually implemented.
---
🏁 Script executed:
```shell
#!/bin/bash
# Find files related to Anthropic or translators
fd -t f -i anthropic
Length of output: 94
🏁 Script executed:
#!/bin/bash
# Search for translator-related files and directories
fd -t d translatorLength of output: 28
🏁 Script executed:
#!/bin/bash
# Search the entire codebase for these config field names
rg -n 'system_message_handling|default_max_tokens' --type=go --type=yamlLength of output: 168
🏁 Script executed:
#!/bin/bash
# Check if the example YAML file is validated or parsed anywhere
rg -n 'olla\.yaml|crush-vllm' --type=goLength of output: 0
I need to check what translators are actually implemented and review the documentation.
---
🏁 Script executed:
```shell
#!/bin/bash
# List translator implementations
ls -la internal/adapter/translator/
Length of output: 361
🏁 Script executed:
#!/bin/bash
# Check if Anthropic translator exists
ls -la internal/adapter/translator/ | grep -i anthropicLength of output: 59
🏁 Script executed:
#!/bin/bash
# View the Anthropic documentation to see if these fields are documented
cat docs/content/integrations/api-translation/anthropic.mdLength of output: 22724
🏁 Script executed:
#!/bin/bash
# Search for YAML config parsing and validation logic
rg -n 'yaml|config|unmarshal' --type=go -C2 | head -50Length of output: 2483
---
🏁 Script executed:
```shell
#!/bin/bash
# List files in Anthropic translator implementation
ls -la internal/adapter/translator/anthropic/
Length of output: 1382
🏁 Script executed:
#!/bin/bash
# Search for struct definitions or config handling in Anthropic translator
rg -n 'type.*Config|Unmarshal|version|system_message' internal/adapter/translator/anthropic/ --type=goLength of output: 2248
Remove undocumented configuration fields from Anthropic translator example.
The Anthropic translator example includes three unsupported configuration options (version, system_message_handling, default_max_tokens) that are not documented or implemented. The official documentation lists only enabled (boolean) and max_message_size (integer) as supported configuration options, and these fields appear nowhere in the translator implementation. Remove lines 46, 48, and 50 to prevent user confusion.
🤖 Prompt for AI Agents
In examples/crush-vllm/olla.yaml around lines 42 to 51, the Anthropic translator
example contains undocumented/unsupported fields (version,
system_message_handling, default_max_tokens); remove the three lines (currently
at lines 46, 48, and 50) so the block only contains supported options (enabled
and max_message_size) to match the translator implementation and docs.
| a.routeRegistry.RegisterWithMethod( | ||
| path, | ||
| handler, | ||
| name+" Messages API", | ||
| "POST", | ||
| ) | ||
|
|
||
| // Register models endpoint for translators | ||
| // Uses base path from translator + "/models" (e.g., /olla/anthropic/v1/models) | ||
| modelsPath := extractBasePath(path) + "/models" | ||
| modelsHandler := a.translatorModelsHandler(trans) | ||
|
|
||
| a.routeRegistry.RegisterWithMethod( | ||
| modelsPath, | ||
| modelsHandler, | ||
| name+" Models API", | ||
| "GET", | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
Ensure proxy headers are set for translator endpoints.
Per guidelines for internal/app/handlers/*.go, proxy responses must set: X-Olla-Endpoint, X-Olla-Model, X-Olla-Backend-Type, X-Olla-Request-ID, X-Olla-Response-Time. Please confirm translationHandler and translatorModelsHandler/tokenCountHandler set these (or are wrapped by middleware that does).
Run:
If missing, consider a lightweight wrapper:
- handler := a.translationHandler(trans)
+ handler := a.withProxyHeaders(a.translationHandler(trans))…and implement withProxyHeaders once in this package.
As per coding guidelines.
Also applies to: 101-107
🏁 Script executed:
#!/bin/bash
# Verify required proxy headers are set by handlers or middleware.
rg -n --type=go -C2 $'X-Olla-(Endpoint|Model|Backend-Type|Request-ID|Response-Time)' internal/app/handlers || true
# Inspect translation handler for header writes
ast-grep --pattern $'func ($_ $_) translationHandler($_) $_ { $$$ }' || true
rg -n --type=go -C3 'translationHandler\(' internal/app/handlers || true
# Check if routeRegistry injects headers automatically for RegisterWithMethod/RegisterProxyRoute
rg -n --type=go -C3 'RegisterWithMethod|RegisterProxyRoute' internal || trueLength of output: 31618
Expected output would show where headers are set in the actual handlers, confirming whether they're being set for translator endpoints.
---
🏁 Script executed:
```shell
#!/bin/bash
# Find executeTranslatedStreamingRequest and executeTranslatedNonStreamingRequest implementations
rg -n --type=go 'executeTranslatedStreamingRequest|executeTranslatedNonStreamingRequest' internal/app/handlers -A 20 | head -100
# Find translatorModelsHandler and tokenCountHandler implementations
rg -n --type=go -B2 'func.*translatorModelsHandler|func.*tokenCountHandler' internal/app/handlers | head -80
# Search for where X-Olla headers are written (w.Header().Set with X-Olla)
rg -n --type=go 'w\.Header\(\)\.Set.*X-Olla' internal/app/handlers | head -30
Length of output: 185
---
🏁 Script executed:
```shell
#!/bin/bash
# Find executeTranslatedStreamingRequest and executeTranslatedNonStreamingRequest implementations
rg -n --type=go 'func.*executeTranslatedStreamingRequest|func.*executeTranslatedNonStreamingRequest' internal/app/handlers -A 25 | head -150
# Find translatorModelsHandler and tokenCountHandler implementations
rg -n --type=go 'func.*translatorModelsHandler|func.*tokenCountHandler' internal/app/handlers -A 20 | head -100
# Search for where X-Olla headers are actually written
rg -n --type=go 'w\.Header\(\)\.Set.*constants\.HeaderXOlla|w\.Header\(\)\.Set.*"X-Olla' internal/app/handlers -B2 -A2 | head -80
Length of output: 14776
---
🏁 Script executed:
```shell
#!/bin/bash
# Find writeTranslatorModelsError implementation
rg -n --type=go 'func.*writeTranslatorModelsError' internal/app/handlers -A 15
# Find writeTranslatorError implementation
rg -n --type=go 'func.*writeTranslatorError' internal/app/handlers -A 15
# Find ProxyRequestToEndpoints implementation (likely in adapter/proxy)
rg -n --type=go 'func.*ProxyRequestToEndpoints' internal -A 15
# Check if there's middleware wrapping the handlers
rg -n --type=go 'withProxyHeaders|setProxyHeaders' internal -B2 -A5
Length of output: 22713
---
🏁 Script executed:
```shell
#!/bin/bash
# Find ProxyRequestToEndpointsWithRetry implementation in olla
rg -n --type=go 'func.*ProxyRequestToEndpointsWithRetry' internal/adapter/proxy/olla/service_retry.go -A 50 | head -100
# Find actual response header writing in proxy service
rg -n --type=go 'w\.Header\(\)\.Set.*HeaderXOlla' internal/adapter/proxy -B2 -A2 | head -80
# Check if there's any middleware or wrapper that adds headers
rg -n --type=go 'HeaderXOlla' internal -B3 -A3 | grep -E '(func|\.Set)' | head -40
Length of output: 8131
---
🏁 Script executed:
```shell
#!/bin/bash
# Find the function containing the header-setting code in common.go
rg -n --type=go --context 20 'h\.Set\(constants\.HeaderXOllaRequestID' internal/adapter/proxy/core/common.go | head -80
# Find what calls this function
rg -n --type=go 'SetProxyResponseHeaders|WriteProxyHeaders|copyResponseHeaders' internal/adapter/proxy -B3 -A3 | head -60
Length of output: 1432
---
🏁 Script executed:
```shell
#!/bin/bash
# Find all calls to SetResponseHeaders
rg -n --type=go 'SetResponseHeaders\(' internal -B3 -A3
# Check if executeTranslatedNonStreamingRequest or executeTranslatedStreamingRequest call it
rg -n --type=go 'executeTranslatedNonStreamingRequest|executeTranslatedStreamingRequest' internal/app/handlers/handler_translation.go -A 50 | grep -E '(SetResponseHeaders|w\.Header)'
# Check if proxyService implementations call SetResponseHeaders
rg -n --type=go 'proxyToSingleEndpoint|retryHandler\.ExecuteWithRetry' internal/adapter/proxy -A 30 | grep -E '(SetResponseHeaders|w\.Header)'
Length of output: 4130
---
🏁 Script executed:
```shell
#!/bin/bash
# Find ExecuteWithRetry implementation to see if it calls SetResponseHeaders
rg -n --type=go 'func.*ExecuteWithRetry' internal/adapter/proxy -A 40 | head -120
# Also check for any wrapper or middleware between handler and proxy service
rg -n --type=go 'translationHandler|translatorModelsHandler|tokenCountHandler' internal/app/handlers/server_routes.go -B2 -A2
Length of output: 3500
---
🏁 Script executed:
```shell
#!/bin/bash
# Full implementation of translatorModelsHandler to see if it sets headers
rg -n --type=go 'func \(a \*Application\) translatorModelsHandler' internal/app/handlers -A 80 | head -150
# Full implementation of tokenCountHandler
rg -n --type=go 'func \(a \*Application\) tokenCountHandler' internal/app/handlers -A 80 | head -150
Length of output: 13357
Set X-Olla proxy headers on translator Models and Token Count endpoints.
Per guidelines, translator endpoints must set response headers: X-Olla-Endpoint, X-Olla-Model, X-Olla-Backend-Type, X-Olla-Request-ID, X-Olla-Response-Time.
translationHandler(lines 76–80): ✓ Already compliant viaproxyService.ProxyRequestToEndpoints()→core.SetResponseHeaders()translatorModelsHandler(lines 85–87): ✗ Missing header callstokenCountHandler(lines 101–107): ✗ Missing header calls
Add header setting for the Models and Token Count handlers. Use the existing copyOllaHeaders utility if available, or call core.SetResponseHeaders() with appropriate stats/endpoint context.
- fix typos in configuration for Claude Code - Refactor Streaming tests and use helpers - update handler tests - fix trailing slashes in paths causing issues when not escaped - update logging to avoid PII data leak - add additional tests for token counts, configuration
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (3)
internal/adapter/translator/anthropic/response.go (2)
205-206: Fix typo in comment.“ambigious” → “ambiguous”.
Apply:
-// encode bytes to base58, shorter and less ambigious than hex +// encode bytes to base58, shorter and less ambiguous than hex
141-151: Reduce PII risk in tool-args logging.Don’t log the preview string under a name suggesting it’s the full value; prefer length only.
Apply:
- t.logger.Warn("Failed to parse tool arguments, using empty input", + t.logger.Warn("Failed to parse tool arguments; using empty input", "tool", name, "tool_id", id, "error", err, - "raw_arguments", util.TruncateString(argsStr, util.DefaultTruncateLengthPII), - "raw_arguments_len", len(argsStr)) + "raw_arguments_len", len(argsStr))internal/app/handlers/handler_translation.go (1)
76-79: Fix typo in comment.“make shure” → “make sure”.
Apply:
- // make shure that we have at least one endpoint available + // make sure that we have at least one endpoint available
🧹 Nitpick comments (14)
docs/content/integrations/frontend/claude-code.md (3)
101-115: Add a language to the ASCII diagram code fence (markdownlint MD040).Specify a fence language to satisfy linters and improve rendering.
-``` +```text ... -``` +```
11-12: Prefer headings over bold for section titles (markdownlint MD036).Use proper headings for structure and navigation.
-**Set in Claude Code:** +## Set in Claude Code -**Option A: Environment Variables (Recommended)** +### Option A: Environment Variables (Recommended) -**Option B: Configuration File** +### Option B: Configuration FileAlso applies to: 275-276, 290-296
537-541: Minor wording: “multi‑task” → “multitask”.Improve consistency with common usage.
- - `qwen3:32b` - Strong multi-task performance + - `qwen3:32b` - Strong multitask performanceinternal/adapter/translator/anthropic/token_count.go (2)
120-146: Support pointer to typed content blocks in message.Content.Messages can mirror system’s typed forms. Add *[]ContentBlock handling to avoid undercounting.
func countMessageChars(msg *AnthropicMessage) int { totalChars := 0 switch content := msg.Content.(type) { case string: @@ case []ContentBlock: // typed block arrays for _, block := range content { totalChars += countTypedContentBlockChars(&block) } + case *[]ContentBlock: + if content != nil { + for _, block := range *content { + totalChars += countTypedContentBlockChars(&block) + } + } } return totalChars }
51-71: Optional: count runes, not bytes, for string lengths.Using len(s) overestimates non‑ASCII text (bytes vs runes). Consider utf8.RuneCountInString for more language‑agnostic estimates. Minimal change: introduce a small helper and use it in the string paths (text, names, serialized JSON remains byte‑based).
Example:
import "unicode/utf8" func runeLen(s string) int { return utf8.RuneCountInString(s) }Then replace len(...) on user‑visible text fields with runeLen(...). No functional change required now.
Also applies to: 73-119, 148-190, 192-230
internal/adapter/translator/anthropic/integration_test.go (1)
92-98: Run top‑level tests in parallel to speed CI (safe here).Each Test* creates its own translator/config; mark them parallel.
func TestAnthropicToOpenAIToAnthropic_RoundTrip(t *testing.T) { translator := NewTranslator(createIntegrationTestLogger(), createTestConfig()) ctx := context.Background() + t.Parallel() } @@ func TestAnthropicToolCalling_RoundTrip(t *testing.T) { translator := NewTranslator(createIntegrationTestLogger(), createTestConfig()) ctx := context.Background() + t.Parallel() } @@ func TestAnthropicEdgeCases_RoundTrip(t *testing.T) { translator := NewTranslator(createIntegrationTestLogger(), createTestConfig()) ctx := context.Background() + t.Parallel() } @@ func TestAnthropicModelPreservation(t *testing.T) { translator := NewTranslator(createIntegrationTestLogger(), createTestConfig()) ctx := context.Background() + t.Parallel() } @@ func TestAnthropicUsageTracking(t *testing.T) { translator := NewTranslator(createIntegrationTestLogger(), createTestConfig()) ctx := context.Background() + t.Parallel() }Also applies to: 374-380, 827-833, 1033-1039, 1074-1081
internal/adapter/translator/registry.go (1)
72-85: Optional: expose Unregister for dynamic reconfiguration.If translators can be reloaded/replaced at runtime, an Unregister(name) method (with lock) prevents stale entries during reinit.
No change required now.
Also applies to: 87-105
internal/adapter/translator/anthropic/response.go (2)
79-85: Handle OpenAI message.content arrays (multi‑part).Support []interface{} with text parts to avoid dropping content from newer APIs.
Apply:
- // Handle text content - if textContent, ok := message["content"].(string); ok && textContent != "" { - content = append(content, ContentBlock{ - Type: contentTypeText, - Text: textContent, - }) - } + // Handle text content + switch c := message["content"].(type) { + case string: + if c != "" { + content = append(content, ContentBlock{Type: contentTypeText, Text: c}) + } + case []interface{}: + // Concatenate text parts; ignore non-text here (images not supported in non-stream path yet) + var b strings.Builder + for _, part := range c { + if m, ok := part.(map[string]interface{}); ok { + if t, _ := m["type"].(string); t == "text" { + if txt, _ := m["text"].(string); txt != "" { + b.WriteString(txt) + } + } + } + } + if b.Len() > 0 { + content = append(content, ContentBlock{Type: contentTypeText, Text: b.String()}) + } + }Add import:
+ "strings"
117-127: Map ‘content_filter’ finish_reason.Add explicit mapping to a sensible default to avoid surprises.
Apply:
switch finishReason { case "stop": return "end_turn" case "tool_calls": return contentTypeToolUse case "length": return "max_tokens" + case "content_filter": + return "end_turn" default: return "end_turn" }internal/adapter/translator/anthropic/streaming_test.go (3)
14-59: Mark tests parallel where safe.Add t.Parallel() at the start of each TestTransformStreamingResponse_* to speed CI.
Example:
func TestTransformStreamingResponse_SimpleText(t *testing.T) { + t.Parallel() translator := newTestTranslator()Repeat for other tests in this file.
523-544: Add boundary test for max_message_size.Include a case exceeding configured MaxMessageSize to assert translator returns an error/status rather than hanging.
Would you like me to draft the test using a >10MiB single-arg chunk?
296-338: Finish-reason default case.Add a case asserting unknown finish_reason maps to “end_turn” to pin expected default.
internal/app/handlers/handler_translation.go (1)
153-165: Avoid string keys in context.Use a typed, unexported key to prevent collisions.
Apply:
func (a *Application) prepareProxyContext(ctx context.Context, r *http.Request, pr *proxyRequest) (context.Context, *http.Request) { - if pr.model != "" { - ctx = context.WithValue(ctx, "model", pr.model) - r = r.WithContext(ctx) - } + if pr.model != "" { + ctx = context.WithValue(ctx, ctxKeyModel, pr.model) + r = r.WithContext(ctx) + }Add near the top of the file:
// context keys type ctxKey string const ctxKeyModel ctxKey = "model"internal/adapter/translator/anthropic/streaming_test_helpers.go (1)
137-151: Optionally ignore non‑JSON data lines.If future streams include “data: [DONE]”, skip parsing those lines to avoid false errors in helpers.
Example:
} else if strings.HasPrefix(line, "data: ") { dataStr := strings.TrimPrefix(line, "data: ") + if dataStr == "[DONE]" { + continue + } var data map[string]interface{} if err := json.Unmarshal([]byte(dataStr), &data); err != nil { return nil, err }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
docs/content/integrations/frontend/claude-code.md(1 hunks)examples/claude-code-ollama/olla.yaml(1 hunks)examples/crush-vllm/olla.yaml(1 hunks)internal/adapter/translator/anthropic/integration_test.go(1 hunks)internal/adapter/translator/anthropic/response.go(1 hunks)internal/adapter/translator/anthropic/response_test.go(1 hunks)internal/adapter/translator/anthropic/streaming.go(1 hunks)internal/adapter/translator/anthropic/streaming_test.go(1 hunks)internal/adapter/translator/anthropic/streaming_test_helpers.go(1 hunks)internal/adapter/translator/anthropic/token_count.go(1 hunks)internal/adapter/translator/anthropic/token_count_test.go(1 hunks)internal/adapter/translator/registry.go(1 hunks)internal/app/handlers/handler_translation.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- internal/adapter/translator/anthropic/token_count_test.go
- examples/claude-code-ollama/olla.yaml
- internal/adapter/translator/anthropic/streaming.go
🧰 Additional context used
📓 Path-based instructions (2)
{internal,pkg}/**/*_test.go
📄 CodeRabbit inference engine (CLAUDE.md)
Include Go benchmarks (Benchmark* functions) for critical paths, proxy engine comparisons, pooling efficiency, and circuit breaker behaviour
Files:
internal/adapter/translator/anthropic/streaming_test.gointernal/adapter/translator/anthropic/response_test.gointernal/adapter/translator/anthropic/integration_test.go
internal/app/handlers/*.go
📄 CodeRabbit inference engine (CLAUDE.md)
Set response headers on proxy responses:
X-Olla-Endpoint,X-Olla-Model,X-Olla-Backend-Type,X-Olla-Request-ID,X-Olla-Response-Time
Files:
internal/app/handlers/handler_translation.go
🧠 Learnings (3)
📚 Learning: 2025-09-23T08:30:20.366Z
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-23T08:30:20.366Z
Learning: Applies to internal/app/handlers/*.go : Set response headers on proxy responses: `X-Olla-Endpoint`, `X-Olla-Model`, `X-Olla-Backend-Type`, `X-Olla-Request-ID`, `X-Olla-Response-Time`
Applied to files:
internal/app/handlers/handler_translation.go
📚 Learning: 2025-09-23T08:30:20.366Z
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-23T08:30:20.366Z
Learning: Applies to internal/adapter/proxy/**/service.go : Proxy services must include response headers: `X-Olla-Endpoint`, `X-Olla-Model`, `X-Olla-Backend-Type`, `X-Olla-Request-ID`, `X-Olla-Response-Time`
Applied to files:
internal/app/handlers/handler_translation.go
📚 Learning: 2025-09-23T08:30:20.366Z
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-23T08:30:20.366Z
Learning: Applies to {internal,pkg}/**/*_test.go : Include Go benchmarks (Benchmark* functions) for critical paths, proxy engine comparisons, pooling efficiency, and circuit breaker behaviour
Applied to files:
internal/adapter/translator/anthropic/integration_test.go
🧬 Code graph analysis (7)
internal/adapter/translator/anthropic/token_count.go (3)
internal/adapter/translator/anthropic/translator.go (1)
Translator(16-21)internal/adapter/translator/types.go (1)
TokenCountResponse(58-62)internal/adapter/translator/anthropic/types.go (3)
AnthropicRequest(7-21)ContentBlock(60-69)AnthropicMessage(53-56)
internal/adapter/translator/anthropic/response.go (3)
internal/adapter/translator/anthropic/translator.go (1)
Translator(16-21)internal/adapter/translator/anthropic/types.go (3)
AnthropicResponse(106-115)ContentBlock(60-69)AnthropicUsage(118-121)internal/util/parsing.go (2)
TruncateString(23-34)DefaultTruncateLengthPII(8-8)
internal/adapter/translator/anthropic/response_test.go (5)
internal/logger/styled.go (1)
StyledLogger(12-35)internal/config/types.go (1)
Config(12-21)internal/logger/styled_plain.go (1)
NewPlainStyledLogger(18-22)internal/adapter/translator/anthropic/translator.go (1)
NewTranslator(26-51)internal/adapter/translator/anthropic/types.go (2)
AnthropicResponse(106-115)ContentBlock(60-69)
internal/app/handlers/handler_translation.go (6)
internal/app/handlers/application.go (1)
Application(71-88)internal/adapter/translator/types.go (3)
RequestTranslator(11-26)ErrorWriter(47-49)TokenCounter(53-55)internal/util/request.go (1)
StripPrefix(74-83)internal/core/constants/endpoint.go (1)
DefaultOllaProxyPathPrefix(5-5)internal/core/constants/content.go (10)
HeaderContentType(57-57)ContentTypeJSON(7-7)HeaderXOllaModel(104-104)HeaderXOllaRequestID(101-101)HeaderXOllaEndpoint(102-102)HeaderXOllaBackendType(103-103)HeaderXOllaResponseTime(105-105)HeaderXOllaRoutingStrategy(106-106)HeaderXOllaRoutingDecision(107-107)HeaderXOllaRoutingReason(108-108)internal/core/domain/endpoint.go (1)
Endpoint(19-39)
internal/adapter/translator/anthropic/integration_test.go (5)
internal/logger/styled.go (1)
StyledLogger(12-35)internal/config/types.go (2)
Config(12-21)AnthropicTranslatorConfig(176-179)internal/logger/styled_plain.go (1)
NewPlainStyledLogger(18-22)internal/adapter/translator/anthropic/types.go (5)
AnthropicRequest(7-21)AnthropicMessage(53-56)AnthropicResponse(106-115)AnthropicTool(82-86)ContentBlock(60-69)internal/adapter/translator/anthropic/translator.go (1)
NewTranslator(26-51)
internal/adapter/translator/registry.go (2)
internal/adapter/translator/types.go (1)
RequestTranslator(11-26)internal/logger/styled.go (1)
StyledLogger(12-35)
internal/adapter/translator/anthropic/streaming_test_helpers.go (3)
internal/adapter/translator/anthropic/translator.go (2)
Translator(16-21)NewTranslator(26-51)internal/config/types.go (2)
Config(12-21)AnthropicTranslatorConfig(176-179)internal/logger/styled_plain.go (1)
NewPlainStyledLogger(18-22)
🪛 LanguageTool
docs/content/integrations/frontend/claude-code.md
[grammar] ~342-~342: A determiner may be missing.
Context: ...t-connections**: Routes to backend with fewest active requests **Timeout Configuratio...
(THE_SUPERLATIVE)
[misspelling] ~539-~539: This word is normally spelled as one.
Context: ...soning abilities - qwen3:32b - Strong multi-task performance **Performance vs Quality T...
(EN_COMPOUNDS_MULTI_TASK)
[uncategorized] ~591-~591: Possible missing preposition found.
Context: ...og ### No Models Available **List models from Olla**:bash curl http://localh...
(AI_HYDRA_LEO_MISSING_OF)
[grammar] ~845-~845: A determiner may be missing.
Context: ... Workflow - Local-first: Configure highest priority for local backends - **Fallbac...
(THE_SUPERLATIVE)
🪛 markdownlint-cli2 (0.18.1)
docs/content/integrations/frontend/claude-code.md
32-32: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
101-101: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
275-275: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
290-290: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
🔇 Additional comments (8)
internal/adapter/translator/anthropic/integration_test.go (1)
1150-1598: Benchmarks for hot paths look good.Good coverage for typical request/response shapes; alloc tracking and reset timing are correct. LGTM.
internal/adapter/translator/anthropic/response_test.go (2)
901-915: Benchmark added for TransformResponse hot path — aligns with repo guidance.Good to have allocation tracking here. LGTM.
7-15: Testify version requirement satisfied.The go.mod file confirms testify is pinned to v1.11.1, which meets the ≥ v1.11.1 requirement. No further action needed.
internal/adapter/translator/registry.go (1)
26-56: Registration guards and overwrite warning look solid.Nil translator and empty-name cases handled; sorted name reporting is helpful. LGTM.
docs/content/integrations/frontend/claude-code.md (1)
14-24: The suggested environment variable names do not align with Claude Code's actual configuration.Based on the web search results, the original review comment contains inaccurate recommendations:
- ANTHROPIC_API_BASE_URL: Claude Code does not support this variable name. For proxy providers, use provider-specific variables such as
ANTHROPIC_BEDROCK_BASE_URLorANTHROPIC_VERTEX_BASE_URLinstead.- ANTHROPIC_AUTH_TOKEN: This variable is valid for proxy authentication and remains appropriate for proxy scenarios.
- ANTHROPIC_API_KEY: This is the correct variable name for configuring the API key.
The standardisation suggested in the review comment should not be applied as written. Before making changes, verify what configuration the examples in the documentation are meant to demonstrate—whether they target the main Anthropic API or a specific proxy provider.
Likely an incorrect or invalid review comment.
examples/crush-vllm/olla.yaml (1)
41-47: Anthropic translator config looks correct.Only supported fields are present and match the code config struct.
internal/adapter/translator/anthropic/streaming_test_helpers.go (1)
22-37: LGTM – concise test translator config.Helper uses minimal, realistic config and error‑level logging.
internal/app/handlers/handler_translation.go (1)
39-41: Request body reuse across retries is correctly handled.The concern in your review is valid in principle:
bytes.NewReader()cannot be re-read after EOF without seeking. However, the retry implementation already addresses this.When the request reaches the proxy service, the
RetryHandlerexplicitly preserves the request body by reading it into memory (io.ReadAll) during the first call topreserveRequestBody(). For each retry attempt,resetRequestBodyForRetry()creates a freshbytes.NewReaderfrom the preserved byte slice, ensuring each attempt starts with a new readable reader at position 0.The code at handler_translation.go lines 39–41 correctly creates the initial body as
io.NopCloser(bytes.NewReader(openaiBody)), and the retry logic in core/retry.go properly handles body preservation and recreation across all retry attempts.
| randomBytes := make([]byte, 16) | ||
| if _, err := rand.Read(randomBytes); err != nil { | ||
| // crypto/rand shouldn't fail but just in case | ||
| t.logger.Warn("Failed to generate random bytes for message ID", "error", err) | ||
| return fmt.Sprintf("msg_01fallback%d", big.NewInt(0).SetBytes(randomBytes[:8]).Uint64()) | ||
| } | ||
|
|
||
| encoded := encodeBase58(randomBytes) | ||
| return fmt.Sprintf("msg_01%s", encoded) | ||
| } |
There was a problem hiding this comment.
Avoid ID collisions in fallback path.
If crypto/rand fails, current fallback can repeat across requests. Use time-based uniqueness.
Apply:
func (t *Translator) generateMessageID() string {
randomBytes := make([]byte, 16)
if _, err := rand.Read(randomBytes); err != nil {
// crypto/rand shouldn't fail but just in case
t.logger.Warn("Failed to generate random bytes for message ID", "error", err)
- return fmt.Sprintf("msg_01fallback%d", big.NewInt(0).SetBytes(randomBytes[:8]).Uint64())
+ return fmt.Sprintf("msg_01fallback_%d", time.Now().UnixNano())
}
encoded := encodeBase58(randomBytes)
return fmt.Sprintf("msg_01%s", encoded)
}Also add import:
+ "time"🤖 Prompt for AI Agents
internal/adapter/translator/anthropic/response.go around lines 194-203: the
fallback path when crypto/rand fails can produce repeating IDs (not unique);
change the fallback to include a time-based uniqueness (e.g., append or combine
time.Now().UnixNano() or a timestamp-encoded value with the existing
bytes/encoding) so each fallback ID is unique across requests, and add the time
import to the file's imports.
| // bounded read to prevent OOM attacks | ||
| // read up to maxMessageSize + 1 to detect oversized requests | ||
| limitedReader := io.LimitReader(r.Body, t.maxMessageSize+1) | ||
| body, err := io.ReadAll(limitedReader) | ||
| if err != nil { | ||
| // close the original body even on error | ||
| _ = r.Body.Close() | ||
| return nil, fmt.Errorf("failed to read request body: %w", err) | ||
| } | ||
|
|
||
| // detect oversized requests | ||
| if int64(len(body)) > t.maxMessageSize { | ||
| _ = r.Body.Close() | ||
| return nil, fmt.Errorf("request body exceeds maximum size of %d bytes", t.maxMessageSize) | ||
| } | ||
|
|
||
| // reset body for downstream handlers to re-read | ||
| r.Body = io.NopCloser(bytes.NewReader(body)) | ||
|
|
There was a problem hiding this comment.
Close the original request body before replacing it to avoid leaking the underlying reader.
On success, the original r.Body is never closed, then replaced with a new NopCloser. That can leak the underlying connection/FD.
Apply:
func (t *Translator) CountTokens(ctx context.Context, r *http.Request) (*translator.TokenCountResponse, error) {
// bounded read to prevent OOM attacks
// read up to maxMessageSize + 1 to detect oversized requests
- limitedReader := io.LimitReader(r.Body, t.maxMessageSize+1)
+ origBody := r.Body
+ limitedReader := io.LimitReader(origBody, t.maxMessageSize+1)
body, err := io.ReadAll(limitedReader)
if err != nil {
- // close the original body even on error
- _ = r.Body.Close()
+ _ = origBody.Close()
return nil, fmt.Errorf("failed to read request body: %w", err)
}
// detect oversized requests
if int64(len(body)) > t.maxMessageSize {
- _ = r.Body.Close()
+ _ = origBody.Close()
return nil, fmt.Errorf("request body exceeds maximum size of %d bytes", t.maxMessageSize)
}
// reset body for downstream handlers to re-read
- r.Body = io.NopCloser(bytes.NewReader(body))
+ _ = origBody.Close() // release original before replacing
+ r.Body = io.NopCloser(bytes.NewReader(body))🤖 Prompt for AI Agents
internal/adapter/translator/anthropic/token_count.go around lines 16 to 34:
after successfully reading the bounded body you must close the original r.Body
before replacing it to avoid leaking the underlying reader; change the flow so
that once io.ReadAll succeeds and the size check passes you call _ =
r.Body.Close() (or defer a close right after a successful read and before early
returns) and then set r.Body = io.NopCloser(bytes.NewReader(body)); ensure the
original body is closed on all success and error paths (already closed on
errors) so no file descriptor/connection is leaked.
| w.Header().Set(constants.HeaderContentType, constants.ContentTypeJSON) | ||
| a.copyOllaHeaders(recorder, w) | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Always set X‑Olla‑Model on success.
Mirror error/streaming paths and set the fallback header here too. As per coding guidelines.
Apply:
w.Header().Set(constants.HeaderContentType, constants.ContentTypeJSON)
a.copyOllaHeaders(recorder, w)
+ a.setModelHeaderIfMissing(w, pr.model)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In internal/app/handlers/handler_translation.go around lines 260-262, the
success path sets Content-Type and copies Olla headers but fails to ensure the
X-Olla-Model header is present; mirror the error/streaming paths by explicitly
setting X-Olla-Model on the response: check the recorder or copied headers for
an X-Olla-Model value and if empty set w.Header().Set(constants.HeaderOllaModel,
constants.OllaModelFallback) (or the existing fallback constant used elsewhere),
so the response always includes X-Olla-Model on success.
update token_count and translator strategies
feat: anthropic / message logger (development only)
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (4)
internal/adapter/translator/anthropic/response.go (3)
151-162: Still logging raw args preview; normalise/rename to avoid PII leakage.You reintroduced a preview under key raw_arguments. Either drop it entirely or rename to raw_arguments_preview and normalise whitespace. Example change below keeps only a short, normalised preview plus length. This avoids accidental full-arg logging and newline injection.
@@ - t.logger.Warn("Failed to parse tool arguments, using empty input", - "tool", name, - "tool_id", id, - "error", err, - "raw_arguments", util.TruncateString(argsStr, util.DefaultTruncateLengthPII), - "raw_arguments_len", len(argsStr)) + // Normalise newlines and log only a short redacted preview + length + preview := util.TruncateString(strings.ReplaceAll(strings.ReplaceAll(argsStr, "\n", " "), "\r", " "), util.DefaultTruncateLengthPII) + t.logger.Warn("Failed to parse tool arguments; using empty input", + "tool", name, + "tool_id", id, + "error", err, + "raw_arguments_preview", preview, + "raw_arguments_len", len(argsStr))Also add the import:
@@ -import ( +import ( "context" "crypto/rand" "encoding/json" "fmt" "math/big" "net/http" + "strings"
215-215: Typo in comment.“ambigious” → “ambiguous”.
-// encode bytes to base58, shorter and less ambigious than hex +// encode bytes to base58, shorter and less ambiguous than hex
204-213: Fallback message IDs can collide; make the fallback time-based.If crypto/rand fails, the current path can emit the same ID across requests. Use a timestamp-based fallback.
@@ -import ( +import ( "context" "crypto/rand" "encoding/json" "fmt" "math/big" "net/http" + "time" @@ if _, err := rand.Read(randomBytes); err != nil { // crypto/rand shouldn't fail but just in case t.logger.Warn("Failed to generate random bytes for message ID", "error", err) - return fmt.Sprintf("msg_01fallback%d", big.NewInt(0).SetBytes(randomBytes[:8]).Uint64()) + return fmt.Sprintf("msg_01fallback_%d", time.Now().UnixNano()) }internal/adapter/translator/anthropic/token_count.go (1)
16-34: Close original body before resetting to avoid leaks.On success, the original r.Body isn’t closed before replacement.
- limitedReader := io.LimitReader(r.Body, t.maxMessageSize+1) - body, err := io.ReadAll(limitedReader) + origBody := r.Body + limitedReader := io.LimitReader(origBody, t.maxMessageSize+1) + body, err := io.ReadAll(limitedReader) if err != nil { - // close the original body even on error - _ = r.Body.Close() + _ = origBody.Close() return nil, fmt.Errorf("failed to read request body: %w", err) } // detect oversized requests if int64(len(body)) > t.maxMessageSize { - _ = r.Body.Close() + _ = origBody.Close() return nil, fmt.Errorf("request body exceeds maximum size of %d bytes", t.maxMessageSize) } // reset body for downstream handlers to re-read - r.Body = io.NopCloser(bytes.NewReader(body)) + _ = origBody.Close() + r.Body = io.NopCloser(bytes.NewReader(body))
🧹 Nitpick comments (17)
docs/content/notes/anthropic-inspector.md (4)
52-64: Align config snippet with shipped defaults (path and safety).Docs show output_dir "logs/inspector", while config/config.yaml uses "logs/inspector/anthropic" and inspector is off by default. Align to avoid confusion and accidental enablement.
Apply:
- inspector: - enabled: true - output_dir: "logs/inspector" - session_header: "X-Session-ID" + inspector: + enabled: false + output_dir: "logs/inspector/anthropic" + session_header: "X-Session-ID"
81-85: Use the same output_dir in examples.Examples reference logs/inspector/... but config uses logs/inspector/anthropic. Unify to prevent users tailing the wrong path.
Apply:
-{"type":"request","ts":"2025-10-21T10:30:00Z","model":"qwen/qwen3-coder","body":{...}} -{"type":"response","ts":"2025-10-21T10:30:03Z","body":{...}} +{"type":"request","ts":"2025-10-21T10:30:00Z","model":"qwen/qwen3-coder","body":{...}} +{"type":"response","ts":"2025-10-21T10:30:03Z","body":{...}}-cat logs/inspector/2025-10-21/debug-session.jsonl | jq 'select(.type == "request") | .body' +cat logs/inspector/anthropic/2025-10-21/debug-session.jsonl | jq 'select(.type == "request") | .body'-cat logs/inspector/2025-10-21/debug-session.jsonl | jq 'select(.type == "response") | .body.content[0].text' +cat logs/inspector/anthropic/2025-10-21/debug-session.jsonl | jq 'select(.type == "response") | .body.content[0].text'-jq 'select(.type == "response") | .body.content[0].text' logs/inspector/2025-10-21/compare-qwen.jsonl > qwen.txt -jq 'select(.type == "response") | .body.content[0].text' logs/inspector/2025-10-21/compare-glm.jsonl > glm.txt +jq 'select(.type == "response") | .body.content[0].text' logs/inspector/anthropic/2025-10-21/compare-qwen.jsonl > qwen.txt +jq 'select(.type == "response") | .body.content[0].text' logs/inspector/anthropic/2025-10-21/compare-glm.jsonl > glm.txt-chmod 700 logs/inspector/ -chown olla:olla logs/inspector/ +chmod 700 logs/inspector/anthropic/ +chown olla:olla logs/inspector/anthropic/Also applies to: 90-98, 102-106, 112-115, 119-121
27-49: Add log retention/rotation guidance.Inspector dumps sensitive JSONL; recommend retention and rotation to reduce risk.
Consider adding:
- Retention: delete files older than N days.
- Rotation: daily directories already used; suggest logrotate/systemd-tmpfiles snippet.
- Encryption at rest if logs leave dev machines.
I can add a short section with sample logrotate and tmpfiles rules on request.
20-26: Calibrate “Minimal performance impact”.Disk I/O + JSON encoding + serialised writes add overhead. Suggest “Low additional overhead for development; not optimised for prod.” to set expectations.
internal/adapter/inspector/simple.go (4)
122-127: Emit UTC timestamps to match docs and ease analysis.Current code uses local time; docs and many tools expect Zulu.
Apply:
- Timestamp: time.Now().Format(time.RFC3339), + Timestamp: time.Now().UTC().Format(time.RFC3339),and
- Timestamp: time.Now().Format(time.RFC3339), + Timestamp: time.Now().UTC().Format(time.RFC3339),Also applies to: 140-144
94-101: Strengthen ancestor check; avoid string HasPrefix.Using Clean+HasPrefix is brittle for path ancestry checks. Prefer filepath.Rel.
Apply:
- // try and ensure the resolved path is within the output directory - // uses Clean to normalise paths and prevent ../ bypasses - if !strings.HasPrefix(filepath.Clean(absTestPath), filepath.Clean(absOutputDir)) { - return "", fmt.Errorf("session ID would escape output directory") - } + // ensure resolved path stays within outputDir + rel, err := filepath.Rel(absOutputDir, absTestPath) + if err != nil || filepath.IsAbs(rel) || strings.HasPrefix(rel, "..") { + return "", fmt.Errorf("session ID would escape output directory") + }
168-173: Typo in comment.“owtner” → “owner”. No behaviour change.
Apply:
- // Use 0700 permissions - owtner only access (not world-readable) + // Use 0700 permissions - owner only access (not world-readable)
46-57: Optional: guard against nil logger.If log is nil, methods will panic. Default to a no‑op/plain logger.
I can wire this with slog.Default() if desired; let me push a tiny patch on request.
internal/adapter/inspector/simple_test.go (2)
1-509: Add micro-benchmarks for inspector logging (per guidelines).Tests are good; please include Benchmark* for logging hot paths.
Append:
+func BenchmarkSimpleInspector_LogRequest(b *testing.B) { + tmpDir := b.TempDir() + log := logger.NewPlainStyledLogger(slog.Default()) + inspector := NewSimple(true, tmpDir, "X-Session-ID", log) + body := []byte(`{"bench":"req"}`) + b.ResetTimer() + for i := 0; i < b.N; i++ { + _ = inspector.LogRequest("bench-session", "bench-model", body) + } +} + +func BenchmarkSimpleInspector_LogResponse(b *testing.B) { + tmpDir := b.TempDir() + log := logger.NewPlainStyledLogger(slog.Default()) + inspector := NewSimple(true, tmpDir, "X-Session-ID", log) + body := []byte(`{"bench":"resp"}`) + b.ResetTimer() + for i := 0; i < b.N; i++ { + _ = inspector.LogResponse("bench-session", body) + } +}As per coding guidelines.
471-509: Minor: avoid mutating private fields in tests.Directly toggling warningLogged couples tests to internals. Better to assert one‑time behaviour without resetting, or expose a test‑only getter.
Happy to adjust tests after the thread‑safe warning change.
internal/adapter/translator/anthropic/response.go (2)
88-95: Handle array-style message content from OpenAI.OpenAI may return content as an array of parts. Concatenate text parts so you don’t drop content.
@@ func (t *Translator) convertResponseContent(message map[string]interface{}, finishReason string) ([]ContentBlock, string) { // Handle text content if textContent, ok := message["content"].(string); ok && textContent != "" { content = append(content, ContentBlock{ Type: contentTypeText, Text: textContent, }) } + // Handle array content parts (pick up text blocks) + if parts, ok := message["content"].([]interface{}); ok && len(parts) > 0 { + var sb strings.Builder + for _, p := range parts { + if pm, ok := p.(map[string]interface{}); ok { + if pt, _ := pm["type"].(string); pt == contentTypeText { + if tx, _ := pm["text"].(string); tx != "" { + sb.WriteString(tx) + } + } + } + } + if sb.Len() > 0 { + content = append(content, ContentBlock{Type: contentTypeText, Text: sb.String()}) + } + }
125-137: Cover additional finish reasons.Consider mapping content_filter (and similar) explicitly to end_turn to avoid surprises.
switch finishReason { case "stop": return "end_turn" case "tool_calls": return contentTypeToolUse case "length": return "max_tokens" + case "content_filter": + return "end_turn" default: return "end_turn" }internal/adapter/translator/anthropic/token_count.go (1)
35-38: Prefer strict JSON decoding for parity with TransformRequest.Use json.Decoder with DisallowUnknownFields to keep behaviour consistent and surface unknown-field errors.
- var req AnthropicRequest - if err := json.Unmarshal(body, &req); err != nil { + var req AnthropicRequest + dec := json.NewDecoder(bytes.NewReader(body)) + dec.DisallowUnknownFields() + if err := dec.Decode(&req); err != nil { return nil, fmt.Errorf("failed to parse request: %w", err) }internal/adapter/translator/anthropic/request.go (1)
16-27: Improve oversize detection UX.LimitReader will truncate and yield a JSON parse error (“unexpected EOF”) rather than a clear “too large” error. If Content-Length is present, reject early; otherwise use a LimitedReader with limit+1 and check post-read.
- limitedBody := io.LimitReader(r.Body, t.maxMessageSize) + if r.ContentLength > 0 && r.ContentLength > t.maxMessageSize { + return nil, fmt.Errorf("request body exceeds maximum size of %d bytes", t.maxMessageSize) + } + limitedBody := io.LimitReader(r.Body, t.maxMessageSize+1)internal/adapter/translator/anthropic/streaming.go (3)
263-264: Consider explicit handling of missing index.The type assertion silently defaults to
0if theindexfield is missing or not a float64. Whilst this might work in practice, it could mask malformed tool call deltas.- index, _ := toolCall["index"].(float64) - toolIndex := int(index) + index, ok := toolCall["index"].(float64) + if !ok { + return nil, false // require valid index + } + toolIndex := int(index)
335-335: Inconsistent error wrapping for flush operations.Lines 335 and 353 return
rc.Flush()directly, whilst other locations (e.g., lines 60, 176, 199, 222, 238, 298, 401, 450, 460) wrap flush errors with context usingfmt.Errorf("flush failed: %w", err). Consistent error messages improve debuggability.Apply this diff to lines 335 and 353:
- return rc.Flush() + if err := rc.Flush(); err != nil { + return fmt.Errorf("flush failed: %w", err) + } + return nilAlso applies to: 353-353
407-431: Add bounds checking for defensive programming.Line 413 accesses
state.contentBlocks[blockIndex]without verifying thatblockIndexis within bounds. Whilst the internal mapping should ensure this is safe, adding a bounds check would prevent potential panics if the state becomes inconsistent.for toolIndex, builder := range state.toolCallBuffers { argsJSON := builder.String() if argsJSON != "" { var input map[string]interface{} if err := json.Unmarshal([]byte(argsJSON), &input); err == nil { // use mapping to find the correct block, avoids linear search if blockIndex, found := state.toolIndexToBlock[toolIndex]; found { + // bounds check before accessing slice + if blockIndex < 0 || blockIndex >= len(state.contentBlocks) { + t.logger.Error("Block index out of bounds", + "tool_index", toolIndex, + "block_index", blockIndex, + "blocks_len", len(state.contentBlocks)) + continue + } // validate block type before updating to catch any state inconsistencies if state.contentBlocks[blockIndex].Type != contentTypeToolUse {
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
config/config.yaml(2 hunks)docs/content/notes/anthropic-inspector.md(1 hunks)docs/content/notes/overview.md(1 hunks)docs/mkdocs.yml(5 hunks)internal/adapter/inspector/simple.go(1 hunks)internal/adapter/inspector/simple_test.go(1 hunks)internal/adapter/translator/anthropic/constants.go(1 hunks)internal/adapter/translator/anthropic/request.go(1 hunks)internal/adapter/translator/anthropic/response.go(1 hunks)internal/adapter/translator/anthropic/streaming.go(1 hunks)internal/adapter/translator/anthropic/token_count.go(1 hunks)internal/adapter/translator/anthropic/translator.go(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- docs/content/notes/overview.md
🧰 Additional context used
📓 Path-based instructions (1)
{internal,pkg}/**/*_test.go
📄 CodeRabbit inference engine (CLAUDE.md)
Include Go benchmarks (Benchmark* functions) for critical paths, proxy engine comparisons, pooling efficiency, and circuit breaker behaviour
Files:
internal/adapter/inspector/simple_test.go
🧬 Code graph analysis (7)
internal/adapter/translator/anthropic/token_count.go (3)
internal/adapter/translator/anthropic/translator.go (1)
Translator(17-23)internal/adapter/translator/types.go (1)
TokenCountResponse(58-62)internal/adapter/translator/anthropic/types.go (3)
AnthropicRequest(7-21)ContentBlock(60-69)AnthropicMessage(53-56)
internal/adapter/inspector/simple_test.go (2)
internal/logger/styled_plain.go (1)
NewPlainStyledLogger(18-22)internal/adapter/inspector/simple.go (2)
NewSimple(47-58)Entry(17-22)
internal/adapter/translator/anthropic/request.go (3)
internal/adapter/translator/anthropic/translator.go (1)
Translator(17-23)internal/adapter/translator/types.go (1)
TransformedRequest(30-37)internal/adapter/translator/anthropic/types.go (3)
AnthropicRequest(7-21)AnthropicMessage(53-56)ContentBlock(60-69)
internal/adapter/translator/anthropic/translator.go (6)
internal/logger/styled.go (1)
StyledLogger(12-35)pkg/pool/lite_pool.go (2)
Pool(39-42)NewLitePool(44-68)internal/adapter/inspector/simple.go (2)
Simple(26-33)NewSimple(47-58)internal/config/types.go (1)
AnthropicTranslatorConfig(179-183)internal/adapter/translator/anthropic/types.go (1)
ContentBlock(60-69)internal/core/constants/content.go (2)
HeaderContentType(57-57)ContentTypeJSON(7-7)
internal/adapter/inspector/simple.go (1)
internal/logger/styled.go (1)
StyledLogger(12-35)
internal/adapter/translator/anthropic/response.go (3)
internal/adapter/translator/anthropic/translator.go (1)
Translator(17-23)internal/adapter/translator/anthropic/types.go (3)
AnthropicResponse(106-115)ContentBlock(60-69)AnthropicUsage(118-121)internal/util/parsing.go (2)
TruncateString(23-34)DefaultTruncateLengthPII(8-8)
internal/adapter/translator/anthropic/streaming.go (4)
internal/adapter/translator/anthropic/types.go (3)
ContentBlock(60-69)AnthropicResponse(106-115)AnthropicUsage(118-121)internal/adapter/translator/anthropic/translator.go (1)
Translator(17-23)internal/core/constants/content.go (1)
HeaderContentType(57-57)internal/util/parsing.go (2)
TruncateString(23-34)DefaultTruncateLengthPII(8-8)
🪛 LanguageTool
docs/content/notes/anthropic-inspector.md
[style] ~12-~12: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 1754 characters long)
Context: ...s and troubleshoot integration issues. !!! danger "DEVELOPMENT USE ONLY - DO NOT E...
(EN_EXCESSIVE_EXCLAMATION)
🪛 markdownlint-cli2 (0.18.1)
docs/content/notes/anthropic-inspector.md
29-29: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
54-54: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
70-70: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
81-81: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
90-90: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
96-96: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
102-102: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
112-112: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
119-119: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Cross-Platform Build Validation (windows-latest)
🔇 Additional comments (10)
docs/mkdocs.yml (1)
111-195: Nav targets verified—all files present.Verification confirmed that all 63 nav entries in
docs/mkdocs.ymlreference existing files indocs/content/. No broken links or missing documentation files detected.internal/adapter/translator/anthropic/translator.go (1)
84-90: **** —GetSessionHeader()is properly exposed and exists.The inspector's
GetSessionHeader()method is publicly defined ininternal/adapter/inspector/simple.go(line 202) and returns the configured session header. This is not tight coupling to internals; it is legitimate use of a public API. The code compiles and functions correctly. The verification concern does not apply.Likely an incorrect or invalid review comment.
internal/adapter/translator/anthropic/streaming.go (8)
16-29: LGTM!The
StreamingStatestructure is well-designed and addresses past review feedback. Usingmap[int]for tool call buffers and maintaining thetoolIndexToBlockmapping ensures correct tool argument association.
32-71: LGTM!The entry point correctly sets up SSE headers, initializes state, and ensures
message_startis sent even for empty streams. Good defensive programming.
74-100: Good scanner configuration and resilient error handling.The increased buffer size (1 MiB max) correctly addresses the "token too long" issue from past reviews. Continuing on line processing errors (line 91) is a reasonable trade-off for streaming resilience, though it means some stream corruption may go unnoticed by the caller.
103-167: LGTM!Line processing correctly addresses the privacy concern from past reviews by logging truncated data and length. Token extraction and routing logic are sound.
184-246: LGTM!Content delta handling correctly manages block lifecycle, closing previous blocks of different types before starting new ones. State updates are consistent.
357-389: LGTM!Tool call delta handling correctly initializes buffers before use (lines 369-371), ensuring subsequent operations are safe. Block initialization and argument buffering logic are sound.
392-471: Finalisation logic correctly addresses past review feedback.The use of
toolIndexToBlockmapping (lines 413-422) efficiently locates the correct block without linear search. Type validation and comprehensive error logging help catch state inconsistencies.
473-541: LGTM!Inspector logging correctly reconstructs the complete response from streaming state. SSE event formatting follows the standard and matches Anthropic's expected format.
| anthropic: | ||
| enabled: true | ||
| max_message_size: 10485760 # 10MB - Anthropic API limit | ||
| # !! WARNING: Do not enable inspector in production without reviewing data privacy !! | ||
| # Anthropic messages may contain sensitive user data. | ||
| # This release is quite taxing and unoptimised, so use with caution & only for debug. | ||
| inspector: | ||
| enabled: false # Enable request/response logging | ||
| output_dir: "logs/inspector/anthropic" # Where to write logs | ||
| session_header: "X-Session-ID" # Header to use for session grouping |
There was a problem hiding this comment.
Default experimental translator to disabled; unify inspector path.
Enabling Anthropic translation by default can surprise existing deployments. Safer to keep it opt‑in while experimental.
Apply:
anthropic:
- enabled: true
+ enabled: false
max_message_size: 10485760 # 10MB - Anthropic API limit
@@
inspector:
- enabled: false # Enable request/response logging
- output_dir: "logs/inspector/anthropic" # Where to write logs
+ enabled: false # Enable request/response logging
+ output_dir: "logs/inspector/anthropic" # Where to write logs
session_header: "X-Session-ID" # Header to use for session grouping📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| anthropic: | |
| enabled: true | |
| max_message_size: 10485760 # 10MB - Anthropic API limit | |
| # !! WARNING: Do not enable inspector in production without reviewing data privacy !! | |
| # Anthropic messages may contain sensitive user data. | |
| # This release is quite taxing and unoptimised, so use with caution & only for debug. | |
| inspector: | |
| enabled: false # Enable request/response logging | |
| output_dir: "logs/inspector/anthropic" # Where to write logs | |
| session_header: "X-Session-ID" # Header to use for session grouping | |
| anthropic: | |
| enabled: false | |
| max_message_size: 10485760 # 10MB - Anthropic API limit | |
| # !! WARNING: Do not enable inspector in production without reviewing data privacy !! | |
| # Anthropic messages may contain sensitive user data. | |
| # This release is quite taxing and unoptimised, so use with caution & only for debug. | |
| inspector: | |
| enabled: false # Enable request/response logging | |
| output_dir: "logs/inspector/anthropic" # Where to write logs | |
| session_header: "X-Session-ID" # Header to use for session grouping |
🤖 Prompt for AI Agents
config/config.yaml lines 137-146: the Anthropics translator is enabled by
default and the inspector path is inconsistent; change anthropic.enabled to
false to make the experimental translator opt-in, and set inspector.output_dir
to a unified inspector path (e.g., logs/inspector/translator/anthropic) so all
translator inspectors share a consistent base directory; update the YAML values
accordingly and keep inspector.enabled false by default.
OLLA-287: skip empty assistant messages
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (8)
docs/content/integrations/frontend/claude-code.md (6)
32-40: Migrate emphasis to proper markdown heading.Line 32 uses emphasis (
**What you get via Olla**) as a heading. For better markdown structure and consistency, convert to an actual heading.-**What you get via Olla** +## What you get via Olla
101-101: Specify language for fenced code block.The architecture diagram code block (line 101) lacks a language identifier, which impairs syntax highlighting and markdown compliance.
-``` +```text
275-275: Migrate emphasis to proper markdown heading.Line 275 uses emphasis (
**Option A: Environment Variables (Recommended)**) as a heading. Convert to an actual markdown heading for consistency.-**Option A: Environment Variables (Recommended)** +### Option A: Environment Variables (Recommended)
290-290: Migrate emphasis to proper markdown heading.Line 290 uses emphasis (
**Option B: Configuration File**) as a heading. Convert to an actual markdown heading for consistency.-**Option B: Configuration File** +### Option B: Configuration File
342-342: Add missing determiner for grammatical correctness.Line 342 reads "Routes to backend with fewest active requests" — it should include "the" before "fewest" for proper grammar.
-- **least-connections**: Routes to backend with fewest active requests +- **least-connections**: Routes to backend with the fewest active requests
845-845: Add missing determiner for grammatical correctness.Line 845 reads "Configure highest priority for local backends" — it should include "the" before "highest" for proper grammar.
-- **Local-first**: Configure highest priority for local backends +- **Local-first**: Configure the highest priority for local backendsinternal/adapter/translator/anthropic/request.go (2)
33-37: Consider preserving original request bytes to avoid re-marshaling overhead.Re-marshaling the decoded request creates additional processing overhead. If the original request body could be read and preserved (e.g., using io.TeeReader or reading into a buffer first), this re-marshaling step could be eliminated.
However, this is a minor optimisation that can be deferred, as the current implementation prioritises strict validation via the decoder's DisallowUnknownFields setting.
227-253: Optimise string concatenation in text content accumulation.Line 245 uses
textContent += textin a loop, which is inefficient for multiple text blocks as it creates intermediate string allocations.Apply this diff to use strings.Builder:
func (t *Translator) convertAssistantMessage(blocks []interface{}) map[string]interface{} { msg := map[string]interface{}{ "role": "assistant", } - var textContent string + var textBuilder strings.Builder var toolCalls []map[string]interface{} for _, block := range blocks { blockMap, ok := block.(map[string]interface{}) if !ok { continue } blockType, _ := blockMap["type"].(string) switch blockType { case contentTypeText: if text, ok := blockMap["text"].(string); ok { - textContent += text + textBuilder.WriteString(text) } case contentTypeToolUse: toolCall := t.convertToolUse(blockMap) if toolCall != nil { toolCalls = append(toolCalls, toolCall) } } } + + textContent := textBuilder.String() // openai wants null content when only tool calls present if textContent != "" {
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
config/config.yaml(2 hunks)docs/content/integrations/frontend/claude-code.md(1 hunks)internal/adapter/inspector/simple.go(1 hunks)internal/adapter/translator/anthropic/request.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- config/config.yaml
🧰 Additional context used
🧬 Code graph analysis (2)
internal/adapter/translator/anthropic/request.go (3)
internal/adapter/translator/anthropic/translator.go (1)
Translator(17-23)internal/adapter/translator/types.go (1)
TransformedRequest(30-37)internal/adapter/translator/anthropic/types.go (3)
AnthropicRequest(7-21)AnthropicMessage(53-56)ContentBlock(60-69)
internal/adapter/inspector/simple.go (1)
internal/logger/styled.go (1)
StyledLogger(12-35)
🪛 LanguageTool
docs/content/integrations/frontend/claude-code.md
[uncategorized] ~34-~34: The preposition ‘for’ seems more likely in this position.
Context: ...** * Consistent Anthropic Messages API at /olla/anthropic/v1/* * Load-balancing...
(AI_HYDRA_LEO_REPLACE_AT_FOR)
[grammar] ~342-~342: A determiner may be missing.
Context: ...t-connections**: Routes to backend with fewest active requests **Timeout Configuratio...
(THE_SUPERLATIVE)
[misspelling] ~539-~539: This word is normally spelled as one.
Context: ...soning abilities - qwen3:32b - Strong multi-task performance **Performance vs Quality T...
(EN_COMPOUNDS_MULTI_TASK)
[uncategorized] ~591-~591: Possible missing preposition found.
Context: ...og ### No Models Available **List models from Olla**:bash curl http://localh...
(AI_HYDRA_LEO_MISSING_OF)
[grammar] ~845-~845: A determiner may be missing.
Context: ... Workflow - Local-first: Configure highest priority for local backends - **Fallbac...
(THE_SUPERLATIVE)
🪛 markdownlint-cli2 (0.18.1)
docs/content/integrations/frontend/claude-code.md
32-32: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
101-101: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
275-275: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
290-290: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test (1.24)
🔇 Additional comments (6)
internal/adapter/translator/anthropic/request.go (6)
107-130: Well-structured message conversion with proper system prompt injection.The implementation correctly handles system prompt injection as the first OpenAI message and properly pre-allocates capacity for the common case. Error propagation is clean.
133-171: Clean message conversion with proper role-specific handling.The implementation correctly handles both string and block-based content, with a sensible fallback for single-map content. Empty messages are filtered out appropriately.
174-224: Robust user message conversion with proper separation of text and tool results.The implementation correctly splits text content and tool results into separate OpenAI messages as required by the OpenAI API. Tool result content is appropriately handled for both string and structured formats.
266-270: Excellent fix for empty assistant messages.The OLLA-287 fix correctly guards against emitting empty assistant messages when there's neither text content nor tool calls. This directly addresses the concern raised in the previous review.
274-298: Solid tool_use to tool_call conversion with appropriate error handling.The implementation correctly maps Anthropic's tool_use format to OpenAI's tool_call structure, with sensible fallback to an empty object when input marshaling fails.
301-319: ****The callback function passed to
forEachSystemContentBlockalways returnsniland never propagates errors. TheforEachSystemBlockimplementation only returns non-nil errors when the callback itself returns an error—which cannot occur here. Discarding the error is appropriate in this context.Likely an incorrect or invalid review comment.
This PR brings in Anthropic Support so you can use tools like Claude Code to route through to OpenAI models locally.
Uses a new Translator concept in Olla (from Scout) that takes an incoming Anthropic request, converts it to OpenAI format, forwards through to Olla Pipeline (just like the
openaiendpoint) then upon the backend returning an OpenAI message, converts it back to Anthropic format.This has been quite a tricky endeavour and is very much an experimental release.
Summary by CodeRabbit
New Features
Documentation
Other