Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds end-to-end embeddings: new core embedding types and Provider.Embeddings, provider implementations/wrappers, router and POST /v1/embeddings route and handler, model-validation middleware, usage extraction for embeddings, many test/mocks updated, and docs/Makefile tooling additions. Changes
Sequence DiagramsequenceDiagram
participant Client as Client
participant HTTP as HTTP/Handler
participant Middleware as ModelValidation
participant Router as Router
participant Provider as Provider
participant External as ExternalAPI
Client->>HTTP: POST /v1/embeddings (EmbeddingRequest)
HTTP->>Middleware: request enters (body)
Middleware->>Middleware: extract model, validate, set provider type in context
Middleware-->>HTTP: pass with context
HTTP->>Router: Router.Embeddings(ctx, req)
Router->>Router: lookup provider for model
Router->>Provider: Provider.Embeddings(ctx, req)
Provider->>External: POST /embeddings (provider-specific)
External-->>Provider: EmbeddingResponse
Provider-->>Router: EmbeddingResponse
Router-->>HTTP: EmbeddingResponse
HTTP->>HTTP: ExtractFromEmbeddingResponse -> usage entry (optional cost calc)
HTTP-->>Client: JSON EmbeddingResponse
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested Labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
Makefile (1)
1-16:⚠️ Potential issue | 🟠 MajorSet an explicit default goal; plain
makenow runs tool installation.At Line 13,
install-toolsis the first concrete target, so runningmakecan execute global tool installs by default, which is a risky behavior change for CI/dev workflows.🔧 Proposed fix
-.PHONY: build run clean tidy test test-e2e test-integration test-contract test-all lint lint-fix record-api swagger install-tools +.DEFAULT_GOAL := all +.PHONY: all build run clean tidy test test-e2e test-integration test-contract test-all lint lint-fix record-api swagger install-tools + +all: build🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 1 - 16, The Makefile's default target is currently the first concrete target (install-tools), causing plain `make` to run tool installation; change the default goal to a safe build target by adding a .DEFAULT_GOAL directive (for example: set .DEFAULT_GOAL := build) near the top of the Makefile so that `make` runs the intended safe target (e.g., build) instead of `install-tools`; keep existing .PHONY and target names (install-tools, build, etc.) intact and place the .DEFAULT_GOAL line before target definitions.tests/integration/setup_test.go (1)
279-299: 🧹 Nitpick | 🔵 Trivial
MockLLMServerhas no handler for/v1/embeddings.
TestProvider.Embeddingscurrently returns an error before reaching the mock, so existing tests pass. However, any future integration test that exercises the embeddings path through a mock-backed provider will receive a404 Not Foundfrom the mock server's defaulthttp.NotFoundhandler, likely producing a confusing error. Adding a stub handler now would keep the mock consistent with the real server surface.💡 Suggested addition
case "/v1/responses": handleResponses(w, r, body) case "/v1/models": handleModels(w) + case "/v1/embeddings": + handleEmbeddings(w, r, body) default: http.NotFound(w, r) }func handleEmbeddings(w http.ResponseWriter, _ *http.Request, body []byte) { var req struct { Model string `json:"model"` } _ = json.Unmarshal(body, &req) if req.Model == "" { req.Model = "text-embedding-3-small" } w.Header().Set("Content-Type", "application/json") w.WriteHeader(http.StatusOK) response := fmt.Sprintf(`{ "object": "list", "data": [{"object": "embedding", "embedding": [0.1, 0.2, 0.3], "index": 0}], "model": %q, "usage": {"prompt_tokens": 5, "total_tokens": 5} }`, req.Model) _, _ = w.Write([]byte(response)) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/setup_test.go` around lines 279 - 299, NewMockLLMServer currently lacks a handler for the embeddings endpoint, causing 404s; add a new function handleEmbeddings(w http.ResponseWriter, r *http.Request, body []byte) that unmarshals the request to read a "model" field (default to "text-embedding-3-small" if empty), sets Content-Type to application/json and returns a 200 JSON payload with a simple embedding array, model name and usage fields; then update the switch in NewMockLLMServer's handler to route "/v1/embeddings" to handleEmbeddings so tests exercising embeddings hit the mock instead of the default NotFound.internal/providers/gemini/gemini.go (1)
183-198:⚠️ Potential issue | 🟠 MajorEmbedding-only models are now reachable for
ChatCompletion, which will fail.The updated condition
(supportsGenerate || supportsEmbed) && strings.HasPrefix(modelID, "gemini-")registers models that supportembedContentbut notgenerateContent(e.g.,gemini-embedding-exp-03-07) into the model registry. Because the router routes allProvidermethods — includingChatCompletionandStreamChatCompletion— to the same registered provider, a user who sends a chat request to an embedding-only model will receive an upstream API error instead of a clear gateway-levelinvalid_requesterror.Consider keeping the registration predicate stricter and instead explicitly routing embedding-capable models only for the
Embeddingsmethod. If that's architecturally complex, at minimum theChatCompletionimplementation should detect incompatible models early and return a clearGatewayError.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/gemini/gemini.go` around lines 183 - 198, The current registration logic in internal/providers/gemini/gemini.go uses (supportsGenerate || supportsEmbed) with modelID to register Gemini models, which causes embedding-only models (detected via gm.SupportedMethods / supportsEmbed) to be exposed to ChatCompletion/StreamChatCompletion and fail; change the predicate so only models that support generateContent (supportsGenerate) are registered for general chat providers, and avoid registering embedding-only models for ChatCompletion/StreamChatCompletion (or alternatively register them only for Embeddings); additionally, update the ChatCompletion/StreamChatCompletion handlers to check model capabilities by inspecting gm.SupportedMethods or model metadata at request time and return a clear GatewayError/invalid_request when a chat request targets an embedding-only model.internal/providers/router_test.go (1)
125-169: 🧹 Nitpick | 🔵 Trivial
TestRouterEmptyLookupis missing anEmbeddingssub-test.All other operations (
ChatCompletion,StreamChatCompletion,ListModels,Responses,StreamResponses) are exercised insideTestRouterEmptyLookup.Embeddingsis covered separately inTestRouterEmbeddings_EmptyLookup, but keeping parity here would make the empty-registry contract explicit and discoverable alongside the others.♻️ Suggested addition inside `TestRouterEmptyLookup`
t.Run("StreamResponses returns error", func(t *testing.T) { _, err := router.StreamResponses(context.Background(), &core.ResponsesRequest{Model: "any"}) if !errors.Is(err, ErrRegistryNotInitialized) { t.Errorf("expected ErrRegistryNotInitialized, got: %v", err) } }) + + t.Run("Embeddings returns error", func(t *testing.T) { + _, err := router.Embeddings(context.Background(), &core.EmbeddingRequest{Model: "any"}) + if !errors.Is(err, ErrRegistryNotInitialized) { + t.Errorf("expected ErrRegistryNotInitialized, got: %v", err) + } + }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/router_test.go` around lines 125 - 169, Add an Embeddings sub-test (and optional StreamEmbeddings sub-test for parity) inside TestRouterEmptyLookup that calls router.Embeddings(context.Background(), &core.EmbeddingsRequest{Model: "any"}) (and router.StreamEmbeddings(...) for streaming) and asserts the returned error is ErrRegistryNotInitialized; update TestRouterEmptyLookup to include t.Run("Embeddings returns error", ...) (and t.Run("StreamEmbeddings returns error", ...) if adding the stream check) so the empty-registry contract for embeddings is explicitly verified alongside the other operations.internal/providers/ollama/ollama.go (1)
54-63:⚠️ Potential issue | 🔴 CriticalEmbeddings can use an invalid or wrong base URL in non-default setups.
NewWithHTTPClientdoes not initializenativeBaseURL, andSetBaseURLdoes not keepnativeBaseURLin sync.Embeddingsthen builds a client fromp.nativeBaseURL, which can be empty or stale, causing failed or misrouted requests.🔧 Suggested fix
import ( "context" "io" "net/http" + "strings" "time" @@ type Provider struct { client *llmclient.Client + nativeClient *llmclient.Client apiKey string // Accepted but ignored by Ollama hooks llmclient.Hooks nativeBaseURL string } @@ func New(apiKey string, opts providers.ProviderOptions) core.Provider { p := &Provider{ apiKey: apiKey, hooks: opts.Hooks, nativeBaseURL: defaultNativeBaseURL, } @@ p.client = llmclient.New(cfg, p.setHeaders) + nativeCfg := llmclient.DefaultConfig("ollama", p.nativeBaseURL) + nativeCfg.Hooks = opts.Hooks + p.nativeClient = llmclient.New(nativeCfg, p.setHeaders) return p } @@ func NewWithHTTPClient(apiKey string, httpClient *http.Client, hooks llmclient.Hooks) *Provider { @@ - p := &Provider{apiKey: apiKey} + p := &Provider{ + apiKey: apiKey, + hooks: hooks, + nativeBaseURL: defaultNativeBaseURL, + } cfg := llmclient.DefaultConfig("ollama", defaultBaseURL) cfg.Hooks = hooks p.client = llmclient.NewWithHTTPClient(httpClient, cfg, p.setHeaders) + nativeCfg := llmclient.DefaultConfig("ollama", p.nativeBaseURL) + nativeCfg.Hooks = hooks + p.nativeClient = llmclient.NewWithHTTPClient(httpClient, nativeCfg, p.setHeaders) return p } @@ func (p *Provider) SetBaseURL(url string) { p.client.SetBaseURL(url) + base := strings.TrimSuffix(strings.TrimSuffix(url, "/"), "/v1") + p.nativeBaseURL = base + if p.nativeClient != nil { + p.nativeClient.SetBaseURL(base) + } } @@ func (p *Provider) Embeddings(ctx context.Context, req *core.EmbeddingRequest) (*core.EmbeddingResponse, error) { - nativeCfg := llmclient.DefaultConfig("ollama", p.nativeBaseURL) - nativeCfg.Hooks = p.hooks - nativeClient := llmclient.New(nativeCfg, p.setHeaders) - ollamaReq := ollamaEmbedRequest{ Model: req.Model, Input: req.Input, } @@ - err := nativeClient.Do(ctx, llmclient.Request{ + err := p.nativeClient.Do(ctx, llmclient.Request{ Method: http.MethodPost, Endpoint: "/api/embed", Body: ollamaReq, }, &ollamaResp)Also applies to: 66-68, 157-160
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/ollama/ollama.go` around lines 54 - 63, NewWithHTTPClient fails to initialize Provider.nativeBaseURL (and SetBaseURL doesn't keep it in sync), causing Embeddings to build clients with an empty/stale base URL; update NewWithHTTPClient to set p.nativeBaseURL = defaultBaseURL (or the passed base URL if present) when creating the Provider, and modify SetBaseURL to update both p.baseURL and p.nativeBaseURL together so Embeddings (which uses p.nativeBaseURL) always gets the correct URL; reference NewWithHTTPClient, SetBaseURL, Provider.nativeBaseURL and Embeddings when making these changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@GETTING_STARTED.md`:
- Around line 560-568: Add a new "Create Embeddings" example and a brief note
about the /v1/embeddings endpoint to GETTING_STARTED.md: insert a curl example
showing POST to /v1/embeddings with JSON body containing "model" (e.g.,
"text-embedding-3-small") and "input" (text), and add a new tip entry in the
Tips list (e.g., "Embeddings: Use /v1/embeddings with provider-supported
embedding models. Unsupported providers/models return a clear error response")
so the docs surface the new API; update the surrounding numbering/sequence to
include this new example and ensure references to model names and the
/v1/embeddings endpoint match the gateway's naming used elsewhere.
In `@internal/core/types.go`:
- Around line 233-242: Add Go doc comments for the exported types EmbeddingData
and EmbeddingUsage: place a brief comment directly above each type definition
that starts with the type name and describes its purpose and key fields (e.g.,
EmbeddingData holds a single embedding vector with its index and object type;
EmbeddingUsage tracks token usage counts). Ensure comments follow Go doc style
(start with the type name, concise sentence) so linters and documentation
generation are satisfied.
In `@internal/providers/gemini/gemini.go`:
- Around line 212-225: The Embeddings method currently returns resp without
ensuring resp.Model is set; mirror the ChatCompletion behavior by adding a
fallback that sets resp.Model = req.Model when resp.Model is empty. Modify the
Provider.Embeddings function to check resp.Model after the p.client.Do call and
assign req.Model if needed (similar to the logic in ChatCompletion) so usage
entries won't have an empty model field.
In `@internal/providers/groq/groq.go`:
- Around line 122-135: The Embeddings method in Provider does not apply the same
defensive fallback as ChatCompletion: if resp.Model is empty it should be set to
req.Model to avoid recording empty model usage; update the Embeddings function
to check resp.Model (after the p.client.Do call and before returning) and assign
req.Model when resp.Model == "" (mirroring the logic in ChatCompletion) so usage
entries always have a model.
In `@internal/providers/openai/openai.go`:
- Around line 199-212: The Embeddings handler (Provider.Embeddings) currently
returns resp without setting resp.Model when the upstream omits it; add the same
fallback used in ChatCompletion and Responses by assigning resp.Model =
req.Model before returning so callers receive the request model if the provider
response lacks a model field.
In `@internal/providers/xai/xai.go`:
- Around line 139-152: The Embeddings method currently returns resp.Model empty
when xAI omits it; update Provider.Embeddings to mirror ChatCompletion and
Responses by checking if resp.Model == "" after the Do call and, if so, set
resp.Model = req.Model before setting resp.Provider and returning; reference the
Embeddings method, resp.Model, and req.Model to locate and implement the
fallback.
In `@internal/server/handlers_test.go`:
- Line 551: The tests call NewHandler with the wrong arity (3 args) while
NewHandler now requires 4; update every test call site (e.g., the handler :=
NewHandler(mock, nil, nil) line and the other similar invocations) to pass the
new fourth parameter (use nil or an appropriate mock/stub) so each NewHandler
call supplies four arguments matching the constructor signature.
In `@internal/usage/extractor.go`:
- Around line 154-171: The ExtractFromEmbeddingResponse function currently omits
pricing and thus leaves cost fields nil; modify ExtractFromEmbeddingResponse to
accept a variadic pricing ...*core.ModelPricing parameter (matching other
extractors) and invoke CalculateGranularCost(resp.Model,
resp.Usage.PromptTokens, pricing...) (or the existing signature used elsewhere)
to populate the cost-related fields on the returned *UsageEntry so embedding
usage entries include computed costs; update the returned UsageEntry to include
the calculated cost fields.
- Around line 154-171: ExtractFromEmbeddingResponse currently omits the pricing
parameter required to compute cost; update the function signature to accept
pricing (e.g., pricing ...*core.ModelPricing) like
ExtractFromChatResponse/ExtractFromResponsesResponse/ExtractFromSSEUsage, thread
that pricing into the created UsageEntry (compute Cost using the same helper or
logic used by the other extractors) and ensure the function uses pricing to
populate the UsageEntry pricing/cost fields before returning; reference function
name ExtractFromEmbeddingResponse, the UsageEntry struct, and core.ModelPricing
when making these changes.
---
Outside diff comments:
In `@internal/providers/gemini/gemini.go`:
- Around line 183-198: The current registration logic in
internal/providers/gemini/gemini.go uses (supportsGenerate || supportsEmbed)
with modelID to register Gemini models, which causes embedding-only models
(detected via gm.SupportedMethods / supportsEmbed) to be exposed to
ChatCompletion/StreamChatCompletion and fail; change the predicate so only
models that support generateContent (supportsGenerate) are registered for
general chat providers, and avoid registering embedding-only models for
ChatCompletion/StreamChatCompletion (or alternatively register them only for
Embeddings); additionally, update the ChatCompletion/StreamChatCompletion
handlers to check model capabilities by inspecting gm.SupportedMethods or model
metadata at request time and return a clear GatewayError/invalid_request when a
chat request targets an embedding-only model.
In `@internal/providers/ollama/ollama.go`:
- Around line 54-63: NewWithHTTPClient fails to initialize
Provider.nativeBaseURL (and SetBaseURL doesn't keep it in sync), causing
Embeddings to build clients with an empty/stale base URL; update
NewWithHTTPClient to set p.nativeBaseURL = defaultBaseURL (or the passed base
URL if present) when creating the Provider, and modify SetBaseURL to update both
p.baseURL and p.nativeBaseURL together so Embeddings (which uses
p.nativeBaseURL) always gets the correct URL; reference NewWithHTTPClient,
SetBaseURL, Provider.nativeBaseURL and Embeddings when making these changes.
In `@internal/providers/router_test.go`:
- Around line 125-169: Add an Embeddings sub-test (and optional StreamEmbeddings
sub-test for parity) inside TestRouterEmptyLookup that calls
router.Embeddings(context.Background(), &core.EmbeddingsRequest{Model: "any"})
(and router.StreamEmbeddings(...) for streaming) and asserts the returned error
is ErrRegistryNotInitialized; update TestRouterEmptyLookup to include
t.Run("Embeddings returns error", ...) (and t.Run("StreamEmbeddings returns
error", ...) if adding the stream check) so the empty-registry contract for
embeddings is explicitly verified alongside the other operations.
In `@Makefile`:
- Around line 1-16: The Makefile's default target is currently the first
concrete target (install-tools), causing plain `make` to run tool installation;
change the default goal to a safe build target by adding a .DEFAULT_GOAL
directive (for example: set .DEFAULT_GOAL := build) near the top of the Makefile
so that `make` runs the intended safe target (e.g., build) instead of
`install-tools`; keep existing .PHONY and target names (install-tools, build,
etc.) intact and place the .DEFAULT_GOAL line before target definitions.
In `@tests/integration/setup_test.go`:
- Around line 279-299: NewMockLLMServer currently lacks a handler for the
embeddings endpoint, causing 404s; add a new function handleEmbeddings(w
http.ResponseWriter, r *http.Request, body []byte) that unmarshals the request
to read a "model" field (default to "text-embedding-3-small" if empty), sets
Content-Type to application/json and returns a 200 JSON payload with a simple
embedding array, model name and usage fields; then update the switch in
NewMockLLMServer's handler to route "/v1/embeddings" to handleEmbeddings so
tests exercising embeddings hit the mock instead of the default NotFound.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (28)
DEVELOPMENT.mdGETTING_STARTED.mdMakefileinternal/admin/handler_test.gointernal/auditlog/stream_wrapper.gointernal/core/interfaces.gointernal/core/types.gointernal/guardrails/provider.gointernal/guardrails/provider_test.gointernal/providers/anthropic/anthropic.gointernal/providers/anthropic/anthropic_test.gointernal/providers/factory_test.gointernal/providers/gemini/gemini.gointernal/providers/groq/groq.gointernal/providers/ollama/ollama.gointernal/providers/openai/openai.gointernal/providers/registry_race_test.gointernal/providers/registry_test.gointernal/providers/router.gointernal/providers/router_test.gointernal/providers/xai/xai.gointernal/server/handlers.gointernal/server/handlers_test.gointernal/server/http.gointernal/usage/extractor.gotests/e2e/main_test.gotests/integration/setup_test.gotests/stress/stress_test.go
add embeddings
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@README.md`:
- Line 79: Change the table cell that currently contains "—" to an explicit
unsupported marker (e.g., "🚫" or "❌") so the README reflects that Anthropic
embeddings return an invalid_request_error rather than being "Not applicable";
update the legend wording where that symbol is defined to use "unsupported" (or
equivalent) instead of "not-applicable" and ensure any caption or note
referencing the /v1/embeddings surface mentions Anthropic is unsupported at
runtime.
SantiagoDePolonia
left a comment
There was a problem hiding this comment.
Looks good! A few comments left
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@DEVELOPMENT.md`:
- Around line 66-70: Remove the duplicate "pre-commit install" instruction in
the later section and keep the canonical setup instructions already documented
earlier; specifically, delete or merge the repeated code block containing the
literal command "pre-commit install" so the documentation only mentions this
step once (refer to the existing canonical pre-commit setup block and ensure any
surrounding context points to that single entry).
In `@Makefile`:
- Line 1: The .PHONY list in the Makefile is missing the required all target
which triggers checkmake's minphony warning; update the .PHONY declaration (the
line defining .PHONY) to include all among the other phony targets (e.g., add
the symbol all to the existing .PHONY list) so the Makefile declares the default
target as phony.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (5)
DEVELOPMENT.mdMakefileinternal/core/interfaces.gointernal/server/handlers_test.gointernal/usage/extractor.go
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/providers/ollama/ollama.go (1)
54-63:⚠️ Potential issue | 🟠 Major
NewWithHTTPClientdoesn't initializehooksandnativeBaseURL.The
New()constructor initializeshooksandnativeBaseURL, butNewWithHTTPClient()leaves them at zero values. This will causeEmbeddings()to fail or behave unexpectedly when called on a provider created viaNewWithHTTPClient().🐛 Proposed fix
func NewWithHTTPClient(apiKey string, httpClient *http.Client, hooks llmclient.Hooks) *Provider { if httpClient == nil { httpClient = http.DefaultClient } - p := &Provider{apiKey: apiKey} + p := &Provider{ + apiKey: apiKey, + hooks: hooks, + nativeBaseURL: defaultNativeBaseURL, + } cfg := llmclient.DefaultConfig("ollama", defaultBaseURL) cfg.Hooks = hooks p.client = llmclient.NewWithHTTPClient(httpClient, cfg, p.setHeaders) return p }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/ollama/ollama.go` around lines 54 - 63, NewWithHTTPClient currently leaves Provider.hooks and Provider.nativeBaseURL unset; set p.hooks = hooks and p.nativeBaseURL = defaultBaseURL (same initialization as New) before creating the llmclient so Embeddings() and other methods that rely on those fields work correctly—update the NewWithHTTPClient function to initialize those two fields (and keep cfg.Hooks = hooks and p.setHeaders usage as-is).
♻️ Duplicate comments (1)
internal/providers/ollama/ollama.go (1)
21-24: 🧹 Nitpick | 🔵 TrivialExtract duplicated root URL to a single constant.
Both constants share
http://localhost:11434. Extracting the root reduces duplication and simplifies future host/port changes.♻️ Proposed fix
const ( - defaultBaseURL = "http://localhost:11434/v1" - defaultNativeBaseURL = "http://localhost:11434" + defaultRootURL = "http://localhost:11434" + defaultBaseURL = defaultRootURL + "/v1" + defaultNativeBaseURL = defaultRootURL )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/ollama/ollama.go` around lines 21 - 24, Two constants defaultBaseURL and defaultNativeBaseURL duplicate the same root "http://localhost:11434"; introduce a single root constant (e.g., defaultRootURL or defaultHostPort) and build defaultBaseURL and defaultNativeBaseURL from it (e.g., defaultBaseURL = defaultRootURL + "/v1", defaultNativeBaseURL = defaultRootURL) to remove duplication and simplify future changes; update any references to defaultBaseURL/defaultNativeBaseURL in this file to use the new composed constants.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@GETTING_STARTED.md`:
- Around line 418-455: Add an optional "Provider-Specific Models" subsection
under the Embeddings section showing non-OpenAI examples (e.g., Gemini model
"text-embedding-004" and Ollama model "nomic-embed-text") so readers see
multi-provider usage; insert this new subsection immediately after the "With
Custom Dimensions" example and before the "Supported by:" line, using the same
curl format as the existing examples to demonstrate model and input fields.
In `@internal/providers/ollama/ollama.go`:
- Around line 157-159: The Embeddings() method currently creates a new
llmclient.Client each call (nativeCfg := llmclient.DefaultConfig("ollama",
p.nativeBaseURL); nativeCfg.Hooks = p.hooks; nativeClient :=
llmclient.New(nativeCfg, p.setHeaders)), which is inefficient; modify the
Provider struct to add a nativeClient field, initialize that nativeClient once
in New() using the same DefaultConfig and llmclient.New logic (mirroring how
p.client is created), and then update Embeddings() to reuse p.nativeClient
(lazy-initialize in New() or on first use if preferred) instead of constructing
a new client per request.
In `@README.md`:
- Line 107: Remove the unused legend entry "— Not applicable" from the README
legend line that currently reads "✅ Supported 🚧 Coming soon ❌ Unsupported —
Not applicable"; edit the legend to only include the symbols actually used in
the provider capabilities table (e.g., "✅ Supported 🚧 Coming soon ❌
Unsupported") so the legend matches the table content.
---
Outside diff comments:
In `@internal/providers/ollama/ollama.go`:
- Around line 54-63: NewWithHTTPClient currently leaves Provider.hooks and
Provider.nativeBaseURL unset; set p.hooks = hooks and p.nativeBaseURL =
defaultBaseURL (same initialization as New) before creating the llmclient so
Embeddings() and other methods that rely on those fields work correctly—update
the NewWithHTTPClient function to initialize those two fields (and keep
cfg.Hooks = hooks and p.setHeaders usage as-is).
---
Duplicate comments:
In `@internal/providers/ollama/ollama.go`:
- Around line 21-24: Two constants defaultBaseURL and defaultNativeBaseURL
duplicate the same root "http://localhost:11434"; introduce a single root
constant (e.g., defaultRootURL or defaultHostPort) and build defaultBaseURL and
defaultNativeBaseURL from it (e.g., defaultBaseURL = defaultRootURL + "/v1",
defaultNativeBaseURL = defaultRootURL) to remove duplication and simplify future
changes; update any references to defaultBaseURL/defaultNativeBaseURL in this
file to use the new composed constants.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (8)
GETTING_STARTED.mdREADME.mdinternal/core/types.gointernal/providers/gemini/gemini.gointernal/providers/groq/groq.gointernal/providers/ollama/ollama.gointernal/providers/openai/openai.gointernal/providers/xai/xai.go
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/providers/factory.go (1)
80-89:⚠️ Potential issue | 🟠 MajorGuard against nil providers before wrapping.
At Line 88,
newProviderWrappercan wrap a nil provider if a constructor returns nil, causing a runtime panic on first method call. Fail fast inCreatewhenp == nil.🐛 Proposed fix
p := builder(cfg.APIKey, opts) + if p == nil { + return nil, fmt.Errorf("provider constructor returned nil for type: %s", cfg.Type) + } if cfg.BaseURL != "" { if setter, ok := p.(interface{ SetBaseURL(string) }); ok { setter.SetBaseURL(cfg.BaseURL) } } return newProviderWrapper(p, cfg.Type), nil🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/factory.go` around lines 80 - 89, After calling the constructor builder(cfg.APIKey, opts) in Create, check whether the returned provider value p is nil and return a non-nil error instead of proceeding to newProviderWrapper; specifically, after p := builder(cfg.APIKey, opts) add a nil-guard that returns an error (with context including cfg.Type) when p == nil so you never call newProviderWrapper(nil, cfg.Type); keep the existing BaseURL setter logic but only run it after the nil check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/providers/gemini/gemini.go`:
- Line 190: The conditional that gates Gemini model registration only allows
modelIDs with the "gemini-" prefix: update the check around supportsGenerate,
supportsEmbed and modelID so that embedding models without the "gemini-" prefix
are allowed when supportsEmbed is true. Concretely, change the condition that
uses strings.HasPrefix(modelID, "gemini-") so it permits either
strings.HasPrefix(modelID, "gemini-") OR (supportsEmbed && the modelID matches
your embedding model naming convention, e.g., "text-embedding-"), keeping the
existing behavior for supportsGenerate and for gemini-prefixed models (refer to
supportsGenerate, supportsEmbed, modelID, and strings.HasPrefix in gemini.go).
In `@internal/providers/ollama/ollama.go`:
- Around line 75-80: The native base URL derivation in Provider.SetBaseURL must
handle inputs with an extra trailing slash (e.g., "http://host:11434/v1/");
update the logic in SetBaseURL so that after calling p.client.SetBaseURL(url)
you normalize the input (trim any trailing slashes first, then remove a trailing
"/v1" if present) and pass that normalized root URL into
p.nativeClient.SetBaseURL; modify the code around Provider.SetBaseURL
(referencing p.client.SetBaseURL and p.nativeClient.SetBaseURL) to perform
trim+trim-suffix instead of only stripping the exact "/v1" suffix.
In `@internal/server/handlers.go`:
- Around line 234-236: The embeddings usage log is created without resolving
pricing so cost fields stay empty; update the call to
usage.ExtractFromEmbeddingResponse to pass the configured pricing resolver (e.g.
usage.ExtractFromEmbeddingResponse(resp, requestID, providerType,
"/v1/embeddings", h.usageLogger.Config().PricingResolver)) so the returned
usageEntry has pricing populated before writing with h.usageLogger; locate the
call around h.usageLogger and replace the 4-arg call with the 5-arg variant
using the logger's pricing resolver.
In `@internal/server/model_validation.go`:
- Around line 19-69: Add Go doc comments for the exported symbols
ModelValidation, GetProviderType, and ModelCtx: document ModelValidation as an
echo middleware that validates model interactions (reads and preserves request
body, peeks "model" from JSON, rejects missing/unsupported models via
core.NewInvalidRequestError, sets providerType in the context under
providerTypeKey, enriches auditlog entry, and attaches X-Request-ID into the
request context); document GetProviderType as returning the provider type string
stored on the echo.Context (or empty string if missing); and document ModelCtx
as returning the request context and the provider type string for handlers. Keep
comments concise, start with the function name, and describe parameters/returns
and observable side effects.
In `@Makefile`:
- Around line 15-18: Update the install-tools Makefile rule to use a portable
check and pinned versions: replace uses of which with command -v and avoid
installing golangci-lint@latest and unversioned pre-commit; instead pin specific
versions for golangci-lint and pre-commit in the install commands so installs
are reproducible (refer to the install-tools target and the golangci-lint and
pre-commit install commands).
---
Outside diff comments:
In `@internal/providers/factory.go`:
- Around line 80-89: After calling the constructor builder(cfg.APIKey, opts) in
Create, check whether the returned provider value p is nil and return a non-nil
error instead of proceeding to newProviderWrapper; specifically, after p :=
builder(cfg.APIKey, opts) add a nil-guard that returns an error (with context
including cfg.Type) when p == nil so you never call newProviderWrapper(nil,
cfg.Type); keep the existing BaseURL setter logic but only run it after the nil
check.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (18)
DEVELOPMENT.mdMakefileREADME.mdinternal/providers/anthropic/anthropic.gointernal/providers/factory.gointernal/providers/factory_test.gointernal/providers/gemini/gemini.gointernal/providers/groq/groq.gointernal/providers/ollama/ollama.gointernal/providers/ollama/ollama_test.gointernal/providers/openai/openai.gointernal/providers/wrapper.gointernal/providers/xai/xai.gointernal/server/handlers.gointernal/server/handlers_test.gointernal/server/http.gointernal/server/model_validation.gointernal/server/model_validation_test.go
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
addresses an issue with unmarshalling
fixes ollama initialization issue
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/providers/ollama/ollama.go`:
- Around line 186-194: The json.Marshal error in the loop over
ollamaResp.Embeddings is being ignored; update the loop that builds
core.EmbeddingData (iterating ollamaResp.Embeddings) to capture the error (raw,
err := json.Marshal(emb)), and on error return or propagate a descriptive error
(or log and return) instead of discarding it so callers know embedding
serialization failed; ensure the change affects the function that constructs the
data slice (the block creating data := make([]core.EmbeddingData, ...)) and
consistently handles the error path.
In `@internal/server/handlers_test.go`:
- Around line 513-604: The embeddings tests do not cover the branch where
usageLogger.Enabled is true and a pricingResolver is used, so add a new test
(e.g., TestEmbeddings_WithUsageAndPricing) that constructs a mockProvider and a
Handler with a usageLogger.Enabled=true and a mock pricingResolver that returns
known price data, send a valid embeddings POST to handler.Embeddings and assert
the response and/or logs include the populated usage/pricing fields (or assert
the pricingResolver mock was invoked) to ensure cost population is exercised;
reference the existing TestEmbeddings/TestEmbeddings_ProviderReturnsError setup
and use the same request shape and mockProvider embeddingResponse to locate
where to plug in usageLogger and pricingResolver.
In `@internal/server/handlers.go`:
- Around line 103-104: After switching to ModelCtx(c) the provider call contexts
lost the request correlation; retrieve the request ID from
c.Request().Header.Get("X-Request-ID") (as already done into requestID) and
re-wrap the context returned by ModelCtx(c) with core.WithRequestID(requestID)
before passing it to provider calls in ChatCompletion, Responses, and Embeddings
so provider-side calls regain per-request correlation (i.e., replace usages of
ctx from ModelCtx(c) with core.WithRequestID(ctx, requestID) when invoking the
provider).
In `@Makefile`:
- Around line 15-18: The Makefile's install-tools recipe uses backticked
prefixes "`@command`" and "`@echo`" which invoke command substitution and break
the target; update the install-tools target to remove backticks so the prefixes
are literal (use `@command` -v ... || (...) and `@echo` "All tools are ready")
ensuring the existing checks/install flows for golangci-lint and pre-commit run
as intended; look for the install-tools target and the occurrences of `@command`
and `@echo` to correct.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (11)
Makefileinternal/core/types.gointernal/providers/factory_test.gointernal/providers/gemini/gemini.gointernal/providers/ollama/ollama.gointernal/providers/ollama/ollama_test.gointernal/providers/router.gointernal/providers/router_test.gointernal/server/handlers.gointernal/server/handlers_test.gointernal/server/model_validation.go
Add
/v1/embeddingsendpoint with per-provider supportAdds an OpenAI-compatible embeddings endpoint to the gateway. Each provider implements
Embeddings()on thecore.Providerinterface — providers with embedding models forward the request upstream, while providers without support (Anthropic) return400 invalid_request_error.Supported providers
/embeddingstext-embedding-3-small,text-embedding-3-largegemini-embedding-001,text-embedding-004nomic-embed-text-v1_5/api/embednomic-embed-text,all-minilmWhat's included
core.Providerinterface —Embeddings()as a first-class method alongsideChatCompletionandResponsescoretypes —EmbeddingRequest,EmbeddingResponse,EmbeddingData,EmbeddingUsagePOST /v1/embeddingswith model validation, audit logging, and usage trackingListModels()now includesembedContentmodels in the registryExtractFromEmbeddingResponse()for prompt token tracking/v1/embeddingsrecognized as a model interaction pathdocs/EMBEDDINGS_ARCHITECTURE.mdSummary by CodeRabbit
New Features
POST /v1/embeddings) for text embedding generation.Documentation
Chores