Skip to content

feat(gom-47): implement embeddings#104

Merged
giorgi-pro merged 32 commits intomainfrom
me/gom-47-embedding-endpoints-anthropic-and-openai
Feb 28, 2026
Merged

feat(gom-47): implement embeddings#104
giorgi-pro merged 32 commits intomainfrom
me/gom-47-embedding-endpoints-anthropic-and-openai

Conversation

@giorgi-pro
Copy link
Copy Markdown
Collaborator

@giorgi-pro giorgi-pro commented Feb 25, 2026

Add /v1/embeddings endpoint with per-provider support

Adds an OpenAI-compatible embeddings endpoint to the gateway. Each provider implements Embeddings() on the core.Provider interface — providers with embedding models forward the request upstream, while providers without support (Anthropic) return 400 invalid_request_error.

Supported providers

Provider Mechanism Example models
OpenAI Pass-through /embeddings text-embedding-3-small, text-embedding-3-large
Gemini Pass-through (OpenAI-compatible) gemini-embedding-001, text-embedding-004
Groq Pass-through nomic-embed-text-v1_5
xAI Pass-through xAI embedding models
Ollama Format conversion via /api/embed nomic-embed-text, all-minilm
Anthropic Returns 400 N/A — suggests Voyage AI

What's included

  • core.Provider interfaceEmbeddings() as a first-class method alongside ChatCompletion and Responses
  • core typesEmbeddingRequest, EmbeddingResponse, EmbeddingData, EmbeddingUsage
  • HTTP routePOST /v1/embeddings with model validation, audit logging, and usage tracking
  • 6 provider implementations — pass-through for OpenAI/Gemini/Groq/xAI, format conversion for Ollama, 400 stub for Anthropic
  • Gemini model discoveryListModels() now includes embedContent models in the registry
  • Guardrails — pass-through (no guardrails applied to embedding requests)
  • Usage trackingExtractFromEmbeddingResponse() for prompt token tracking
  • Audit logging/v1/embeddings recognized as a model interaction path
  • 10 tests across handler, router, guardrails, and Anthropic provider
  • Architecture docdocs/EMBEDDINGS_ARCHITECTURE.md

Summary by CodeRabbit

New Features

  • Added embeddings API endpoint (POST /v1/embeddings) for text embedding generation.
  • Expanded embeddings support across multiple AI providers with provider-specific availability.

Documentation

  • Enhanced development setup documentation with simplified tool installation and testing commands.
  • Updated getting started guide with embeddings usage examples and expanded provider capabilities documentation.
  • Updated provider support matrix to reflect embeddings availability.

Chores

  • Improved build tooling with automated tool installation and version metadata tracking.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 25, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • ✅ Review completed - (🔄 Check again to review again)
📝 Walkthrough

Walkthrough

Adds end-to-end embeddings: new core embedding types and Provider.Embeddings, provider implementations/wrappers, router and POST /v1/embeddings route and handler, model-validation middleware, usage extraction for embeddings, many test/mocks updated, and docs/Makefile tooling additions.

Changes

Cohort / File(s) Summary
Docs & Tooling
DEVELOPMENT.md, GETTING_STARTED.md, Makefile, README.md
Docs reformatted/expanded, added Embeddings coverage, new make install-tools and tool-install guidance, and README updated with POST /v1/embeddings.
Core types & interfaces
internal/core/types.go, internal/core/interfaces.go
Added EmbeddingRequest/EmbeddingResponse/EmbeddingData/EmbeddingUsage types and Provider.Embeddings(ctx, *EmbeddingRequest) (*EmbeddingResponse, error).
Server: routes & handlers
internal/server/http.go, internal/server/handlers.go, internal/server/handlers_test.go, internal/server/model_validation.go, internal/server/model_validation_test.go
Registered /v1/embeddings, added Handler.Embeddings, introduced ModelValidation middleware and helper funcs (GetProviderType/ModelCtx); tests added/updated for validation and embeddings.
Routing, guardrails & audit
internal/providers/router.go, internal/providers/router_test.go, internal/guardrails/provider.go, internal/guardrails/provider_test.go, internal/auditlog/stream_wrapper.go
Router routes Embeddings, GuardedProvider delegates Embeddings to inner provider, audit wrapper marks /v1/embeddings as model interaction; tests updated.
Provider implementations & factory
internal/providers/*
internal/providers/openai/openai.go, internal/providers/groq/groq.go, internal/providers/gemini/gemini.go, internal/providers/ollama/ollama.go, internal/providers/xai/xai.go, internal/providers/anthropic/anthropic.go, internal/providers/factory.go, internal/providers/wrapper.go
Added Embeddings implementations (OpenAI, Groq, Gemini, Ollama native, xAI); Anthropic returns explicit unsupported error; providerFactory now wraps providers with providerWrapper that annotates Response.Provider; wrapper forwards methods and sets Provider on Chat/Responses/Embeddings results.
Usage tracking
internal/usage/extractor.go
Added ExtractFromEmbeddingResponse to convert EmbeddingResponse into UsageEntry and compute costs when pricing provided.
Tests & mocks
multiple internal/*_test.go, tests/*
internal/admin/handler_test.go, internal/providers/*_test.go, internal/providers/registry*.go, tests/e2e/main_test.go, tests/integration/setup_test.go, tests/stress/stress_test.go
Many test mocks updated to implement Embeddings (often returning unsupported errors or stubbed responses); new embedding-focused tests across router, handler, guardrails, providers, and provider-specific tests (Ollama integration tests added).

Sequence Diagram

sequenceDiagram
    participant Client as Client
    participant HTTP as HTTP/Handler
    participant Middleware as ModelValidation
    participant Router as Router
    participant Provider as Provider
    participant External as ExternalAPI

    Client->>HTTP: POST /v1/embeddings (EmbeddingRequest)
    HTTP->>Middleware: request enters (body)
    Middleware->>Middleware: extract model, validate, set provider type in context
    Middleware-->>HTTP: pass with context
    HTTP->>Router: Router.Embeddings(ctx, req)
    Router->>Router: lookup provider for model
    Router->>Provider: Provider.Embeddings(ctx, req)
    Provider->>External: POST /embeddings (provider-specific)
    External-->>Provider: EmbeddingResponse
    Provider-->>Router: EmbeddingResponse
    Router-->>HTTP: EmbeddingResponse
    HTTP->>HTTP: ExtractFromEmbeddingResponse -> usage entry (optional cost calc)
    HTTP-->>Client: JSON EmbeddingResponse
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested Labels

enhancement

Poem

🐰 I stitched new vectors into the trail,

POSTs hop in, providers set sail,
Router guides, handler hums the tune,
Usage counts beneath the moon,
Hop, embed, and ship — code blooms soon!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 23.08% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: implementing embeddings support. It is specific, directly related to the primary objective of the PR, and follows conventional commit format.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch me/gom-47-embedding-endpoints-anthropic-and-openai

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
Makefile (1)

1-16: ⚠️ Potential issue | 🟠 Major

Set an explicit default goal; plain make now runs tool installation.

At Line 13, install-tools is the first concrete target, so running make can execute global tool installs by default, which is a risky behavior change for CI/dev workflows.

🔧 Proposed fix
-.PHONY: build run clean tidy test test-e2e test-integration test-contract test-all lint lint-fix record-api swagger install-tools
+.DEFAULT_GOAL := all
+.PHONY: all build run clean tidy test test-e2e test-integration test-contract test-all lint lint-fix record-api swagger install-tools
+
+all: build
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 1 - 16, The Makefile's default target is currently the
first concrete target (install-tools), causing plain `make` to run tool
installation; change the default goal to a safe build target by adding a
.DEFAULT_GOAL directive (for example: set .DEFAULT_GOAL := build) near the top
of the Makefile so that `make` runs the intended safe target (e.g., build)
instead of `install-tools`; keep existing .PHONY and target names
(install-tools, build, etc.) intact and place the .DEFAULT_GOAL line before
target definitions.
tests/integration/setup_test.go (1)

279-299: 🧹 Nitpick | 🔵 Trivial

MockLLMServer has no handler for /v1/embeddings.

TestProvider.Embeddings currently returns an error before reaching the mock, so existing tests pass. However, any future integration test that exercises the embeddings path through a mock-backed provider will receive a 404 Not Found from the mock server's default http.NotFound handler, likely producing a confusing error. Adding a stub handler now would keep the mock consistent with the real server surface.

💡 Suggested addition
 		case "/v1/responses":
 			handleResponses(w, r, body)
 		case "/v1/models":
 			handleModels(w)
+		case "/v1/embeddings":
+			handleEmbeddings(w, r, body)
 		default:
 			http.NotFound(w, r)
 		}
func handleEmbeddings(w http.ResponseWriter, _ *http.Request, body []byte) {
    var req struct {
        Model string `json:"model"`
    }
    _ = json.Unmarshal(body, &req)
    if req.Model == "" {
        req.Model = "text-embedding-3-small"
    }
    w.Header().Set("Content-Type", "application/json")
    w.WriteHeader(http.StatusOK)
    response := fmt.Sprintf(`{
        "object": "list",
        "data": [{"object": "embedding", "embedding": [0.1, 0.2, 0.3], "index": 0}],
        "model": %q,
        "usage": {"prompt_tokens": 5, "total_tokens": 5}
    }`, req.Model)
    _, _ = w.Write([]byte(response))
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/setup_test.go` around lines 279 - 299, NewMockLLMServer
currently lacks a handler for the embeddings endpoint, causing 404s; add a new
function handleEmbeddings(w http.ResponseWriter, r *http.Request, body []byte)
that unmarshals the request to read a "model" field (default to
"text-embedding-3-small" if empty), sets Content-Type to application/json and
returns a 200 JSON payload with a simple embedding array, model name and usage
fields; then update the switch in NewMockLLMServer's handler to route
"/v1/embeddings" to handleEmbeddings so tests exercising embeddings hit the mock
instead of the default NotFound.
internal/providers/gemini/gemini.go (1)

183-198: ⚠️ Potential issue | 🟠 Major

Embedding-only models are now reachable for ChatCompletion, which will fail.

The updated condition (supportsGenerate || supportsEmbed) && strings.HasPrefix(modelID, "gemini-") registers models that support embedContent but not generateContent (e.g., gemini-embedding-exp-03-07) into the model registry. Because the router routes all Provider methods — including ChatCompletion and StreamChatCompletion — to the same registered provider, a user who sends a chat request to an embedding-only model will receive an upstream API error instead of a clear gateway-level invalid_request error.

Consider keeping the registration predicate stricter and instead explicitly routing embedding-capable models only for the Embeddings method. If that's architecturally complex, at minimum the ChatCompletion implementation should detect incompatible models early and return a clear GatewayError.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/gemini/gemini.go` around lines 183 - 198, The current
registration logic in internal/providers/gemini/gemini.go uses (supportsGenerate
|| supportsEmbed) with modelID to register Gemini models, which causes
embedding-only models (detected via gm.SupportedMethods / supportsEmbed) to be
exposed to ChatCompletion/StreamChatCompletion and fail; change the predicate so
only models that support generateContent (supportsGenerate) are registered for
general chat providers, and avoid registering embedding-only models for
ChatCompletion/StreamChatCompletion (or alternatively register them only for
Embeddings); additionally, update the ChatCompletion/StreamChatCompletion
handlers to check model capabilities by inspecting gm.SupportedMethods or model
metadata at request time and return a clear GatewayError/invalid_request when a
chat request targets an embedding-only model.
internal/providers/router_test.go (1)

125-169: 🧹 Nitpick | 🔵 Trivial

TestRouterEmptyLookup is missing an Embeddings sub-test.

All other operations (ChatCompletion, StreamChatCompletion, ListModels, Responses, StreamResponses) are exercised inside TestRouterEmptyLookup. Embeddings is covered separately in TestRouterEmbeddings_EmptyLookup, but keeping parity here would make the empty-registry contract explicit and discoverable alongside the others.

♻️ Suggested addition inside `TestRouterEmptyLookup`
 	t.Run("StreamResponses returns error", func(t *testing.T) {
 		_, err := router.StreamResponses(context.Background(), &core.ResponsesRequest{Model: "any"})
 		if !errors.Is(err, ErrRegistryNotInitialized) {
 			t.Errorf("expected ErrRegistryNotInitialized, got: %v", err)
 		}
 	})
+
+	t.Run("Embeddings returns error", func(t *testing.T) {
+		_, err := router.Embeddings(context.Background(), &core.EmbeddingRequest{Model: "any"})
+		if !errors.Is(err, ErrRegistryNotInitialized) {
+			t.Errorf("expected ErrRegistryNotInitialized, got: %v", err)
+		}
+	})
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/router_test.go` around lines 125 - 169, Add an Embeddings
sub-test (and optional StreamEmbeddings sub-test for parity) inside
TestRouterEmptyLookup that calls router.Embeddings(context.Background(),
&core.EmbeddingsRequest{Model: "any"}) (and router.StreamEmbeddings(...) for
streaming) and asserts the returned error is ErrRegistryNotInitialized; update
TestRouterEmptyLookup to include t.Run("Embeddings returns error", ...) (and
t.Run("StreamEmbeddings returns error", ...) if adding the stream check) so the
empty-registry contract for embeddings is explicitly verified alongside the
other operations.
internal/providers/ollama/ollama.go (1)

54-63: ⚠️ Potential issue | 🔴 Critical

Embeddings can use an invalid or wrong base URL in non-default setups.

NewWithHTTPClient does not initialize nativeBaseURL, and SetBaseURL does not keep nativeBaseURL in sync. Embeddings then builds a client from p.nativeBaseURL, which can be empty or stale, causing failed or misrouted requests.

🔧 Suggested fix
 import (
 	"context"
 	"io"
 	"net/http"
+	"strings"
 	"time"
@@
 type Provider struct {
 	client        *llmclient.Client
+	nativeClient  *llmclient.Client
 	apiKey        string // Accepted but ignored by Ollama
 	hooks         llmclient.Hooks
 	nativeBaseURL string
 }
@@
 func New(apiKey string, opts providers.ProviderOptions) core.Provider {
 	p := &Provider{
 		apiKey:        apiKey,
 		hooks:         opts.Hooks,
 		nativeBaseURL: defaultNativeBaseURL,
 	}
@@
 	p.client = llmclient.New(cfg, p.setHeaders)
+	nativeCfg := llmclient.DefaultConfig("ollama", p.nativeBaseURL)
+	nativeCfg.Hooks = opts.Hooks
+	p.nativeClient = llmclient.New(nativeCfg, p.setHeaders)
 	return p
 }
@@
 func NewWithHTTPClient(apiKey string, httpClient *http.Client, hooks llmclient.Hooks) *Provider {
@@
-	p := &Provider{apiKey: apiKey}
+	p := &Provider{
+		apiKey:        apiKey,
+		hooks:         hooks,
+		nativeBaseURL: defaultNativeBaseURL,
+	}
 	cfg := llmclient.DefaultConfig("ollama", defaultBaseURL)
 	cfg.Hooks = hooks
 	p.client = llmclient.NewWithHTTPClient(httpClient, cfg, p.setHeaders)
+	nativeCfg := llmclient.DefaultConfig("ollama", p.nativeBaseURL)
+	nativeCfg.Hooks = hooks
+	p.nativeClient = llmclient.NewWithHTTPClient(httpClient, nativeCfg, p.setHeaders)
 	return p
 }
@@
 func (p *Provider) SetBaseURL(url string) {
 	p.client.SetBaseURL(url)
+	base := strings.TrimSuffix(strings.TrimSuffix(url, "/"), "/v1")
+	p.nativeBaseURL = base
+	if p.nativeClient != nil {
+		p.nativeClient.SetBaseURL(base)
+	}
 }
@@
 func (p *Provider) Embeddings(ctx context.Context, req *core.EmbeddingRequest) (*core.EmbeddingResponse, error) {
-	nativeCfg := llmclient.DefaultConfig("ollama", p.nativeBaseURL)
-	nativeCfg.Hooks = p.hooks
-	nativeClient := llmclient.New(nativeCfg, p.setHeaders)
-
 	ollamaReq := ollamaEmbedRequest{
 		Model: req.Model,
 		Input: req.Input,
 	}
@@
-	err := nativeClient.Do(ctx, llmclient.Request{
+	err := p.nativeClient.Do(ctx, llmclient.Request{
 		Method:   http.MethodPost,
 		Endpoint: "/api/embed",
 		Body:     ollamaReq,
 	}, &ollamaResp)

Also applies to: 66-68, 157-160

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/ollama/ollama.go` around lines 54 - 63, NewWithHTTPClient
fails to initialize Provider.nativeBaseURL (and SetBaseURL doesn't keep it in
sync), causing Embeddings to build clients with an empty/stale base URL; update
NewWithHTTPClient to set p.nativeBaseURL = defaultBaseURL (or the passed base
URL if present) when creating the Provider, and modify SetBaseURL to update both
p.baseURL and p.nativeBaseURL together so Embeddings (which uses
p.nativeBaseURL) always gets the correct URL; reference NewWithHTTPClient,
SetBaseURL, Provider.nativeBaseURL and Embeddings when making these changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@GETTING_STARTED.md`:
- Around line 560-568: Add a new "Create Embeddings" example and a brief note
about the /v1/embeddings endpoint to GETTING_STARTED.md: insert a curl example
showing POST to /v1/embeddings with JSON body containing "model" (e.g.,
"text-embedding-3-small") and "input" (text), and add a new tip entry in the
Tips list (e.g., "Embeddings: Use /v1/embeddings with provider-supported
embedding models. Unsupported providers/models return a clear error response")
so the docs surface the new API; update the surrounding numbering/sequence to
include this new example and ensure references to model names and the
/v1/embeddings endpoint match the gateway's naming used elsewhere.

In `@internal/core/types.go`:
- Around line 233-242: Add Go doc comments for the exported types EmbeddingData
and EmbeddingUsage: place a brief comment directly above each type definition
that starts with the type name and describes its purpose and key fields (e.g.,
EmbeddingData holds a single embedding vector with its index and object type;
EmbeddingUsage tracks token usage counts). Ensure comments follow Go doc style
(start with the type name, concise sentence) so linters and documentation
generation are satisfied.

In `@internal/providers/gemini/gemini.go`:
- Around line 212-225: The Embeddings method currently returns resp without
ensuring resp.Model is set; mirror the ChatCompletion behavior by adding a
fallback that sets resp.Model = req.Model when resp.Model is empty. Modify the
Provider.Embeddings function to check resp.Model after the p.client.Do call and
assign req.Model if needed (similar to the logic in ChatCompletion) so usage
entries won't have an empty model field.

In `@internal/providers/groq/groq.go`:
- Around line 122-135: The Embeddings method in Provider does not apply the same
defensive fallback as ChatCompletion: if resp.Model is empty it should be set to
req.Model to avoid recording empty model usage; update the Embeddings function
to check resp.Model (after the p.client.Do call and before returning) and assign
req.Model when resp.Model == "" (mirroring the logic in ChatCompletion) so usage
entries always have a model.

In `@internal/providers/openai/openai.go`:
- Around line 199-212: The Embeddings handler (Provider.Embeddings) currently
returns resp without setting resp.Model when the upstream omits it; add the same
fallback used in ChatCompletion and Responses by assigning resp.Model =
req.Model before returning so callers receive the request model if the provider
response lacks a model field.

In `@internal/providers/xai/xai.go`:
- Around line 139-152: The Embeddings method currently returns resp.Model empty
when xAI omits it; update Provider.Embeddings to mirror ChatCompletion and
Responses by checking if resp.Model == "" after the Do call and, if so, set
resp.Model = req.Model before setting resp.Provider and returning; reference the
Embeddings method, resp.Model, and req.Model to locate and implement the
fallback.

In `@internal/server/handlers_test.go`:
- Line 551: The tests call NewHandler with the wrong arity (3 args) while
NewHandler now requires 4; update every test call site (e.g., the handler :=
NewHandler(mock, nil, nil) line and the other similar invocations) to pass the
new fourth parameter (use nil or an appropriate mock/stub) so each NewHandler
call supplies four arguments matching the constructor signature.

In `@internal/usage/extractor.go`:
- Around line 154-171: The ExtractFromEmbeddingResponse function currently omits
pricing and thus leaves cost fields nil; modify ExtractFromEmbeddingResponse to
accept a variadic pricing ...*core.ModelPricing parameter (matching other
extractors) and invoke CalculateGranularCost(resp.Model,
resp.Usage.PromptTokens, pricing...) (or the existing signature used elsewhere)
to populate the cost-related fields on the returned *UsageEntry so embedding
usage entries include computed costs; update the returned UsageEntry to include
the calculated cost fields.
- Around line 154-171: ExtractFromEmbeddingResponse currently omits the pricing
parameter required to compute cost; update the function signature to accept
pricing (e.g., pricing ...*core.ModelPricing) like
ExtractFromChatResponse/ExtractFromResponsesResponse/ExtractFromSSEUsage, thread
that pricing into the created UsageEntry (compute Cost using the same helper or
logic used by the other extractors) and ensure the function uses pricing to
populate the UsageEntry pricing/cost fields before returning; reference function
name ExtractFromEmbeddingResponse, the UsageEntry struct, and core.ModelPricing
when making these changes.

---

Outside diff comments:
In `@internal/providers/gemini/gemini.go`:
- Around line 183-198: The current registration logic in
internal/providers/gemini/gemini.go uses (supportsGenerate || supportsEmbed)
with modelID to register Gemini models, which causes embedding-only models
(detected via gm.SupportedMethods / supportsEmbed) to be exposed to
ChatCompletion/StreamChatCompletion and fail; change the predicate so only
models that support generateContent (supportsGenerate) are registered for
general chat providers, and avoid registering embedding-only models for
ChatCompletion/StreamChatCompletion (or alternatively register them only for
Embeddings); additionally, update the ChatCompletion/StreamChatCompletion
handlers to check model capabilities by inspecting gm.SupportedMethods or model
metadata at request time and return a clear GatewayError/invalid_request when a
chat request targets an embedding-only model.

In `@internal/providers/ollama/ollama.go`:
- Around line 54-63: NewWithHTTPClient fails to initialize
Provider.nativeBaseURL (and SetBaseURL doesn't keep it in sync), causing
Embeddings to build clients with an empty/stale base URL; update
NewWithHTTPClient to set p.nativeBaseURL = defaultBaseURL (or the passed base
URL if present) when creating the Provider, and modify SetBaseURL to update both
p.baseURL and p.nativeBaseURL together so Embeddings (which uses
p.nativeBaseURL) always gets the correct URL; reference NewWithHTTPClient,
SetBaseURL, Provider.nativeBaseURL and Embeddings when making these changes.

In `@internal/providers/router_test.go`:
- Around line 125-169: Add an Embeddings sub-test (and optional StreamEmbeddings
sub-test for parity) inside TestRouterEmptyLookup that calls
router.Embeddings(context.Background(), &core.EmbeddingsRequest{Model: "any"})
(and router.StreamEmbeddings(...) for streaming) and asserts the returned error
is ErrRegistryNotInitialized; update TestRouterEmptyLookup to include
t.Run("Embeddings returns error", ...) (and t.Run("StreamEmbeddings returns
error", ...) if adding the stream check) so the empty-registry contract for
embeddings is explicitly verified alongside the other operations.

In `@Makefile`:
- Around line 1-16: The Makefile's default target is currently the first
concrete target (install-tools), causing plain `make` to run tool installation;
change the default goal to a safe build target by adding a .DEFAULT_GOAL
directive (for example: set .DEFAULT_GOAL := build) near the top of the Makefile
so that `make` runs the intended safe target (e.g., build) instead of
`install-tools`; keep existing .PHONY and target names (install-tools, build,
etc.) intact and place the .DEFAULT_GOAL line before target definitions.

In `@tests/integration/setup_test.go`:
- Around line 279-299: NewMockLLMServer currently lacks a handler for the
embeddings endpoint, causing 404s; add a new function handleEmbeddings(w
http.ResponseWriter, r *http.Request, body []byte) that unmarshals the request
to read a "model" field (default to "text-embedding-3-small" if empty), sets
Content-Type to application/json and returns a 200 JSON payload with a simple
embedding array, model name and usage fields; then update the switch in
NewMockLLMServer's handler to route "/v1/embeddings" to handleEmbeddings so
tests exercising embeddings hit the mock instead of the default NotFound.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 872e8d2 and a5787e6.

📒 Files selected for processing (28)
  • DEVELOPMENT.md
  • GETTING_STARTED.md
  • Makefile
  • internal/admin/handler_test.go
  • internal/auditlog/stream_wrapper.go
  • internal/core/interfaces.go
  • internal/core/types.go
  • internal/guardrails/provider.go
  • internal/guardrails/provider_test.go
  • internal/providers/anthropic/anthropic.go
  • internal/providers/anthropic/anthropic_test.go
  • internal/providers/factory_test.go
  • internal/providers/gemini/gemini.go
  • internal/providers/groq/groq.go
  • internal/providers/ollama/ollama.go
  • internal/providers/openai/openai.go
  • internal/providers/registry_race_test.go
  • internal/providers/registry_test.go
  • internal/providers/router.go
  • internal/providers/router_test.go
  • internal/providers/xai/xai.go
  • internal/server/handlers.go
  • internal/server/handlers_test.go
  • internal/server/http.go
  • internal/usage/extractor.go
  • tests/e2e/main_test.go
  • tests/integration/setup_test.go
  • tests/stress/stress_test.go

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@README.md`:
- Line 79: Change the table cell that currently contains "—" to an explicit
unsupported marker (e.g., "🚫" or "❌") so the README reflects that Anthropic
embeddings return an invalid_request_error rather than being "Not applicable";
update the legend wording where that symbol is defined to use "unsupported" (or
equivalent) instead of "not-applicable" and ensure any caption or note
referencing the /v1/embeddings surface mentions Anthropic is unsupported at
runtime.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a5787e6 and bd64a19.

📒 Files selected for processing (1)
  • README.md

Copy link
Copy Markdown
Contributor

@SantiagoDePolonia SantiagoDePolonia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! A few comments left

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@DEVELOPMENT.md`:
- Around line 66-70: Remove the duplicate "pre-commit install" instruction in
the later section and keep the canonical setup instructions already documented
earlier; specifically, delete or merge the repeated code block containing the
literal command "pre-commit install" so the documentation only mentions this
step once (refer to the existing canonical pre-commit setup block and ensure any
surrounding context points to that single entry).

In `@Makefile`:
- Line 1: The .PHONY list in the Makefile is missing the required all target
which triggers checkmake's minphony warning; update the .PHONY declaration (the
line defining .PHONY) to include all among the other phony targets (e.g., add
the symbol all to the existing .PHONY list) so the Makefile declares the default
target as phony.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bd64a19 and 35ec805.

📒 Files selected for processing (5)
  • DEVELOPMENT.md
  • Makefile
  • internal/core/interfaces.go
  • internal/server/handlers_test.go
  • internal/usage/extractor.go

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/providers/ollama/ollama.go (1)

54-63: ⚠️ Potential issue | 🟠 Major

NewWithHTTPClient doesn't initialize hooks and nativeBaseURL.

The New() constructor initializes hooks and nativeBaseURL, but NewWithHTTPClient() leaves them at zero values. This will cause Embeddings() to fail or behave unexpectedly when called on a provider created via NewWithHTTPClient().

🐛 Proposed fix
 func NewWithHTTPClient(apiKey string, httpClient *http.Client, hooks llmclient.Hooks) *Provider {
 	if httpClient == nil {
 		httpClient = http.DefaultClient
 	}
-	p := &Provider{apiKey: apiKey}
+	p := &Provider{
+		apiKey:        apiKey,
+		hooks:         hooks,
+		nativeBaseURL: defaultNativeBaseURL,
+	}
 	cfg := llmclient.DefaultConfig("ollama", defaultBaseURL)
 	cfg.Hooks = hooks
 	p.client = llmclient.NewWithHTTPClient(httpClient, cfg, p.setHeaders)
 	return p
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/ollama/ollama.go` around lines 54 - 63, NewWithHTTPClient
currently leaves Provider.hooks and Provider.nativeBaseURL unset; set p.hooks =
hooks and p.nativeBaseURL = defaultBaseURL (same initialization as New) before
creating the llmclient so Embeddings() and other methods that rely on those
fields work correctly—update the NewWithHTTPClient function to initialize those
two fields (and keep cfg.Hooks = hooks and p.setHeaders usage as-is).
♻️ Duplicate comments (1)
internal/providers/ollama/ollama.go (1)

21-24: 🧹 Nitpick | 🔵 Trivial

Extract duplicated root URL to a single constant.

Both constants share http://localhost:11434. Extracting the root reduces duplication and simplifies future host/port changes.

♻️ Proposed fix
 const (
-	defaultBaseURL       = "http://localhost:11434/v1"
-	defaultNativeBaseURL = "http://localhost:11434"
+	defaultRootURL       = "http://localhost:11434"
+	defaultBaseURL       = defaultRootURL + "/v1"
+	defaultNativeBaseURL = defaultRootURL
 )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/ollama/ollama.go` around lines 21 - 24, Two constants
defaultBaseURL and defaultNativeBaseURL duplicate the same root
"http://localhost:11434"; introduce a single root constant (e.g., defaultRootURL
or defaultHostPort) and build defaultBaseURL and defaultNativeBaseURL from it
(e.g., defaultBaseURL = defaultRootURL + "/v1", defaultNativeBaseURL =
defaultRootURL) to remove duplication and simplify future changes; update any
references to defaultBaseURL/defaultNativeBaseURL in this file to use the new
composed constants.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@GETTING_STARTED.md`:
- Around line 418-455: Add an optional "Provider-Specific Models" subsection
under the Embeddings section showing non-OpenAI examples (e.g., Gemini model
"text-embedding-004" and Ollama model "nomic-embed-text") so readers see
multi-provider usage; insert this new subsection immediately after the "With
Custom Dimensions" example and before the "Supported by:" line, using the same
curl format as the existing examples to demonstrate model and input fields.

In `@internal/providers/ollama/ollama.go`:
- Around line 157-159: The Embeddings() method currently creates a new
llmclient.Client each call (nativeCfg := llmclient.DefaultConfig("ollama",
p.nativeBaseURL); nativeCfg.Hooks = p.hooks; nativeClient :=
llmclient.New(nativeCfg, p.setHeaders)), which is inefficient; modify the
Provider struct to add a nativeClient field, initialize that nativeClient once
in New() using the same DefaultConfig and llmclient.New logic (mirroring how
p.client is created), and then update Embeddings() to reuse p.nativeClient
(lazy-initialize in New() or on first use if preferred) instead of constructing
a new client per request.

In `@README.md`:
- Line 107: Remove the unused legend entry "— Not applicable" from the README
legend line that currently reads "✅ Supported  🚧 Coming soon  ❌ Unsupported  —
Not applicable"; edit the legend to only include the symbols actually used in
the provider capabilities table (e.g., "✅ Supported  🚧 Coming soon  ❌
Unsupported") so the legend matches the table content.

---

Outside diff comments:
In `@internal/providers/ollama/ollama.go`:
- Around line 54-63: NewWithHTTPClient currently leaves Provider.hooks and
Provider.nativeBaseURL unset; set p.hooks = hooks and p.nativeBaseURL =
defaultBaseURL (same initialization as New) before creating the llmclient so
Embeddings() and other methods that rely on those fields work correctly—update
the NewWithHTTPClient function to initialize those two fields (and keep
cfg.Hooks = hooks and p.setHeaders usage as-is).

---

Duplicate comments:
In `@internal/providers/ollama/ollama.go`:
- Around line 21-24: Two constants defaultBaseURL and defaultNativeBaseURL
duplicate the same root "http://localhost:11434"; introduce a single root
constant (e.g., defaultRootURL or defaultHostPort) and build defaultBaseURL and
defaultNativeBaseURL from it (e.g., defaultBaseURL = defaultRootURL + "/v1",
defaultNativeBaseURL = defaultRootURL) to remove duplication and simplify future
changes; update any references to defaultBaseURL/defaultNativeBaseURL in this
file to use the new composed constants.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 35ec805 and ba735a4.

📒 Files selected for processing (8)
  • GETTING_STARTED.md
  • README.md
  • internal/core/types.go
  • internal/providers/gemini/gemini.go
  • internal/providers/groq/groq.go
  • internal/providers/ollama/ollama.go
  • internal/providers/openai/openai.go
  • internal/providers/xai/xai.go

giorgi-pro and others added 7 commits February 26, 2026 22:01
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/providers/factory.go (1)

80-89: ⚠️ Potential issue | 🟠 Major

Guard against nil providers before wrapping.

At Line 88, newProviderWrapper can wrap a nil provider if a constructor returns nil, causing a runtime panic on first method call. Fail fast in Create when p == nil.

🐛 Proposed fix
 	p := builder(cfg.APIKey, opts)
+	if p == nil {
+		return nil, fmt.Errorf("provider constructor returned nil for type: %s", cfg.Type)
+	}
 
 	if cfg.BaseURL != "" {
 		if setter, ok := p.(interface{ SetBaseURL(string) }); ok {
 			setter.SetBaseURL(cfg.BaseURL)
 		}
 	}
 
 	return newProviderWrapper(p, cfg.Type), nil
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/factory.go` around lines 80 - 89, After calling the
constructor builder(cfg.APIKey, opts) in Create, check whether the returned
provider value p is nil and return a non-nil error instead of proceeding to
newProviderWrapper; specifically, after p := builder(cfg.APIKey, opts) add a
nil-guard that returns an error (with context including cfg.Type) when p == nil
so you never call newProviderWrapper(nil, cfg.Type); keep the existing BaseURL
setter logic but only run it after the nil check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/providers/gemini/gemini.go`:
- Line 190: The conditional that gates Gemini model registration only allows
modelIDs with the "gemini-" prefix: update the check around supportsGenerate,
supportsEmbed and modelID so that embedding models without the "gemini-" prefix
are allowed when supportsEmbed is true. Concretely, change the condition that
uses strings.HasPrefix(modelID, "gemini-") so it permits either
strings.HasPrefix(modelID, "gemini-") OR (supportsEmbed && the modelID matches
your embedding model naming convention, e.g., "text-embedding-"), keeping the
existing behavior for supportsGenerate and for gemini-prefixed models (refer to
supportsGenerate, supportsEmbed, modelID, and strings.HasPrefix in gemini.go).

In `@internal/providers/ollama/ollama.go`:
- Around line 75-80: The native base URL derivation in Provider.SetBaseURL must
handle inputs with an extra trailing slash (e.g., "http://host:11434/v1/");
update the logic in SetBaseURL so that after calling p.client.SetBaseURL(url)
you normalize the input (trim any trailing slashes first, then remove a trailing
"/v1" if present) and pass that normalized root URL into
p.nativeClient.SetBaseURL; modify the code around Provider.SetBaseURL
(referencing p.client.SetBaseURL and p.nativeClient.SetBaseURL) to perform
trim+trim-suffix instead of only stripping the exact "/v1" suffix.

In `@internal/server/handlers.go`:
- Around line 234-236: The embeddings usage log is created without resolving
pricing so cost fields stay empty; update the call to
usage.ExtractFromEmbeddingResponse to pass the configured pricing resolver (e.g.
usage.ExtractFromEmbeddingResponse(resp, requestID, providerType,
"/v1/embeddings", h.usageLogger.Config().PricingResolver)) so the returned
usageEntry has pricing populated before writing with h.usageLogger; locate the
call around h.usageLogger and replace the 4-arg call with the 5-arg variant
using the logger's pricing resolver.

In `@internal/server/model_validation.go`:
- Around line 19-69: Add Go doc comments for the exported symbols
ModelValidation, GetProviderType, and ModelCtx: document ModelValidation as an
echo middleware that validates model interactions (reads and preserves request
body, peeks "model" from JSON, rejects missing/unsupported models via
core.NewInvalidRequestError, sets providerType in the context under
providerTypeKey, enriches auditlog entry, and attaches X-Request-ID into the
request context); document GetProviderType as returning the provider type string
stored on the echo.Context (or empty string if missing); and document ModelCtx
as returning the request context and the provider type string for handlers. Keep
comments concise, start with the function name, and describe parameters/returns
and observable side effects.

In `@Makefile`:
- Around line 15-18: Update the install-tools Makefile rule to use a portable
check and pinned versions: replace uses of which with command -v and avoid
installing golangci-lint@latest and unversioned pre-commit; instead pin specific
versions for golangci-lint and pre-commit in the install commands so installs
are reproducible (refer to the install-tools target and the golangci-lint and
pre-commit install commands).

---

Outside diff comments:
In `@internal/providers/factory.go`:
- Around line 80-89: After calling the constructor builder(cfg.APIKey, opts) in
Create, check whether the returned provider value p is nil and return a non-nil
error instead of proceeding to newProviderWrapper; specifically, after p :=
builder(cfg.APIKey, opts) add a nil-guard that returns an error (with context
including cfg.Type) when p == nil so you never call newProviderWrapper(nil,
cfg.Type); keep the existing BaseURL setter logic but only run it after the nil
check.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ba735a4 and 0538bd0.

📒 Files selected for processing (18)
  • DEVELOPMENT.md
  • Makefile
  • README.md
  • internal/providers/anthropic/anthropic.go
  • internal/providers/factory.go
  • internal/providers/factory_test.go
  • internal/providers/gemini/gemini.go
  • internal/providers/groq/groq.go
  • internal/providers/ollama/ollama.go
  • internal/providers/ollama/ollama_test.go
  • internal/providers/openai/openai.go
  • internal/providers/wrapper.go
  • internal/providers/xai/xai.go
  • internal/server/handlers.go
  • internal/server/handlers_test.go
  • internal/server/http.go
  • internal/server/model_validation.go
  • internal/server/model_validation_test.go

giorgi-pro and others added 5 commits February 26, 2026 22:47
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/providers/ollama/ollama.go`:
- Around line 186-194: The json.Marshal error in the loop over
ollamaResp.Embeddings is being ignored; update the loop that builds
core.EmbeddingData (iterating ollamaResp.Embeddings) to capture the error (raw,
err := json.Marshal(emb)), and on error return or propagate a descriptive error
(or log and return) instead of discarding it so callers know embedding
serialization failed; ensure the change affects the function that constructs the
data slice (the block creating data := make([]core.EmbeddingData, ...)) and
consistently handles the error path.

In `@internal/server/handlers_test.go`:
- Around line 513-604: The embeddings tests do not cover the branch where
usageLogger.Enabled is true and a pricingResolver is used, so add a new test
(e.g., TestEmbeddings_WithUsageAndPricing) that constructs a mockProvider and a
Handler with a usageLogger.Enabled=true and a mock pricingResolver that returns
known price data, send a valid embeddings POST to handler.Embeddings and assert
the response and/or logs include the populated usage/pricing fields (or assert
the pricingResolver mock was invoked) to ensure cost population is exercised;
reference the existing TestEmbeddings/TestEmbeddings_ProviderReturnsError setup
and use the same request shape and mockProvider embeddingResponse to locate
where to plug in usageLogger and pricingResolver.

In `@internal/server/handlers.go`:
- Around line 103-104: After switching to ModelCtx(c) the provider call contexts
lost the request correlation; retrieve the request ID from
c.Request().Header.Get("X-Request-ID") (as already done into requestID) and
re-wrap the context returned by ModelCtx(c) with core.WithRequestID(requestID)
before passing it to provider calls in ChatCompletion, Responses, and Embeddings
so provider-side calls regain per-request correlation (i.e., replace usages of
ctx from ModelCtx(c) with core.WithRequestID(ctx, requestID) when invoking the
provider).

In `@Makefile`:
- Around line 15-18: The Makefile's install-tools recipe uses backticked
prefixes "`@command`" and "`@echo`" which invoke command substitution and break
the target; update the install-tools target to remove backticks so the prefixes
are literal (use `@command` -v ... || (...) and `@echo` "All tools are ready")
ensuring the existing checks/install flows for golangci-lint and pre-commit run
as intended; look for the install-tools target and the occurrences of `@command`
and `@echo` to correct.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0538bd0 and 5f4a3c6.

📒 Files selected for processing (11)
  • Makefile
  • internal/core/types.go
  • internal/providers/factory_test.go
  • internal/providers/gemini/gemini.go
  • internal/providers/ollama/ollama.go
  • internal/providers/ollama/ollama_test.go
  • internal/providers/router.go
  • internal/providers/router_test.go
  • internal/server/handlers.go
  • internal/server/handlers_test.go
  • internal/server/model_validation.go

@giorgi-pro giorgi-pro merged commit 07900ae into main Feb 28, 2026
12 checks passed
@giorgi-pro giorgi-pro deleted the me/gom-47-embedding-endpoints-anthropic-and-openai branch February 28, 2026 10:34
@coderabbitai coderabbitai bot mentioned this pull request Mar 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants