Skip to content

Fix: function calling#121

Merged
SantiagoDePolonia merged 13 commits intomainfrom
fix/function-calling
Mar 8, 2026
Merged

Fix: function calling#121
SantiagoDePolonia merged 13 commits intomainfrom
fix/function-calling

Conversation

@SantiagoDePolonia
Copy link
Copy Markdown
Contributor

@SantiagoDePolonia SantiagoDePolonia commented Mar 6, 2026

Summary by CodeRabbit

  • New Features
    • Function-calling across providers with tool selection and tool_call tracking (unique IDs) and optional parallel tool execution.
  • API Enhancements
    • Chat and Responses APIs accept tool definitions, tool_choice, parallel_tool_calls, and surface function-call metadata and arguments; streaming now emits structured tool-call events and output item lifecycle events.
  • Documentation
    • Added "Chat Completion with Function Calling" example demonstrating a tool-based workflow (gpt-4o-mini).
  • Bug Fixes
    • Normalized finish/stop reasons across providers and preserved null/empty tool-result content.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 6, 2026

📝 Walkthrough

Walkthrough

Adds end-to-end tool-calling (function definitions and invocations) across the stack: core request/response types, guardrails, provider adapters (Anthropic, OpenAI, Responses), streaming event plumbing, docs/examples, tests, and test fixtures to surface and stream tool_call metadata and function-call outputs.

Changes

Cohort / File(s) Summary
Core types & responses
internal/core/types.go, internal/core/responses.go, internal/core/types_test.go
Added Tools, ToolChoice, ParallelToolCalls to requests; added ToolCallID, ContentNull, JSON marshal/unmarshal for Message; added CallID/Name/Arguments on Responses output items; WithStreaming returns shallow copy preserving tool fields.
API docs & example
GETTING_STARTED.md, cmd/gomodel/docs/docs.go, cmd/gomodel/docs/swagger.json
Inserted "Chat Completion with Function Calling" example and updated Swagger/OpenAPI schema to expose tool-related fields (tools, tool_choice, parallel_tool_calls, tool_call_id, temperature, etc.).
Guardrails
internal/guardrails/guardrails.go, internal/guardrails/provider.go, internal/guardrails/provider_test.go
Extended guardrails Message DTO with tool metadata (ToolCalls, ToolCallID, ContentNull), added Guardrail interface, and preserved/propagated tool fields via adapters (cloneToolCalls helper).
Anthropic provider
internal/providers/anthropic/anthropic.go, internal/providers/anthropic/anthropic_test.go
Major provider changes: accept/validate tools and tool_choice, support tool-related content blocks, parse/emit tool_call streaming deltas, new conversion/validation helpers and extensive streaming/tool-call tests.
OpenAI provider
internal/providers/openai/openai.go, internal/providers/openai/openai_test.go
Propagated Tools, ToolChoice, ParallelToolCalls into o-series requests; added tests asserting tool configuration preserved for reasoning and non-reasoning flows.
Responses API & converters
internal/providers/responses_adapter.go, internal/providers/responses_converter.go, internal/providers/responses_output_state.go, internal/providers/responses_adapter_test.go, internal/providers/responses_converter_test.go
Added ConvertResponsesInputToMessages, BuildResponsesOutputItems, ID helpers, rich parsing for function_call / function_call_output items, and streaming converter state for tool_call lifecycle; implemented Responses output event state and SSE event rendering for tool call items.
Streaming & output state
internal/providers/responses_output_state.go, internal/providers/responses_converter.go
New ResponsesOutputEventState and ResponsesOutputToolCallState types; logic to reserve/start/complete assistant output and tool_call items, emit response.output_item.added/done and function_call_arguments.* events.
Tests, fixtures & golden data
tests/*, tests/contract/testdata/golden/*, tests/e2e/*
Updated contract golden finish_reason values (e.g., end_turn/tool_use → stop/tool_calls), added numerous unit/integration/E2E tests for tool flows and streaming, and added mock provider streaming tool responses and thread-safe request accessors.
Mock & test helpers
tests/e2e/mock_provider.go
MockLLMServer: added Requests()/ResetRequests(), forced-tool streaming and non-streaming response paths for test-driven tool-call scenarios.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API as Gateway/API
    participant Guardrails
    participant Chat
    participant Provider
    participant ExtAPI as External/Tool

    Client->>API: Send ChatRequest (messages, tools, tool_choice)
    API->>Guardrails: Process(messages)
    Guardrails->>API: Return guarded messages (preserve ToolCalls, ToolCallID, ContentNull)
    API->>Chat: Convert guarded request -> core.ChatRequest
    Chat->>Provider: Adapt to provider request (include tools/tool_choice, parallel flag)
    Provider->>ExtAPI: Invoke external tool or include function schema
    ExtAPI-->>Provider: Return tool_call metadata / tool result
    Provider->>Chat: Convert provider response -> core.Message (ToolCalls, ToolCallID, ContentNull)
    Chat->>API: Assemble response (ResponsesOutputItems / stream events)
    API->>Client: Return final response (including tool_call items and function arguments)
Loading

Estimated Code Review Effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Poem

🐰 I hopped through bytes and streaming halls,

I stitched the calls and mended walls.
Tools now chatter, functions sing—
Weather found with a single ping.
🥕✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title 'Fix: function calling' is too vague and generic. It lacks specificity about what is being fixed or what the main change accomplishes. Use a more descriptive title that specifies the key change, such as 'Add tool/function calling support to chat and responses APIs' or 'Implement tool calling with parallel execution and streaming support'.
Docstring Coverage ⚠️ Warning Docstring coverage is 18.75% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/function-calling

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive function/tool calling support to the AI gateway, ensuring that tool-related fields (tools, tool_choice, parallel_tool_calls, tool_call_id) are properly passed through for all providers across both the Chat Completions and Responses APIs, including streaming support.

Changes:

  • Extends core types (ChatRequest, ResponsesRequest, Message, ResponsesOutputItem) with tool-calling fields and adds custom Message.UnmarshalJSON to handle null content from tool-calling responses
  • Implements Anthropic-specific tool translation (OpenAI → Anthropic format for tools, tool_choice, parallel tool calls, and message content blocks) in both non-streaming and streaming paths, and normalizes Anthropic's tool_use stop reason to OpenAI's tool_calls
  • Adds Responses API input conversion for function_call and function_call_output items, plus streaming tool call event conversion for the Responses stream converters (both OpenAI-compatible and Anthropic-native)

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
internal/core/types.go Adds tool fields to ChatRequest, ToolCallID to Message, custom UnmarshalJSON for null content
internal/core/responses.go Adds ToolChoice, ParallelToolCalls to ResponsesRequest; CallID, Name, Arguments to ResponsesOutputItem
internal/core/types_test.go Tests for null content unmarshaling and tool field preservation in WithStreaming()
internal/providers/openai/openai.go Adds tool fields to oSeriesChatRequest and adaptForOSeries
internal/providers/openai/openai_test.go Tests tool field passthrough for both regular and reasoning models
internal/providers/anthropic/anthropic.go Full Anthropic tool support: type conversions, streaming tool calls, stop reason normalization
internal/providers/anthropic/anthropic_test.go Comprehensive tests for Anthropic tool conversions, streaming, and responses
internal/providers/responses_adapter.go Adds Responses→Chat conversion for tool fields, function_call/function_call_output input items
internal/providers/responses_adapter_test.go Tests for tool configuration, parallel calls, function_call loop items, and response conversion
internal/providers/responses_converter.go Streaming tool call handling for OpenAI-compatible Responses stream converter
internal/providers/responses_converter_test.go Test for streaming tool call events in Responses format
internal/guardrails/guardrails.go Adds ToolCalls and ToolCallID to guardrails Message DTO
internal/guardrails/provider.go Preserves tool-call fields through guardrail pipeline
internal/guardrails/provider_test.go Tests tool field preservation through guardrails
tests/e2e/mock_provider.go Mock server tool call response support, request recording helpers
tests/e2e/chat_test.go E2E tests for function calling and tool result messages
tests/contract/anthropic_test.go Updates expected finish_reason from tool_use to tool_calls
tests/contract/testdata/golden/anthropic/messages_with_tools.golden.json Golden file update for normalized finish_reason
cmd/gomodel/docs/swagger.json Swagger schema updates for tool fields
cmd/gomodel/docs/docs.go Swagger template updates for tool fields
README.md Documents tool-calling passthrough capability
GETTING_STARTED.md Adds function calling curl example

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/gomodel/docs/swagger.json`:
- Around line 1796-1798: The OpenAPI schema currently declares the tool_choice
property as type: object but runtime accepts both strings like "auto" and
objects; update the schema for the tool_choice property (both occurrences) to
allow string-or-object values by replacing the single type with a union (e.g.,
type: ["string","object"] or an anyOf/oneOf combining {type: "string"} and
{type: "object"}) so generated SDKs/validators accept both forms; ensure the
same change is applied to the second occurrence of tool_choice referenced in the
spec.

In `@internal/guardrails/provider.go`:
- Around line 316-321: The Message construction copies the ToolCalls slice
header only, risking shared-mutability; in the adapter(s) that build messages
(the code assigning Message{Role:, Content:, ToolCalls:, ToolCallID:} and the
other similar adapter) clone m.ToolCalls before assigning it (e.g., assign a new
slice copy via append([]ToolCall(nil), m.ToolCalls...) or allocate and copy) so
each Message gets its own backing array and race/mutation issues are eliminated.

In `@internal/providers/anthropic/anthropic_test.go`:
- Around line 892-927: The test
TestConvertFromAnthropicResponse_WithToolUseStopReason only asserts FinishReason
and ToolCalls[0].ID, which can miss regressions that drop the tool name or
arguments; update the test that calls convertFromAnthropicResponse to also
assert the converted tool call's Name (e.g.,
result.Choices[0].Message.ToolCalls[0].Name or Function.Name) and its
Arguments/Inputs (e.g., result.Choices[0].Message.ToolCalls[0].Arguments or
Function.Arguments contain the original {"city":"Warsaw"}), so the non-streaming
tool-use payload is fully validated.

In `@internal/providers/anthropic/anthropic.go`:
- Around line 465-475: The parseToolCallArguments function currently swallows
JSON unmarshal errors and returns an empty object, which changes the tool
invocation; change parseToolCallArguments to return (any, error) and stop
returning a default {} on parse failure so callers can detect malformed JSON;
update callers (the request conversion flow that calls parseToolCallArguments)
to check the error and return an invalid-request error (or propagate the parse
error) instead of substituting {}; keep the function name parseToolCallArguments
and ensure all call sites are updated to handle the returned error and fail fast
on malformed tool_call.function.arguments.
- Around line 1032-1058: The streaming path currently assigns tool-call
output_index values before reserving any assistant message item, causing
streamed indexes to drift versus the final convertAnthropicResponseToResponses
output; modify the streaming logic to emit or reserve the assistant message
output item (same shape as the non-stream branch: a ResponsesOutputItem with
Type "message", Role "assistant", and Content containing an "output_text") when
content != "" (or when there are zero toolCalls) before assigning tool-call
indexes so that nextOutputIndex and later calls to
providers.ResponsesFunctionCallCallID / providers.ResponsesFunctionCallItemID
produce the same ordering as the final output; ensure the reserved item
increments nextOutputIndex consistently and is emitted/serialized in the stream
in the same position convertAnthropicResponseToResponses will place it.
- Around line 757-762: normalizeAnthropicStopReason currently only maps
"tool_use" and lets Anthropic-specific values leak into finish_reason; update
normalizeAnthropicStopReason to map all known Anthropic stop reasons to
OpenAI-compatible values (at minimum: "tool_use" -> "tool_calls", "end_turn" ->
"stop", "stop_sequence" -> "stop", "max_tokens" -> "length") and leave unknown
values unchanged, then ensure convertFromAnthropicResponse and the streaming
path both call this updated normalizeAnthropicStopReason so finish_reason is
always OpenAI-compatible.

In `@internal/providers/responses_adapter.go`:
- Around line 53-67: The ConvertResponsesInputToMessages function currently
ignores inputs of type []map[string]any because that concrete slice type doesn't
match the []interface{} case; add an explicit branch for []map[string]any before
the []interface{} case (e.g., case []map[string]any:), convert each map element
to interface{} (or directly build core.Message items) and pass them into
convertResponsesInputItems (or replicate its logic) so such inputs produce the
expected []core.Message; reference ConvertResponsesInputToMessages and
convertResponsesInputItems when making the change.
- Around line 24-33: The ChatRequest construction in ResponsesViaChat is missing
the new fields so req.Reasoning and req.StreamOptions are dropped; update the
core.ChatRequest literal in ResponsesViaChat to set Reasoning: req.Reasoning and
StreamOptions: req.StreamOptions (ensure the names/types match
core.ChatRequest's fields) so reasoning effort and streamed-response options
(e.g., include_usage) are preserved when forwarding to provider implementations.

In `@internal/providers/responses_converter.go`:
- Around line 56-63: The writeEvent method on OpenAIResponsesStreamConverter
currently swallows JSON marshal failures and returns an empty string, which
allows callers to unknowingly append empty data; change writeEvent to return
both the formatted string and an error (e.g., (string, error)) so callers can
handle failures, update callers to check the error before appending, and
propagate or log the error appropriately; keep the existing slog.Error call but
surface the error to callers by returning it from writeEvent (update function
signature and all call sites that use writeEvent).

In `@tests/e2e/mock_provider.go`:
- Around line 156-191: The streaming branch currently returns before handling
forced tool calls, so when forcedToolName(req) != "" the SSE path still emits
plain text chunks; add a new helper (e.g., handleStreamingToolResponse)
alongside handleStreamingResponse that mirrors the non-streaming tool-call
response: stream JSON deltas for Message.ToolCalls (emitting incremental
tool-call argument chunks or a single tool_call delta), ensure the final chunk
sets finish_reason to "tool_calls" and includes the same core.ToolCall structure
(ID, Type="function", Function{Name: toolName, Arguments}), and update the
streaming branch to call handleStreamingToolResponse(req, w, toolName) when
forcedToolName(req) returns a name so streaming tests receive proper tool-call
events.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2dad9a38-e319-48ef-9b00-8676420ef1fb

📥 Commits

Reviewing files that changed from the base of the PR and between 8149ab0 and f8b9675.

📒 Files selected for processing (22)
  • GETTING_STARTED.md
  • README.md
  • cmd/gomodel/docs/docs.go
  • cmd/gomodel/docs/swagger.json
  • internal/core/responses.go
  • internal/core/types.go
  • internal/core/types_test.go
  • internal/guardrails/guardrails.go
  • internal/guardrails/provider.go
  • internal/guardrails/provider_test.go
  • internal/providers/anthropic/anthropic.go
  • internal/providers/anthropic/anthropic_test.go
  • internal/providers/openai/openai.go
  • internal/providers/openai/openai_test.go
  • internal/providers/responses_adapter.go
  • internal/providers/responses_adapter_test.go
  • internal/providers/responses_converter.go
  • internal/providers/responses_converter_test.go
  • tests/contract/anthropic_test.go
  • tests/contract/testdata/golden/anthropic/messages_with_tools.golden.json
  • tests/e2e/chat_test.go
  • tests/e2e/mock_provider.go

Comment on lines +56 to +63
func (sc *OpenAIResponsesStreamConverter) writeEvent(eventName string, payload map[string]interface{}) string {
jsonData, err := json.Marshal(payload)
if err != nil {
slog.Error("failed to marshal responses stream event", "error", err, "event", eventName, "response_id", sc.responseID)
return ""
}
return fmt.Sprintf("event: %s\ndata: %s\n\n", eventName, jsonData)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider returning an error indicator for marshalling failures.

writeEvent silently returns an empty string on marshalling errors. While it logs the error, callers may inadvertently append empty strings to the buffer. This is acceptable for resilience but worth noting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/responses_converter.go` around lines 56 - 63, The
writeEvent method on OpenAIResponsesStreamConverter currently swallows JSON
marshal failures and returns an empty string, which allows callers to
unknowingly append empty data; change writeEvent to return both the formatted
string and an error (e.g., (string, error)) so callers can handle failures,
update callers to check the error before appending, and propagate or log the
error appropriately; keep the existing slog.Error call but surface the error to
callers by returning it from writeEvent (update function signature and all call
sites that use writeEvent).

@SantiagoDePolonia
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 6, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 28 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@SantiagoDePolonia
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 6, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/providers/responses_adapter.go (1)

178-196: ⚠️ Potential issue | 🟠 Major

Handle nested []map[string]any content arrays here as well.

ConvertResponsesInputToMessages now accepts top-level []map[string]any, but ExtractContentFromInput still only recognizes []interface{} and []core.ResponsesContentPart. A native Go request like []map[string]any{{"role":"user","content":[]map[string]any{{"type":"input_text","text":"Hello"}}}} will therefore extract "" here, and convertResponsesInputMap drops the whole message.

🩹 Suggested fix
 func ExtractContentFromInput(content interface{}) string {
 	switch c := content.(type) {
 	case string:
 		return c
+	case []map[string]any:
+		var texts []string
+		for _, part := range c {
+			if text, _ := part["text"].(string); text != "" {
+				texts = append(texts, text)
+			}
+		}
+		return strings.Join(texts, " ")
 	case []core.ResponsesContentPart:
 		var texts []string
 		for _, part := range c {
 			if part.Text != "" {
 				texts = append(texts, part.Text)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/responses_adapter.go` around lines 178 - 196,
ExtractContentFromInput currently handles []core.ResponsesContentPart and
[]interface{} but not a top-level []map[string]any, so nested native Go payloads
like []map[string]any with content arrays return empty; add a new case in
ExtractContentFromInput for []map[string]any that mirrors the []interface{}
branch: iterate each map, check for a "text" string and append it, and if a
"content" field exists handle nested []map[string]any or []interface{}
recursively (or reuse the existing extraction logic), ensuring
convertResponsesInputMap and ExtractContentFromInput both accept and correctly
parse []map[string]any content arrays so messages with nested content produce
the expected text.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/gomodel/docs/swagger.json`:
- Around line 1796-1805: The swagger.json currently uses the OpenAPI 3.x-only
"oneOf" for the tool_choice schema (symbol: "tool_choice") while the document
declares "swagger": "2.0"; replace the oneOf usage with a Swagger 2.0-compatible
approach (either change the tool_choice schema to a typeless schema (no "type")
and expand the description to note "string or object", or migrate the overall
spec to OpenAPI 3.x and remove the "swagger": "2.0" header), update the source
annotation that generated this schema so the generator emits the corrected
definition, and then regenerate the generated artifacts
(cmd/gomodel/docs/swagger.json and cmd/gomodel/docs/docs.go) to reflect the fix.

In `@internal/providers/anthropic/anthropic_test.go`:
- Around line 1614-1627: The SSE fixture in anthropic_test.go currently mixes a
preamble text_delta with a tool_use block both using index:0; fix it by making
the preamble its own content block (add a content_block_start and
content_block_stop wrapping the initial text_delta with index:0) and bump the
tool_use block and its input_json_delta events to index:1 so each content block
has a unique index; update the corresponding test assertions in the same test to
reflect the two-block structure (preamble block then tool_use block) per the
other Anthropic SSE fixtures.
- Around line 363-364: The test currently sets foundArgumentsDelta true when any
fragment of function.arguments arrives, allowing partial data to pass; update
the test in anthropic_test.go to accumulate/buffer all emitted
function.arguments fragments (e.g., collect the string parts emitted for the
"function.arguments" role) and only assert equality against the full expected
JSON ({"city":"Warsaw"}) after the stream completes (when foundFinish is true),
replacing the single-fragment flip logic (foundArgumentsDelta) with a
join-and-compare check; apply the same buffering-and-join fix to the other
occurrences flagged (the similar assertions around the blocks noted at the other
locations).
- Around line 845-874: Update the two tests (e.g.,
TestConvertToAnthropicRequest_InvalidToolArguments and its companion around
lines 1918-1940) to also assert the HTTP mapping: after confirming the error is
a *core.GatewayError and its Type is core.ErrorTypeInvalidRequest, call
gatewayErr.HTTPStatusCode() and assert it equals http.StatusBadRequest; keep the
existing checks for error type and use the same gatewayErr variable for this new
assertion so the test fails if the invalid_request_error is mapped to the wrong
HTTP status.

In `@internal/providers/anthropic/anthropic.go`:
- Around line 374-452: The helpers convertOpenAIToolsToAnthropic and
convertOpenAIToolChoiceToAnthropic must fail fast on malformed inputs instead of
silently continuing; change their signatures to return an error (e.g.,
([]anthropicTool, error) and (*anthropicToolChoice, bool, error) or equivalent)
and validate strictly: in convertOpenAIToolsToAnthropic, treat missing/empty
function name or a non-object/invalid "parameters" as an invalid_request_error
and return an error immediately rather than continuing/using a default
inputSchema; in convertOpenAIToolChoiceToAnthropic, treat unknown/unsupported
choice types or a malformed function object (missing name) as
invalid_request_error and return an error instead of mapping to nil/auto/any;
update callers to handle the new error returns and propagate a proper
invalid_request_error to the client.
- Around line 478-486: In buildAnthropicMessageContent, validate that when
msg.Role == "tool" the msg.ToolCallID is non-empty and return an error if it is
missing instead of emitting a tool_result block; update the logic around
creation of the anthropicMessageContentBlock (the tool_result block and its
ToolUseID/ToolCallID usage) to check msg.ToolCallID (tool_call_id) and fail with
a clear invalid-request error when empty so uncorrelatable tool results are
rejected locally.

In `@tests/e2e/mock_provider.go`:
- Around line 39-60: Add Go doc comments for the exported methods on
MockLLMServer: document Requests to explain it returns a thread-safe
copy/snapshot of recorded requests (including that headers and body are cloned)
and document ResetRequests to explain it clears the server's recorded requests
in a thread-safe manner; place the comments immediately above the Requests and
ResetRequests method declarations and follow Go doc style (start with the method
name).

---

Outside diff comments:
In `@internal/providers/responses_adapter.go`:
- Around line 178-196: ExtractContentFromInput currently handles
[]core.ResponsesContentPart and []interface{} but not a top-level
[]map[string]any, so nested native Go payloads like []map[string]any with
content arrays return empty; add a new case in ExtractContentFromInput for
[]map[string]any that mirrors the []interface{} branch: iterate each map, check
for a "text" string and append it, and if a "content" field exists handle nested
[]map[string]any or []interface{} recursively (or reuse the existing extraction
logic), ensuring convertResponsesInputMap and ExtractContentFromInput both
accept and correctly parse []map[string]any content arrays so messages with
nested content produce the expected text.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3b67643e-8f56-4364-95b7-3bc75e16e8b8

📥 Commits

Reviewing files that changed from the base of the PR and between f8b9675 and 026895e.

📒 Files selected for processing (16)
  • cmd/gomodel/docs/docs.go
  • cmd/gomodel/docs/swagger.json
  • internal/guardrails/provider.go
  • internal/guardrails/provider_test.go
  • internal/providers/anthropic/anthropic.go
  • internal/providers/anthropic/anthropic_test.go
  • internal/providers/responses_adapter.go
  • internal/providers/responses_adapter_test.go
  • tests/contract/testdata/golden/anthropic/messages.golden.json
  • tests/contract/testdata/golden/anthropic/messages_extended_thinking.golden.json
  • tests/contract/testdata/golden/anthropic/messages_multi_turn.golden.json
  • tests/contract/testdata/golden/anthropic/messages_multimodal.golden.json
  • tests/contract/testdata/golden/anthropic/messages_stream.golden.json
  • tests/contract/testdata/golden/anthropic/messages_with_params.golden.json
  • tests/e2e/chat_test.go
  • tests/e2e/mock_provider.go

Comment on lines +71 to +122
t.Run("function calling preserves tools and tool_choice", func(t *testing.T) {
mockServer.ResetRequests()

parallelToolCalls := false
payload := core.ChatRequest{
Model: "gpt-4",
Messages: []core.Message{
{Role: "user", Content: "What's the weather in Warsaw?"},
},
Tools: []map[string]any{
{
"type": "function",
"function": map[string]any{
"name": "lookup_weather",
"description": "Get the weather for a city.",
"parameters": map[string]any{
"type": "object",
"properties": map[string]any{
"city": map[string]any{"type": "string"},
},
"required": []string{"city"},
},
},
},
},
ToolChoice: map[string]any{"type": "function", "function": map[string]any{"name": "lookup_weather"}},
ParallelToolCalls: &parallelToolCalls,
}

resp := sendChatRequest(t, payload)
defer closeBody(resp)

require.Equal(t, http.StatusOK, resp.StatusCode)

var chatResp core.ChatResponse
require.NoError(t, json.NewDecoder(resp.Body).Decode(&chatResp))

require.Len(t, chatResp.Choices, 1)
assert.Equal(t, "tool_calls", chatResp.Choices[0].FinishReason)
require.Len(t, chatResp.Choices[0].Message.ToolCalls, 1)
assert.Equal(t, "lookup_weather", chatResp.Choices[0].Message.ToolCalls[0].Function.Name)

requests := mockServer.Requests()
require.NotEmpty(t, requests)

var upstream core.ChatRequest
require.NoError(t, json.Unmarshal(requests[len(requests)-1].Body, &upstream))
require.Len(t, upstream.Tools, 1)
require.NotNil(t, upstream.ToolChoice)
require.NotNil(t, upstream.ParallelToolCalls)
assert.False(t, *upstream.ParallelToolCalls)
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add coverage for tool_choice: "required" too.

These cases only exercise the explicit function-object form, but forcedToolName also added a separate "required" branch that selects the first advertised tool. That new behavior is still untested in both the non-streaming and streaming paths.

As per coding guidelines **/*_test.go: Add or update tests for any behavior change.

Also applies to: 242-295

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

♻️ Duplicate comments (1)
internal/providers/anthropic/anthropic_test.go (1)

910-939: ⚠️ Potential issue | 🟡 Minor

Also assert HTTP 400 in the remaining invalid-request tests.

These cases stop at gatewayErr.Type. If the conversion keeps returning invalid_request_error but starts mapping it to the wrong HTTP status, the suite will miss it. Add gatewayErr.HTTPStatusCode() == http.StatusBadRequest in all three tests.

💡 Suggested assertion
 	if gatewayErr.Type != core.ErrorTypeInvalidRequest {
 		t.Fatalf("error type = %q, want invalid_request_error", gatewayErr.Type)
 	}
+	if gatewayErr.HTTPStatusCode() != http.StatusBadRequest {
+		t.Fatalf("HTTPStatusCode() = %d, want %d", gatewayErr.HTTPStatusCode(), http.StatusBadRequest)
+	}

As per coding guidelines, Use typed gateway errors (core.GatewayError) and map upstream failures correctly and Add or update tests for any behavior change.

Also applies to: 941-958, 2083-2105

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/anthropic/anthropic_test.go` around lines 910 - 939, The
test TestConvertToAnthropicRequest_InvalidToolChoice only asserts the error type
but not the mapped HTTP status; update this and the two similar tests to also
assert that the returned *core.GatewayError (gatewayErr) maps to HTTP 400 by
calling gatewayErr.HTTPStatusCode() and comparing it to http.StatusBadRequest,
keeping the existing checks for errors.As(err, &gatewayErr) and gatewayErr.Type
== core.ErrorTypeInvalidRequest; apply the same addition to the other two tests
mentioned (lines ~941-958 and ~2083-2105) so all invalid-request tests verify
both the error type and HTTP status mapping.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/core/responses.go`:
- Around line 10-11: Replace the untyped payloads in internal/core/responses.go
by introducing shared, strongly-typed models (e.g., ToolDefinition and
ToolChoice) in the internal/core package and using them instead of
[]map[string]any and any for the Tools and ToolChoice fields; add appropriate
JSON tags and fields on those structs to cover the current dynamic keys, update
the Tools field signature to []ToolDefinition and ToolChoice to ToolChoice (or
*ToolChoice) in responses.go, and also update the ChatRequest type to reference
the same shared ToolDefinition/ToolChoice types so validation, serialization,
and schema generation are consistent across requests and responses.

In `@internal/core/types.go`:
- Around line 56-59: The Message struct's Content field must be nullable so a
decoded assistant tool-call with content: null isn't turned into an empty
string; change Message.Content from string to *string (or introduce a nullable
type) and update UnmarshalJSON/MarshalJSON so Unmarshal preserves nil when
content was null and Marshal emits JSON null for Messages that originally had
null content (especially for messages with ToolCalls/ToolCallID); also update
callers that read Message.Content (for example
internal/providers/openai/openai.go and any logic that serializes req.Messages)
to handle the pointer/nullable type accordingly.

In `@internal/providers/anthropic/anthropic.go`:
- Around line 1281-1285: The conversion failure handling in
convertToAnthropicRequest currently discards the original GatewayError and
replaces it with a generic core.NewInvalidRequestError for each batch item;
instead, when err is returned from convertToAnthropicRequest (and similarly in
the other branch around the second conversion call), preserve the original typed
error (core.GatewayError) and return it with the batch-item index prefixed to
its message: detect if err is a core.GatewayError and wrap or annotate it to add
"batch item %d: " while keeping the original error type and message (do not
replace it with a new generic InvalidRequestError).
- Around line 553-568: The code currently drops the incoming tool_choice when
tool conversion yields no tools; instead, detect when
convertOpenAIToolChoiceToAnthropic returns a non-nil toolChoice (or the original
request had ToolChoice set) but anthropicReq.Tools is nil/empty and return a 400
validation error; update both convertToAnthropicRequest and
convertResponsesRequestToAnthropic to check after convertOpenAIToolsToAnthropic
and convertOpenAIToolChoiceToAnthropic: if toolChoice != nil (or req.ToolChoice
!= nil) && len(anthropicReq.Tools) == 0 then return an HTTP 400-style error
(with a clear message about tool_choice requiring at least one tool) rather than
clearing ToolChoice or proceeding, and keep the existing logic for
applyParallelToolCalls, ParallelToolCalls, and disableTools unchanged otherwise.
- Around line 482-492: The parseToolCallArguments helper currently accepts any
JSON value; after unmarshalling in parseToolCallArguments, verify that the
parsed value is a JSON object (i.e., assert it is a map[string]any /
map[string]interface{}); if it is not (e.g., slice, string, number, null),
return a descriptive error indicating tool arguments must be a JSON object; keep
the existing behavior that an empty/whitespace arguments string returns an empty
map.

In `@internal/providers/responses_adapter.go`:
- Around line 140-158: The function_call branch can emit an empty ToolCall.ID
when neither "call_id" nor "id" exist; update the branch in responses_adapter.go
(the case "function_call" block that reads name and callID) to normalize callID
before building the core.Message by replacing an empty callID with the
ResponsesFunctionCallCallID constant (or alternatively return false if you
prefer rejecting blank ids), then use that normalized callID for ToolCall.ID so
downstream providers always receive a stable id.

In `@internal/providers/responses_converter_test.go`:
- Around line 52-55: The test currently sets foundArgumentsDelta true if either
fragment matches, which allows missing halves to pass; update the assertion in
responses_converter_test.go to verify both streamed JSON fragments are received
in order (or assert the exact concatenated delta sequence) instead of a single
fragment: replace the single boolean foundArgumentsDelta check with either a
counter that increments when event.Payload["delta"] equals "{\"city\":\"War" and
when it equals "saw\"}" and assert count==2, or capture the sequence of observed
deltas and assert it equals the expected slice/concatenated string; ensure you
update any final expectations in the test accordingly (look for the case
handling "response.function_call_arguments.delta" and the foundArgumentsDelta
variable).

In `@tests/e2e/mock_provider.go`:
- Around line 297-307: The mock is emitting delta.tool_calls as []core.ToolCall
(see firstChunk and toolCall) but the streaming protocol requires each tool call
object to include an index to correlate incremental chunks; change the
construction of delta.tool_calls in firstChunk to emit a slice of objects that
include an "index" integer and the tool-call payload (instead of marshaling
core.ToolCall directly), e.g. build []map[string]interface{} entries with
"index": <seq> and the tool call fields, ensuring the same sequencing logic used
by the stream assembler is reproduced so tests exercise index-based
accumulation.

---

Duplicate comments:
In `@internal/providers/anthropic/anthropic_test.go`:
- Around line 910-939: The test TestConvertToAnthropicRequest_InvalidToolChoice
only asserts the error type but not the mapped HTTP status; update this and the
two similar tests to also assert that the returned *core.GatewayError
(gatewayErr) maps to HTTP 400 by calling gatewayErr.HTTPStatusCode() and
comparing it to http.StatusBadRequest, keeping the existing checks for
errors.As(err, &gatewayErr) and gatewayErr.Type == core.ErrorTypeInvalidRequest;
apply the same addition to the other two tests mentioned (lines ~941-958 and
~2083-2105) so all invalid-request tests verify both the error type and HTTP
status mapping.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d1e75279-ae9e-46cf-a198-187e76e6ccc2

📥 Commits

Reviewing files that changed from the base of the PR and between 026895e and 5dcdaca.

📒 Files selected for processing (11)
  • cmd/gomodel/docs/docs.go
  • cmd/gomodel/docs/swagger.json
  • internal/core/responses.go
  • internal/core/types.go
  • internal/providers/anthropic/anthropic.go
  • internal/providers/anthropic/anthropic_test.go
  • internal/providers/responses_adapter.go
  • internal/providers/responses_adapter_test.go
  • internal/providers/responses_converter.go
  • internal/providers/responses_converter_test.go
  • tests/e2e/mock_provider.go

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/core/types.go`:
- Around line 59-63: The guardrails pipeline drops the Message.ContentNull flag
causing null content to become an empty string; add a ContentNull bool field to
guardrails.Message (in internal/guardrails/guardrails.go), update chatToMessages
to copy the ContentNull from core.Message into the guardrails DTO, and update
applyMessagesToChat to read ContentNull from the guardrails DTO and set it on
the reconstructed core.Message (the core.Message constructed in
applyMessagesToChat must explicitly set ContentNull). Ensure
ToolCalls/ToolCallID/Role/Content handling remains unchanged while propagating
ContentNull through these adapters.

In `@internal/providers/anthropic/anthropic.go`:
- Around line 524-534: Normalize and validate tool call ID and function name
before creating anthropicMessageContentBlock: trim whitespace from toolCall.ID
and toolCall.Function.Name, and if either is empty after trimming apply the same
fallback/normalization used in responses_adapter.go (e.g., generate or
canonicalize an ID and replace empty names with a safe placeholder) so that the
block created in the loop that calls parseToolCallArguments and constructs
anthropicMessageContentBlock uses validated ID and Name values and preserves
correlation with tool_result messages.
- Around line 871-889: The initial tool-call emission is leaking Anthropic's
"{}" placeholder by writing and emitting initialArguments from
event.ContentBlock.Input; change the logic in the flow that builds the first
chunk (around extractInitialToolArguments, state.PlaceholderObject,
state.Arguments.WriteString, sc.toolCalls and sc.formatChatChunk) so that if
initialArguments == "{}" you do not append or emit that placeholder: set
state.PlaceholderObject = true but emit arguments as empty string (or omit the
arguments field) in the first chunk, and only start writing/emitting real
argument content when input_json_delta events provide real JSON fragments
(handle input_json_delta to write into state.Arguments and clear
PlaceholderObject when first non-placeholder fragment arrives); ensure the same
fix is applied to the responses path that builds tool_calls to prevent "{}"
being sent initially.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 994c6a8c-3843-4296-8059-e4f4597db00b

📥 Commits

Reviewing files that changed from the base of the PR and between 5dcdaca and a3bf895.

📒 Files selected for processing (8)
  • internal/core/types.go
  • internal/core/types_test.go
  • internal/providers/anthropic/anthropic.go
  • internal/providers/anthropic/anthropic_test.go
  • internal/providers/responses_adapter.go
  • internal/providers/responses_adapter_test.go
  • internal/providers/responses_converter_test.go
  • tests/e2e/mock_provider.go

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/guardrails/provider.go`:
- Around line 331-336: applyMessagesToChat currently copies
core.Message.ContentNull verbatim, so if a guardrail transforms an assistant
tool-call from content:null to real text the resulting core.Message still has
ContentNull=true and will marshal as null; update the construction of
core.Message (the block that sets Role, Content, ToolCalls, ToolCallID,
ContentNull) so that ContentNull is cleared (set to false) whenever Content is
non-empty/non-nil (i.e., after applying guardrail mutations), ensuring that when
m.Content has real text you override ContentNull before returning the message;
reference applyMessagesToChat and core.Message/ContentNull to locate where to
change this behavior.

In `@internal/providers/anthropic/anthropic.go`:
- Around line 876-883: The code suppresses the initial "{}" placeholder by
setting emittedArguments = "" when extractInitialToolArguments returns "{}",
which prevents emitting the real empty object if no input_json_delta arrives;
change the logic so that when initialArguments == "{}", set
state.PlaceholderObject = true but set emittedArguments = "{}" (and still avoid
writing it into state.Arguments), so clients receive the empty JSON object for
zero-arg tool calls; apply the same change to the other occurrence that mirrors
this behavior (the block around the second occurrence currently at the 939-940
equivalent).
- Around line 500-509: The tool-result branch returns
anthropicMessageContentBlock with ToolUseID set to raw msg.ToolCallID, causing
mismatches with normalized tool_use.id; apply the same trim/canonicalize logic
used earlier for assistant tool-call IDs (the normalization in the nearby block
that processes assistant tool-call IDs around lines 525-537) to msg.ToolCallID
before assigning it to the ToolUseID field so tool_result.tool_use_id matches
tool_use.id.
- Around line 489-496: Replace the json.Unmarshal call that decodes into parsed
with a json.Decoder that has UseNumber enabled so numeric values are preserved
as json.Number; specifically, create a json.NewDecoder over the trimmed string,
call decoder.UseNumber(), decode into the existing parsed variable (any) and
keep the subsequent map[string]any type check and error handling the same so
integers beyond 2^53 aren't coerced to float64.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 743b6f0d-a215-4fbc-b743-3193290b3bb3

📥 Commits

Reviewing files that changed from the base of the PR and between a3bf895 and 88ee41c.

📒 Files selected for processing (6)
  • internal/guardrails/guardrails.go
  • internal/guardrails/provider.go
  • internal/guardrails/provider_test.go
  • internal/providers/anthropic/anthropic.go
  • internal/providers/anthropic/anthropic_test.go
  • tests/contract/testdata/golden/openai/chat_with_tools.golden.json

@SantiagoDePolonia SantiagoDePolonia self-assigned this Mar 7, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 28 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}
return strings.Join(texts, " ")
func normalizeAnthropicRequestError(err error) error {
if gatewayErr, ok := err.(*core.GatewayError); ok {
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normalizeAnthropicRequestError uses a direct type assertion (err.(*core.GatewayError)) while prefixAnthropicBatchItemError on line 1082 uses errors.As(err, &gatewayErr). For consistency and to be safe against future wrapped errors, consider using errors.As here as well.

Suggested change
if gatewayErr, ok := err.(*core.GatewayError); ok {
var gatewayErr *core.GatewayError
if errors.As(err, &gatewayErr) {

Copilot uses AI. Check for mistakes.
SantiagoDePolonia and others added 2 commits March 8, 2026 10:52
- Make ContentNull defensive: only emit null when Content is empty
- Replace anonymous IIFE role mapping with simple if/else in Anthropic
- Export and deduplicate BuildResponsesOutputItems across providers
- Rename parseResponsesConverterTestEvents to parseTestSSEEvents for consistency
- Replace sort.Ints with slices.Sort

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
internal/providers/anthropic/anthropic.go (2)

1671-1690: 🧹 Nitpick | 🔵 Trivial

Minor: Inconsistent use of interface{} vs any.

This block uses map[string]interface{} while most of the file uses map[string]any. Both are equivalent in Go 1.18+, but any is preferred in newer code for consistency.

♻️ Suggested change for consistency
-responseData := map[string]interface{}{
+responseData := map[string]any{
 	"id":         sc.responseID,
 	"object":     "response",
 	"status":     "completed",
 	...
 }

Similarly for other occurrences in the streaming code (e.g., lines 1864, 1914).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/anthropic/anthropic.go` around lines 1671 - 1690, Replace
the inconsistent use of map[string]interface{} with map[string]any in the
streaming response construction to match file-wide style: update the
responseData and doneEvent declarations (and any other occurrences in the same
streaming block, e.g., where cachedUsage is used) to use map[string]any so
symbols like responseData, doneEvent, sc.cachedUsage, sc.responseID, and
sc.model remain the same but types are consistent with the rest of the file.

1145-1158: 🧹 Nitpick | 🔵 Trivial

Consider using UseNumber() for consistent numeric precision handling.

parseToolCallArguments uses json.Decoder.UseNumber() to preserve precision for large integers, but this extraction path uses plain json.Unmarshal. The round-trip through UnmarshalMarshal can lose precision for integers beyond 2⁵³.

♻️ Suggested improvement for consistency
 arguments := "{}"
 if len(b.Input) > 0 {
-	var parsed any
-	if err := json.Unmarshal(b.Input, &parsed); err == nil {
-		if canonical, err := json.Marshal(parsed); err == nil {
-			arguments = string(canonical)
-		}
-	} else {
-		trimmed := strings.TrimSpace(string(b.Input))
-		if trimmed != "" {
-			arguments = trimmed
-		}
+	// Preserve raw JSON to avoid precision loss from unmarshal/marshal round-trip
+	trimmed := strings.TrimSpace(string(b.Input))
+	if trimmed != "" {
+		arguments = trimmed
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/anthropic/anthropic.go` around lines 1145 - 1158, This
extraction path loses integer precision because it uses json.Unmarshal; switch
to a json.Decoder with UseNumber() before decoding b.Input to preserve large
integers: create a decoder (json.NewDecoder over b.Input), call dec.UseNumber(),
decode into parsed (any) and then json.Marshal(parsed) as before; update the
block that currently unmarshals b.Input (the code handling b.Input/arguments)
and ensure the same fallback to trimmed string on decode error is kept;
reference parseToolCallArguments and the variables b.Input and arguments to
locate the change.
♻️ Duplicate comments (1)
internal/core/responses.go (1)

10-12: ⚠️ Potential issue | 🟠 Major

Replace untyped tool request fields with shared core models.

[]map[string]any and any keep a public API shape weakly typed, push validation into scattered runtime assertions, and make schema generation brittle. Please introduce shared typed ToolDefinition / ToolChoice models in internal/core and reuse them here and in ChatRequest.

#!/bin/bash
# Verify the remaining public tool payload fields that are still untyped.
rg -n -C2 'Tools\s+\[\]map\[string\](any|interface\{\})|ToolChoice\s+(any|interface\{\})' internal/core

Based on learnings, "Do not use interface{} or map[string]interface{} for API request/response payload types. Prefer strongly-typed structs for API payload definitions."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/responses.go` around lines 10 - 12, Introduce strongly-typed
models ToolDefinition and ToolChoice in internal/core and replace the untyped
fields: change the Tools field type from []map[string]any to []ToolDefinition
and change ToolChoice from any to ToolChoice in the responses struct (and the
analogous ChatRequest struct), keep the existing json tags (tools,omitempty and
tool_choice,omitempty) and preserve ParallelToolCalls *bool; update any code
that constructs/consumes these fields to use the new types and add validation
constructors/marshalling helpers if needed, then run the provided ripgrep check
to verify no remaining uses of map[string]interface{} or any for
Tools/ToolChoice.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/guardrails/provider_test.go`:
- Around line 201-224: The test only checks presence/counts/IDs of tool-related
fields, so update assertions to compare full tool payloads and fields that must
be preserved: assert the concrete values of got.Tools[0] (e.g.,
Name/Description/Arguments), got.ToolChoice (the selected tool name/value),
got.Messages[1].ToolCalls[0].Function.Name and Function.Arguments, the ToolCall
result Content, and got.Messages[2].ToolCallID equals "call_123"; also tighten
got.Reasoning to check Reasoning.Effort == "high" and any nested fields. Locate
and update assertions referencing got.Tools, got.ToolChoice, got.Messages,
ToolCalls, and ToolCallID to validate exact expected structs/values rather than
only lengths or non-nil checks.

In `@internal/providers/responses_adapter.go`:
- Around line 78-129: The assistant merging logic in convertResponsesInputItems
uses pendingAssistant and intentionally only inherits msg.ContentNull when
pendingAssistant.Content is empty (the block setting
pendingAssistant.ContentNull = pendingAssistant.ContentNull || msg.ContentNull),
which is subtle; add a short inline comment next to that conditional explaining:
when accumulated content is empty we must preserve a true ContentNull if any
merged piece is null (but once any real content exists we clear/ignore
ContentNull), so future readers understand why the assignment is guarded by
pendingAssistant.Content == "".

In `@internal/providers/responses_converter_test.go`:
- Around line 36-79: The test currently only checks presence flags and delta
fragments, which allows reorderings or extra lifecycle events; update the test
in responses_converter_test.go to assert the exact ordered sequence and payloads
emitted by parseTestSSEEvents(t, string(raw)) for the function_call lifecycle
(check event.Name values "response.output_item.added",
"response.function_call_arguments.delta",
"response.function_call_arguments.done", "response.output_item.done" in the
exact order) and validate each event.Payload (e.g., the added item map contains
type="function_call", call_id="call_123", name="lookup_weather"; the two delta
payloads concatenate to `{"city":"Warsaw"}`; the arguments payload on done
equals `{"city":"Warsaw"}`; the output_item.done payload item.arguments equals
`{"city":"Warsaw"}`) and fail the test if the sequence length, names, or
payloads differ or if extra function-call lifecycle events appear; use the
existing events slice returned by parseTestSSEEvents to perform the ordered
comparison instead of the boolean flags and unordered checks.

In `@internal/providers/responses_converter.go`:
- Around line 71-80: Convert the standalone function normalizeToolCallIndex into
a method on OpenAIResponsesStreamConverter for consistency: change its signature
to a receiver method (e.g., func (c *OpenAIResponsesStreamConverter)
normalizeToolCallIndex(value interface{}) (int, bool)), move the implementation
into the type's file, and update all call sites to use
c.normalizeToolCallIndex(...) instead of the package-level function; ensure any
tests or references are updated accordingly.
- Around line 186-201: In forceStartToolCall, when strings.TrimSpace(state.Name)
== "" currently falls back to "unknown" without logging; add a warning log right
before setting state.Name = "unknown" to record that the stream ended without a
function name (include context like state.CallID and state.OutputIndex). Use the
OpenAIResponsesStreamConverter receiver's logger (e.g., sc.logger.Warnf or the
existing logging field on sc) to emit a concise warning mentioning
forceStartToolCall, state.CallID, state.OutputIndex and that the name was blank
and substituted with "unknown".

---

Outside diff comments:
In `@internal/providers/anthropic/anthropic.go`:
- Around line 1671-1690: Replace the inconsistent use of map[string]interface{}
with map[string]any in the streaming response construction to match file-wide
style: update the responseData and doneEvent declarations (and any other
occurrences in the same streaming block, e.g., where cachedUsage is used) to use
map[string]any so symbols like responseData, doneEvent, sc.cachedUsage,
sc.responseID, and sc.model remain the same but types are consistent with the
rest of the file.
- Around line 1145-1158: This extraction path loses integer precision because it
uses json.Unmarshal; switch to a json.Decoder with UseNumber() before decoding
b.Input to preserve large integers: create a decoder (json.NewDecoder over
b.Input), call dec.UseNumber(), decode into parsed (any) and then
json.Marshal(parsed) as before; update the block that currently unmarshals
b.Input (the code handling b.Input/arguments) and ensure the same fallback to
trimmed string on decode error is kept; reference parseToolCallArguments and the
variables b.Input and arguments to locate the change.

---

Duplicate comments:
In `@internal/core/responses.go`:
- Around line 10-12: Introduce strongly-typed models ToolDefinition and
ToolChoice in internal/core and replace the untyped fields: change the Tools
field type from []map[string]any to []ToolDefinition and change ToolChoice from
any to ToolChoice in the responses struct (and the analogous ChatRequest
struct), keep the existing json tags (tools,omitempty and tool_choice,omitempty)
and preserve ParallelToolCalls *bool; update any code that constructs/consumes
these fields to use the new types and add validation constructors/marshalling
helpers if needed, then run the provided ripgrep check to verify no remaining
uses of map[string]interface{} or any for Tools/ToolChoice.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 897c7330-4ef0-4d00-9e8e-e60d6176f2c3

📥 Commits

Reviewing files that changed from the base of the PR and between 88ee41c and 9ea8e1e.

📒 Files selected for processing (14)
  • internal/core/responses.go
  • internal/core/types.go
  • internal/core/types_test.go
  • internal/guardrails/provider.go
  • internal/guardrails/provider_test.go
  • internal/providers/anthropic/anthropic.go
  • internal/providers/anthropic/anthropic_test.go
  • internal/providers/responses_adapter.go
  • internal/providers/responses_adapter_test.go
  • internal/providers/responses_converter.go
  • internal/providers/responses_converter_test.go
  • tests/contract/testdata/golden/anthropic/responses_stream.golden.json
  • tests/contract/testdata/golden/gemini/responses_stream.golden.json
  • tests/contract/testdata/golden/groq/responses_stream.golden.json

Comment on lines +201 to +224
if len(got.Tools) != 1 {
t.Errorf("tools not preserved")
}
if got.ToolChoice == nil {
t.Errorf("tool_choice not preserved")
}
if got.ParallelToolCalls == nil || *got.ParallelToolCalls {
t.Errorf("parallel_tool_calls not preserved")
}
if !got.Stream {
t.Errorf("stream not preserved")
}
if got.Reasoning == nil || got.Reasoning.Effort != "high" {
t.Errorf("reasoning not preserved")
}
if len(got.Messages) != 3 {
t.Fatalf("len(messages) = %d, want 3", len(got.Messages))
}
if len(got.Messages[1].ToolCalls) != 1 || got.Messages[1].ToolCalls[0].ID != "call_123" {
t.Errorf("assistant tool_calls not preserved: %+v", got.Messages[1].ToolCalls)
}
if got.Messages[2].ToolCallID != "call_123" {
t.Errorf("tool_call_id not preserved: %+v", got.Messages[2])
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Compare the tool payloads here, not just their presence.

This still passes if ToolChoice, ToolCalls[0].Function.Name, ToolCalls[0].Function.Arguments, or the tool result Content change while counts and IDs stay intact. Tighten the assertions to compare the concrete values that are supposed to survive the guardrail round-trip.

Based on learnings, "Add or update tests for any behavior change".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/guardrails/provider_test.go` around lines 201 - 224, The test only
checks presence/counts/IDs of tool-related fields, so update assertions to
compare full tool payloads and fields that must be preserved: assert the
concrete values of got.Tools[0] (e.g., Name/Description/Arguments),
got.ToolChoice (the selected tool name/value),
got.Messages[1].ToolCalls[0].Function.Name and Function.Arguments, the ToolCall
result Content, and got.Messages[2].ToolCallID equals "call_123"; also tighten
got.Reasoning to check Reasoning.Effort == "high" and any nested fields. Locate
and update assertions referencing got.Tools, got.ToolChoice, got.Messages,
ToolCalls, and ToolCallID to validate exact expected structs/values rather than
only lengths or non-nil checks.

Comment on lines +78 to +129
func convertResponsesInputItems(items []interface{}) []core.Message {
messages := make([]core.Message, 0, len(items))
var pendingAssistant *core.Message

flushPendingAssistant := func() {
if pendingAssistant == nil {
return
}
messages = append(messages, *pendingAssistant)
pendingAssistant = nil
}

for _, item := range items {
if msg, ok := convertResponsesInputItem(item); ok {
itemType := responsesInputItemType(item)
if msg.Role == "assistant" {
if itemType == "message" {
flushPendingAssistant()
}
if pendingAssistant == nil {
assistant := core.Message{
Role: "assistant",
Content: msg.Content,
ToolCallID: msg.ToolCallID,
ContentNull: msg.ContentNull,
}
if len(msg.ToolCalls) > 0 {
assistant.ToolCalls = append([]core.ToolCall(nil), msg.ToolCalls...)
}
pendingAssistant = &assistant
} else {
if msg.Content != "" {
pendingAssistant.Content += msg.Content
pendingAssistant.ContentNull = false
}
if len(msg.ToolCalls) > 0 {
pendingAssistant.ToolCalls = append(pendingAssistant.ToolCalls, msg.ToolCalls...)
if pendingAssistant.Content == "" {
pendingAssistant.ContentNull = pendingAssistant.ContentNull || msg.ContentNull
}
}
}
continue
}

flushPendingAssistant()
messages = append(messages, msg)
}
}
flushPendingAssistant()
return messages
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider simplifying the assistant message merging logic.

The pendingAssistant accumulation logic handles merging function_call items into preceding assistant messages. While correct, the conditional logic around ContentNull inheritance (lines 115-117) is subtle:

if pendingAssistant.Content == "" {
    pendingAssistant.ContentNull = pendingAssistant.ContentNull || msg.ContentNull
}

This only sets ContentNull when there's no accumulated content, which is correct but not immediately obvious. Consider adding a brief inline comment explaining this behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/responses_adapter.go` around lines 78 - 129, The assistant
merging logic in convertResponsesInputItems uses pendingAssistant and
intentionally only inherits msg.ContentNull when pendingAssistant.Content is
empty (the block setting pendingAssistant.ContentNull =
pendingAssistant.ContentNull || msg.ContentNull), which is subtle; add a short
inline comment next to that conditional explaining: when accumulated content is
empty we must preserve a true ContentNull if any merged piece is null (but once
any real content exists we clear/ignore ContentNull), so future readers
understand why the assignment is guarded by pendingAssistant.Content == "".

Comment on lines +36 to +79
events := parseTestSSEEvents(t, string(raw))
foundAdded := false
foundArgumentsDone := false
foundItemDone := false
var argumentDeltas []string

for _, event := range events {
if event.Done {
continue
}
switch event.Name {
case "response.output_item.added":
item, _ := event.Payload["item"].(map[string]interface{})
if item["type"] == "function_call" && item["call_id"] == "call_123" && item["name"] == "lookup_weather" {
foundAdded = true
}
case "response.function_call_arguments.delta":
if delta, _ := event.Payload["delta"].(string); delta != "" {
argumentDeltas = append(argumentDeltas, delta)
}
case "response.function_call_arguments.done":
if event.Payload["arguments"] == `{"city":"Warsaw"}` {
foundArgumentsDone = true
}
case "response.output_item.done":
item, _ := event.Payload["item"].(map[string]interface{})
if item["type"] == "function_call" && item["arguments"] == `{"city":"Warsaw"}` {
foundItemDone = true
}
}
}

if !foundAdded {
t.Fatal("expected response.output_item.added for function_call")
}
if len(argumentDeltas) != 2 || argumentDeltas[0] != "{\"city\":\"War" || argumentDeltas[1] != "saw\"}" {
t.Fatalf("response.function_call_arguments.delta sequence = %#v, want two ordered fragments", argumentDeltas)
}
if !foundArgumentsDone {
t.Fatal("expected response.function_call_arguments.done for function_call")
}
if !foundItemDone {
t.Fatal("expected response.output_item.done for function_call")
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Assert the full function-call replay order.

These checks still pass if response.function_call_arguments.done and response.output_item.done are reordered, or if an extra function-call lifecycle event is emitted. Since this converter’s contract is the replayed stream sequence, please assert the ordered event names/payloads instead of only presence flags plus delta fragments.

Based on learnings, "Add or update tests for any behavior change".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/responses_converter_test.go` around lines 36 - 79, The
test currently only checks presence flags and delta fragments, which allows
reorderings or extra lifecycle events; update the test in
responses_converter_test.go to assert the exact ordered sequence and payloads
emitted by parseTestSSEEvents(t, string(raw)) for the function_call lifecycle
(check event.Name values "response.output_item.added",
"response.function_call_arguments.delta",
"response.function_call_arguments.done", "response.output_item.done" in the
exact order) and validate each event.Payload (e.g., the added item map contains
type="function_call", call_id="call_123", name="lookup_weather"; the two delta
payloads concatenate to `{"city":"Warsaw"}`; the arguments payload on done
equals `{"city":"Warsaw"}`; the output_item.done payload item.arguments equals
`{"city":"Warsaw"}`) and fail the test if the sequence length, names, or
payloads differ or if extra function-call lifecycle events appear; use the
existing events slice returned by parseTestSSEEvents to perform the ordered
comparison instead of the boolean flags and unordered checks.

Comment on lines +71 to +80
func normalizeToolCallIndex(value interface{}) (int, bool) {
switch v := value.(type) {
case int:
return v, true
case float64:
return int(v), true
default:
return 0, false
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider making this a method receiver for consistency.

normalizeToolCallIndex is a standalone function while similar helpers are methods on OpenAIResponsesStreamConverter. For consistency, consider making it a method, though this is a minor stylistic point.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/responses_converter.go` around lines 71 - 80, Convert the
standalone function normalizeToolCallIndex into a method on
OpenAIResponsesStreamConverter for consistency: change its signature to a
receiver method (e.g., func (c *OpenAIResponsesStreamConverter)
normalizeToolCallIndex(value interface{}) (int, bool)), move the implementation
into the type's file, and update all call sites to use
c.normalizeToolCallIndex(...) instead of the package-level function; ensure any
tests or references are updated accordingly.

Comment on lines +186 to +201
func (sc *OpenAIResponsesStreamConverter) forceStartToolCall(state *openAIResponsesToolCallState) string {
if state.Started {
return ""
}
if strings.TrimSpace(state.Name) == "" {
state.Name = "unknown"
}
state.CallID = ResponsesFunctionCallCallID(state.CallID)
state.ItemID = ResponsesFunctionCallItemID(state.CallID)
state.Started = true
return sc.writeEvent("response.output_item.added", map[string]interface{}{
"type": "response.output_item.added",
"item": sc.renderToolCallItem(state, "in_progress"),
"output_index": state.OutputIndex,
})
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

forceStartToolCall fallback to "unknown" name is pragmatic.

This handles edge cases where the stream ends without providing a function name. While "unknown" is a reasonable fallback, consider logging a warning to aid debugging when this occurs.

💡 Suggested improvement
 func (sc *OpenAIResponsesStreamConverter) forceStartToolCall(state *openAIResponsesToolCallState) string {
 	if state.Started {
 		return ""
 	}
 	if strings.TrimSpace(state.Name) == "" {
+		slog.Warn("forcing tool call start with unknown name", "call_id", state.CallID, "response_id", sc.responseID)
 		state.Name = "unknown"
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (sc *OpenAIResponsesStreamConverter) forceStartToolCall(state *openAIResponsesToolCallState) string {
if state.Started {
return ""
}
if strings.TrimSpace(state.Name) == "" {
state.Name = "unknown"
}
state.CallID = ResponsesFunctionCallCallID(state.CallID)
state.ItemID = ResponsesFunctionCallItemID(state.CallID)
state.Started = true
return sc.writeEvent("response.output_item.added", map[string]interface{}{
"type": "response.output_item.added",
"item": sc.renderToolCallItem(state, "in_progress"),
"output_index": state.OutputIndex,
})
}
func (sc *OpenAIResponsesStreamConverter) forceStartToolCall(state *openAIResponsesToolCallState) string {
if state.Started {
return ""
}
if strings.TrimSpace(state.Name) == "" {
slog.Warn("forcing tool call start with unknown name", "call_id", state.CallID, "response_id", sc.responseID)
state.Name = "unknown"
}
state.CallID = ResponsesFunctionCallCallID(state.CallID)
state.ItemID = ResponsesFunctionCallItemID(state.CallID)
state.Started = true
return sc.writeEvent("response.output_item.added", map[string]interface{}{
"type": "response.output_item.added",
"item": sc.renderToolCallItem(state, "in_progress"),
"output_index": state.OutputIndex,
})
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/responses_converter.go` around lines 186 - 201, In
forceStartToolCall, when strings.TrimSpace(state.Name) == "" currently falls
back to "unknown" without logging; add a warning log right before setting
state.Name = "unknown" to record that the stream ended without a function name
(include context like state.CallID and state.OutputIndex). Use the
OpenAIResponsesStreamConverter receiver's logger (e.g., sc.logger.Warnf or the
existing logging field on sc) to emit a concise warning mentioning
forceStartToolCall, state.CallID, state.OutputIndex and that the name was blank
and substituted with "unknown".

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
internal/providers/anthropic/anthropic.go (1)

811-821: ⚠️ Potential issue | 🟡 Minor

Don’t canonicalize streamed tool input via any.

json.Unmarshal into any decodes numbers as float64, so a non-placeholder content_block_start.input like {"id":9007199254740993} will be rewritten before it is streamed back out. json.Compact gives the same whitespace normalization without changing numeric literals.

🛠️ Proposed fix
 func extractInitialToolArguments(input json.RawMessage) string {
 	if len(input) == 0 {
 		return ""
 	}
@@
-	var parsed any
-	if err := json.Unmarshal(input, &parsed); err != nil {
-		return trimmed
-	}
-
-	canonical, err := json.Marshal(parsed)
-	if err != nil {
-		return trimmed
-	}
-
-	return string(canonical)
+	var compact bytes.Buffer
+	if err := json.Compact(&compact, []byte(trimmed)); err != nil {
+		return trimmed
+	}
+
+	return compact.String()
 }
In Go's encoding/json package, when unmarshalling JSON into interface{} / any, what Go type are JSON numbers decoded into, and can that lose precision for integers larger than 2^53?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/anthropic/anthropic.go` around lines 811 - 821, The code
currently unmarshals input into parsed (any) then json.Marshals it, which uses
json.Unmarshal into interface{} and converts JSON numbers to float64 (losing
integer precision); instead preserve numeric literals and only remove whitespace
by using json.Compact (or json.Indent as needed) on the original input bytes and
return that result; update the block that references parsed, input, trimmed to
call json.Compact into a bytes.Buffer (or similar) and handle errors, returning
trimmed on error and the compacted string on success.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/providers/anthropic/anthropic.go`:
- Around line 388-414: Trim the tool name before storing it: where the code
reads the raw name variable (name, _ := function["name"].(string)) and later
appends an anthropicTool, call strings.TrimSpace(name) and use that trimmed
value for anthropicTool.Name (and any other places that persist or compare the
name); apply the same fix to the other branch referenced (the second creation of
anthropicTool around lines 446-463) so stored names never retain
leading/trailing whitespace and matching logic (e.g., tool_choice vs tool_use)
works reliably.
- Around line 1132-1136: In convertAnthropicResponseToResponses, when building
the core.Message (msg) for responses that contain toolCalls but no text, set
msg.ContentNull = true so we mark content as intentionally absent; specifically,
after you collect toolCalls and content and before calling
providers.BuildResponsesOutputItems(msg), if content == "" (or nil/empty) AND
len(toolCalls) > 0 then set the ContentNull field on the core.Message instance
to true, otherwise leave it false.

In `@internal/providers/responses_adapter.go`:
- Around line 218-231: The function stringifyResponsesInputValue silently
swallows json.Marshal errors and returns an empty string; update this by either
logging the marshal error (use the package's logger at debug/trace level if
available, e.g., call logger.Debugf or similar with the error when
json.Marshal(v) fails) or, if silent fallback is intentional, add a clear
comment above stringifyResponsesInputValue explaining that marshal failures are
treated as absent values and intentionally return an empty string; modify the
json.Marshal error branch (in stringifyResponsesInputValue) to perform the
logging or add the explanatory comment accordingly.
- Around line 294-301: ResponsesFunctionCallItemID creates its own UUID when
given an empty callID, causing mismatch with ResponsesFunctionCallCallID; change
ResponsesFunctionCallItemID to delegate to ResponsesFunctionCallCallID for
normalization/UUID generation (i.e., call ResponsesFunctionCallCallID(callID)
and then prefix with "fc_") so both functions produce the same id when callID is
empty and keep the "fc_" prefix behavior consistent.

---

Duplicate comments:
In `@internal/providers/anthropic/anthropic.go`:
- Around line 811-821: The code currently unmarshals input into parsed (any)
then json.Marshals it, which uses json.Unmarshal into interface{} and converts
JSON numbers to float64 (losing integer precision); instead preserve numeric
literals and only remove whitespace by using json.Compact (or json.Indent as
needed) on the original input bytes and return that result; update the block
that references parsed, input, trimmed to call json.Compact into a bytes.Buffer
(or similar) and handle errors, returning trimmed on error and the compacted
string on success.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 89629d80-859e-489e-a365-e08960ffe9d2

📥 Commits

Reviewing files that changed from the base of the PR and between 9ea8e1e and c5a1ede.

📒 Files selected for processing (6)
  • internal/providers/anthropic/anthropic.go
  • internal/providers/anthropic/anthropic_test.go
  • internal/providers/responses_adapter.go
  • internal/providers/responses_adapter_test.go
  • internal/providers/responses_converter.go
  • internal/providers/responses_output_state.go

Comment on lines +388 to +414
name, _ := function["name"].(string)
if strings.TrimSpace(name) == "" {
return nil, core.NewInvalidRequestError("tool.function.name is required", nil)
}

description, _ := function["description"].(string)
inputSchema, hasParameters := function["parameters"]
if !hasParameters || inputSchema == nil {
inputSchema = map[string]any{
"type": "object",
"properties": map[string]any{},
}
} else {
schema, ok := inputSchema.(map[string]any)
if !ok {
return nil, core.NewInvalidRequestError("tool.function.parameters must be an object", nil)
}
if schemaType, ok := schema["type"].(string); ok && schemaType != "" && schemaType != "object" {
return nil, core.NewInvalidRequestError("tool.function.parameters must define an object schema", nil)
}
inputSchema = schema
}

out = append(out, anthropicTool{
Name: name,
Description: description,
InputSchema: inputSchema.(map[string]any),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Trim tool names before storing them.

These branches validate strings.TrimSpace(name) but then keep the raw string. A value like " lookup_weather " passes local validation and is forwarded with spaces, which can break Anthropic validation or make tool_choice stop matching the later trimmed tool_use names.

🛠️ Proposed fix
-		name, _ := function["name"].(string)
-		if strings.TrimSpace(name) == "" {
+		name, _ := function["name"].(string)
+		name = strings.TrimSpace(name)
+		if name == "" {
 			return nil, core.NewInvalidRequestError("tool.function.name is required", nil)
 		}
@@
-				name, _ := function["name"].(string)
-				if strings.TrimSpace(name) != "" {
+				name, _ := function["name"].(string)
+				name = strings.TrimSpace(name)
+				if name != "" {
 					return &anthropicToolChoice{Type: "tool", Name: name}, false, nil
 				}
 			}
@@
-			name, _ := c["name"].(string)
+			name, _ := c["name"].(string)
+			name = strings.TrimSpace(name)
 			if name == "" {
 				if function, ok := c["function"].(map[string]any); ok {
 					name, _ = function["name"].(string)
+					name = strings.TrimSpace(name)
 				}
 			}
-			if strings.TrimSpace(name) == "" {
+			if name == "" {
 				return nil, false, core.NewInvalidRequestError("tool_choice.name is required", nil)
 			}

Also applies to: 446-463

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/anthropic/anthropic.go` around lines 388 - 414, Trim the
tool name before storing it: where the code reads the raw name variable (name, _
:= function["name"].(string)) and later appends an anthropicTool, call
strings.TrimSpace(name) and use that trimmed value for anthropicTool.Name (and
any other places that persist or compare the name); apply the same fix to the
other branch referenced (the second creation of anthropicTool around lines
446-463) so stored names never retain leading/trailing whitespace and matching
logic (e.g., tool_choice vs tool_use) works reliably.

Comment on lines +1132 to +1136
msg := core.Message{
Content: content,
ToolCalls: toolCalls,
}
output := providers.BuildResponsesOutputItems(msg)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify whether BuildResponsesOutputItems distinguishes empty string content from null content.
rg -n "func BuildResponsesOutputItems|ContentNull" internal/providers internal/core --type go -C 4

Repository: ENTERPILOT/GOModel

Length of output: 10596


🏁 Script executed:

sed -n '1125,1145p' internal/providers/anthropic/anthropic.go

Repository: ENTERPILOT/GOModel

Length of output: 710


🏁 Script executed:

# Find the function that contains lines 1132-1136
sed -n '1100,1145p' internal/providers/anthropic/anthropic.go | grep -n "func\|msg :="

Repository: ENTERPILOT/GOModel

Length of output: 219


🏁 Script executed:

# Get broader context to identify which function this belongs to
sed -n '1050,1150p' internal/providers/anthropic/anthropic.go | head -100

Repository: ENTERPILOT/GOModel

Length of output: 2924


Set ContentNull when response contains only tool calls without text.

In convertAnthropicResponseToResponses, when the Anthropic response contains only tool calls and no text content, ContentNull should be marked to distinguish intentionally-absent content from present-but-empty content. This distinction is important for semantic correctness and downstream message handling, particularly in tool call history and message adaptation flows.

Fix
 	msg := core.Message{
 		Content:   content,
 		ToolCalls: toolCalls,
 	}
+	if content == "" && len(toolCalls) > 0 {
+		msg.ContentNull = true
+	}
 	output := providers.BuildResponsesOutputItems(msg)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/anthropic/anthropic.go` around lines 1132 - 1136, In
convertAnthropicResponseToResponses, when building the core.Message (msg) for
responses that contain toolCalls but no text, set msg.ContentNull = true so we
mark content as intentionally absent; specifically, after you collect toolCalls
and content and before calling providers.BuildResponsesOutputItems(msg), if
content == "" (or nil/empty) AND len(toolCalls) > 0 then set the ContentNull
field on the core.Message instance to true, otherwise leave it false.

Comment on lines +218 to 231
func stringifyResponsesInputValue(value interface{}) string {
switch v := value.(type) {
case nil:
return ""
case string:
return v
default:
encoded, err := json.Marshal(v)
if err != nil {
return ""
}
return string(encoded)
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Silent error swallowing in JSON marshaling.

When json.Marshal fails at line 226, the function returns an empty string without any indication of the error. While this may be intentional for resilience, it could hide data corruption issues during debugging.

Consider logging the error at debug level or adding a comment explaining the silent fallback is intentional:

// silently return empty on marshal error – invalid arguments are
// treated as absent rather than causing upstream failures
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/responses_adapter.go` around lines 218 - 231, The function
stringifyResponsesInputValue silently swallows json.Marshal errors and returns
an empty string; update this by either logging the marshal error (use the
package's logger at debug/trace level if available, e.g., call logger.Debugf or
similar with the error when json.Marshal(v) fails) or, if silent fallback is
intentional, add a clear comment above stringifyResponsesInputValue explaining
that marshal failures are treated as absent values and intentionally return an
empty string; modify the json.Marshal error branch (in
stringifyResponsesInputValue) to perform the logging or add the explanatory
comment accordingly.

Comment on lines +294 to +301
// ResponsesFunctionCallItemID returns a stable function-call item id.
func ResponsesFunctionCallItemID(callID string) string {
normalizedCallID := strings.TrimSpace(callID)
if normalizedCallID == "" {
normalizedCallID = "call_" + uuid.New().String()
}
return "fc_" + normalizedCallID
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Inconsistent ID generation when both functions receive empty input.

If ResponsesFunctionCallCallID("") and ResponsesFunctionCallItemID("") are called independently with empty strings, they each generate different UUIDs, breaking the expected fc_<callID> relationship.

Currently, BuildResponsesOutputItems chains them correctly (line 324→326), but if ResponsesFunctionCallItemID is ever called standalone with an empty callID, it won't correlate with any existing call.

Consider having ResponsesFunctionCallItemID delegate to ResponsesFunctionCallCallID internally:

Suggested fix
 func ResponsesFunctionCallItemID(callID string) string {
-	normalizedCallID := strings.TrimSpace(callID)
-	if normalizedCallID == "" {
-		normalizedCallID = "call_" + uuid.New().String()
-	}
+	normalizedCallID := ResponsesFunctionCallCallID(callID)
 	return "fc_" + normalizedCallID
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/responses_adapter.go` around lines 294 - 301,
ResponsesFunctionCallItemID creates its own UUID when given an empty callID,
causing mismatch with ResponsesFunctionCallCallID; change
ResponsesFunctionCallItemID to delegate to ResponsesFunctionCallCallID for
normalization/UUID generation (i.e., call ResponsesFunctionCallCallID(callID)
and then prefix with "fc_") so both functions produce the same id when callID is
empty and keep the "fc_" prefix behavior consistent.

@SantiagoDePolonia SantiagoDePolonia merged commit 3acf4cc into main Mar 8, 2026
13 checks passed
@SantiagoDePolonia SantiagoDePolonia deleted the fix/function-calling branch March 22, 2026 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants