Fix: multimodal chat content handling#119
Conversation
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughAdds multimodal content support across core types, JSON (un)marshalling, provider adapters, guardrails, tests, and docs: new ContentPart/ImageURL/InputAudio/ResponseMessage types; Message/Responses payloads accept structured arrays; normalization/parsing utilities and conversion paths updated to preserve and validate multimodal content. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Server as Server/Handler
participant Guardrails as Guardrails
participant Core as Core/Content
participant Provider as Provider (Anthropic/OpenAI)
Client->>Server: POST ChatRequest (multimodal content)
Server->>Guardrails: processChat(req)
Guardrails->>Core: UnmarshalMessageContent(content)
Core-->>Guardrails: []ContentPart / string / error
Guardrails->>Guardrails: detect non-text content
alt has non-text
Guardrails->>Guardrails: applySystemMessagesToMultimodalChat()
Guardrails->>Core: NormalizeMessageContent()
else text-only
Guardrails->>Core: NormalizeMessageContent()
end
Core-->>Guardrails: normalized content
Guardrails-->>Server: messages[]
Server->>Provider: convert & send upstream (convertMessageContentToAnthropic / OpenAI)
Provider->>Provider: validate/convert image/audio sources
Provider-->>Server: provider response (uses ResponseMessage)
Server-->>Client: ChatResponse (ResponseMessage)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/providers/anthropic/anthropic.go (1)
723-739:⚠️ Potential issue | 🟠 MajorReject malformed
input[]items instead of dropping them.If
roleis empty orConvertResponsesContentToChatContent()returnsok=false, this loop justcontinues. That rewrites bad Responses payloads into a shorter conversation — or even an emptymessagesarray — instead of returning a deterministic 400.Proposed fix
- case []interface{}: - for _, item := range input { - if msgMap, ok := item.(map[string]interface{}); ok { - role, _ := msgMap["role"].(string) - content, ok := providers.ConvertResponsesContentToChatContent(msgMap["content"]) - if role == "" || !ok { - continue - } - anthropicContent, err := convertMessageContentToAnthropic(content) - if err != nil { - return nil, err - } - if role != "" { - anthropicReq.Messages = append(anthropicReq.Messages, anthropicMessage{ - Role: role, - Content: anthropicContent, - }) - } - } - } + case []interface{}: + for idx, item := range input { + msgMap, ok := item.(map[string]interface{}) + if !ok { + return nil, core.NewInvalidRequestError(fmt.Sprintf("responses input item %d must be an object", idx), nil) + } + role, _ := msgMap["role"].(string) + if role == "" { + return nil, core.NewInvalidRequestError(fmt.Sprintf("responses input item %d is missing role", idx), nil) + } + content, ok := providers.ConvertResponsesContentToChatContent(msgMap["content"]) + if !ok { + return nil, core.NewInvalidRequestError(fmt.Sprintf("responses input item %d has unsupported content", idx), nil) + } + anthropicContent, err := convertMessageContentToAnthropic(content) + if err != nil { + return nil, err + } + anthropicReq.Messages = append(anthropicReq.Messages, anthropicMessage{ + Role: role, + Content: anthropicContent, + }) + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/anthropic/anthropic.go` around lines 723 - 739, The loop handling case []interface{} currently skips malformed items (when role=="" or ConvertResponsesContentToChatContent returns ok==false), silently shortening the messages; change this to validate each item and return a deterministic 400-style error instead of continuing. Specifically, inside the for loop (the block that inspects msgMap and calls providers.ConvertResponsesContentToChatContent and convertMessageContentToAnthropic) replace the continue path with an immediate error return describing the invalid input, ensuring you still return errors from convertMessageContentToAnthropic and do not append any anthropicMessage to anthropicReq.Messages for invalid items; this preserves deterministic behavior when malformed input[] entries are provided.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/gomodel/docs/swagger.json`:
- Around line 2034-2035: The swagger spec is missing the multimodal request
schema for core.Message.content — add explicit definitions for the array shape
and the referenced objects (ContentPart, ImageURLContent, InputAudioContent) in
the OpenAPI annotations so the generated docs/SDKs surface the new multimodal
types, update any endpoint request/response refs to use core.Message.content’s
new schema, and then run make swagger to regenerate the
cmd/gomodel/docs/swagger.json; search for core.Message.content, ContentPart,
ImageURLContent, and InputAudioContent in the codebase to locate where to add
the annotations.
In `@internal/core/message_json.go`:
- Around line 27-42: The MarshalJSON currently always calls
NormalizeMessageContent which turns nil into an empty string and breaks
OpenAI-compatible tool-call messages; change Message.MarshalJSON so that when
m.ToolCalls is non-empty you preserve a null content: if len(m.ToolCalls) > 0
and m.Content == nil then set content = nil (skip or bypass
NormalizeMessageContent) so the serialized JSON emits "content": null; otherwise
continue normal normalization via NormalizeMessageContent. Reference
Message.MarshalJSON, NormalizeMessageContent, m.Content and m.ToolCalls when
making the change.
In `@internal/core/types.go`:
- Around line 50-53: Message.Content was changed to any which removed its schema
from the generated Swagger; restore an explicit union schema by introducing a
concrete ContentPart model and annotating Message.Content as a oneOf (string OR
[]ContentPart). Add a new exported ContentPart struct (the shape used by the
multimodal parts) and add swagger annotations/comments on Message
(Message.Content) to declare the union (oneOf: string, array of ContentPart) so
the generator emits type/items for core.Message.content; then regenerate the API
docs (the gomodel docs generator) to update docs.go and swagger.json and commit
those regenerated files.
In `@internal/guardrails/provider.go`:
- Around line 292-296: The early return on chatHasNonTextContent(req) bypasses
g.pipeline.Process and thus skips all guardrails and the copy-on-write handling;
instead, convert the request's text portion to normalized messages using
chatToMessages(req), call g.pipeline.Process on that []Message to run
validation/blocking guardrails, and then proceed without attempting full
multimodal rewrites—preserve the original request structure and the
guarded-request copy-on-write semantics (as used by GuardedProvider/Router) by
applying only the safe outcomes of the pipeline (allow/deny/metadata) back to
the flow rather than returning req directly or skipping g.pipeline.Process.
In `@internal/providers/anthropic/anthropic.go`:
- Around line 191-195: The anthropicContentSource struct handling currently
rejects any non-`data:` images and doesn't validate media types; update the
logic that processes anthropicContentSource (fields MediaType and Data) to
accept remote image URLs (http/https) in Data in addition to data: URIs and to
enforce an allowlist for MediaType: only "image/jpeg", "image/png", "image/gif",
and "image/webp" are permitted; for data: URIs parse or infer the media type as
before, for URL sources require MediaType to be provided and belong to the
allowlist, and return a clear validation error otherwise. Ensure all places that
inspect anthropicContentSource.Data and anthropicContentSource.MediaType (the
parsing/validation path) are updated to implement these checks.
In `@internal/providers/responses_adapter.go`:
- Around line 81-125: The conversion loop in responses_adapter.go currently uses
continue when encountering malformed or unknown parts (inside the switch
handling "text"/"input_text", "image_url"/"input_image", "input_audio"), which
silently drops content; instead, abort the conversion on the first unsupported
or malformed part so the caller can surface a 400. Replace the continue branches
(including where extractResponsesImageURL or extractResponsesInputAudio return
ok==false, and any default/unknown partType handling) with an immediate failure
return from this conversion function (e.g., return nil, false) so the message
isn’t truncated; ensure behavior is consistent with the textOnly/parts logic so
valid conversions still return the assembled texts or parts.
In `@tests/e2e/chat_test.go`:
- Around line 71-98: The test "multimodal content array" only asserts echoed
text; update it to also verify the image part survives the end-to-end path by
inspecting the recorded upstream request body (the payload sent by
sendChatRequest) and asserting that messages[0].content still contains an
element with Type "image_url" and an ImageURLContent.URL equal to
"https://example.com/image.png"; locate the test block where payload is
constructed (core.ChatRequest, core.ContentPart) and after calling
sendChatRequest/closeBody decode or access the recorded upstream request (the
mock server recorder used by your e2e harness) and add an assertion that the
upstream messages[0].content includes the image part to prevent silent stripping
during binding/normalization.
---
Outside diff comments:
In `@internal/providers/anthropic/anthropic.go`:
- Around line 723-739: The loop handling case []interface{} currently skips
malformed items (when role=="" or ConvertResponsesContentToChatContent returns
ok==false), silently shortening the messages; change this to validate each item
and return a deterministic 400-style error instead of continuing. Specifically,
inside the for loop (the block that inspects msgMap and calls
providers.ConvertResponsesContentToChatContent and
convertMessageContentToAnthropic) replace the continue path with an immediate
error return describing the invalid input, ensuring you still return errors from
convertMessageContentToAnthropic and do not append any anthropicMessage to
anthropicReq.Messages for invalid items; this preserves deterministic behavior
when malformed input[] entries are provided.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 466e5fce-a50b-4c53-a01a-7815672aa45b
📒 Files selected for processing (17)
README.mdcmd/gomodel/docs/docs.gocmd/gomodel/docs/swagger.jsoninternal/core/chat_content.gointernal/core/chat_content_test.gointernal/core/message_json.gointernal/core/types.gointernal/guardrails/provider.gointernal/guardrails/provider_test.gointernal/providers/anthropic/anthropic.gointernal/providers/anthropic/anthropic_test.gointernal/providers/openai/openai_test.gointernal/providers/responses_adapter.gointernal/providers/responses_adapter_test.gointernal/server/handlers_test.gotests/e2e/chat_test.gotests/e2e/mock_provider.go
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/providers/anthropic/anthropic.go (1)
723-758:⚠️ Potential issue | 🟡 MinorReject unsupported
ResponsesRequest.Inputshapes before posting upstream.This switch only handles
stringand[]interface{}.nil, typed slices, or any other value fall through with zero messages and still get sent to Anthropic, which turns a local validation error into a provider-side 400 instead of a gatewayinvalid_request_error.Possible fix
switch input := req.Input.(type) { case string: anthropicReq.Messages = append(anthropicReq.Messages, anthropicMessage{ Role: "user", Content: input, }) case []interface{}: for i, item := range input { msgMap, ok := item.(map[string]interface{}) if !ok { return nil, core.NewInvalidRequestError(fmt.Sprintf("invalid responses input item at index %d: expected object", i), nil) } @@ anthropicReq.Messages = append(anthropicReq.Messages, anthropicMessage{ Role: role, Content: anthropicContent, }) } + default: + return nil, core.NewInvalidRequestError("responses input must be a string or an array of input items", nil) } + + if len(anthropicReq.Messages) == 0 { + return nil, core.NewInvalidRequestError("responses input must contain at least one message", nil) + } return anthropicReq, nil }As per coding guidelines: "Use typed gateway errors (
core.GatewayError) and map upstream failures correctly."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/anthropic/anthropic.go` around lines 723 - 758, The switch that converts req.Input into anthropicReq.Messages currently only accepts string and []interface{} and lets nil/other types fall through, producing zero messages; add a default branch (and explicit nil check) in that conversion logic to reject unsupported shapes by returning a typed gateway error (use core.NewInvalidRequestError or core.GatewayError per convention) with a clear message like "invalid responses input: unsupported type" so we never send empty messages upstream; update the block around anthropicReq.Messages conversion (the switch handling req.Input, which uses providers.ConvertResponsesContentToChatContent and convertMessageContentToAnthropic) to perform this validation and return immediately on invalid input.
♻️ Duplicate comments (4)
internal/core/chat_content.go (1)
33-36:⚠️ Potential issue | 🟠 MajorDon’t rewrite
content:nullto an empty string.This branch makes
null, a missingcontent, and an explicit""indistinguishable. After unmarshal,Message.MarshalJSONcan no longer emitnull, so tool-call/history messages change shape on round-trip instead of being preserved or rejected explicitly. Preservenilhere and letMessage.UnmarshalJSONdecide when null or missing content is valid.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/chat_content.go` around lines 33 - 36, The UnmarshalMessageContent function currently maps JSON null/empty to an empty string, losing distinction between null, missing, and explicit "" and preventing Message.MarshalJSON from re-emitting null; change UnmarshalMessageContent to return nil (not "") when trimmed input is empty or equals "null" so that Message.UnmarshalJSON/Message.MarshalJSON can validate or preserve null vs empty content correctly (update any callers expecting "" accordingly, and ensure Message.UnmarshalJSON handles nil content cases).internal/providers/responses_adapter.go (1)
54-55:⚠️ Potential issue | 🟠 MajorReject malformed Responses items instead of dropping them.
When
ConvertResponsesContentToChatContentreturnsok=false, this code just omits the message. A single bad or unknown multimodal part now truncates the conversation and shifts message ordering instead of surfacing an invalid-request error. Please bubble upcore.NewInvalidRequestError(...)from conversion rather than silently rewriting user input.Also applies to: 71-137
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/responses_adapter.go` around lines 54 - 55, When ConvertResponsesContentToChatContent(msgMap["content"]) returns ok==false you must not silently drop the message; instead return an error created via core.NewInvalidRequestError(...) so the request fails with an explicit invalid-request. Update the branch in responses_adapter.go around the use of ConvertResponsesContentToChatContent (and the other similar usage blocks that call this function) to check the ok flag and, on false, bubble up and return core.NewInvalidRequestError with a clear message identifying the malformed Responses item (including any available detail from the conversion function) rather than omitting the message from the conversation.internal/core/types.go (1)
50-57: 🧹 Nitpick | 🔵 TrivialConsider a typed union for
Message.Contentif multimodal support becomes complex.
Message.Contentusesanyon a public request payload type. While the existing custom JSON marshaling inmessage_json.gohandles both string and[]ContentPartcorrectly, a small typed wrapper (e.g.,MessageContentwith an interface or discriminated union) could improve type clarity. However, this is an optional refactoring—the current approach is pragmatic for OpenAI compatibility and already has proper serialization handling in place.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/types.go` around lines 50 - 57, The Message struct currently exposes Content as any which weakly types public payloads; introduce a typed wrapper (e.g., MessageContent) that represents either a string or []ContentPart and replace Message.Content with MessageContent, then implement/ reuse the existing custom JSON marshal/unmarshal logic (move or adapt code in message_json.go) to handle the two variants; ensure MessageContent references ContentPart and preserves the Swagger extensions/annotations presently on ContentSchema so serialization, validation and OpenAI-compatible payloads remain unchanged.internal/providers/anthropic/anthropic.go (1)
198-202:⚠️ Potential issue | 🟠 MajorSerialize remote image sources with a
urlfield, notdata.The new
http/httpsbranch now accepts remote images locally, but the payload it builds is still{"type":"url","data":"..."}becauseanthropicContentSourcehas nourlfield. If Anthropic expects URL sources on a dedicatedurlproperty, remote-image support is still broken on the wire.Possible fix
type anthropicContentSource struct { Type string `json:"type"` MediaType string `json:"media_type,omitempty"` Data string `json:"data,omitempty"` + URL string `json:"url,omitempty"` } @@ return &anthropicContentSource{ Type: "url", MediaType: mediaType, - Data: raw, + URL: raw, }, nil }Anthropic Messages API image source schema for image blocks: when source.type is "url", does the payload use a "url" field or a "data" field, and is media_type required for URL sources?Also applies to: 823-875
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/anthropic/anthropic.go` around lines 198 - 202, The anthropicContentSource struct currently only has Data for image payloads causing remote URLs to serialize as {"type":"url","data":"..."}; add a Url string `json:"url,omitempty"` to the anthropicContentSource struct (preserving Data for inline/base64 sources) and update the code that builds/serializes message/image payloads (where anthropicContentSource is instantiated) to set Url for type == "url" and Data for inline sources, and ensure MediaType is included when required by the Anthropic Messages API; also update the other occurrences of anthropicContentSource use (the other referenced blocks) to follow the same url vs data logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/gomodel/docs/swagger.json`:
- Around line 1855-1871: The OpenAPI spec still emits
core.ResponsesRequest.input as a string-only field so the /v1/responses request
shape omits multimodal arrays; update the Swagger annotations so
core.ResponsesRequest.input is an array of core.ContentPart (which references
core.ImageURLContent and core.InputAudioContent) consistent with
core.Message.content and the code paths in
internal/providers/responses_adapter_test.go and
internal/providers/anthropic/anthropic.go, then regenerate the spec by running
make swagger (also ensure the alternative occurrences around the other block
noted are updated as well).
In `@internal/core/types.go`:
- Around line 87-96: ResponseMessage.Content is currently a string so JSON nulls
become empty strings; change ResponseMessage to use Content any and add custom
(un)marshal logic mirroring the existing Message implementation in
message_json.go so that null is preserved when ToolCalls is present; update the
struct field signature (ResponseMessage.Content), implement
UnmarshalJSON/MarshalJSON for ResponseMessage using the same null-preservation
pattern as Message, and ensure encoding/decoding treats null vs "" distinctly
for tool-call responses (ToolCall, ResponseMessage, and the existing
message_json.go logic are the reference points to copy/adapt).
In `@internal/guardrails/provider.go`:
- Around line 297-299: The current early-return using chatHasNonTextContent(req)
that calls applySystemMessagesToMultimodalChat discards guardrail rewrites for
text-only messages by restoring every original non-system message; update the
logic so that applySystemMessagesToMultimodalChat (or the
multimodal-preservation branch) only replaces messages that actually contain
non-text parts while preserving modified.Content for text-only messages (i.e.,
iterate the normalized []Message list and, for each message, if message has
non-text content restore original content, otherwise keep modified.Content);
apply the same fix pattern where similar multimodal-restoration occurs (the
other multimodal branches referenced) and ensure this change is used by
GuardedProvider’s upstream flow so guardrail rewrites remain for text-only
turns.
In `@internal/providers/anthropic/anthropic_test.go`:
- Around line 1960-2000: The test
TestConvertToAnthropicRequest_MultimodalRemoteImageContent and the
convertToAnthropicRequest serialization must be updated to match Anthropic's
Messages API: change expectations to assert blocks[0].Source.URL (not Data) for
type "url" and remove the MediaType assertion for URL sources, and change the
implementation that builds anthropicContentBlock.Source for remote images to
emit field name "url" (Source.URL) and omit MediaType when Source.Type == "url";
adjust the same fixes for the other similar tests/serializers referenced (lines
2002–2027, 2029–2055) so all assertions and serialization use Source.URL for URL
sources and do not include MediaType for them.
---
Outside diff comments:
In `@internal/providers/anthropic/anthropic.go`:
- Around line 723-758: The switch that converts req.Input into
anthropicReq.Messages currently only accepts string and []interface{} and lets
nil/other types fall through, producing zero messages; add a default branch (and
explicit nil check) in that conversion logic to reject unsupported shapes by
returning a typed gateway error (use core.NewInvalidRequestError or
core.GatewayError per convention) with a clear message like "invalid responses
input: unsupported type" so we never send empty messages upstream; update the
block around anthropicReq.Messages conversion (the switch handling req.Input,
which uses providers.ConvertResponsesContentToChatContent and
convertMessageContentToAnthropic) to perform this validation and return
immediately on invalid input.
---
Duplicate comments:
In `@internal/core/chat_content.go`:
- Around line 33-36: The UnmarshalMessageContent function currently maps JSON
null/empty to an empty string, losing distinction between null, missing, and
explicit "" and preventing Message.MarshalJSON from re-emitting null; change
UnmarshalMessageContent to return nil (not "") when trimmed input is empty or
equals "null" so that Message.UnmarshalJSON/Message.MarshalJSON can validate or
preserve null vs empty content correctly (update any callers expecting ""
accordingly, and ensure Message.UnmarshalJSON handles nil content cases).
In `@internal/core/types.go`:
- Around line 50-57: The Message struct currently exposes Content as any which
weakly types public payloads; introduce a typed wrapper (e.g., MessageContent)
that represents either a string or []ContentPart and replace Message.Content
with MessageContent, then implement/ reuse the existing custom JSON
marshal/unmarshal logic (move or adapt code in message_json.go) to handle the
two variants; ensure MessageContent references ContentPart and preserves the
Swagger extensions/annotations presently on ContentSchema so serialization,
validation and OpenAI-compatible payloads remain unchanged.
In `@internal/providers/anthropic/anthropic.go`:
- Around line 198-202: The anthropicContentSource struct currently only has Data
for image payloads causing remote URLs to serialize as
{"type":"url","data":"..."}; add a Url string `json:"url,omitempty"` to the
anthropicContentSource struct (preserving Data for inline/base64 sources) and
update the code that builds/serializes message/image payloads (where
anthropicContentSource is instantiated) to set Url for type == "url" and Data
for inline sources, and ensure MediaType is included when required by the
Anthropic Messages API; also update the other occurrences of
anthropicContentSource use (the other referenced blocks) to follow the same url
vs data logic.
In `@internal/providers/responses_adapter.go`:
- Around line 54-55: When
ConvertResponsesContentToChatContent(msgMap["content"]) returns ok==false you
must not silently drop the message; instead return an error created via
core.NewInvalidRequestError(...) so the request fails with an explicit
invalid-request. Update the branch in responses_adapter.go around the use of
ConvertResponsesContentToChatContent (and the other similar usage blocks that
call this function) to check the ok flag and, on false, bubble up and return
core.NewInvalidRequestError with a clear message identifying the malformed
Responses item (including any available detail from the conversion function)
rather than omitting the message from the conversation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 67f69b3d-5855-4705-82fa-e2b6a925df9a
📒 Files selected for processing (16)
cmd/gomodel/docs/docs.gocmd/gomodel/docs/swagger.jsoninternal/core/chat_content.gointernal/core/chat_content_test.gointernal/core/message_json.gointernal/core/types.gointernal/guardrails/provider.gointernal/guardrails/provider_test.gointernal/providers/anthropic/anthropic.gointernal/providers/anthropic/anthropic_test.gointernal/providers/openai/openai_test.gointernal/providers/responses_adapter.gointernal/providers/responses_adapter_test.gointernal/server/handlers_test.gotests/e2e/chat_test.gotests/e2e/mock_provider.go
There was a problem hiding this comment.
Actionable comments posted: 11
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/gomodel/docs/docs.go`:
- Around line 2378-2384: ResponsesRequest.input is documented as only "string"
or "[]ContentPart" but the runtime accepts arrays of input-item/message objects
(tested by TestResponsesWithArrayInput); update the source model/annotations
that define ResponsesRequest.input so the OpenAPI schema allows the additional
array item shape (e.g., include an alternative "items" schema for
input-item/message objects or a oneOf that references both core.ContentPart and
the input-item/message definition), then regenerate the Swagger artifacts and
commit the regenerated docs (ensure cmd/gomodel/docs/docs.go and
cmd/gomodel/docs/swagger.json are updated) so SDKs accept the supported request
shape.
In `@cmd/gomodel/docs/swagger.json`:
- Around line 2303-2317: core.ResponseMessage.content is currently declared as
an untyped empty object, causing SDK generators to lose the intended union
shape; update the swagger definition for core.ResponseMessage.content to match
core.Message.content by adding the same x-oneof annotation (allowing either a
string or an array of core.ContentPart) so generated clients know content can be
a string or an array of ContentPart (refer to core.Message.content and
core.ContentPart for the exact schema shape to copy).
In `@internal/core/chat_content.go`:
- Around line 224-254: normalizeTypedContentPart allows text/input_text parts
with empty Text, which is inconsistent with unmarshalContentPart's requirement
that raw.Text be present; update normalizeTypedContentPart (used by
NormalizeMessageContent) to validate that part.Text is non-empty (e.g., if
part.Text == "" return an error like "text part missing text") so
programmatically-constructed ContentPart{Type:"text", Text:""} is rejected
consistently with JSON unmarshalling; keep the error message descriptive and
return the same error behavior/shape as other branches.
In `@internal/core/message_json.go`:
- Around line 85-100: The special-case for tool-call assistant messages only
checks raw == nil, but legacy messages may have Content: "" (empty string), so
update marshalMessageContent to treat an empty-string MessageContent as null
when len(toolCalls) > 0: change the condition to detect either raw == nil or
(type-assert raw to string and raw == "" or strings.TrimSpace(rawStr) == ""),
and only then set content = nil; otherwise call NormalizeMessageContent(raw) as
before. Reference: function marshalMessageContent and NormalizeMessageContent,
and handle MessageContent values that are empty strings.
In `@internal/core/responses_json_test.go`:
- Around line 34-60: The test TestResponsesRequestMarshalJSON_PreservesInput
currently only checks that decoded["input"] exists; update it to fully assert
the round-tripped nested structure from ResponsesRequest after
json.Marshal/json.Unmarshal by validating the decoded["input"] is an array, that
its first element has "role" == "user", that its "content" is an array whose
first element has "type" == "input_text" and "text" == "hello", so MarshalJSON
is protected from flattening/corruption; locate this logic around
TestResponsesRequestMarshalJSON_PreservesInput and use the decoded map/typed
assertions to fail the test if any of those fields/values are missing or
altered.
In `@internal/core/responses.go`:
- Around line 8-10: The Swagger/OpenAPI schema is wrong because InputSchema is
declared as []ContentPart but the actual payload uses the outer role/content
wrapper (ResponsesInputItem); change the InputSchema field’s type from
[]ContentPart to []ResponsesInputItem (keeping the duplicate json tag and the
extensions metadata) and update the extensions $ref to point to
core.ResponsesInputItem (so the x-oneOf/array items ref matches the real request
shape), ensuring the struct field name InputSchema and the existing
Input/ResponsesInputItem symbols are used to locate and update the declaration.
In `@internal/core/types.go`:
- Around line 96-99: ResponseMessage currently lacks the Swagger/schema metadata
that Message.content carries, so update ResponseMessage.content (type
MessageContent) to mirror the same JSON/tag annotation used on Message.content
(the string-or-[]ContentPart Swagger helper) so the response-side contract is
preserved; then regenerate the API docs and commit the generated files
(cmd/gomodel/docs/docs.go and cmd/gomodel/docs/swagger.json). Target the
ResponseMessage struct and the Message.content annotation to copy the exact
schema helper used previously before running the generator.
In `@internal/guardrails/provider.go`:
- Around line 357-360: The current branch in GuardedProvider's message assembly
silently discards text rewrites for multimodal turns: when
core.HasNonTextContent(original.Content) is true the code appends original
(preserving non-text parts) and ignores modified.Content, making rewrite
guardrails no-op; fix by merging the rewritten text back into the multimodal
original instead of replacing it wholesale or else fail-fast if merging isn't
supported. Locate the logic that builds coreMessages (the block referencing
original, modified, and core.HasNonTextContent) and change it so that when
HasNonTextContent(original.Content) is true you construct a message that
preserves original's non-text parts but sets its text part to modified.Content
(or throw a clear error if you can't safely merge), ensuring rewrite-style
guardrails apply to multimodal messages while still allowing blocking guardrails
to operate.
In `@internal/providers/anthropic/anthropic.go`:
- Around line 828-872: The data URL parsing in anthropicImageSource is brittle
because it expects meta to be exactly "mediatype;base64"; update
anthropicImageSource to robustly parse the meta by splitting meta on ';' into
tokens, treat the first token as the mediaType candidate (use mediaTypeHint if
the first token is empty), and detect the presence of a "base64" token
case-insensitively among the remaining tokens (return the existing
invalid-request error if missing). After extracting mediaType, trim spaces,
validate with isAllowedAnthropicImageMediaType, and proceed to build the
anthropicContentSource as before; keep the rest of the function (URL parsing
branch and returned struct fields) unchanged.
In `@internal/providers/responses_adapter_test.go`:
- Around line 162-192: Add a test case to TestConvertResponsesRequestToChat that
verifies ConvertResponsesRequestToChat returns an error when
core.ResponsesRequest has a nil Input: add a case named like "nil input fails"
with input set to &core.ResponsesRequest{Model: "test-model", Input: nil} and
expectErr: true so the code path that checks for nil Input is covered.
In `@internal/providers/responses_adapter.go`:
- Around line 52-71: Validation trims the role for emptiness but still assigns
the original untrimmed value to core.Message; update the loop so you trim the
role once and reuse the trimmed value: after extracting role :=
msgMap["role"].(string) replace/override it with role = strings.TrimSpace(role)
(or use a new trimmedRole variable) before the empty check and before appending
to chatReq.Messages; ensure the appended core.Message.Role uses the trimmed
value and keep the existing ConvertResponsesContentToChatContent and
chatReq.Messages append logic unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 20806be4-04f2-48c4-a1d7-393d74f15cd6
📒 Files selected for processing (18)
cmd/gomodel/docs/docs.gocmd/gomodel/docs/swagger.jsoninternal/core/chat_content.gointernal/core/chat_content_test.gointernal/core/message_json.gointernal/core/responses.gointernal/core/responses_json.gointernal/core/responses_json_test.gointernal/core/types.gointernal/guardrails/provider.gointernal/guardrails/provider_test.gointernal/providers/anthropic/anthropic.gointernal/providers/anthropic/anthropic_test.gointernal/providers/responses_adapter.gointernal/providers/responses_adapter_test.gotests/contract/testdata/golden/gemini/chat_with_tools.golden.jsontests/contract/testdata/golden/groq/chat_with_tools.golden.jsontests/contract/testdata/golden/openai/chat_with_tools.golden.json
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/providers/responses_adapter.go (1)
45-76:⚠️ Potential issue | 🟠 MajorSupport the typed
ResponsesInputItemform, not just JSON-decoded maps.
ResponsesRequest.Inputis documented asstringor[]core.ResponsesInputItem, but this conversion only accepts the exactencoding/jsonoutput shape:[]interface{}items backed bymap[string]interface{}content parts. Any in-process caller that builds the advertised DTOs directly will hitinvalid responses input: unsupported type. Add typed branches for[]core.ResponsesInputItemand the referenced content-part types so callers do not need to round-trip through untyped maps first.Based on learnings: Guideline: Do not use interface{} or map[string]interface{} for API request/response payload types. Prefer strongly-typed structs for API payload definitions to improve type safety, serialization, and documentation.
Also applies to: 85-150
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/responses_adapter.go` around lines 45 - 76, The switch handling req.Input in ConvertResponsesAdapter only accepts raw JSON-decoded shapes (string or []interface{} with map[string]interface{}) and rejects the documented typed DTOs; update the type switch to also handle []core.ResponsesInputItem (and the concrete content-part types used by those items) by adding cases for []core.ResponsesInputItem and for individual core.ResponsesInputItem when encountered, extracting role and calling ConvertResponsesContentToChatContent (or a new helper that accepts typed content parts) to produce core.Message entries; mirror the same typed-branch additions in the later identical block (lines ~85-150) so in-process callers that pass typed ResponsesRequest/Input structs are accepted without round-tripping through untyped maps.
♻️ Duplicate comments (5)
internal/core/chat_content.go (1)
201-206:⚠️ Potential issue | 🟠 MajorReject empty
textin the JSON/map paths too.
normalizeTypedContentPart()now rejectsContentPart{Type: "text", Text: ""}, but these two branches still accept{"type":"text","text":""}. That leavesUnmarshalMessageContent()andNormalizeMessageContent([]interface{}{...})with a different contract than the typed[]ContentPartpath.🛠️ Proposed fix
switch raw.Type { case "text", "input_text": - if raw.Text == nil { + if raw.Text == nil || *raw.Text == "" { return ContentPart{}, fmt.Errorf("text part is missing text") } return ContentPart{Type: "text", Text: *raw.Text}, nil @@ switch partType { case "text", "input_text": text, ok := partMap["text"].(string) - if !ok { + if !ok || text == "" { return ContentPart{}, fmt.Errorf("text part is missing text") } return ContentPart{Type: "text", Text: text}, nilAlso applies to: 273-278
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/chat_content.go` around lines 201 - 206, The JSON/map branches that handle raw.Type == "text" / "input_text" (seen in the switch inside UnmarshalMessageContent()/NormalizeMessageContent code paths) must enforce the same non-empty Text invariant as normalizeTypedContentPart(); update the branches that currently allow raw.Text == "" to return an error when raw.Text is nil or when *raw.Text == "" (reject empty string), mirroring the ContentPart validation logic used for the typed []ContentPart path so all paths share the same contract.internal/core/responses.go (1)
8-10:⚠️ Potential issue | 🟠 Major
InputSchemastill doesn't describe the nested multimodal payload.This new oneOf only fixes the top-level
inputfield. The referencedcore.ResponsesInputItemstill exposesContent interface{}with no schema metadata, andResponsesContentPartbelow still omitsinput_audio. Generated OpenAPI/SDKs will therefore publish a weaker contract than the runtime accepts for array-form multimodal requests.Based on learnings: Guideline: Do not use interface{} or map[string]interface{} for API request/response payload types. Prefer strongly-typed structs for API payload definitions to improve type safety, serialization, and documentation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/responses.go` around lines 8 - 10, The Responses API still uses interface{} for nested multimodal payloads so OpenAPI/SDKs omit the real schema; update the types referenced by InputSchema: replace ResponsesInputItem.Content (and any map/interface uses) with explicit, strongly-typed structs/enums that enumerate multimodal parts (e.g., ResponsesContentPart variants) and include the missing input_audio field metadata, and add the appropriate json and extensions/x-oneOf schema tags on ResponsesContentPart and the concrete content structs so the generated OpenAPI shows the array-form multimodal contract correctly; locate and modify the types named ResponsesInputItem, ResponsesContentPart, and the Content field to add these explicit definitions and schema tags.internal/guardrails/provider.go (2)
369-374:⚠️ Potential issue | 🟠 MajorHonor guardrail deletions instead of restoring dropped turns.
If a guardrail returns fewer non-system messages, this block appends the originals back onto the request. That makes delete/trim-style guardrails ineffective on multimodal conversations while the text-only path respects their output. The multimodal branch should honor the shorter
msgsslice instead of resurrecting removed turns.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/guardrails/provider.go` around lines 369 - 374, The current loop appends remaining entries from nonSystemOriginal back into coreMessages, undoing guardrail deletions; instead honor the guardrail's returned msgs slice by not restoring original non-system turns when msgs is shorter. Locate the multimodal merge logic (references: coreMessages, nonSystemOriginal, nextNonSystem, msgs) and remove or gate the restore loop so it only appends originals if the guardrail did not drop any non-system messages (i.e., only when len(msgs) >= len(nonSystemOriginal) or equivalent); ensure nextNonSystem is advanced to match msgs consumption and that coreMessages reflects exactly the guardrail-provided msgs in the multimodal branch.
356-364:⚠️ Potential issue | 🟠 MajorDon't drop tool-call metadata when rebuilding mixed multimodal chats.
Once any message contains non-text content, this branch rebuilds every non-system message from only
RoleandContent. Earlier assistant/tool turns therefore loseToolCalls,ToolCallID,Name, etc., so a valid tool-use conversation breaks as soon as a later turn includes an image/audio part. Preserve the original message envelope and overwrite only the fields guardrails are allowed to change.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/guardrails/provider.go` around lines 356 - 364, The current rebuild loop replaces whole message objects for non-system turns (when core.HasNonTextContent(original.Content) is true) which drops metadata like ToolCalls/ToolCallID/Name; instead, when merging multimodal content, construct the new message by starting from the original nonSystemOriginal[nextNonSystem] envelope and only overwrite the allowed fields (e.g., Content and Role if necessary) with mergedContent/modified.Role, preserving fields such as ToolCalls, ToolCallID, Name, etc.; update the branch that currently appends core.Message{Role: modified.Role, Content: mergedContent} to copy original into a variable, set its Content = mergedContent (and Role if required), then append that preserved-envelope message to coreMessages so tool-call metadata remains intact (refer to nonSystemOriginal, mergeMultimodalContentWithTextRewrite, coreMessages, and modified).internal/core/types.go (1)
48-49:⚠️ Potential issue | 🟠 MajorUse strongly-typed content union instead of
anyfor API request/response payloads.
MessageContentsits on bothMessage.Content(inChatRequest) andResponseMessage.Content(inChatResponse), which are part of the API contract. Usinganyweakens static typing on these request/response payloads and forces callers to perform runtime type assertions inUnmarshalMessageContent,NormalizeMessageContent, and provider conversion functions. A dedicated union-style type with custom JSON marshaling methods would preserve the wire format while maintaining type safety at the API boundary.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/types.go` around lines 48 - 49, Replace the loose alias type MessageContent any with a strongly-typed union struct (e.g., a sum-type like MessageContent{Text *string; Parts *[]MessagePart; Raw json.RawMessage} or an explicit tagged union) and implement custom JSON MarshalJSON/UnmarshalJSON so the wire format stays identical; update the API types that reference it (Message.Content on ChatRequest and ResponseMessage.Content on ChatResponse) to use the new MessageContent type, and refactor UnmarshalMessageContent and NormalizeMessageContent (and any provider conversion helpers) to operate on the new union fields instead of runtime type assertions, preserving existing behavior but restoring compile-time type safety.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/gomodel/docs/swagger.json`:
- Around line 2353-2363: The swagger for core.ResponsesInputItem is missing a
proper schema for the content union; update the Go source annotation for
ResponsesInputItem.content to include a Swagger extension like the existing
ResponsesRequest.InputSchema that defines the union (string OR
[]ResponsesContentPart with multimodal part types text, image_url, input_audio),
then regenerate the OpenAPI JSON by running make swagger so the emitted
swagger.json includes the typed union for core.ResponsesInputItem.content.
In `@internal/core/types.go`:
- Around line 54-60: The Swagger schema for message content must include null to
match runtime tool-call messages: update the ContentSchema field's
extensions:"x-oneOf=..." on the Message and ResponseMessage structs (the Content
and ContentSchema fields in internal/core/types.go) to use
[{"type":"null"},{"type":"string"},{"type":"array","items":{"$ref":"#/definitions/core.ContentPart"}}]
instead of the current two-member union, then regenerate the API docs and SDK
artifacts by re-running the doc/model generator (regenerate
cmd/gomodel/docs/docs.go and cmd/gomodel/docs/swagger.json).
In `@internal/providers/anthropic/anthropic.go`:
- Around line 877-885: The URL validation currently only checks parsed.Scheme
after url.Parse(raw) and accepts scheme-only or relative values; update the
validation in the image URL branch that returns anthropicContentSource to
require an absolute HTTP(S) URI with a non-empty authority/host: after parsed,
err := url.Parse(raw) ensure err == nil, parsed.IsAbs() is true and parsed.Host
(or parsed.Hostname()) is non-empty (reject values like "https:" or "https://");
if any check fails return the same core.NewInvalidRequestError("anthropic chat
image_url must be a data: URL or http/https URL", nil). Use the existing
anthropicContentSource Type/URL return only when those conditions pass.
- Around line 740-758: In convertResponsesRequestToAnthropic, filter out
"system" roles like convertToAnthropic does: after extracting role and
converting content with convertMessageContentToAnthropic, if role == "system"
assign that content to anthropicReq.System (convert the content to the expected
string form) and continue without appending to anthropicReq.Messages; for any
non-"user"/"assistant" role return an invalid-request error; otherwise append to
anthropicReq.Messages as before. Ensure you reference anthropicReq.System,
anthropicReq.Messages, convertMessageContentToAnthropic, and
convertResponsesRequestToAnthropic when making the change.
---
Outside diff comments:
In `@internal/providers/responses_adapter.go`:
- Around line 45-76: The switch handling req.Input in ConvertResponsesAdapter
only accepts raw JSON-decoded shapes (string or []interface{} with
map[string]interface{}) and rejects the documented typed DTOs; update the type
switch to also handle []core.ResponsesInputItem (and the concrete content-part
types used by those items) by adding cases for []core.ResponsesInputItem and for
individual core.ResponsesInputItem when encountered, extracting role and calling
ConvertResponsesContentToChatContent (or a new helper that accepts typed content
parts) to produce core.Message entries; mirror the same typed-branch additions
in the later identical block (lines ~85-150) so in-process callers that pass
typed ResponsesRequest/Input structs are accepted without round-tripping through
untyped maps.
---
Duplicate comments:
In `@internal/core/chat_content.go`:
- Around line 201-206: The JSON/map branches that handle raw.Type == "text" /
"input_text" (seen in the switch inside
UnmarshalMessageContent()/NormalizeMessageContent code paths) must enforce the
same non-empty Text invariant as normalizeTypedContentPart(); update the
branches that currently allow raw.Text == "" to return an error when raw.Text is
nil or when *raw.Text == "" (reject empty string), mirroring the ContentPart
validation logic used for the typed []ContentPart path so all paths share the
same contract.
In `@internal/core/responses.go`:
- Around line 8-10: The Responses API still uses interface{} for nested
multimodal payloads so OpenAPI/SDKs omit the real schema; update the types
referenced by InputSchema: replace ResponsesInputItem.Content (and any
map/interface uses) with explicit, strongly-typed structs/enums that enumerate
multimodal parts (e.g., ResponsesContentPart variants) and include the missing
input_audio field metadata, and add the appropriate json and extensions/x-oneOf
schema tags on ResponsesContentPart and the concrete content structs so the
generated OpenAPI shows the array-form multimodal contract correctly; locate and
modify the types named ResponsesInputItem, ResponsesContentPart, and the Content
field to add these explicit definitions and schema tags.
In `@internal/core/types.go`:
- Around line 48-49: Replace the loose alias type MessageContent any with a
strongly-typed union struct (e.g., a sum-type like MessageContent{Text *string;
Parts *[]MessagePart; Raw json.RawMessage} or an explicit tagged union) and
implement custom JSON MarshalJSON/UnmarshalJSON so the wire format stays
identical; update the API types that reference it (Message.Content on
ChatRequest and ResponseMessage.Content on ChatResponse) to use the new
MessageContent type, and refactor UnmarshalMessageContent and
NormalizeMessageContent (and any provider conversion helpers) to operate on the
new union fields instead of runtime type assertions, preserving existing
behavior but restoring compile-time type safety.
In `@internal/guardrails/provider.go`:
- Around line 369-374: The current loop appends remaining entries from
nonSystemOriginal back into coreMessages, undoing guardrail deletions; instead
honor the guardrail's returned msgs slice by not restoring original non-system
turns when msgs is shorter. Locate the multimodal merge logic (references:
coreMessages, nonSystemOriginal, nextNonSystem, msgs) and remove or gate the
restore loop so it only appends originals if the guardrail did not drop any
non-system messages (i.e., only when len(msgs) >= len(nonSystemOriginal) or
equivalent); ensure nextNonSystem is advanced to match msgs consumption and that
coreMessages reflects exactly the guardrail-provided msgs in the multimodal
branch.
- Around line 356-364: The current rebuild loop replaces whole message objects
for non-system turns (when core.HasNonTextContent(original.Content) is true)
which drops metadata like ToolCalls/ToolCallID/Name; instead, when merging
multimodal content, construct the new message by starting from the original
nonSystemOriginal[nextNonSystem] envelope and only overwrite the allowed fields
(e.g., Content and Role if necessary) with mergedContent/modified.Role,
preserving fields such as ToolCalls, ToolCallID, Name, etc.; update the branch
that currently appends core.Message{Role: modified.Role, Content: mergedContent}
to copy original into a variable, set its Content = mergedContent (and Role if
required), then append that preserved-envelope message to coreMessages so
tool-call metadata remains intact (refer to nonSystemOriginal,
mergeMultimodalContentWithTextRewrite, coreMessages, and modified).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: ac5957b3-6281-48b6-858a-5e1ff5b0d92f
📒 Files selected for processing (14)
cmd/gomodel/docs/docs.gocmd/gomodel/docs/swagger.jsoninternal/core/chat_content.gointernal/core/chat_content_test.gointernal/core/message_json.gointernal/core/responses.gointernal/core/responses_json_test.gointernal/core/types.gointernal/guardrails/provider.gointernal/guardrails/provider_test.gointernal/providers/anthropic/anthropic.gointernal/providers/anthropic/anthropic_test.gointernal/providers/responses_adapter.gointernal/providers/responses_adapter_test.go
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
internal/providers/responses_adapter.go (2)
23-34:⚠️ Potential issue | 🟠 MajorForward
reasoningandstream_optionsintochatReq.
ConvertResponsesRequestToChatnow validates the request, but it still dropsreq.Reasoningandreq.StreamOptions. That makes/v1/responsessilently ignore reasoning effort andstream_options.include_usageon the chat-backed path.Proposed fix
func ConvertResponsesRequestToChat(req *core.ResponsesRequest) (*core.ChatRequest, error) { chatReq := &core.ChatRequest{ - Model: req.Model, - Provider: req.Provider, - Messages: make([]core.Message, 0), - Temperature: req.Temperature, - Stream: req.Stream, + Model: req.Model, + Provider: req.Provider, + Messages: make([]core.Message, 0), + Reasoning: req.Reasoning, + Temperature: req.Temperature, + Stream: req.Stream, + StreamOptions: req.StreamOptions, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/responses_adapter.go` around lines 23 - 34, ConvertResponsesRequestToChat currently omits forwarding req.Reasoning and req.StreamOptions into the constructed *core.ChatRequest, causing the /v1/responses chat-backed path to ignore reasoning and stream_options.include_usage; update ConvertResponsesRequestToChat to set chatReq.Reasoning = req.Reasoning and chatReq.StreamOptions = req.StreamOptions (or map include_usage from req.StreamOptions to chatReq.StreamOptions.IncludeUsage if shapes differ) so the reasoning effort and stream options are preserved when converting ResponsesRequest to ChatRequest.
289-321:⚠️ Potential issue | 🟠 MajorPreserve detailed usage when wrapping chat responses.
This adapter only copies token totals.
prompt_tokens_details,completion_tokens_details, andraw_usageare dropped, so/v1/responsesloses image/audio token accounting for multimodal traffic routed through chat.Proposed fix
Usage: &core.ResponsesUsage{ - InputTokens: resp.Usage.PromptTokens, - OutputTokens: resp.Usage.CompletionTokens, - TotalTokens: resp.Usage.TotalTokens, + InputTokens: resp.Usage.PromptTokens, + OutputTokens: resp.Usage.CompletionTokens, + TotalTokens: resp.Usage.TotalTokens, + PromptTokensDetails: resp.Usage.PromptTokensDetails, + CompletionTokensDetails: resp.Usage.CompletionTokensDetails, + RawUsage: resp.Usage.RawUsage, },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/responses_adapter.go` around lines 289 - 321, ConvertChatResponseToResponses currently only copies total token counts into core.ResponsesResponse. Update the mapping in ConvertChatResponseToResponses to preserve detailed usage from resp.Usage by populating the corresponding fields on core.ResponsesUsage (e.g., prompt_tokens_details, completion_tokens_details, raw_usage or their Go struct field equivalents) so image/audio/multimodal token breakdowns are forwarded; make sure to reference resp.Usage.<fields> when building the ResponsesUsage object returned by ConvertChatResponseToResponses so no detailed usage data is dropped.internal/providers/anthropic/anthropic.go (1)
368-376:⚠️ Potential issue | 🔴 CriticalGuard nil requests before dereferencing
req.Both converters read
req.Modelbefore checking whether the pointer is nil.ChatCompletion,StreamChatCompletion,Responses, andStreamResponsescall these helpers directly, so a nil request will panic instead of returning a typed invalid-request error.🐛 Proposed fix
func convertToAnthropicRequest(req *core.ChatRequest) (*anthropicRequest, error) { + if req == nil { + return nil, core.NewInvalidRequestError("chat request is required", nil) + } anthropicReq := &anthropicRequest{ Model: req.Model, Messages: make([]anthropicMessage, 0, len(req.Messages)), MaxTokens: 4096, // Default max tokens Temperature: req.Temperature, @@ func convertResponsesRequestToAnthropic(req *core.ResponsesRequest) (*anthropicRequest, error) { + if req == nil { + return nil, core.NewInvalidRequestError("responses request is required", nil) + } anthropicReq := &anthropicRequest{ Model: req.Model, Messages: make([]anthropicMessage, 0), MaxTokens: 4096, // Default max tokens Temperature: req.Temperature,Also applies to: 701-709
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/anthropic/anthropic.go` around lines 368 - 376, The converter convertToAnthropicRequest dereferences req (e.g., req.Model) before checking for nil which will panic when callers like ChatCompletion, StreamChatCompletion, Responses, or StreamResponses pass a nil pointer; add a nil guard at the top of convertToAnthropicRequest (and the corresponding converter at lines ~701-709) that returns a typed invalid-request error (rather than panicking) when req == nil, and ensure callers receive that error instead of causing a runtime panic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/core/responses.go`:
- Around line 6-18: ResponsesRequest.Input and ResponsesInputItem.Content
currently use interface{} which leaks dynamic payloads across the codebase;
replace these fields with concrete union wrapper types (e.g.,
ResponsesInputUnion and ResponsesContentUnion) that implement custom
MarshalJSON/UnmarshalJSON to accept either a string or an array/object form,
update struct field types (Input, InputSchema items' Content) to these wrappers,
keep the same JSON tags/Swagger extensions, and update any code paths that
read/write these fields to use the new wrapper's typed accessors so callers no
longer handle map[string]interface{} or []interface{} directly.
In `@internal/core/types.go`:
- Around line 48-60: Replace the untyped MessageContent alias with a concrete
union DTO that represents the three allowed shapes (null, string, []ContentPart)
and implement custom MarshalJSON/UnmarshalJSON on that DTO to
serialize/deserialize accordingly; update the Message struct (and
ResponseMessage equivalents) to use this new type instead of MessageContent,
preserve the existing ContentSchema field and its swagger extensions, and ensure
the new type exposes accessor methods (e.g., IsNull, AsString, AsParts) to avoid
consumers doing runtime type switches while keeping ContentPart referenced for
the array case.
In `@internal/guardrails/provider.go`:
- Around line 297-300: The current branch only uses
applySystemMessagesToMultimodalChat when chatHasNonTextContent(req) is true,
causing messages with assistant ToolCalls but no non-text content to be passed
to applyMessagesToChat and lose ToolCalls; update the condition in this block to
detect either non-text content OR presence of tool calls in the normalized
messages (e.g., check for ToolCalls on message DTOs or use a helper that
inspects messages for tool_call entries) and call
applySystemMessagesToMultimodalChat(req, modified) in that case so tool-call
turns are preserved (refer to chatHasNonTextContent,
applySystemMessagesToMultimodalChat, applyMessagesToChat, and the
guarded/normalized Message DTOs used by GuardedProvider).
- Around line 316-320: chatToMessages currently calls core.ExtractTextContent
directly and can silently convert unsupported m.Content into empty strings;
instead validate and normalize m.Content before flattening: inside
chatToMessages (and any helper you add), switch/inspect m.Content against the
supported content shapes, produce a normalized Message with Role and extracted
text only for valid shapes, and return a typed invalid-request error (e.g.,
wrap/return a sentinel InvalidRequestError) when m.Content is not one of the
supported types so the guardrail layer can reject the request rather than
forwarding empty messages.
In `@internal/providers/anthropic/anthropic.go`:
- Around line 386-393: The loop over req.Messages overwrites anthropicReq.System
for each "system" role so earlier system instructions are lost; modify the
handling in the loop (and the other similar block around the other occurrence)
to append/merge system texts instead of replacing them: call
textOnlyAnthropicContent(msg.Content) as currently done, then concatenate the
returned systemText to anthropicReq.System (with a separator like "\n\n") or
otherwise accumulate them into a slice and join before assigning, ensuring all
system turns are preserved rather than only keeping the last one.
- Around line 187-203: The anthropicMessage struct uses Content any which loses
compile-time validation; replace it with a typed union wrapper (e.g., a new type
AnthropicsContent or AnthropcContentUnion) that can represent either a string or
[]anthropicContentBlock and implement custom MarshalJSON (and UnmarshalJSON if
needed) to serialize as either a JSON string or an array of
anthropicContentBlock, referencing the existing anthropicContentBlock and
anthropicContentSource types; update anthropicMessage.Content to use this new
union type and apply the same change to the other occurrences noted (the
analogous structs/fields around the 766–809 region) so JSON payloads are
validated and serialized correctly.
In `@internal/providers/responses_adapter.go`:
- Around line 22-23: Update the exported Go doc for
ConvertResponsesRequestToChat to reflect the new behavior: state that it
converts a core.ResponsesRequest to a core.ChatRequest and also validates the
input shape, returning an error if validation fails; mention the error-returning
contract so callers know to handle the error from ConvertResponsesRequestToChat.
---
Outside diff comments:
In `@internal/providers/anthropic/anthropic.go`:
- Around line 368-376: The converter convertToAnthropicRequest dereferences req
(e.g., req.Model) before checking for nil which will panic when callers like
ChatCompletion, StreamChatCompletion, Responses, or StreamResponses pass a nil
pointer; add a nil guard at the top of convertToAnthropicRequest (and the
corresponding converter at lines ~701-709) that returns a typed invalid-request
error (rather than panicking) when req == nil, and ensure callers receive that
error instead of causing a runtime panic.
In `@internal/providers/responses_adapter.go`:
- Around line 23-34: ConvertResponsesRequestToChat currently omits forwarding
req.Reasoning and req.StreamOptions into the constructed *core.ChatRequest,
causing the /v1/responses chat-backed path to ignore reasoning and
stream_options.include_usage; update ConvertResponsesRequestToChat to set
chatReq.Reasoning = req.Reasoning and chatReq.StreamOptions = req.StreamOptions
(or map include_usage from req.StreamOptions to
chatReq.StreamOptions.IncludeUsage if shapes differ) so the reasoning effort and
stream options are preserved when converting ResponsesRequest to ChatRequest.
- Around line 289-321: ConvertChatResponseToResponses currently only copies
total token counts into core.ResponsesResponse. Update the mapping in
ConvertChatResponseToResponses to preserve detailed usage from resp.Usage by
populating the corresponding fields on core.ResponsesUsage (e.g.,
prompt_tokens_details, completion_tokens_details, raw_usage or their Go struct
field equivalents) so image/audio/multimodal token breakdowns are forwarded;
make sure to reference resp.Usage.<fields> when building the ResponsesUsage
object returned by ConvertChatResponseToResponses so no detailed usage data is
dropped.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2d529ade-8f40-4a56-99dd-d8b3f069faae
📒 Files selected for processing (12)
cmd/gomodel/docs/docs.gocmd/gomodel/docs/swagger.jsoninternal/core/chat_content.gointernal/core/chat_content_test.gointernal/core/responses.gointernal/core/types.gointernal/guardrails/provider.gointernal/guardrails/provider_test.gointernal/providers/anthropic/anthropic.gointernal/providers/anthropic/anthropic_test.gointernal/providers/responses_adapter.gointernal/providers/responses_adapter_test.go
internal/core/responses.go
Outdated
| Model string `json:"model"` | ||
| Provider string `json:"provider,omitempty"` | ||
| Input interface{} `json:"input"` // string or []ResponsesInputItem — see docs for array form | ||
| //nolint:govet // Intentional duplicate json tag for Swagger docs: input is string OR []ResponsesInputItem. | ||
| InputSchema []ResponsesInputItem `json:"input,omitempty" extensions:"x-oneOf=[{\"type\":\"string\"},{\"type\":\"array\",\"items\":{\"$ref\":\"#/definitions/core.ResponsesInputItem\"}}]"` | ||
| Instructions string `json:"instructions,omitempty"` | ||
| Tools []map[string]any `json:"tools,omitempty"` | ||
| Temperature *float64 `json:"temperature,omitempty"` | ||
| MaxOutputTokens *int `json:"max_output_tokens,omitempty"` | ||
| Stream bool `json:"stream,omitempty"` | ||
| StreamOptions *StreamOptions `json:"stream_options,omitempty"` | ||
| Metadata map[string]string `json:"metadata,omitempty"` | ||
| Reasoning *Reasoning `json:"reasoning,omitempty"` |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Avoid interface{} on the Responses DTOs.
ResponsesRequest.Input and ResponsesInputItem.Content expose raw dynamic payloads on the API boundary, so the rest of the codebase now has to keep carrying []interface{} / map[string]interface{} normalization logic around them. Please wrap these unions in concrete types with custom JSON methods instead of publishing interface{} fields.
Based on learnings: "Do not use interface{} or map[string]interface{} for API request/response payload types. Prefer strongly-typed structs for API payload definitions to improve type safety, serialization, and documentation."
Also applies to: 39-45
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/core/responses.go` around lines 6 - 18, ResponsesRequest.Input and
ResponsesInputItem.Content currently use interface{} which leaks dynamic
payloads across the codebase; replace these fields with concrete union wrapper
types (e.g., ResponsesInputUnion and ResponsesContentUnion) that implement
custom MarshalJSON/UnmarshalJSON to accept either a string or an array/object
form, update struct field types (Input, InputSchema items' Content) to these
wrappers, keep the same JSON tags/Swagger extensions, and update any code paths
that read/write these fields to use the new wrapper's typed accessors so callers
no longer handle map[string]interface{} or []interface{} directly.
| // MessageContent stores message content as either text or structured parts. | ||
| type MessageContent any | ||
|
|
||
| // Message represents a single message in the chat. | ||
| type Message struct { | ||
| Role string `json:"role"` | ||
| Content string `json:"content"` | ||
| ToolCalls []ToolCall `json:"tool_calls,omitempty"` | ||
| Role string `json:"role"` | ||
| // Content accepts either a plain string or an array of ContentPart values. | ||
| // This preserves OpenAI-compatible multimodal chat payloads. | ||
| Content MessageContent `json:"content"` | ||
| //nolint:govet // Intentional duplicate json tag for Swagger docs: content is null OR string OR []ContentPart. | ||
| // ContentSchema documents that `content` accepts either a plain string | ||
| // or an array of ContentPart values. | ||
| ContentSchema []ContentPart `json:"content,omitempty" extensions:"x-oneOf=[{\"type\":\"null\"},{\"type\":\"string\"},{\"type\":\"array\",\"items\":{\"$ref\":\"#/definitions/core.ContentPart\"}}]"` |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Replace MessageContent with a concrete union DTO.
MessageContent being any leaks an untyped payload through the public chat request/response surface and forces runtime type switching in every consumer. Please model the string / []ContentPart / null union as a real type with custom JSON methods instead of exposing any on Message and ResponseMessage.
Based on learnings: "Do not use interface{} or map[string]interface{} for API request/response payload types. Prefer strongly-typed structs for API payload definitions to improve type safety, serialization, and documentation."
Also applies to: 95-100
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/core/types.go` around lines 48 - 60, Replace the untyped
MessageContent alias with a concrete union DTO that represents the three allowed
shapes (null, string, []ContentPart) and implement custom
MarshalJSON/UnmarshalJSON on that DTO to serialize/deserialize accordingly;
update the Message struct (and ResponseMessage equivalents) to use this new type
instead of MessageContent, preserve the existing ContentSchema field and its
swagger extensions, and ensure the new type exposes accessor methods (e.g.,
IsNull, AsString, AsParts) to avoid consumers doing runtime type switches while
keeping ContentPart referenced for the array case.
| type anthropicMessage struct { | ||
| Role string `json:"role"` | ||
| Content string `json:"content"` | ||
| Content any `json:"content"` | ||
| } | ||
|
|
||
| type anthropicContentBlock struct { | ||
| Type string `json:"type"` | ||
| Text string `json:"text,omitempty"` | ||
| Source *anthropicContentSource `json:"source,omitempty"` | ||
| } | ||
|
|
||
| type anthropicContentSource struct { | ||
| Type string `json:"type"` | ||
| MediaType string `json:"media_type,omitempty"` | ||
| Data string `json:"data,omitempty"` | ||
| URL string `json:"url,omitempty"` | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
head -n 220 internal/providers/anthropic/anthropic.go | tail -n 40Repository: ENTERPILOT/GOModel
Length of output: 1282
🏁 Script executed:
sed -n '766,809p' internal/providers/anthropic/anthropic.goRepository: ENTERPILOT/GOModel
Length of output: 1355
🏁 Script executed:
rg "convertMessageContentToAnthropic" internal/providers/anthropic/ -A 10 -B 2Repository: ENTERPILOT/GOModel
Length of output: 2818
🏁 Script executed:
rg "anthropicMessage" internal/providers/anthropic/ -B 3 -A 3Repository: ENTERPILOT/GOModel
Length of output: 3410
Replace any on anthropicMessage.Content with a typed union.
anthropicMessage is part of the Anthropic API request payload and currently uses any for the Content field, which can be either a string or []anthropicContentBlock. This removes compile-time validation for a critical JSON shape. Use a typed wrapper with custom MarshalJSON to make the union explicit while preserving JSON serialization.
This also applies to lines 766–809.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/providers/anthropic/anthropic.go` around lines 187 - 203, The
anthropicMessage struct uses Content any which loses compile-time validation;
replace it with a typed union wrapper (e.g., a new type AnthropicsContent or
AnthropcContentUnion) that can represent either a string or
[]anthropicContentBlock and implement custom MarshalJSON (and UnmarshalJSON if
needed) to serialize as either a JSON string or an array of
anthropicContentBlock, referencing the existing anthropicContentBlock and
anthropicContentSource types; update anthropicMessage.Content to use this new
union type and apply the same change to the other occurrences noted (the
analogous structs/fields around the 766–809 region) so JSON payloads are
validated and serialized correctly.
Eliminate duplicate ResponsesContentPart struct (identical to ContentPart) and use ContentPart everywhere. Add 11 tests covering input_audio unmarshaling, normalization, validation, and mixed content scenarios. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Deduplicate isNullEquivalentContent (was two identical functions) - Remove unreachable trailing loop in applySystemMessagesToMultimodalChat - Remove unused anthropicMessageContentBlock type alias - Simplify single-variable var block - Add test for input_text → text normalization in Anthropic conversion - Add test for maxContentParts overflow guard Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add dupl linter (threshold 150, excluded for test files) to catch duplicate code blocks. Extract shared conversation thread-walking logic into buildConversationThread helper used by both MongoDB and PostgreSQL readers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/providers/anthropic/anthropic_test.go (1)
2404-2430:⚠️ Potential issue | 🟡 MinorAdd a Responses-path multimodal image case here.
convertToAnthropicRequestnow has image coverage, butconvertResponsesRequestToAnthropicis still only exercised with text-only content. A regression in the Responses-specific image block mapping would go unnoticed.As per coding guidelines,
**/*_test.go: Add or update tests for any behavior change.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/anthropic/anthropic_test.go` around lines 2404 - 2430, Add a Responses-path multimodal image test case in internal/providers/anthropic/anthropic_test.go that exercises convertResponsesRequestToAnthropic (similar to the existing "array input with content parts" case): create a core.ResponsesRequest with Input containing a content array that includes an image part (e.g., map with "type":"image" and an image URL or data) and call convertResponsesRequestToAnthropic; then assert on the returned *anthropicRequest that the image was mapped on the Responses path (inspect req.Messages for the expected image-related field or content and length as done for text parts). This ensures convertResponsesRequestToAnthropic and anthropicRequest handle multimodal image content.
♻️ Duplicate comments (1)
internal/providers/anthropic/anthropic.go (1)
202-205: 🧹 Nitpick | 🔵 TrivialConsider using a typed union for
anthropicMessage.Content.The
Content anyfield loses compile-time validation. A typed wrapper with customMarshalJSONwould make the union explicit (string vs[]anthropicContentBlock). This is a maintainability improvement rather than a correctness issue since the current code works correctly at runtime.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/providers/anthropic/anthropic.go` around lines 202 - 205, The anthropicMessage struct currently uses Content any which loses compile-time validation; replace it with a typed union (e.g., define a new type anthropicContent that can hold either a string or []anthropicContentBlock) and implement custom MarshalJSON and UnmarshalJSON on anthropicContent to serialize as a JSON string when it contains plain text and as an array when it contains blocks; update anthropicMessage to use Content anthropicContent so callers get type safety and the JSON shape remains identical for the Anthropic API.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CLAUDE.md`:
- Around line 167-168: Update the checklist so the README is explicitly
mentioned: restore "README.md" as a third documentation layer alongside the
existing "In-code documentation" (Go doc comments on exported
types/functions/interfaces) and "Mintlify / technical docs" (`docs/` directory);
change the header text that currently says "all three documentation layers" to
match the three items and add a bullet (or modify the existing second bullet)
that instructs contributors to update README.md when public APIs, features, or
configuration docs change, ensuring the README is treated equally with Go doc
comments and docs/*.mdx.
In `@internal/core/chat_content_test.go`:
- Around line 135-166: Add a happy-path multimodal round-trip test for
ResponseMessage by creating a new test (e.g.,
TestResponseMessageMultimodalRoundTrip) that constructs a ResponseMessage with
non-nil Content representing multimodal data (not just content:null), includes
ToolCalls if relevant, marshals it with json.Marshal, unmarshals back with
json.Unmarshal into a ResponseMessage, and asserts that the unmarshaled struct
(fields like Content, Role, ToolCalls, and any multimodal-specific fields)
equals the original; place assertions to fail on marshal/unmarshal errors and to
compare Content deeply so regressions in the new content marshal/unmarshal path
are caught.
In `@internal/core/responses_json_test.go`:
- Around line 154-188: Update
TestResponsesRequestMarshalJSON_PreservesTypedInputItemContent to also include a
typed multimodal content item in ResponsesRequest.Input so we cover
slice-of-typed-content behavior: construct a ResponsesInputItem whose Content is
a typed multimodal value (e.g., a slice/struct representing a multimodal content
item with nested fields), marshal and unmarshal the request the same way, and
add assertions that the nested fields inside that typed content are preserved
(exist and have expected values) after json round-trip; target the
ResponsesRequest and ResponsesInputItem types when locating where to modify the
test.
In `@internal/core/responses.go`:
- Around line 6-20: The public DTO narrows array inputs to []ResponsesInputItem
which loses supported variants (e.g., function_call and function_call_output)
used by the adapter; change the InputSchema field to use a discriminated union
type (e.g., ResponsesInputElement or similar) that explicitly models all allowed
array item shapes (chat-style {role,content}, function_call,
function_call_output with their specific fields), update any custom JSON
marshal/unmarshal logic on ResponsesInputItem to instead live on the new union
type (or implement a custom UnmarshalJSON that switches on "type" to populate
the correct concrete struct), and adjust code that references
InputSchema/ResponsesInputItem so the adapter’s handling (the logic around tool
continuation) can round-trip without dropping fields. Ensure the new union type
is strongly typed (no map[string]interface{}), preserves existing field
tags/oneOf metadata for docs, and is used in the struct field currently named
InputSchema.
In `@internal/guardrails/provider.go`:
- Around line 385-396: The code assigns preserved := original which shares
slices/pointers (e.g., ToolCalls) with the caller; instead, make a clone of the
envelope before modifying Role/Content to preserve copy-on-write semantics:
create a new envelope value (copy primitive fields) and deep-copy any slice or
pointer fields such as ToolCalls and nested pointers, then set preserved.Role
and preserved.Content (using mergeMultimodalContentWithTextRewrite when
core.HasNonTextContent(original.Content) is true) and append that cloned
preserved to coreMessages so downstream mutations won’t affect req.Messages.
- Around line 438-458: The current merge logic collapses multiple text segments
into one rewritten text slot (variables: parts, merged, rewrittenText,
textPartCount, originalTexts, insertedRewrittenText), which changes ordering for
multimodal turns like text->image->text; update the code to fail closed instead
of collapsing: if textPartCount > 1 and rewrittenText is non-empty and
rewrittenText != strings.Join(originalTexts, " "), return an invalid-request
error (instead of returning a copied parts slice) unless you implement logic to
preserve each original text slot (i.e., apply rewrittenText to each
corresponding text part rather than collapsing them); update the return path to
produce that error where the current copied return is used.
In `@internal/providers/openai/openai_test.go`:
- Around line 181-203: Validate the nested image_url payload in the test by
asserting that second["image_url"] exists and is a map[string]interface{} (e.g.,
imageURL, ok := second["image_url"].(map[string]interface{})) and then assert
that imageURL["url"] is a non-empty string (e.g., url, ok :=
imageURL["url"].(string) and url != ""), so the test fails if the provider emits
type:"image_url" but omits the actual URL; update the checks around
messages/message/content/second in openai_test.go accordingly.
In `@internal/providers/responses_adapter.go`:
- Around line 159-163: The call_id returned by firstNonEmptyString is not
trimmed so values like " call_1 " differ from "call_1"; update the code where
callID is set (the local variable callID used before calling
ResponsesFunctionCallCallID) to run strings.TrimSpace(callID) and then pass the
trimmed value into ResponsesFunctionCallCallID; make the identical change in the
other block that builds a callID for function_call_output (the second callID
assignment later in the file) so all stored call IDs are normalized.
---
Outside diff comments:
In `@internal/providers/anthropic/anthropic_test.go`:
- Around line 2404-2430: Add a Responses-path multimodal image test case in
internal/providers/anthropic/anthropic_test.go that exercises
convertResponsesRequestToAnthropic (similar to the existing "array input with
content parts" case): create a core.ResponsesRequest with Input containing a
content array that includes an image part (e.g., map with "type":"image" and an
image URL or data) and call convertResponsesRequestToAnthropic; then assert on
the returned *anthropicRequest that the image was mapped on the Responses path
(inspect req.Messages for the expected image-related field or content and length
as done for text parts). This ensures convertResponsesRequestToAnthropic and
anthropicRequest handle multimodal image content.
---
Duplicate comments:
In `@internal/providers/anthropic/anthropic.go`:
- Around line 202-205: The anthropicMessage struct currently uses Content any
which loses compile-time validation; replace it with a typed union (e.g., define
a new type anthropicContent that can hold either a string or
[]anthropicContentBlock) and implement custom MarshalJSON and UnmarshalJSON on
anthropicContent to serialize as a JSON string when it contains plain text and
as an array when it contains blocks; update anthropicMessage to use Content
anthropicContent so callers get type safety and the JSON shape remains identical
for the Anthropic API.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: ca710c4d-c0c9-46b9-af96-03fa3229b946
📒 Files selected for processing (19)
CLAUDE.mdcmd/gomodel/docs/docs.gocmd/gomodel/docs/swagger.jsoninternal/core/chat_content_test.gointernal/core/message_json.gointernal/core/responses.gointernal/core/responses_json.gointernal/core/responses_json_test.gointernal/core/types.gointernal/core/types_test.gointernal/guardrails/provider.gointernal/guardrails/provider_test.gointernal/providers/anthropic/anthropic.gointernal/providers/anthropic/anthropic_test.gointernal/providers/openai/openai_test.gointernal/providers/responses_adapter.gointernal/providers/responses_adapter_test.gotests/e2e/chat_test.gotests/e2e/mock_provider.go
| 1. **In-code documentation** (Go doc comments on exported types, functions, interfaces) — Update when changing public APIs, adding new exported symbols, or modifying function signatures/behavior. | ||
| 2. **Mintlify / technical docs** (`docs/` directory) — Update `docs/advanced/*.mdx` pages when changing configuration options or guardrails behavior. Update `docs/adr/` when making significant architectural decisions. Update `docs/plans/` if implementation diverges from existing plans. Check `docs.json` if new pages need to be added to the navigation. |
There was a problem hiding this comment.
Add README.md back to this checklist.
This section still says “all three documentation layers,” but the list now only includes two. That will steer contributors away from updating the README when public API or feature docs change.
Based on learnings, Check all relevant documentation layers: README.md, exported Go doc comments, and docs/ technical documentation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@CLAUDE.md` around lines 167 - 168, Update the checklist so the README is
explicitly mentioned: restore "README.md" as a third documentation layer
alongside the existing "In-code documentation" (Go doc comments on exported
types/functions/interfaces) and "Mintlify / technical docs" (`docs/` directory);
change the header text that currently says "all three documentation layers" to
match the three items and add a bullet (or modify the existing second bullet)
that instructs contributors to update README.md when public APIs, features, or
configuration docs change, ensuring the README is treated equally with Go doc
comments and docs/*.mdx.
| func TestResponseMessageMarshalJSON_PreservesNullContentForToolCalls(t *testing.T) { | ||
| body, err := json.Marshal(ResponseMessage{ | ||
| Role: "assistant", | ||
| Content: nil, | ||
| ToolCalls: []ToolCall{ | ||
| { | ||
| ID: "call_123", | ||
| Type: "function", | ||
| Function: FunctionCall{ | ||
| Name: "lookup", | ||
| Arguments: "{}", | ||
| }, | ||
| }, | ||
| }, | ||
| }) | ||
| if err != nil { | ||
| t.Fatalf("json.Marshal() error = %v", err) | ||
| } | ||
| if !strings.Contains(string(body), `"content":null`) { | ||
| t.Fatalf("expected content:null, got %s", string(body)) | ||
| } | ||
| } | ||
|
|
||
| func TestResponseMessageUnmarshalJSON_PreservesNullContentForToolCalls(t *testing.T) { | ||
| var msg ResponseMessage | ||
| if err := json.Unmarshal([]byte(`{"role":"assistant","content":null,"tool_calls":[{"id":"call_1","type":"function","function":{"name":"lookup","arguments":"{}"}}]}`), &msg); err != nil { | ||
| t.Fatalf("json.Unmarshal() error = %v", err) | ||
| } | ||
| if msg.Content != nil { | ||
| t.Fatalf("Content = %#v, want nil", msg.Content) | ||
| } | ||
| } |
There was a problem hiding this comment.
Add a happy-path multimodal ResponseMessage round-trip test.
These new cases only exercise ResponseMessage with content:null. A regression in the new content-part marshal/unmarshal path would still pass here.
As per coding guidelines, **/*_test.go: Add or update tests for any behavior change.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/core/chat_content_test.go` around lines 135 - 166, Add a happy-path
multimodal round-trip test for ResponseMessage by creating a new test (e.g.,
TestResponseMessageMultimodalRoundTrip) that constructs a ResponseMessage with
non-nil Content representing multimodal data (not just content:null), includes
ToolCalls if relevant, marshals it with json.Marshal, unmarshals back with
json.Unmarshal into a ResponseMessage, and asserts that the unmarshaled struct
(fields like Content, Role, ToolCalls, and any multimodal-specific fields)
equals the original; place assertions to fail on marshal/unmarshal errors and to
compare Content deeply so regressions in the new content marshal/unmarshal path
are caught.
| preserved := original | ||
| preserved.Role = modified.Role | ||
| if core.HasNonTextContent(original.Content) { | ||
| mergedContent, err := mergeMultimodalContentWithTextRewrite(original.Content, modified.Content) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| preserved.Content = mergedContent | ||
| } else { | ||
| preserved.Content = modified.Content | ||
| } | ||
| coreMessages = append(coreMessages, preserved) |
There was a problem hiding this comment.
Clone preserved envelopes before reusing them.
preserved := original keeps the original ToolCalls slice (and any reused nested pointers) attached to the guarded request. That means this path is not fully copy-on-write: if a downstream provider mutates the preserved envelope, the caller’s req.Messages can be mutated too. Clone the preserved fields before appending preserved.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/guardrails/provider.go` around lines 385 - 396, The code assigns
preserved := original which shares slices/pointers (e.g., ToolCalls) with the
caller; instead, make a clone of the envelope before modifying Role/Content to
preserve copy-on-write semantics: create a new envelope value (copy primitive
fields) and deep-copy any slice or pointer fields such as ToolCalls and nested
pointers, then set preserved.Role and preserved.Content (using
mergeMultimodalContentWithTextRewrite when
core.HasNonTextContent(original.Content) is true) and append that cloned
preserved to coreMessages so downstream mutations won’t affect req.Messages.
| for _, part := range parts { | ||
| if part.Type == "text" { | ||
| textPartCount++ | ||
| hadTextPart = true | ||
| originalTexts = append(originalTexts, part.Text) | ||
| if !insertedRewrittenText { | ||
| if rewrittenText != "" { | ||
| merged = append(merged, core.ContentPart{Type: "text", Text: rewrittenText}) | ||
| } | ||
| insertedRewrittenText = true | ||
| } | ||
| continue | ||
| } | ||
| merged = append(merged, part) | ||
| } | ||
|
|
||
| if textPartCount > 1 && rewrittenText == strings.Join(originalTexts, " ") { | ||
| copied := make([]core.ContentPart, len(parts)) | ||
| copy(copied, parts) | ||
| return copied, nil | ||
| } |
There was a problem hiding this comment.
Fail closed on rewritten multimodal turns with multiple text segments.
For content shaped like text -> image/audio -> text, this merge collapses every text segment into the first slot unless the rewrite is a no-op. The result becomes rewritten(text1+text2) -> image/audio, which changes part ordering and therefore the prompt semantics. Return an invalid-request error here unless you can preserve each original text slot.
Possible guardrail-safe fallback
- if textPartCount > 1 && rewrittenText == strings.Join(originalTexts, " ") {
- copied := make([]core.ContentPart, len(parts))
- copy(copied, parts)
- return copied, nil
- }
+ if textPartCount > 1 {
+ if rewrittenText == strings.Join(originalTexts, " ") {
+ copied := make([]core.ContentPart, len(parts))
+ copy(copied, parts)
+ return copied, nil
+ }
+ return nil, core.NewInvalidRequestError(
+ "guardrails cannot safely rewrite multimodal messages with multiple text parts",
+ nil,
+ )
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for _, part := range parts { | |
| if part.Type == "text" { | |
| textPartCount++ | |
| hadTextPart = true | |
| originalTexts = append(originalTexts, part.Text) | |
| if !insertedRewrittenText { | |
| if rewrittenText != "" { | |
| merged = append(merged, core.ContentPart{Type: "text", Text: rewrittenText}) | |
| } | |
| insertedRewrittenText = true | |
| } | |
| continue | |
| } | |
| merged = append(merged, part) | |
| } | |
| if textPartCount > 1 && rewrittenText == strings.Join(originalTexts, " ") { | |
| copied := make([]core.ContentPart, len(parts)) | |
| copy(copied, parts) | |
| return copied, nil | |
| } | |
| for _, part := range parts { | |
| if part.Type == "text" { | |
| textPartCount++ | |
| hadTextPart = true | |
| originalTexts = append(originalTexts, part.Text) | |
| if !insertedRewrittenText { | |
| if rewrittenText != "" { | |
| merged = append(merged, core.ContentPart{Type: "text", Text: rewrittenText}) | |
| } | |
| insertedRewrittenText = true | |
| } | |
| continue | |
| } | |
| merged = append(merged, part) | |
| } | |
| if textPartCount > 1 { | |
| if rewrittenText == strings.Join(originalTexts, " ") { | |
| copied := make([]core.ContentPart, len(parts)) | |
| copy(copied, parts) | |
| return copied, nil | |
| } | |
| return nil, core.NewInvalidRequestError( | |
| "guardrails cannot safely rewrite multimodal messages with multiple text parts", | |
| nil, | |
| ) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/guardrails/provider.go` around lines 438 - 458, The current merge
logic collapses multiple text segments into one rewritten text slot (variables:
parts, merged, rewrittenText, textPartCount, originalTexts,
insertedRewrittenText), which changes ordering for multimodal turns like
text->image->text; update the code to fail closed instead of collapsing: if
textPartCount > 1 and rewrittenText is non-empty and rewrittenText !=
strings.Join(originalTexts, " "), return an invalid-request error (instead of
returning a copied parts slice) unless you implement logic to preserve each
original text slot (i.e., apply rewrittenText to each corresponding text part
rather than collapsing them); update the return path to produce that error where
the current copied return is used.
| messages, ok := req["messages"].([]interface{}) | ||
| if !ok || len(messages) != 1 { | ||
| t.Fatalf("messages = %#v, want single message", req["messages"]) | ||
| } | ||
| message, ok := messages[0].(map[string]interface{}) | ||
| if !ok { | ||
| t.Fatalf("message type = %T", messages[0]) | ||
| } | ||
| content, ok := message["content"].([]interface{}) | ||
| if !ok { | ||
| t.Fatalf("content type = %T, want []interface{}", message["content"]) | ||
| } | ||
| if len(content) != 2 { | ||
| t.Fatalf("len(content) = %d, want 2", len(content)) | ||
| } | ||
| second, ok := content[1].(map[string]interface{}) | ||
| if !ok { | ||
| t.Fatalf("second part type = %T", content[1]) | ||
| } | ||
| if second["type"] != "image_url" { | ||
| t.Fatalf("second part type = %v, want image_url", second["type"]) | ||
| } | ||
|
|
There was a problem hiding this comment.
Assert the nested image_url.url payload too.
This still passes if the provider emits a second part with type:"image_url" but drops the actual image_url object. Please validate the nested URL so the OpenAI request shape is really covered.
As per coding guidelines, **/*_test.go: Add or update tests for any behavior change.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/providers/openai/openai_test.go` around lines 181 - 203, Validate
the nested image_url payload in the test by asserting that second["image_url"]
exists and is a map[string]interface{} (e.g., imageURL, ok :=
second["image_url"].(map[string]interface{})) and then assert that
imageURL["url"] is a non-empty string (e.g., url, ok := imageURL["url"].(string)
and url != ""), so the test fails if the provider emits type:"image_url" but
omits the actual URL; update the checks around messages/message/content/second
in openai_test.go accordingly.
| callID := firstNonEmptyString(item, "call_id", "id") | ||
| if name == "" { | ||
| return core.Message{}, false | ||
| if strings.TrimSpace(name) == "" { | ||
| return core.Message{}, "", core.NewInvalidRequestError(fmt.Sprintf("invalid responses input item at index %d: function_call name is required", index), nil) | ||
| } | ||
| callID = ResponsesFunctionCallCallID(callID) |
There was a problem hiding this comment.
Trim call_id before storing it.
firstNonEmptyString() only trims for emptiness; it returns the raw value. " call_1 " therefore becomes a different ID than "call_1", so the assistant tool_calls created from the function_call item can stop matching the later function_call_output.
Small normalization fix
- callID := firstNonEmptyString(item, "call_id", "id")
+ callID := strings.TrimSpace(firstNonEmptyString(item, "call_id", "id"))
@@
- callID := firstNonEmptyString(item, "call_id")
+ callID := strings.TrimSpace(firstNonEmptyString(item, "call_id"))Also applies to: 180-188
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/providers/responses_adapter.go` around lines 159 - 163, The call_id
returned by firstNonEmptyString is not trimmed so values like " call_1 " differ
from "call_1"; update the code where callID is set (the local variable callID
used before calling ResponsesFunctionCallCallID) to run
strings.TrimSpace(callID) and then pass the trimmed value into
ResponsesFunctionCallCallID; make the identical change in the other block that
builds a callID for function_call_output (the second callID assignment later in
the file) so all stored call IDs are normalized.
…ment
The old ResponsesInputItem only modeled chat messages (role + content),
forcing function_call and function_call_output items through untyped
map[string]interface{} handling. The new ResponsesInputElement struct
explicitly represents all three input variants via a Type discriminator,
with typed fields for each shape. JSON unmarshal now produces
[]ResponsesInputElement instead of []interface{}, eliminating
map[string]interface{} from the hot path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary by CodeRabbit
New Features
Bug Fixes
Tests