Skip to content

修复 Responses 非透传模式下工具原语保真#1703

Merged
looplj merged 1 commit into
looplj:unstablefrom
deqiying:codex/openai-responses-non-passthrough-fix
May 25, 2026
Merged

修复 Responses 非透传模式下工具原语保真#1703
looplj merged 1 commit into
looplj:unstablefrom
deqiying:codex/openai-responses-non-passthrough-fix

Conversation

@deqiying

@deqiying deqiying commented May 23, 2026

Copy link
Copy Markdown
Contributor

修复 Responses 非透传模式下工具原语保真

背景

OpenAI Responses 非透传转换链路会将请求归一化为通用 llm.Request,部分 Provider 原生字段无法完整表达,例如 tool_searchtool_choice.tools 以及部分原始 input item,导致出站请求丢失工具相关信息。

修改内容

  • llm.Request 增加 ProviderExtensions sidecar,用于保存 Provider 私有原始片段,并通过 json:"-" 避免污染通用序列化。
  • 在 Responses inbound 阶段捕获无法映射到通用模型的 tools、tool_choice 和 input item。
  • 在 Responses outbound 阶段以结构化 payload 为主,兼容时回填原始 Provider 片段。
  • 增加 tool signature 保护,避免结构化 tools 已变化时错误回放旧 raw tool。
  • 补充单元测试覆盖捕获、回放、工具变更不回放,以及 ProviderExtensions 不参与 JSON 序列化。

变更范围

本次修复仅覆盖 OpenAI Responses 请求侧的非透传转换链路,不涉及响应/SSE 透传、UI、GraphQL 或系统配置。

验证

已运行:

cd llm && go test ./transformer/openai/responses

关联:#1525

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to capture and replay provider-specific sidecar data for the OpenAI Responses API, ensuring high fidelity when converting between internal and external request formats. It adds a ProviderExtensions field to the llm.Request and implements logic to handle raw tools, tool choices, and input items that are not structurally represented in the core model. The review feedback highlights that RawInputItems are currently captured but not replayed in the outbound request. Additionally, improvements were suggested for the RawToolChoice replay logic to prevent replaying stale data when a tool choice has been intentionally cleared or modified.

Comment on lines +176 to +205
func marshalRequestPayload(payload Request, llmReq *llm.Request) ([]byte, error) {
body, err := json.Marshal(payload)
if err != nil {
return nil, err
}

requestExt := openAIResponsesRequestExtensions(llmReq)
if requestExt == nil {
return body, nil
}

var obj map[string]json.RawMessage
if err := json.Unmarshal(body, &obj); err != nil {
return nil, err
}

if tools, ok := mergeRawOnlyTools(obj["tools"], requestExt); ok {
toolsRaw, err := json.Marshal(tools)
if err != nil {
return nil, err
}
obj["tools"] = toolsRaw
}

if len(requestExt.RawToolChoice) > 0 && rawToolChoiceMatchesCurrentTools(requestExt.RawToolChoice, payload.ToolChoice) {
obj["tool_choice"] = cloneRaw(requestExt.RawToolChoice)
}

return json.Marshal(obj)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The RawInputItems captured during the inbound phase (line 19) are not being replayed here in marshalRequestPayload. This means that any provider-specific input items that AxonHub doesn't structurally represent (and thus skips during inbound conversion) will be missing from the outbound request. To ensure full fidelity as stated in the PR description, consider implementing a merge logic for input items similar to how tools are handled.

Comment on lines +271 to +273
if current == nil {
return true
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Returning true when current is nil causes the raw tool choice to be replayed even if it was intentionally cleared from the request (e.g., by a middleware). Since RawToolChoice is only populated if a tool choice existed in the original inbound request, a nil value in the current request should be respected as an instruction to remove it.

Suggested change
if current == nil {
return true
}
if current == nil {
return false
}

Comment on lines +281 to +283
if currentSignature == "" {
return true
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Returning true when currentSignature is empty is too permissive. If the structured tool choice was modified to a state that doesn't produce a recognized signature (but is not nil), replaying the raw tool choice might override that change with stale data. It's safer to only replay when signatures explicitly match.

Suggested change
if currentSignature == "" {
return true
}
if currentSignature == "" {
return false
}

@greptile-apps

greptile-apps Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds a ProviderExtensions sidecar to llm.Request and uses it to preserve OpenAI Responses API-specific raw fragments (unsupported tool types, tool_choice.tools arrays, and unrecognised input items) that cannot be expressed in the normalised common model, then re-injects the relevant raw fragments on the outbound path.

  • llm/provider_extensions.go introduces the new sidecar types with json:"-" on every field so nothing leaks through the common JSON model; CloneProviderExtensions ensures deep copies travel safely through the pipeline.
  • request_extensions.go implements capture (inbound) and replay (outbound) for raw tools and tool_choice; RawInputItems is captured and included in the early-return guard, but there is no corresponding injection step in marshalRequestPayload, so unsupported input items are silently dropped on the outbound path.
  • marshalRequestPayload applies a tool-signature guard before merging raw tools back, preventing stale data from being replayed if the tool list was modified mid-pipeline.

Confidence Score: 3/5

Safe to merge for the tools and tool_choice replay paths, but the input-item preservation path is incomplete — items are captured but never re-injected on the outbound side.

The raw-tool and raw-tool_choice replay logic is well-structured and protected by a signature guard that prevents stale data from being applied when the tool list changes. However, RawInputItems is built, stored, and factored into the early-return guard of attachOpenAIResponsesRequestExtensions, yet marshalRequestPayload contains no step to merge those items back into the outbound request. Any request carrying unsupported input item types will have those items captured at inbound time and silently dropped at outbound time, leaving the feature half-implemented and without test coverage for the replay side.

llm/transformer/openai/responses/request_extensions.go — the marshalRequestPayload function needs a replay step for RawInputItems to match the inbound capture logic.

Important Files Changed

Filename Overview
llm/model.go Adds ProviderExtensions *ProviderExtensions field to Request with json:"-" to keep it out of serialized output; minimal, safe change.
llm/provider_extensions.go New file defining the ProviderExtensions sidecar hierarchy and its deep-clone helpers. All struct fields carry json:"-" tags so the data is never serialized. Contains a duplicate of the cloneRaw helper that also exists in request_extensions.go.
llm/transformer/openai/responses/request_extensions.go New file with inbound capture and outbound replay logic for raw tools, tool_choice, and input items. RawInputItems is captured (including in the early-return guard) but never injected back in marshalRequestPayload, so unsupported input item types are silently dropped on the outbound path.
llm/transformer/openai/responses/inbound.go Passes httpReq.Body to convertToLLMRequest to enable extension capture; function signature changed to variadic rawBody ...[]byte. The variadic form is slightly loose but all existing callers are consistent.
llm/transformer/openai/responses/outbound.go Delegates the final JSON marshal step to marshalRequestPayload so raw extensions can be injected; the change is a one-line swap with no other behavioral differences.
llm/transformer/openai/responses/inbound_test.go Adds a test case verifying that tool_search tools and a tools-array tool_choice are captured in ProviderExtensions. No test covers RawInputItems capture or replay.
llm/transformer/openai/responses/outbound_test.go Adds three new test cases: raw tools/tool_choice replay, no-replay when tools change, and ProviderExtensions JSON exclusion. Coverage is good for the implemented paths; no test validates that RawInputItems would be replayed.

Sequence Diagram

sequenceDiagram
    participant Client
    participant InboundTransformer
    participant attachExtensions
    participant Pipeline
    participant OutboundTransformer
    participant marshalPayload

    Client->>InboundTransformer: HTTP Request (raw JSON body)
    InboundTransformer->>InboundTransformer: json.Unmarshal → Request struct
    InboundTransformer->>attachExtensions: convertToLLMRequest(req, rawBody)
    attachExtensions->>attachExtensions: parseRawRequestFragments(rawBody)
    attachExtensions->>attachExtensions: buildRawOnlyToolFragments → RawTools
    attachExtensions->>attachExtensions: buildRepresentedToolSignatures → ToolSignatures
    attachExtensions->>attachExtensions: rawUnsupportedToolChoice → RawToolChoice
    attachExtensions->>attachExtensions: buildRawOnlyInputFragments → RawInputItems (captured, not replayed)
    attachExtensions-->>InboundTransformer: llm.Request with ProviderExtensions sidecar
    InboundTransformer-->>Pipeline: llm.Request (normalised + sidecar)
    Pipeline->>Pipeline: optional tool/message modification
    Pipeline->>OutboundTransformer: llm.Request
    OutboundTransformer->>OutboundTransformer: build Request payload (structured tools only)
    OutboundTransformer->>marshalPayload: marshalRequestPayload(payload, llmReq)
    marshalPayload->>marshalPayload: structuredToolSignaturesMatch?
    alt signatures match
        marshalPayload->>marshalPayload: mergeRawOnlyTools → insert raw tools at original indices
    end
    marshalPayload->>marshalPayload: rawToolChoiceMatchesCurrentTools?
    alt choice unchanged
        marshalPayload->>marshalPayload: replace tool_choice with raw RawToolChoice
    end
    marshalPayload-->>OutboundTransformer: merged JSON body
    OutboundTransformer-->>Client: HTTP Request to upstream LLM
Loading

Reviews (1): Last reviewed commit: "修复 Responses 非透传工具原语保真" | Re-trigger Greptile

Comment on lines +22 to +24
if len(requestExt.RawTools) == 0 && len(requestExt.RawToolChoice) == 0 && len(requestExt.RawInputItems) == 0 {
return
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 RawInputItems captured but never replayed

buildRawOnlyInputFragments populates requestExt.RawInputItems for unsupported input item types, and the guard on line 22 keeps the extension alive when only RawInputItems is non-empty. However, marshalRequestPayload merges back RawTools and RawToolChoice but has no corresponding step for RawInputItems. Any request that contains unsupported input item types (e.g. future OpenAI-specific items not in isStructurallyRepresentedInputItem) will have those items captured in the inbound, silently dropped during message conversion, and never re-injected on the outbound — defeating the stated goal of input-item fidelity for the non-passthrough path.


// convertToLLMRequest converts OpenAI Responses API Request to llm.Request.
func convertToLLMRequest(req *Request) (*llm.Request, error) {
func convertToLLMRequest(req *Request, rawBody ...[]byte) (*llm.Request, error) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The variadic rawBody ...[]byte signature technically accepts any number of []byte arguments but only ever uses the first. A caller that mistakenly passes two bodies (e.g. during a future refactor) would get silent partial behavior. Since convertToLLMRequest is a package-internal function, a plain rawBody []byte with a nil guard is clearer and safer.

Suggested change
func convertToLLMRequest(req *Request, rawBody ...[]byte) (*llm.Request, error) {
func convertToLLMRequest(req *Request, rawBody []byte) (*llm.Request, error) {

Comment on lines +298 to +300
if len(rawBody) > 0 {
attachOpenAIResponsesRequestExtensions(chatReq, req, rawBody[0])
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 If convertToLLMRequest is changed to take a plain []byte, this call site needs a corresponding update to pass rawBody directly instead of wrapping it in the variadic form.

Suggested change
if len(rawBody) > 0 {
attachOpenAIResponsesRequestExtensions(chatReq, req, rawBody[0])
}
if len(rawBody) > 0 {
attachOpenAIResponsesRequestExtensions(chatReq, req, rawBody)
}

Comment on lines +81 to +87
func cloneRawMessage(src json.RawMessage) json.RawMessage {
if len(src) == 0 {
return nil
}

return append(json.RawMessage(nil), src...)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Duplicate cloneRaw helper

cloneRawMessage here and cloneRaw in request_extensions.go do exactly the same thing. The duplication is harmless but could diverge over time. Consider exporting cloneRawMessage from the llm package (or keeping it unexported and referencing it from request_extensions.go via the llm package) to have a single source of truth.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@looplj

looplj commented May 24, 2026

Copy link
Copy Markdown
Owner

感谢 PR,感觉
greptile 的 几个 review 都挺有道理的,可以看下。

@looplj looplj merged commit abb9d84 into looplj:unstable May 25, 2026
4 checks passed
junjiangao pushed a commit to junjiangao/axonhub that referenced this pull request May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants