Skip to content

fix: 修复了多图输入情况下转发失效的问题#1714

Merged
looplj merged 1 commit into
looplj:unstablefrom
sxueck:unstable
May 25, 2026
Merged

fix: 修复了多图输入情况下转发失效的问题#1714
looplj merged 1 commit into
looplj:unstablefrom
sxueck:unstable

Conversation

@sxueck

@sxueck sxueck commented May 25, 2026

Copy link
Copy Markdown
Contributor

本次 PR 修复了以下我们在实际测试中发现的两个问题:

  1. 使用 gpt-image-2 模型的 edit endpoint 的时候,网关会自动注入 response_format 字段,导致上游拒绝本次生图请求
  2. 多图下使用 image[] 字段,网关代码硬编码了 name="image",导致上游将请求内识别成为了重复字段拒绝生图请求

修复前后:

image image

@greptile-apps

greptile-apps Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes two bugs in the OpenAI image gateway's outbound transformer: response_format was being injected into gpt-image-2 edit requests (rejected by the upstream API), and multi-image edit requests used a hardcoded name="image" for every part, causing the upstream to treat them as duplicate fields.

  • Replaces the single prefix-based isModelSupportResponseFormat guard with three explicit allowlist functions (supportsImageGenerationResponseFormat, supportsImageEditResponseFormat, supportsImageVariationResponseFormat), each scoped to the models that actually support the field.
  • When more than one image is present in an edit request, the multipart field name is now image[] instead of image, matching the array-field convention expected by the upstream API.
  • Three new unit tests cover the gpt-image-2 omission behaviour for generation/edit and the multi-image field-name switch.

Confidence Score: 4/5

The fix is targeted and correct; the only concern is a fragile boundary-extraction pattern in one test that could produce a false-positive pass if the Content-Type format changes.

Both production changes (allowlist functions and the image[] field name) directly address the described upstream rejections and are well-covered by new tests. The test boundary extraction uses strings.TrimPrefix rather than proper MIME parsing, which is brittle but does not affect production behaviour.

llm/transformer/openai/image_outbound_test.go — specifically the multipart boundary extraction in TestBuildImageEditRequest_MultipleImagesUsesArrayField.

Important Files Changed

Filename Overview
llm/transformer/openai/image_outbound.go Replaces a single prefix-based model guard with three explicit allowlist functions; fixes response_format injection for gpt-image-2 on the edit endpoint, and correctly switches multi-image field names from "image" to "image[]".
llm/transformer/openai/image_outbound_test.go Adds three new tests covering the gpt-image-2 response_format omission for generation and edit, and the multi-image image[] field name; boundary extraction uses a fragile TrimPrefix instead of mime.ParseMediaType.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[buildImageGenerationAPIRequest] --> B{APIFormat?}
    B -->|ImageVariation| C[buildImageVariationRequest]
    B -->|ImageEdit| D[buildImageEditRequest]
    B -->|default| E[buildImageGenerateRequest]

    C --> F{"supportsImageVariationResponseFormat\nmodel is empty or dall-e-2"}
    D --> G{len formFiles > 1?}
    G -->|Yes| H["imageFieldName = image-array"]
    G -->|No| I["imageFieldName = image"]
    H & I --> J{"supportsImageEditResponseFormat\nmodel == dall-e-2"}
    E --> K{"supportsImageGenerationResponseFormat\nmodel == dall-e-2 or dall-e-3"}

    F -->|true| L[inject response_format]
    F -->|false| M[omit response_format]
    J -->|true| N[inject response_format]
    J -->|false| O[omit response_format]
    K -->|true| P[inject response_format]
    K -->|false| Q[omit response_format]
Loading

Reviews (1): Last reviewed commit: "fix: 修复了多图输入情况下转发失效的问题" | Re-trigger Greptile

httpReq, err := ot.buildImageEditRequest(req, "test-key")
require.NoError(t, err)

reader := multipart.NewReader(bytes.NewReader(httpReq.Body), strings.TrimPrefix(httpReq.Headers.Get("Content-Type"), "multipart/form-data; boundary="))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The boundary is extracted by a literal string prefix strip. Go's multipart.NewWriter currently always formats the Content-Type as multipart/form-data; boundary=<value>, so this works today, but it bypasses proper MIME parsing. If the format ever gains quoting, extra parameters, or whitespace differences, TrimPrefix will silently return the original string, causing multipart.NewReader to fail to match any parts and the assert.Len assertion to pass with 0 items instead of erroring loudly. mime.ParseMediaType is the robust alternative.

Suggested change
reader := multipart.NewReader(bytes.NewReader(httpReq.Body), strings.TrimPrefix(httpReq.Headers.Get("Content-Type"), "multipart/form-data; boundary="))
_, mimeParams, err := mime.ParseMediaType(httpReq.Headers.Get("Content-Type"))
require.NoError(t, err)
reader := multipart.NewReader(bytes.NewReader(httpReq.Body), mimeParams["boundary"])

Comment on lines +77 to 79
func supportsImageVariationResponseFormat(model string) bool {
return model == "" || model == "dall-e-2"
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 An unspecified model in a variation request (model == "") is treated as a supported model, so a default response_format: "b64_json" will be injected into the multipart form. This preserves the pre-existing behaviour, but the intent is now encoded as a silent special case rather than something the caller can reason about. Documenting this in a comment avoids future surprises when someone maintains the allowlist.

Suggested change
func supportsImageVariationResponseFormat(model string) bool {
return model == "" || model == "dall-e-2"
}
// supportsImageVariationResponseFormat reports whether the given model supports
// the response_format field for image variations. An empty model string is
// treated as dall-e-2 (the API default) and therefore also supported.
func supportsImageVariationResponseFormat(model string) bool {
return model == "" || model == "dall-e-2"
}

@looplj looplj merged commit 8025e86 into looplj:unstable May 25, 2026
4 checks passed
Edint386 pushed a commit to Edint386/axonhub that referenced this pull request May 27, 2026
junjiangao pushed a commit to junjiangao/axonhub that referenced this pull request May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants