Skip to content

Vertex Gemini: parallel function-call responses are split across separate Contents, causing deterministic "function response parts" INVALID_ARGUMENT #2958

@t-mizumoto1203

Description

@t-mizumoto1203

Description

When a Gemini model (via Vertex AI) returns parallel function calls (multiple functionCall parts in one assistant turn), docker-agent returns each tool result as its own Content (one functionResponse part per Content). Vertex AI requires the response turn to contain the same number of functionResponse parts as the preceding turn's functionCall parts, so the request is rejected. The failure is deterministic for a given turn, so retries reproduce it and the orchestrator eventually aborts.

Expected Behavior

Parallel function-call results should be returned together as multiple functionResponse parts within a single Content (role user), with the part count matching the number of function calls — as described in the Vertex AI function-calling docs:

collect API responses in parts and send them back to the model ... combined into one Content(role="user", parts=function_response_parts)

Actual Behavior

The request fails with HTTP 400 INVALID_ARGUMENT. Because the conversation history is fixed, every retry re-sends the same (mis-grouped) responses and fails identically; the orchestrator hits its consecutive-identical-call guard and aborts.

Steps to Reproduce

  1. Configure an orchestrator agent that delegates to a sub-agent via transfer_task, using a Gemini model on Vertex AI (provider: google).
  2. Have the sub-agent perform two tool calls in a single turn (e.g. a filesystem agent told to write two files), so the assistant turn emits two functionCall parts.
  3. The two tool results are sent back as two separate Content objects → the next model call returns 400.

Note: Vertex Gemini exposes no parameter to disable parallel function calling (go-genai FunctionCallingConfig only has Mode), so there is no opt-out — the responses must be grouped correctly.

Environment

  • Docker Agent Version: v1.70.2
  • OS & Terminal: Linux
  • Model Used: gemini-2.5-flash-lite (Vertex AI, location global)

Error Output

model failed: error receiving from stream: Error 400, Message: Please ensure that the number of function response parts is equal to the number of function call parts of the function call turn., Status: INVALID_ARGUMENT, Details: []

Screenshots

N/A

Additional Context

Root causepkg/model/provider/gemini/client.go, convertMessagesToGemini: an assistant message with N ToolCalls becomes one genai.Content with N functionCall parts, but each tool-response message becomes its own genai.Content with a single functionResponse part (the role == Tool branch emits one Content per message). The response turn therefore has 1 part per Content instead of N parts in one Content.

Proposed fix — coalesce consecutive tool-response messages into a single genai.Content with N functionResponse parts (flush on the next non-tool message and at end of loop). A fix plus a regression test are prepared (the test fails on main and passes with the fix; go build ./..., go vet, package tests, and golangci-lint are green). PR to follow.

  • Not a regression — present since the per-message tool-response conversion in the gemini provider.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/providersFor features/issues/fixes related to LLM providers (Bedrock, LiteLLM, Qwen, custom, etc.)area/providers/geminiGoogle Gemini provider support

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions