[Bug]: Gemini 3.x silent hang on subagent flows — `thoughtSignature` dropped on cross-provider replay (2026.5.3-1)

## Summary

`google/gemini-3.1-pro-preview` (`thinkingLevel: high`) reliably hangs after a few tool-call rounds in any flow where conversation history was authored by a different provider/model — e.g. subagents spawned from a Claude or GPT orchestrator. Failure mode: gateway logs `LLM idle timeout (600s): no response from model` and the run dies. Trivial single-turn calls to the same model with the same key work in 3 seconds.

This is the bug reported in #74244 and #72127 (and several earlier #58235, #63397). All were autoclosed by the `clawsweeper` bot citing a "no longer reproducible" or "already fixed" rationale. Original reporter @YouFoundJK pushed back in #74244 with a screenshot reproduction on `gemini-3.1-pro-preview` and posted a verified-working blueprint fix on 2026-04-29. No human maintainer ever responded; no PR was opened. The bug persists in `2026.5.3-1`, which is the current latest stable release.

## Status in shipped 2026.5.3-1

The exact code path @YouFoundJK identified as the problem is still present in the shipped dist:

```js
// /app/dist/transport-stream-8H4N10uL.js:211, 231
...isSameProviderAndModel && block.textSignature ? { thoughtSignature: block.textSignature } : {}
...isSameProviderAndModel && block.thoughtSignature ? { thoughtSignature: block.thoughtSignature } : {}
```

`isSameProviderAndModel` gates signature forwarding. For any conversation where assistant turns were authored by a different provider or model than the one currently being called (subagents, model switches, fallback chains), `thoughtSignature` is silently dropped from the outbound `functionCall` parts. Gemini 3.x then hangs (per Google docs, missing-signature should be a `400 INVALID_ARGUMENT`, but in practice it manifests as a silent stall).

The recommended fallback string `"skip_thought_signature_validator"` (per Google's [thought-signatures docs](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/thought-signatures)) is **not** present anywhere in `/app/dist/` of 2026.5.3-1, so neither the original-signature path nor the fallback path covers cross-provider replay.

## Reproduction fingerprint (5 captured runs)

Five subagent runs on `google/gemini-3.1-pro-preview` with `thinkingLevel: high` and a multi-step `exec`-tool task. All five hung identically. Captured directly from gateway-persisted JSONL transcripts:

| run | assistant turns | tool calls | thinking blocks | blocks with `thoughtSignature` | `<think>` text tags | terminal error |
|---|---|---|---|---|---|---|
| 1 | 3 | 3 | 2 | **0** | 0 | `LLM idle timeout (600s)` |
| 2 | 4 | 4 | 1 | **0** | 0 | `LLM idle timeout (600s)` |
| 3 | 3 | 2 | 2 | **0** | 0 | `request timed out` |
| 4 | 15 | 15 | 4 | **0** | 0 | `LLM idle timeout (600s)` |
| 5 | 5 | 5 | 2 | **0** | 0 | `LLM idle timeout (600s)` |

Persisted gateway error record from one run:

```json
{
  "type": "custom",
  "customType": "openclaw:prompt-error",
  "data": {
    "provider": "google",
    "model": "gemini-3.1-pro-preview",
    "api": "google-generative-ai",
    "error": "LLM idle timeout (600s): no response from model"
  }
}
```

Every persisted thinking content block in the failing transcripts has `type: "thinking"` (native, not text-tagged), 300–600 chars of text, and **no `signature` / `thoughtSignature` / `thinkingSignature` field on the persisted block**.

## What was checked / ruled out

- **No HTTP proxy.** `HTTP_PROXY` / `HTTPS_PROXY` are unset on both host and gateway, so #70453 / commit `ca620fb` (env-proxy provider routing, shipped in 2026.5.3-1) does not apply.
- **Not the `<think>` text-tag bug from #69220.** No `<think>` / `</think>` / `<final>` tags appear anywhere in the failing transcripts; the model is emitting native thinking parts, not text-tagged ones. (Note: `BUILTIN_REASONING_OUTPUT_MODES["google-generative-ai"] = "tagged"` is still in `/app/dist/provider-utils-*.js`, so #69220 may still bite other users on lower thinking levels.)
- **Not auth / quota / billing.** A trivial single-turn probe to `gemini-3.1-pro-preview` (one-word reply, no tools) returns in 3s.
- **Not the surrounding pipeline.** The same orchestrator → subagent code path with `claude-opus-4-7` instead of Gemini completes cleanly with similar tool-call counts. Two such transcripts captured (11 assistant turns, 18 tool calls, 6–8 thinking blocks each, no errors).

This narrows the failure to the Gemini transport's outbound serialization of cross-provider assistant history — exactly what @YouFoundJK identified.

## Asks

1. **Re-open #74244** (or supersede with this one). The clawsweeper autoclose was incorrect — code review missed the `isSameProviderAndModel` gate that strips signatures on cross-provider replay, and ignored the original reporter's request for human verification. Consider requiring a human re-check before autoclosing Gemini 3.x bug reports going forward.
2. **Land @YouFoundJK's [blueprint fix](https://github.com/openclaw/openclaw/issues/74244#issuecomment-4347696165)** (or an equivalent maintainer-cleaned version) covering all three gaps:
   - `transport-message-transform.ts` — stop stripping `thoughtSignature` from cross-provider history; let downstream transports decide.
   - `transport-stream.ts` — always include `thoughtSignature` on `functionCall` parts; use `"skip_thought_signature_validator"` fallback (per Google docs) when no captured signature is available.
   - `openai-transport-stream.ts` — collect signatures from all history (not just same-model).
3. **Make the failure mode loud.** A silent 600 s idle timeout with no diagnostic about a known protocol gap is very hard to root-cause downstream. Either surface Google's `400 INVALID_ARGUMENT: missing thought_signature` to the gateway log/error path, or fast-fail when an outbound Gemini call would emit a `functionCall` part with no signature.

## Environment

- OpenClaw `2026.5.3-1` (release tag, latest stable as of 2026-05-04)
- `google/gemini-3.1-pro-preview` via Google AI Studio API key (no Vertex, no proxy)
- Linux container, Node `24.14.0`
- `thinkingDefault: medium` globally; the affected reviewer agents run with `thinkingLevel: high`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Gemini 3.x silent hang on subagent flows — `thoughtSignature` dropped on cross-provider replay (2026.5.3-1) #77566

Summary

Status in shipped 2026.5.3-1

Reproduction fingerprint (5 captured runs)

What was checked / ruled out

Asks

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

run	assistant turns	tool calls	thinking blocks	terminal error
1	3	3	2	`LLM idle timeout (600s)`
2	4	4	1	`LLM idle timeout (600s)`
3	3	2	2	`request timed out`
4	15	15	4	`LLM idle timeout (600s)`
5	5	5	2	`LLM idle timeout (600s)`

Uh oh!

[Bug]: Gemini 3.x silent hang on subagent flows — thoughtSignature dropped on cross-provider replay (2026.5.3-1) #77566

Description

Summary

Status in shipped 2026.5.3-1

Reproduction fingerprint (5 captured runs)

What was checked / ruled out

Asks

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

[Bug]: Gemini 3.x silent hang on subagent flows — `thoughtSignature` dropped on cross-provider replay (2026.5.3-1) #77566