fix(provider/google): correctly mark reasoning files as such and fix related multi-turn errors#13262
Merged
felixarntz merged 4 commits intomainfrom Mar 9, 2026
Merged
Conversation
Collaborator
we can now. Okay to merge as is and backport to v6, but then we can implement a spec change in |
gr2m
approved these changes
Mar 9, 2026
vercel-ai-sdk bot
pushed a commit
that referenced
this pull request
Mar 9, 2026
…related multi-turn errors (#13262) ## Background When using Gemini thinking models with image output, the Google API returns `thought: true` on non-text parts (e.g. `inlineData` images). The SDK correctly maps `thought: true` on text parts to `type: 'reasoning'`, but silently strips the flag from file parts. This causes problems in multi-turn exchanges: the thought-image gets sent back as a regular image, causing API errors because they're interpreted as regular images but lack thought signatures (which per Google API are only present on the non-reasoning images). ## Summary Since we can't introduce a new part type without a spec change, this preserves the `thought` flag on file parts via `providerMetadata`/`providerOptions`. - Add `thought: z.boolean().nullish()` to the `inlineData` member in the provider-specific response schema - In both `doGenerate` and `doStream`, propagate `thought: true` into `providerMetadata` on file parts (alongside `thoughtSignature`) - In `convertToGoogleGenerativeAIMessages`, read `thought` from `providerMetadata` and send it back to the API on file parts Function examples for `generate-text` and `stream-text` demonstrating multi-step image generation with thinking are added, as well as an E2E example that allows interactive multi-turn with reasoning output that includes the images. ## Manual Verification Test with the new examples to verify; for the E2E example making two turns would result in an error without the `packages` fixes from this PR. ## Checklist - [x] Tests have been added / updated (for bug fixes / features) - [ ] Documentation has been added / updated (for bug fixes / features) - [x] A _patch_ changeset for relevant packages has been added (for bug fixes / features - run `pnpm changeset` in the project root) - [x] I have reviewed this pull request (self-review) ## Future Work A provider-agnostic mechanism to distinguish thought content from output content regardless of part type (as discussed in #12516) would be the ideal long-term solution, removing the need for provider-specific `providerMetadata` checks. However, this will require a spec change and therefore is only an option for v7 - we should explore that further. ## Related Issues Fixes #11461
Contributor
|
✅ Backport PR created: #13281 |
vercel-ai-sdk bot
added a commit
that referenced
this pull request
Mar 9, 2026
…h and fix related multi-turn errors (#13281) This is an automated backport of #13262 to the release-v6.0 branch. FYI @felixarntz Co-authored-by: Felix Arntz <felix.arntz@vercel.com>
Contributor
|
🚀 Published in:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
When using Gemini thinking models with image output, the Google API returns
thought: trueon non-text parts (e.g.inlineDataimages). The SDK correctly mapsthought: trueon text parts totype: 'reasoning', but silently strips the flag from file parts. This causes problems in multi-turn exchanges: the thought-image gets sent back as a regular image, causing API errors because they're interpreted as regular images but lack thought signatures (which per Google API are only present on the non-reasoning images).Summary
Since we can't introduce a new part type without a spec change, this preserves the
thoughtflag on file parts viaproviderMetadata/providerOptions.thought: z.boolean().nullish()to theinlineDatamember in the provider-specific response schemadoGenerateanddoStream, propagatethought: trueintoproviderMetadataon file parts (alongsidethoughtSignature)convertToGoogleGenerativeAIMessages, readthoughtfromproviderMetadataand send it back to the API on file partsFunction examples for
generate-textandstream-textdemonstrating multi-step image generation with thinking are added, as well as an E2E example that allows interactive multi-turn with reasoning output that includes the images.Manual Verification
Test with the new examples to verify; for the E2E example making two turns would result in an error without the
packagesfixes from this PR.Checklist
pnpm changesetin the project root)Future Work
A provider-agnostic mechanism to distinguish thought content from output content regardless of part type (as discussed in #12516) would be the ideal long-term solution, removing the need for provider-specific
providerMetadatachecks. However, this will require a spec change and therefore is only an option for v7 - we should explore that further.Related Issues
Fixes #11461