Skip to content

Commit 5a307f5

Browse files
authored
feat(provider/google-vertex): allow using Gemini image models with generateImage (#12502)
## Background Follow-up to #12252 (which added Gemini image generation to the `google` provider). This wasn't yet wired up for the `google-vertex` provider. ## Summary - Adds Gemini image model support to `google-vertex`'s `generateImage()`. When a `gemini-*` model ID is detected, the image model internally delegates to `GoogleGenerativeAILanguageModel` with `responseModalities: ['IMAGE']`, matching the approach used in the `google` provider. Existing Imagen model behavior is unchanged. - Includes docs for this new `generateImage` capability for `google-vertex` - Also improves provider options typing in both `google` and `google-vertex` image models to use `satisfies GoogleLanguageModelOptions` ## Manual Verification Ran `google-vertex-gemini-image.ts` and `google-vertex-gemini-editing.ts` examples against Vertex AI with real credentials. Both image generation and image editing produce correct results. ## Checklist - [x] Tests have been added / updated (for bug fixes / features) - [x] Documentation has been added / updated (for bug fixes / features) - [x] A _patch_ changeset for relevant packages has been added (for bug fixes / features - run `pnpm changeset` in the project root) - [x] I have reviewed this pull request (self-review) ## Future Work N/A ## Related Issues Fixes #12452
1 parent e73beb8 commit 5a307f5

File tree

9 files changed

+637
-14
lines changed

9 files changed

+637
-14
lines changed

.changeset/gentle-cats-stare.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
'@ai-sdk/google-vertex': patch
3+
'@ai-sdk/google': patch
4+
---
5+
6+
feat(provider/google-vertex): allow using Gemini image models with `generateImage`

content/providers/01-ai-sdk-providers/16-google-vertex.mdx

Lines changed: 84 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -838,8 +838,11 @@ The following optional provider options are available for Google Vertex AI embed
838838

839839
### Image Models
840840

841-
You can create [Imagen](https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview) models that call the [Imagen on Vertex AI API](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images)
842-
using the `.image()` factory method. For more on image generation with the AI SDK see [generateImage()](/docs/reference/ai-sdk-core/generate-image).
841+
You can create image models using the `.image()` factory method. The Google Vertex provider supports both [Imagen](https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview) and [Gemini image models](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-flash-image). For more on image generation with the AI SDK see [generateImage()](/docs/reference/ai-sdk-core/generate-image).
842+
843+
#### Imagen Models
844+
845+
[Imagen models](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images) generate images using the Imagen on Vertex AI API.
843846

844847
```ts
845848
import { vertex } from '@ai-sdk/google-vertex';
@@ -910,7 +913,7 @@ console.log(
910913
);
911914
```
912915

913-
#### Image Editing
916+
##### Image Editing
914917

915918
Google Vertex Imagen models support image editing through inpainting, outpainting, and other edit modes. Pass input images via `prompt.images` and optionally a mask via `prompt.mask`.
916919

@@ -919,7 +922,7 @@ Google Vertex Imagen models support image editing through inpainting, outpaintin
919922
`imagen-4.0-generate-001` model does not currently support editing operations.
920923
</Note>
921924

922-
##### Inpainting (Insert Objects)
925+
###### Inpainting (Insert Objects)
923926

924927
Insert or replace objects in specific areas using a mask:
925928

@@ -951,7 +954,7 @@ const { images } = await generateImage({
951954
});
952955
```
953956

954-
##### Outpainting (Extend Image)
957+
###### Outpainting (Extend Image)
955958

956959
Extend an image beyond its original boundaries:
957960

@@ -982,7 +985,7 @@ const { images } = await generateImage({
982985
});
983986
```
984987

985-
##### Edit Provider Options
988+
###### Edit Provider Options
986989

987990
The following options are available under `providerOptions.vertex.edit`:
988991

@@ -1013,7 +1016,7 @@ The following options are available under `providerOptions.vertex.edit`:
10131016
image editing.
10141017
</Note>
10151018

1016-
#### Model Capabilities
1019+
##### Imagen Model Capabilities
10171020

10181021
| Model | Aspect Ratios |
10191022
| ------------------------------- | ------------------------- |
@@ -1024,6 +1027,80 @@ The following options are available under `providerOptions.vertex.edit`:
10241027
| `imagen-4.0-fast-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 |
10251028
| `imagen-4.0-ultra-generate-001` | 1:1, 3:4, 4:3, 9:16, 16:9 |
10261029

1030+
#### Gemini Image Models
1031+
1032+
[Gemini image models](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-flash-image) (e.g. `gemini-2.5-flash-image`) are multimodal output language models that can be used with `generateImage()` for a simpler image generation experience. Internally, the provider calls the language model API with `responseModalities: ['IMAGE']`.
1033+
1034+
```ts
1035+
import { vertex } from '@ai-sdk/google-vertex';
1036+
import { generateImage } from 'ai';
1037+
1038+
const { image } = await generateImage({
1039+
model: vertex.image('gemini-2.5-flash-image'),
1040+
prompt: 'A photorealistic image of a cat wearing a wizard hat',
1041+
aspectRatio: '1:1',
1042+
});
1043+
```
1044+
1045+
Gemini image models also support image editing by providing input images:
1046+
1047+
```ts
1048+
import { vertex } from '@ai-sdk/google-vertex';
1049+
import { generateImage } from 'ai';
1050+
import fs from 'node:fs';
1051+
1052+
const sourceImage = fs.readFileSync('./cat.png');
1053+
1054+
const { image } = await generateImage({
1055+
model: vertex.image('gemini-2.5-flash-image'),
1056+
prompt: {
1057+
text: 'Add a small wizard hat to this cat',
1058+
images: [sourceImage],
1059+
},
1060+
});
1061+
```
1062+
1063+
You can also use URLs (including `gs://` Cloud Storage URIs) for input images:
1064+
1065+
```ts
1066+
import { vertex } from '@ai-sdk/google-vertex';
1067+
import { generateImage } from 'ai';
1068+
1069+
const { image } = await generateImage({
1070+
model: vertex.image('gemini-2.5-flash-image'),
1071+
prompt: {
1072+
text: 'Add a small wizard hat to this cat',
1073+
images: ['https://example.com/cat.png'],
1074+
},
1075+
});
1076+
```
1077+
1078+
<Note>
1079+
Gemini image models do not support the `size` or `n` parameters. Use
1080+
`aspectRatio` instead of `size`. Mask-based inpainting is also not supported.
1081+
</Note>
1082+
1083+
<Note>
1084+
Gemini image models are multimodal output models that can generate both text
1085+
and images. For more advanced use cases where you need both text and image
1086+
outputs, or want more control over the generation process, you can use them
1087+
directly with `generateText()`.
1088+
</Note>
1089+
1090+
##### Gemini Image Model Capabilities
1091+
1092+
| Model | Image Generation | Image Editing | Aspect Ratios |
1093+
| ---------------------------- | ------------------- | ------------------- | --------------------------------------------------- |
1094+
| `gemini-3-pro-image-preview` | <Check size={18} /> | <Check size={18} /> | 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9 |
1095+
| `gemini-2.5-flash-image` | <Check size={18} /> | <Check size={18} /> | 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9 |
1096+
1097+
<Note>
1098+
`gemini-3-pro-image-preview` supports additional features including up to 14
1099+
reference images for editing (6 objects, 5 humans), resolution options (1K,
1100+
2K, 4K via `providerOptions.vertex.imageConfig.imageSize`), and Google Search
1101+
grounding.
1102+
</Note>
1103+
10271104
### Video Models
10281105

10291106
You can create [Veo](https://cloud.google.com/vertex-ai/generative-ai/docs/video/overview) video models that call the Vertex AI API
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
import { vertex } from '@ai-sdk/google-vertex';
2+
import { generateImage } from 'ai';
3+
import fs from 'node:fs';
4+
import { presentImages } from '../lib/present-image';
5+
import { run } from '../lib/run';
6+
7+
run(async () => {
8+
console.log('Generating base cat image...');
9+
const baseResult = await generateImage({
10+
model: vertex.image('gemini-2.5-flash-image'),
11+
prompt:
12+
'A photorealistic picture of a fluffy ginger cat sitting on a wooden table',
13+
});
14+
15+
const timestamp = Date.now();
16+
17+
fs.mkdirSync('output', { recursive: true });
18+
19+
const baseImage = baseResult.image;
20+
await fs.promises.writeFile(
21+
`output/cat-base-${timestamp}.png`,
22+
baseImage.uint8Array,
23+
);
24+
console.log(`Saved base image: output/cat-base-${timestamp}.png`);
25+
26+
console.log('Adding wizard hat...');
27+
const editResult = await generateImage({
28+
model: vertex.image('gemini-2.5-flash-image'),
29+
prompt: {
30+
text: 'Add a small wizard hat to this cat. Keep everything else the same.',
31+
images: [baseImage.uint8Array],
32+
},
33+
});
34+
35+
presentImages(editResult.images);
36+
});
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
import { vertex } from '@ai-sdk/google-vertex';
2+
import { generateImage } from 'ai';
3+
import { presentImages } from '../lib/present-image';
4+
import { run } from '../lib/run';
5+
6+
run(async () => {
7+
const result = await generateImage({
8+
model: vertex.image('gemini-2.5-flash-image'),
9+
prompt:
10+
'Create a picture of a nano banana dish in a fancy restaurant with a Gemini theme',
11+
aspectRatio: '1:1',
12+
});
13+
14+
presentImages(result.images);
15+
16+
console.log(
17+
'Provider metadata:',
18+
JSON.stringify(result.providerMetadata, null, 2),
19+
);
20+
console.log('Token usage:', result.usage);
21+
});

0 commit comments

Comments
 (0)