-
Notifications
You must be signed in to change notification settings - Fork 849
Add support for text-to-image #6648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/Libraries/Microsoft.Extensions.AI.Abstractions/TextToImage/DelegatingTextToImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/TextToImage/DelegatingTextToImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/TextToImage/DelegatingTextToImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/TextToImage/ITextToImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/TextToImage/ITextToImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAITextToImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAITextToImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI/TextToImage/LoggingTextToImageClient.cs
Outdated
Show resolved
Hide resolved
...s/Microsoft.Extensions.AI/TextToImage/TextToImageClientBuilderTextToImageClientExtensions.cs
Outdated
Show resolved
Hide resolved
test/Libraries/Microsoft.Extensions.AI.Abstractions.Tests/TestJsonSerializerContext.cs
Outdated
Show resolved
Hide resolved
These are all nullable now so that the client can use defaults where appropriate. Remove quality default since it's not consistent across models. Also remove setting ResponseFormat since this is not supported by gpt-image-1.
src/Libraries/Microsoft.Extensions.AI/TextToImage/LoggingTextToImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/TextToImage/TextToImageResponse.cs
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request adds comprehensive support for text-to-image functionality to the Microsoft.Extensions.AI library. It introduces a new ITextToImageClient interface with implementations for OpenAI, complete with dependency injection support, middleware pipeline capabilities, and comprehensive test coverage.
- Defines core text-to-image abstractions including
ITextToImageClient,TextToImageOptions, andTextToImageResponse - Implements OpenAI text-to-image client integration with support for image generation and editing
- Adds middleware pipeline support with logging and configuration options
Reviewed Changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
src/Libraries/Microsoft.Extensions.AI.Abstractions/TextToImage/ |
Core abstractions and interfaces for text-to-image functionality |
src/Libraries/Microsoft.Extensions.AI/TextToImage/ |
Client builder, middleware, and dependency injection extensions |
src/Libraries/Microsoft.Extensions.AI.OpenAI/ |
OpenAI-specific implementation of ITextToImageClient |
test/Libraries/Microsoft.Extensions.AI.*Tests/TextToImage/ |
Comprehensive test coverage for all text-to-image components |
src/Libraries/Microsoft.Extensions.AI.Abstractions/TextToImage/TextToImageResponse.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/TextToImage/TextToImageOptions.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAITextToImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAITextToImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAITextToImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAITextToImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAITextToImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAITextToImageClient.cs
Outdated
Show resolved
Hide resolved
OpenAI's image API supports multiple images and this does seem to be common functionality and a better generalization. The client library doesn't expose this yet, but we should account for it. Image models may be capable of things like "Combine the subjects of these images into a single image" or "Create a single image that uses the subject from the first image and background for the second" etc.
src/Libraries/Microsoft.Extensions.AI.Abstractions/TextToImage/ITextToImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/TextToImage/TextToImageOptions.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/TextToImage/TextToImageOptions.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/TextToImage/TextToImageOptions.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/TextToImage/TextToImageResponse.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Image/ImageOptions.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Image/DelegatingImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Image/IImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Image/IImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Image/IImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.Abstractions/Image/ImageClientExtensions.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIImageClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIImageClient.cs
Outdated
Show resolved
Hide resolved
We don't yet have any good public support for streaming to vet this API We can guess at how it might behave for OpenAI, but that doesn't really give enough confidence to build the API around it.
|
/backport to release/9.8 |
|
Started backporting to release/9.8: https://github.com/dotnet/extensions/actions/runs/16914549812 |
|
@ericstj backporting to "release/9.8" failed, the patch most likely resulted in conflicts: $ git am --3way --empty=keep --ignore-whitespace --keep-non-patch changes.patch
Applying: Add ITextToImageClient
Applying: Remove URI based edit since it's not available
Applying: Add filename for edit
Applying: Add OpenAI implmentation of ITextToImageClient
.git/rebase-apply/patch:158: trailing whitespace.
/// <summary>Initializes a new instance of the <see cref="OpenAITextToImageClient"/> class for the specified <see cref="OpenAIClient"/> and model.
.git/rebase-apply/patch:161: trailing whitespace.
/// <param name="model">The default model to use for image generation.</param>
.git/rebase-apply/patch:371: trailing whitespace.
.git/rebase-apply/patch:421: trailing whitespace.
.git/rebase-apply/patch:430: trailing whitespace.
warning: squelched 1 whitespace error
warning: 6 lines add whitespace errors.
Using index info to reconstruct a base tree...
M src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIClientExtensions.cs
Falling back to patching base and 3-way merge...
Auto-merging src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIClientExtensions.cs
CONFLICT (content): Merge conflict in src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIClientExtensions.cs
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Patch failed at 0004 Add OpenAI implmentation of ITextToImageClient
Error: The process '/usr/bin/git' failed with exit code 128Please backport manually! |
* Add ITextToImageClient * Remove URI based edit since it's not available * Add filename for edit * Add OpenAI implmentation of ITextToImageClient * Fix tests * Add tests for TextToImage * Add DeletgatingTextToImageClient and tests * Add integration test and fix some bugs * Add remaining support to MEAI for TextToImage * Make all TextToImageOptions optional These are all nullable now so that the client can use defaults where appropriate. Remove quality default since it's not consistent across models. Also remove setting ResponseFormat since this is not supported by gpt-image-1. * Address feedback * Document some exceptions * Address feedback * Make EditImageAsync plural OpenAI's image API supports multiple images and this does seem to be common functionality and a better generalization. The client library doesn't expose this yet, but we should account for it. Image models may be capable of things like "Combine the subjects of these images into a single image" or "Create a single image that uses the subject from the first image and background for the second" etc. * Address feedback and add/fix tests. * Fix bad merge * Address feedback * Fix test * Use DataContent.Name for filename. * Add extensions for EditImageAsync Extension that accepts a single DataContent and one that accepts a byte[]. I've left out streams and file paths, since these require more opinions about how to load them. I filed dotnet#6683 to address streams. * Fix test * Remove use of `_model` field. * Rename ImageToText to Image * Rename TextToImage directories to Image * Rename files TextToImage -> Image * Add new request and response type * Make GenerateImagesAsync accept ImageRequest * Remove EditImageAsync * Adding GenerateStreamingImagesAsync * Update docs * Rename ImageClient ImageGenerator * Fix up some text-to-image references * Rename Image(Options|Request|Response) * Remove `Images` from `GenerateImagesAsync` * Remove streaming method We don't yet have any good public support for streaming to vet this API We can guess at how it might behave for OpenAI, but that doesn't really give enough confidence to build the API around it. * Address feedback * Provide OpenAI an appropriate filename * Remove Style from ImageGenerationOptions
* Add ITextToImageClient * Remove URI based edit since it's not available * Add filename for edit * Add OpenAI implmentation of ITextToImageClient * Fix tests * Add tests for TextToImage * Add DeletgatingTextToImageClient and tests * Add integration test and fix some bugs * Add remaining support to MEAI for TextToImage * Make all TextToImageOptions optional These are all nullable now so that the client can use defaults where appropriate. Remove quality default since it's not consistent across models. Also remove setting ResponseFormat since this is not supported by gpt-image-1. * Address feedback * Document some exceptions * Address feedback * Make EditImageAsync plural OpenAI's image API supports multiple images and this does seem to be common functionality and a better generalization. The client library doesn't expose this yet, but we should account for it. Image models may be capable of things like "Combine the subjects of these images into a single image" or "Create a single image that uses the subject from the first image and background for the second" etc. * Address feedback and add/fix tests. * Fix bad merge * Address feedback * Fix test * Use DataContent.Name for filename. * Add extensions for EditImageAsync Extension that accepts a single DataContent and one that accepts a byte[]. I've left out streams and file paths, since these require more opinions about how to load them. I filed #6683 to address streams. * Fix test * Remove use of `_model` field. * Rename ImageToText to Image * Rename TextToImage directories to Image * Rename files TextToImage -> Image * Add new request and response type * Make GenerateImagesAsync accept ImageRequest * Remove EditImageAsync * Adding GenerateStreamingImagesAsync * Update docs * Rename ImageClient ImageGenerator * Fix up some text-to-image references * Rename Image(Options|Request|Response) * Remove `Images` from `GenerateImagesAsync` * Remove streaming method We don't yet have any good public support for streaming to vet this API We can guess at how it might behave for OpenAI, but that doesn't really give enough confidence to build the API around it. * Address feedback * Provide OpenAI an appropriate filename * Remove Style from ImageGenerationOptions
Microsoft Reviewers: Open in CodeFlow