Skip to content

.Net: Added support for BinaryData for ImageContent.#4919

Closed
dersia wants to merge 0 commit intomicrosoft:mainfrom
dersia:main
Closed

.Net: Added support for BinaryData for ImageContent.#4919
dersia wants to merge 0 commit intomicrosoft:mainfrom
dersia:main

Conversation

@dersia
Copy link
Contributor

@dersia dersia commented Feb 7, 2024

Added Tests

Motivation and Context

As Described in #4781 right now there is no possibility in SK to add Images as DataUris to ChatCompletion APIs, although the Azure OpenAI API and the Open AI API both support this.

Fixes #4781

Description

As per Discussion added overload to the ImageContent ctor that takes BinaryData.
For backward Compat we kept the ctor that takes an URI.

Also the new ctor throws, if the BinaryData is null, empty or if there is not MediaType provided.

I thought about allowing plain, non base64 encoded DataUris with BinaryData.
The Idea was to not encode to base64, if the MediaType is set to "text/plain", but then I decided, that this is not needed, since Uri in general allows for DataUris like new Uri("data:text/plain;http://exmpaledomain.com") just not for DataUris that are longer than 65520 bytes. I feel like that is ok, for plain DataUris.

We can still add this if needed.

Also as per discussion in the issue, I did not add additional overloads for direct Streams support.

Contribution Checklist

@dersia dersia requested a review from a team as a code owner February 7, 2024 21:36
@shawncal shawncal added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel kernel.core labels Feb 7, 2024
@github-actions github-actions bot changed the title Added support for BinaryData for ImageContent. .Net: Added support for BinaryData for ImageContent. Feb 7, 2024
@dersia
Copy link
Contributor Author

dersia commented Feb 7, 2024

@microsoft-github-policy-service agree [company="Sia Consulting GmbH"]

@dersia
Copy link
Contributor Author

dersia commented Feb 7, 2024

@microsoft-github-policy-service agree company="Sia Consulting GmbH"

Copy link
Contributor

@Krzysztof318 Krzysztof318 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I don't think this is a good solution. If we were to introduce base64 support it would probably be better to create a new class like ImageBinaryContent. I'm not sure the SK team will agree to this change. By the way, if you make any changes create a new branch instead working on a copy of the main.
And to implement this, a fix in the implementation of the openai connector would be required....

/// </summary>
public BinaryData? Data { get; }

/// <summary>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But which has higher priority uri or base64 and why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the ToString override suggests, Data has priority. This is also assured by Tests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToString() rather will not used in connectors. We should clearly explain to others that they should prioritize a BinaryData over an Uri. For example by adding additional information to the doc comment in ImageContent class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also looking at an issue opened in runtime asking to remove the length limitation on Uri. I think this is in general a good idea, but this won't be GA before net9 and this is way too late. SK and AI development is moving so quickly right now, that most wouldn't want to wait for that change.

But even with that change, I think BinaryData gives developers a lot of comfortable possibilities and they don't have to do all the base64 encoding and serialization themselfs. So maybe once the limit on Uri is removed, we could in the implementation just switch back to Uri as an implementation detail. I really hope the issue on Azure.AI.OpenAI is moving forward sooner rather than later, so we can finally merge this change.

Assert.Throws<ArgumentNullException>(() => new ImageContent((BinaryData)null!));
Assert.Throws<ArgumentNullException>(() => new ImageContent(data));
Assert.Throws<ArgumentNullException>(() => new ImageContent(data2));
Assert.Throws<ArgumentNullException>(() => new ImageContent(data3));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use [Theory] and inline data in such cases.

// Assert throws if mediatype is null
Assert.Throws<ArgumentNullException>(() => new ImageContent(BinaryData.FromBytes(bytes, null)));
Assert.Throws<ArgumentNullException>(() => new ImageContent(BinaryData.FromBytes(bytes, string.Empty)));
Assert.Throws<ArgumentNullException>(() => new ImageContent(BinaryData.FromBytes(bytes, " ")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same [Theory] and [InlineData] and throws assertions should be in another test.

{
// Arrange
var ms = new System.IO.MemoryStream(System.Text.Encoding.UTF8.GetBytes("this is a test"));
try
Copy link
Contributor

@Krzysztof318 Krzysztof318 Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests projects support c#11 so you can do it simpler.
instead System.Text.Encoding.UTF8.GetBytes("this is a test")
you can write "this is a test"u8

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the MemoryStream ctor does not take a ROS<byte> and u8 gives back a ROS<byte>.

I could do "this is a test"u8.ToArray() not sure if is any better, though

@dersia
Copy link
Contributor Author

dersia commented Feb 8, 2024

In general I don't think this is a good solution. If we were to introduce base64 support it would probably be better to create a new class like ImageBinaryContent. I doubt the SK team would agree to this change. By the way, if you make any changes create a new branch instead working on a copy of the main. And to implement this, a fix in the implementation of the openai connector would be required....

I usually create branches for features, I am not sure, why this happend this time. My Fault, Sorry.
I will push the changes, once they're confirmed, I will close this PR and create a new one from the other branch.

Regarding the issue with the OpenAI Connector, you are totally correct, I thought that it would use the ToString() method and use that for the SDK.

I have commented on an open issue regarding this and also referenced this issue. Once this is fixed, I will adjust the implementation to the Connectors.

@Krzysztof318
Copy link
Contributor

@dersia You can leave this PR with main branch as is. It doesn't matter for PR, only for your convenience.

public Uri? Uri { get; set; }

/// <summary>
/// The Data used as DataUri for the image.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dersia Thanks for the changes.
I would remove the DataUri from the comment on this field and move it to the ToString.

Suggested change
/// The Data used as DataUri for the image.
/// The image binary data.

this.Data = data;
}

/// <inheritdoc/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// <inheritdoc/>
/// <summary>
/// Returns the string representation of the image.
/// BinaryData images will be represented as DataUri
/// Remote Uri images will be represented as is
/// </summary>

public void ToStringForDataUriFromStreamReturnsDataUriString()
{
// Arrange
var ms = new System.IO.MemoryStream(System.Text.Encoding.UTF8.GetBytes("this is a test"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any special reason why are you manually handling the disposal of MemoryStream in the test?

Can be simplified with the using var

Suggested change
var ms = new System.IO.MemoryStream(System.Text.Encoding.UTF8.GetBytes("this is a test"));
using var ms = new System.IO.MemoryStream(System.Text.Encoding.UTF8.GetBytes("this is a test"));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember having the using pattern implemented and than having early disposal in some test, but I don't have those test anymore. I will see if I can replicate it and if not, revert back to using the using pattern here

@Krzysztof318
Copy link
Contributor

Krzysztof318 commented Feb 13, 2024

@dersia btw. new class was introduced to kernel BinaryContent
This class is currently used only by OpenAIFileService.

I know this news may not be relevant to this PR, but it is worth taking a view.

@dersia
Copy link
Contributor Author

dersia commented Feb 14, 2024

I didn't mean to close this PR. I will open a new one and reference this PR

github-merge-queue bot pushed a commit that referenced this pull request Feb 20, 2024
Reopend PR. See #4919
for previous PR.
Reopening this PR from another feature branch makes rebasing easier.

### Motivation and Context

As Described in #4781 right now there is no possibility in SK to add
Images as DataUris to ChatCompletion APIs, although the Azure OpenAI API
and the Open AI API both support this.

Fixes #4781

### Description

As per Discussion added overload to the ImageContent ctor that takes
BinaryData.
For backward Compat we kept the ctor that takes an URI.

Also the new ctor throws, if the BinaryData is null, empty or if there
is not MediaType provided.

I thought about allowing plain, non base64 encoded DataUris with
BinaryData.
The Idea was to not encode to base64, if the MediaType is set to
"text/plain", but then I decided, that this is not needed, since `Uri`
in general allows for DataUris like `new
Uri("data:text/plain;http://exmpaledomain.com")` just not for DataUris
that are longer than 65520 bytes. I feel like that is ok, for plain
DataUris.

We can still add this if needed.

Also as per discussion in the issue, I did not add additional overloads
for direct Streams support.

### Contribution Checklist

<!-- Before submitting this PR, please make sure: -->

- [x] The code builds clean without any errors or warnings
- [x] The PR follows the [SK Contribution
Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
and the [pre-submission formatting
script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts)
raises no violations
- [x] All unit tests pass, and I have added new tests where possible
- [x] I didn't break anyone 😄

---------

Co-authored-by: Roger Barreto <19890735+RogerBarreto@users.noreply.github.com>
LudoCorporateShark pushed a commit to LudoCorporateShark/semantic-kernel that referenced this pull request Aug 25, 2024
Reopend PR. See microsoft#4919
for previous PR.
Reopening this PR from another feature branch makes rebasing easier.

### Motivation and Context

As Described in microsoft#4781 right now there is no possibility in SK to add
Images as DataUris to ChatCompletion APIs, although the Azure OpenAI API
and the Open AI API both support this.

Fixes microsoft#4781

### Description

As per Discussion added overload to the ImageContent ctor that takes
BinaryData.
For backward Compat we kept the ctor that takes an URI.

Also the new ctor throws, if the BinaryData is null, empty or if there
is not MediaType provided.

I thought about allowing plain, non base64 encoded DataUris with
BinaryData.
The Idea was to not encode to base64, if the MediaType is set to
"text/plain", but then I decided, that this is not needed, since `Uri`
in general allows for DataUris like `new
Uri("data:text/plain;http://exmpaledomain.com")` just not for DataUris
that are longer than 65520 bytes. I feel like that is ok, for plain
DataUris.

We can still add this if needed.

Also as per discussion in the issue, I did not add additional overloads
for direct Streams support.

### Contribution Checklist

<!-- Before submitting this PR, please make sure: -->

- [x] The code builds clean without any errors or warnings
- [x] The PR follows the [SK Contribution
Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
and the [pre-submission formatting
script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts)
raises no violations
- [x] All unit tests pass, and I have added new tests where possible
- [x] I didn't break anyone 😄

---------

Co-authored-by: Roger Barreto <19890735+RogerBarreto@users.noreply.github.com>
Bryan-Roe pushed a commit to Bryan-Roe-ai/semantic-kernel that referenced this pull request Oct 6, 2024
Reopend PR. See microsoft#4919
for previous PR.
Reopening this PR from another feature branch makes rebasing easier.

### Motivation and Context

As Described in microsoft#4781 right now there is no possibility in SK to add
Images as DataUris to ChatCompletion APIs, although the Azure OpenAI API
and the Open AI API both support this.

Fixes microsoft#4781

### Description

As per Discussion added overload to the ImageContent ctor that takes
BinaryData.
For backward Compat we kept the ctor that takes an URI.

Also the new ctor throws, if the BinaryData is null, empty or if there
is not MediaType provided.

I thought about allowing plain, non base64 encoded DataUris with
BinaryData.
The Idea was to not encode to base64, if the MediaType is set to
"text/plain", but then I decided, that this is not needed, since `Uri`
in general allows for DataUris like `new
Uri("data:text/plain;http://exmpaledomain.com")` just not for DataUris
that are longer than 65520 bytes. I feel like that is ok, for plain
DataUris.

We can still add this if needed.

Also as per discussion in the issue, I did not add additional overloads
for direct Streams support.

### Contribution Checklist

<!-- Before submitting this PR, please make sure: -->

- [x] The code builds clean without any errors or warnings
- [x] The PR follows the [SK Contribution
Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
and the [pre-submission formatting
script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts)
raises no violations
- [x] All unit tests pass, and I have added new tests where possible
- [x] I didn't break anyone 😄

---------

Co-authored-by: Roger Barreto <19890735+RogerBarreto@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kernel.core kernel Issues or pull requests impacting the core kernel .NET Issue or Pull requests regarding .NET code

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

.Net: Add overload ctor to ImageContent that takes an string "url"

4 participants