Skip to content

Conversation

@N-E-W-T-O-N
Copy link
Contributor

@N-E-W-T-O-N N-E-W-T-O-N commented Jun 11, 2024

Motivation and Context

Fix the Bug #6635 HuggingFace Embedding: Unable to Deserialization for certain models

Description

As mentioned in the issue, the HuggingFace Embedding API interface returns responses typically in the form of List<ReadOnlyMemory<float>> and occasionally as List<List<List<ReadOnlyMemory<float>>>>. Currently, only the latter format is handled correctly, leading to deserialization issues.

To address this, I propose the following solution:

try {
    // Attempt to parse data as List<ReadOnlyMemory<float>> and return the parsed data
}
catch (KernelException ex1) {
    try {
        // If the first attempt fails, attempt to parse data as List<List<List<ReadOnlyMemory<float>>>>` and return the parsed data
    }
    catch (KernelException ex2) {
        // If both attempts fail, handle the exception (e.g., the model doesn't exist ,the model has still been loading, or an HTTP exception occurred) and rethrow the error
    }
}

Contribution Checklist

Added class `TextEmbeddingResponseType1` & `TextEmbeddingResponseType2` to parse two distinct possible JsonRequest
…Async`

- Added try-catch block to better handle parsing in `GenerateEmbeddingsAsync` method.
@N-E-W-T-O-N N-E-W-T-O-N requested a review from a team as a code owner June 11, 2024 18:41
@markwallace-microsoft markwallace-microsoft added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel labels Jun 11, 2024
@github-actions github-actions bot changed the title Fix hugging face embedding .Net: Fix hugging face embedding Jun 11, 2024
@N-E-W-T-O-N
Copy link
Contributor Author

Hello @dmytrostruk
Can you please suggest to me where I can add the above custom HttpClient Example in main code .

@dmytrostruk
Copy link
Member

Hello @dmytrostruk Can you please suggest to me where I can add the above custom HttpClient Example in main code .

@N-E-W-T-O-N That's awesome, thanks for validating this idea and creating an example for this! I think you can put it here:
https://github.com/microsoft/semantic-kernel/blob/main/dotnet/samples/Concepts/Memory/HuggingFace_EmbeddingGeneration.cs

@N-E-W-T-O-N
Copy link
Contributor Author

I want to add two more parameters in the api's payload body namely 'use_cache' & 'wait_for_model'

I want to send this values as addition parameters in the HuggingFaceClient

https://huggingface.co/docs/api-inference/en/detailed_parameters

@N-E-W-T-O-N
Copy link
Contributor Author

@dmytrostruk
Added my example in 4f9eb97 .
Please merge my PR..
Let me know for anymore Improvement.

Copy link
Member

@dmytrostruk dmytrostruk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@N-E-W-T-O-N Thanks for your contribution! There are some warnings about possible null references and formatting issues. In order to proceed with merging, we need to make sure that all warnings are resolved. Thanks again!

@N-E-W-T-O-N
Copy link
Contributor Author

@rogerbarreto It might be possible that changes in TextEmbeddingResponse class will lead to test case failure

@N-E-W-T-O-N
Copy link
Contributor Author

N-E-W-T-O-N commented Jun 22, 2024

@dmytrostruk

Remove the whitespace error.
Please re run the workflow

@markwallace-microsoft
Copy link
Member

@N-E-W-T-O-N there are still formatting errors to be addressed. Suggest you run dotnet format on the solution.

@N-E-W-T-O-N
Copy link
Contributor Author

@dmytrostruk anything to add from my side

@dmytrostruk
Copy link
Member

@dmytrostruk anything to add from my side

@N-E-W-T-O-N I think we are ready to merge this to main. I added a couple of small fixes to the example. Thanks for your contribution!

@N-E-W-T-O-N
Copy link
Contributor Author

@dmytrostruk updated the code based on the suggestion of @westey-m

@N-E-W-T-O-N
Copy link
Contributor Author

@dmytrostruk
Please run the workflow test

@N-E-W-T-O-N
Copy link
Contributor Author

@dmytrostruk
Please rerun the workflow.
Resolved the issue

@dmytrostruk dmytrostruk added this pull request to the merge queue Jul 8, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 8, 2024
@dmytrostruk dmytrostruk enabled auto-merge July 8, 2024 17:06
@dmytrostruk dmytrostruk added this pull request to the merge queue Jul 8, 2024
Merged via the queue into microsoft:main with commit 32d3f5d Jul 8, 2024
@N-E-W-T-O-N
Copy link
Contributor Author

@dmytrostruk
Thanks for accepting my PR

@dmytrostruk
Copy link
Member

@N-E-W-T-O-N Thank you for contribution!

LudoCorporateShark pushed a commit to LudoCorporateShark/semantic-kernel that referenced this pull request Aug 25, 2024
### Motivation and Context
Fix the Bug microsoft#6635 HuggingFace Embedding: Unable to Deserialization for
certain models

<!-- Thank you for your contribution to the semantic-kernel repo!
Please help reviewers and future users providing the following
information:
  1. Why is this change required?
  2. What problem does it solve?
  3. What scenario does it contribute to?
  4. If it fixes an open issue, please link to the issue here.
-->

### Description
As mentioned in the issue, the HuggingFace Embedding API interface
returns responses typically in the form of `List<ReadOnlyMemory<float>>`
and occasionally as `List<List<List<ReadOnlyMemory<float>>>>`.
Currently, only the latter format is handled correctly, leading to
deserialization issues.

To address this, I propose the following solution:
```
try {
    // Attempt to parse data as List<ReadOnlyMemory<float>> and return the parsed data
}
catch (KernelException ex1) {
    try {
        // If the first attempt fails, attempt to parse data as List<List<List<ReadOnlyMemory<float>>>>` and return the parsed data
    }
    catch (KernelException ex2) {
        // If both attempts fail, handle the exception (e.g., the model doesn't exist ,the model has still been loading, or an HTTP exception occurred) and rethrow the error
    }
}

```
### Contribution Checklist 

<!-- Before submitting this PR, please make sure: -->

- [x] The code builds clean without any errors or warnings
- [x] The PR follows the [SK Contribution
Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
and the [pre-submission formatting
script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts)
raises no violations
- [ ] All unit tests pass, and I have added new tests where possible
- [ ] I didn't break anyone 😄

---------

Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kernel Issues or pull requests impacting the core kernel .NET Issue or Pull requests regarding .NET code

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants