-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Description
Describe the bug
When using the IEmbeddingGenerator implementation for Google AI (AddGoogleAIEmbeddingGenerator), providing a task_type within the EmbeddingGenerationOptions.AdditionalProperties does not add the taskType field to the outgoing HTTP request body sent to the Google API.
This prevents the use of task-specific embeddings (e.g., RETRIEVAL_DOCUMENT, RETRIEVAL_QUERY), which is a critical feature for optimizing search and RAG applications. The generator silently discards the option, leading to default embeddings being generated for all tasks.
To Reproduce
The issue can be reproduced by intercepting the outgoing HttpClient request.
-
Set up a logging handler:
public class LoggingDelegatingHandler : DelegatingHandler { protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) { if (request.Content != null) { var requestBody = await request.Content.ReadAsStringAsync(cancellationToken); Console.WriteLine($"--> REQUEST BODY:\n{requestBody}"); } return await base.SendAsync(request, cancellationToken); } }
-
Register the services:
services.AddTransient<LoggingDelegatingHandler>(); services.AddHttpClient("GoogleAIClient").AddHttpMessageHandler<LoggingDelegatingHandler>(); // Register the generator, ensuring it uses the instrumented HttpClient services.AddGoogleAIEmbeddingGenerator( modelId: "embedding-001", apiKey: "YOUR_API_KEY", httpClient: services.BuildServiceProvider().GetRequiredService<IHttpClientFactory>().CreateClient("GoogleAIClient") );
-
Execute the call with
task_type:var embeddingGenerator = serviceProvider.GetRequiredService<IEmbeddingGenerator<string, Embedding<float>>>(); var text = "This is a document for retrieval."; var options = new EmbeddingGenerationOptions() { AdditionalProperties = new() { { "task_type", "RETRIEVAL_DOCUMENT" } } }; await embeddingGenerator.GenerateVectorAsync(text, options);
-
Observe the logged request body:
The output shows a request body without thetaskTypefield:--> REQUEST BODY: {"requests":[{"model":"models/gemini-embedding-001","content":{"parts":[{"text":"This is a document for retrieval."}]},"outputDimensionality":1998}]}
Expected behavior
The logged HTTP request body should include the taskType field, as specified in the Google API documentation.
Expected request body:
--> REQUEST BODY:
{"requests":[{"model":"models/gemini-embedding-001","content":{"parts":[{"text":"This is a document for retrieval."}]},"outputDimensionality":1998}]},"taskType":"RETRIEVAL_DOCUMENT"}]}Platform
- Language: C#
- Source:
Microsoft.SemanticKernel.Connectors.Google, version1.66.0-alpha(and likely earlier versions) - AI model: Google
gemini-embedding-001 - IDE: Visual Studio 2022
- OS: Windows
Additional context
The root cause appears to be that the EmbeddingGenerationOptions are not being passed down through the internal call stack.
GoogleAIEmbeddingGenerator.GenerateAsynccalls an internal generator.- This internal generator is an adapter for the obsolete
GoogleAITextEmbeddingGenerationService. - The
GenerateEmbeddingsAsyncmethod onGoogleAITextEmbeddingGenerationServicedoes not have a parameter forEmbeddingGenerationOptions, so the options are discarded at this point. - Consequently, the deeper
GoogleAIEmbeddingRequest.FromDatamethod, which builds the request, is never supplied with thetaskTypevalue.
This functionality is crucial for building effective search systems, as the performance difference between default embeddings and specialized RETRIEVAL_DOCUMENT/RETRIEVAL_QUERY embeddings is significant.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status