Skip to content

Python: Bug: How do I disable parallel function calling in Semantic Kernel (Python)? #9478

@jordanbean-msft

Description

@jordanbean-msft

Describe the bug
We are dealing with this issue here : https://community.openai.com/t/the-multi-tool-use-parallel-bug-and-how-to-fix-it/880771. We tried some of the solution proposed on many threads, with no success.

We found many threads open in OpenAI about this, and a quick suggestion we are trying to implement is to disable parallel calling completely.
Also when reviewing the OpenAI documentation we found that “model outputs may not match strict schemas supplied in tools” when parallel calls are enabled (https://platform.openai.com/docs/guides/function-calling/parallel-function-calling-and-structured-outputs), an issue that also happens to us from time to time.

The problem we have is we are unable to find a setting in Semantic Kernel (Python) to disable parallel function calls at model level, there is a configuration on the Kernel class, but it seems to be internal to Semantic Kernel (meaning it will disable after the model responded with the call(s) instructions).

I can see how I would pass this value in if I directly called the underlying openai SDK, but how do I set this same value in Semantic Kernel?

To Reproduce
Steps to reproduce the behavior:
I was able to dig on the code flow and found the flag is present in OpenAI underlying dependency, but not on the pydantic representation in Semantic Kernel. We are able to change this to test, but I wasn’t able to figure out where in the code is Semantic Kernel doing the parallel calls when they are enabled.

Starting from /semantic_kernel/connectors/ai/open_ai/prompt_execution_settings/azure_chat_prompt_execution_settings.py

Image

I can get to the parent class which is in /semantic_kernel/connectors/ai/open_ai/prompt_execution_settings/open_ai_prompt_execution_settings.py

Image

Then when doing a request I can verify the parameter is sent in /semantic_kernel/connectors/ai/open_ai/services/open_ai_handler.py

Image

And in turn that gets into OpenAI package: /openai/resources/chat/completions.py

Image

Which seems to be working as expected, but I cannot really confirm without checking how the tools are handled on the response by semantic kernel itself.

Expected behavior
A way to disable parallel function calling until the underlying bug from OpenAI is resolved.

Screenshots
Image

Platform

  • OS: Any
  • IDE: Any
  • Language: Python
  • Source: Any recent version of Semantic Kernel

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

Labels

ai connectorAnything related to AI connectorsbugSomething isn't workingpythonPull requests for the Python Semantic Kernel

Type

No type

Projects

Status

Sprint: Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions