Feature Description
When using LLM serving frameworks such as vLLM or MLC-LLM , or services that host open-source models like DeepInfra, Fireworks, or OpenRouter, you sometimes run into an issue where the model being served doesn't have a dedicated tool parser yet the model does support tool use. This usually means the tools parameter in their OpenAI compliant API either doesn't work or causes an error and you'll have to parse the chat completion for any tool calls manually after the request.
While creating a custom provider can address this, it'll need ongoing maintenance and may lead to missing out on new provider features unless manually implemented.
To address this, I suggest adding a setting for a custom tool parser that can be passed to the OpenAI provider when in compatible mode. This feature would allow you to define a function that processes either the response message when using generateText or a stream when using streamText to determine if the response includes a tool call. This way, you can still keep all the benefits of the tool features of the SDK while serving your own models or using an open source model hosting service.
Example usage of a basic parser
import { createOpenAI } from '@ai-sdk/openai';
import { isParsableJson } from "@ai-sdk/provider-utils";
const llama = createOpenAI({
// other settings
compatibility: 'compatible',
textToolParser: (response: string) => {
if (!response.startsWith("<|python_tag|>")) return [];
response = response.replace("<|python_tag|>", "");
if (!isParsableJson(response)) {
return [];
}
const parsed: Array<{ name: string, arguments: Record<string, unknown>}> = JSON.parse(response)
return parsed;
},
streamToolParser: (chunk: LanguageModelV1StreamPart) => {
if (chunk.type !== "text-delta") return;
if (chunk.textDelta.startsWith("<|python_tag|>") {
//rest of the implementation
}
});
Use Case
- If you're hosting your own models and want to incorporate tools into your project but the serving framework doesn't support tool use for that model or tools are not included in the chat template.
- When using a model hosting service that either doesn’t support tools for specific models or lacks tool functionality altogether.
- In projects where you use different models with different tool response formats, making it difficult to parse and handle tool calls in a consistent way.
Additional context
For my team’s project, we host several open-source models and switch between them based on the situation or context—Llama 3.1 for general conversations, Mistral for RAG use cases, Qwen for coding, etc. This has led to a lot of iteration on custom providers to support tool use across models, so having this level of customization natively in the SDK would be great. It would let us use the AI SDK for our internal LLM tooling as well (benchmarks, RAG arenas).
I'm not married to the example I showed above, we can discuss a different implementation. I’d be more than happy to work on this and submit a PR if that’s helpful.
Feature Description
When using LLM serving frameworks such as vLLM or MLC-LLM , or services that host open-source models like DeepInfra, Fireworks, or OpenRouter, you sometimes run into an issue where the model being served doesn't have a dedicated tool parser yet the model does support tool use. This usually means the
toolsparameter in their OpenAI compliant API either doesn't work or causes an error and you'll have to parse the chat completion for any tool calls manually after the request.While creating a custom provider can address this, it'll need ongoing maintenance and may lead to missing out on new provider features unless manually implemented.
To address this, I suggest adding a setting for a custom tool parser that can be passed to the OpenAI provider when in
compatiblemode. This feature would allow you to define a function that processes either the response message when usinggenerateTextor a stream when usingstreamTextto determine if the response includes a tool call. This way, you can still keep all the benefits of the tool features of the SDK while serving your own models or using an open source model hosting service.Example usage of a basic parser
Use Case
Additional context
For my team’s project, we host several open-source models and switch between them based on the situation or context—Llama 3.1 for general conversations, Mistral for RAG use cases, Qwen for coding, etc. This has led to a lot of iteration on custom providers to support tool use across models, so having this level of customization natively in the SDK would be great. It would let us use the AI SDK for our internal LLM tooling as well (benchmarks, RAG arenas).
I'm not married to the example I showed above, we can discuss a different implementation. I’d be more than happy to work on this and submit a PR if that’s helpful.