Toolify——为任意LLM API添加完善的Function Calling支持

摘要：Toolify，一个通过提示词让任何语言模型API都支持函数调用的中间件。它能解决很多LLM接口原生不支持函数调用的痛点，自动处理提示词构造、响应解析与多轮调用，让你轻松为任何模型赋予强大的工具使用能力。

前言

不知道大家有没有碰到过这样的问题：有一个LLM的接口，可它却不支持函数调用(Function calling)，导致很多需要函数调用才可以使用的程序用不上这些接口或LLM

对于很多非官方的API接口或者本地部署的小模型，这些问题是很普遍的。

Toolify就是为了解决本问题而生的！通过简单的提示词，可以让任何语言模型的API都支持函数调用！

介绍

在以前，让模型调用函数最好的方案是使用原生的函数调用接口，例如，OpenAI、Gemini、Anthropic都提供了类似的原生接口。因为通过提示词要求模型输出一个方便程序解析的json字段有时候会出现问题，例如，他们可能会自顾自地加上注释/参数数量写不对，等等。

但现在，模型的脑子越来越好使了，哪怕不使用原生的函数调用接口，通过提示词让模型调用函数在大部分情况下也是可以的！当然，某些比较蠢的模型例外，~~比如grok~~

事实上，Roo Code、Open WebUI等项目中都有类似的实现，但大多没有通用化。Toolify就是为了打破这种情况而诞生的。你可以在Toolify中添加多个兼容OpenAI格式的API上游渠道，然后只需要将包含函数调用的OpenAI格式请求发送至Toolify，Toolify会自动地帮你处理函数调用的提示词、解析上游LLM返回的函数调用请求、返回调用请求给客户端。当客户端回传函数结果时，Toolify也会自动地把调用结果添加至提示词中，以便LLM继续响应你的消息。工作流程大致如下：

sequenceDiagram
    participant Client   as 客户端
    participant Toolify  as Toolify
    participant LLM      as 上游 LLM

    Client->>Toolify: 发送请求（含函数调用需求）
    Toolify->>LLM: 构造提示词并转发
    LLM-->>Toolify: 返回 function_call 请求
    Toolify->>Client: 将 function_call 请求转发给客户端
    Client->>Client: 内部执行函数
    Client-->>Toolify: 返回函数执行结果
    Toolify->>LLM: 在提示中加入执行结果，继续对话
    LLM-->>Toolify: 返回最终响应
    Toolify->>Client: 将最终响应返回给客户端

开源地址

本项目地址为：https://github.com/funnycups/Toolify

觉得有用请务必给个star，谢谢喵

功能支持

允许模型在输出的任何地方立刻中断并调用函数
兼容模型输出的<think>思维链，思维链中的内容不会误触发函数调用
支持一次性多函数调用
支持多上游渠道配置
支持模型映射和重定向
对于不支持developer role的上游，将developer role转换为system role
支持token计数
函数调用解析失败时，可以自动重试
支持Docker Compose快速部署

更新日志

2026/1/11，支持自动重试函数调用，现在可以直接使用ghcr的镜像而不必手动构建了

2025/8/27，修复函数调用的相当一部分问题，现在可以配合NewAPI中转用上Claude Code了，不过由于提示词实现的函数调用，效果并不是特别好

2025/7/31，添加模型映射

配置

项目给出的默认配置文件已经涵盖了绝大部分主要功能的示例，只需要将之从config.example.yaml重命名为config.yaml并修改渠道即可使用：

# Server configuration
server:
  port: 8000                    # Server listening port
  host: "0.0.0.0"              # Server listening address
  timeout: 180                  # Request timeout (seconds)

# Upstream OpenAI compatible service configuration
upstream_services:
  - name: "openai"
    base_url: "https://api.openai.com/v1"
    api_key: "your-openai-api-key-here"
    description: "OpenAI Official Service"
    is_default: true
    models:
      - "gpt-3.5-turbo"
      - "gpt-3.5-turbo-16k"
      - "gpt-4"
      - "gpt-4-turbo"
      - "gpt-4o"
      - "gpt-4o-mini"

  - name: "google"
    base_url: "https://generativelanguage.googleapis.com/v1beta/openai/"
    api_key: "your-google-api-key-here"
    description: "Google Gemini Service"
    is_default: false
    models:
      # Use alias "gemini-2.5" to randomly select one of the following models
      - "gemini-2.5:gemini-2.5-pro"
      - "gemini-2.5:gemini-2.5-flash"
      # You can also define models that can be used directly
      - "gemini-2.5-pro"
      - "gemini-2.5-flash"

# Client authentication configuration
client_authentication:
  allowed_keys:
    - "sk-my-secret-key-1"
    - "sk-my-secret-key-2"

# Feature configuration
features:
  enable_function_calling: true  # Enable function calling feature
  enable_logging: true           # Enable logging
  convert_developer_to_system: true  # Whether to convert the developer role to the system role
  key_passthrough: false          # If true, directly forward client-provided API key to upstream instead of using configured upstream key
  model_passthrough: false         # If true, forward all requests directly to the 'openai' upstream service, ignoring model-based routing
  # Custom prompt template (optional). If not provided, the default prompt will be used.
  # prompt_template: |
  #   Your custom prompt template here...
  #   Must include {tools_list} and {trigger_signal} placeholders
  enable_fc_error_retry: false       # Enable automatic retry for function call parsing errors (default: false)
  fc_error_retry_max_attempts: 3     # Maximum retry attempts (1-10, default: 3)
  # Custom error retry prompt template (optional). If not provided, the default prompt will be used.
  # Must contain {error_details} and {original_response} placeholders.
  # fc_error_retry_prompt_template: |
  #   Your custom prompt template here...

写在最后

如果你使用的LLM接口原生支持Function calling(大部分官方接口都应当支持)，请使用原生方案，效果比通过单纯的提示词好。
函数调用的稳定性受模型自身性能影响，例如，Claude使用Toolify调用函数非常稳定，而有朋友反馈Grok调用一直出错。哪怕提示词中已经严格明确调用方法，有些能力较差的模型仍然会调用错误，这个是很难解决的。如果有更好的函数调用提示词，欢迎PR！

本文作者：小欢

本文链接：Toolify——为任意LLM API添加完善的Function Calling支持 - https://www.cups.moe/archives/toolify.html

版权声明：如无特别声明，本文即为原创文章，仅代表个人观点，版权归小欢博客所有，遵循知识共享署名-相同方式共享 4.0 国际许可协议。转载请注明出处！

手机上阅读

最后一次更新于2026-01-11

小欢博客

Fly your dreams