Prerequisites
Feature Description
Support the tool calling (function calling) of Kimi-K2 series natively, including Kimi-K2-Thinking and maybe also Kimi-K2-Instruct.
Motivation
- Tool calling: Kimi-K2-Thinking's model card said it can "maintaining stable tool-use across 200–300 sequential calls", but we currently have no support on it, falling back to the generic json method.
- Reasoning: Currently we must use
--special to make thinking work as said in Unsloth Documentation.
Possible Implementation
vLLM: https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/tool_parsers/kimi_k2_tool_parser.py
ik_llama.cpp had implemented it before, but now they are using mainline function calling: ikawrakow/ik_llama.cpp#628
I'm trying to implement it at https://github.com/KiruyaMomochi/llama.cpp/tree/kimi-k2-thinking, by copying DeepSeek-V3.1's implementation in a silly way. However, Kimi-K2 seems to have different function name syntax than DeepSeek. I also get an extra <|tool_calls_section_end|> token, maybe due to the --special flag.
Prerequisites
Feature Description
Support the tool calling (function calling) of Kimi-K2 series natively, including Kimi-K2-Thinking and maybe also Kimi-K2-Instruct.
Motivation
--specialto make thinking work as said in Unsloth Documentation.Possible Implementation
vLLM: https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/tool_parsers/kimi_k2_tool_parser.py
ik_llama.cpp had implemented it before, but now they are using mainline function calling: ikawrakow/ik_llama.cpp#628
I'm trying to implement it at https://github.com/KiruyaMomochi/llama.cpp/tree/kimi-k2-thinking, by copying DeepSeek-V3.1's implementation in a silly way. However, Kimi-K2 seems to have different function name syntax than DeepSeek. I also get an extra
<|tool_calls_section_end|>token, maybe due to the--specialflag.