Skip to content

Feature Request: Kimi-K2-Thinking reasoning and tool calling support #17155

@KiruyaMomochi

Description

@KiruyaMomochi

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Support the tool calling (function calling) of Kimi-K2 series natively, including Kimi-K2-Thinking and maybe also Kimi-K2-Instruct.

Motivation

  • Tool calling: Kimi-K2-Thinking's model card said it can "maintaining stable tool-use across 200–300 sequential calls", but we currently have no support on it, falling back to the generic json method.
  • Reasoning: Currently we must use --special to make thinking work as said in Unsloth Documentation.

Possible Implementation

vLLM: https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/tool_parsers/kimi_k2_tool_parser.py

ik_llama.cpp had implemented it before, but now they are using mainline function calling: ikawrakow/ik_llama.cpp#628

I'm trying to implement it at https://github.com/KiruyaMomochi/llama.cpp/tree/kimi-k2-thinking, by copying DeepSeek-V3.1's implementation in a silly way. However, Kimi-K2 seems to have different function name syntax than DeepSeek. I also get an extra <|tool_calls_section_end|> token, maybe due to the --special flag.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions