Skip to content

Fix kimi k2 function calling format#9606

Merged
zhyncs merged 3 commits intosgl-project:mainfrom
XiaotongJiang:kimik2-tool-calling-format
Aug 26, 2025
Merged

Fix kimi k2 function calling format#9606
zhyncs merged 3 commits intosgl-project:mainfrom
XiaotongJiang:kimik2-tool-calling-format

Conversation

@XiaotongJiang
Copy link
Copy Markdown
Collaborator

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @XiaotongJiang, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue with the function calling format for the Kimi-K2 model. It modifies the serving_chat.py file to align the tool_call.id generation with Kimi-K2's expected functions.{name}:{index} format, ensuring proper communication and parsing of tool calls. The changes cover both non-streaming and streaming tool call processes. Additionally, new unit tests have been introduced to verify the correctness of this updated formatting, enhancing the reliability of Kimi-K2 integration.

Highlights

  • Standardized Kimi-K2 Tool Call ID Format: The tool_call.id generation logic has been updated to specifically accommodate the 'kimi_k2' parser, ensuring that tool call IDs adhere to the functions.{name}:{index} format required by Kimi-K2 models. For other parsers, the existing UUID-based ID generation is maintained.
  • Enhanced ToolCall Object with Index: The ToolCall object now includes an index attribute, which is populated from call_info.tool_index when available, providing more detailed information for tool calls.
  • Comprehensive Unit Test Coverage: New unit tests have been added to rigorously validate the correct tool_call.id formatting for Kimi-K2, covering both non-streaming and streaming scenarios to ensure robust integration.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly implements the function calling format for kimi_k2 for both streaming and non-streaming responses. The changes are well-tested with new unit tests covering both scenarios. I have one suggestion to refactor duplicated code to improve maintainability.

Comment on lines +936 to +940
if self.tokenizer_manager.server_args.tool_call_parser == "kimi_k2":
# Align with Kimi-K2 format: functions.{name}:{index}
tool_call_id = f"functions.{call_item.name}:{call_item.tool_index}"
else:
tool_call_id = f"call_{uuid.uuid4().hex[:24]}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for generating the tool_call_id is duplicated here and in the _process_tool_calls method (lines 812-815). To improve maintainability and reduce redundancy, consider extracting this logic into a private helper method. This would centralize the ID generation and make future changes easier to implement and track.

@zhaochenyang20
Copy link
Copy Markdown
Collaborator

I just refactored this docs:

https://docs.sglang.ai/advanced_features/function_calling.html

Could you check whether kimi k2 can go through all the commands in the docs?

Copy link
Copy Markdown
Collaborator

@JustinTong0323 JustinTong0323 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Xiaotong!

@zhyncs zhyncs merged commit 0936c76 into sgl-project:main Aug 26, 2025
219 of 239 checks passed
MahmoudAshraf97 pushed a commit to MahmoudAshraf97/sglang that referenced this pull request Sep 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants