Skip to content

chat : add Granite 4.0 chat template with correct tool_call role mapping#20804

Merged
pwilkin merged 4 commits into
ggml-org:masterfrom
jesus-talavera-ibm:fix/granite-template-upstream
Apr 2, 2026
Merged

chat : add Granite 4.0 chat template with correct tool_call role mapping#20804
pwilkin merged 4 commits into
ggml-org:masterfrom
jesus-talavera-ibm:fix/granite-template-upstream

Conversation

@jesus-talavera-ibm

@jesus-talavera-ibm jesus-talavera-ibm commented Mar 20, 2026

Copy link
Copy Markdown
Contributor

Summary

Add LLM_CHAT_TEMPLATE_GRANITE_4_0 as a parallel path to the existing Granite 3.x template, fixing assistant_tool_call role mapping for Granite 4.0 models when --jinja is not used.

Problem

The Granite 4.0 Jinja template maps assistant_tool_call to <|start_of_role|>assistant<|end_of_role|><|tool_call|>, but the C++ template handler emits the literal role <|start_of_role|>assistant_tool_call<|end_of_role|> which the model does not recognize. This silently breaks tool calling on the C++ template path (without --jinja).

Changes

  • Rename LLM_CHAT_TEMPLATE_GRANITELLM_CHAT_TEMPLATE_GRANITE_3_X (preserves existing 3.x behavior unchanged)
  • Add LLM_CHAT_TEMPLATE_GRANITE_4_0 enum, map entry ("granite-4.0"), and handler
  • Detection: <|start_of_role|> + (<tool_call> or <tools>) → 4.0; otherwise → 3.x
  • Add production Granite 4.0 Jinja template (shared by granite-4.0-tiny, h-small, micro)
  • Add tests for both 3.x and 4.0 template paths (C++ and Jinja) in test-chat-template.cpp
  • Add Granite 4.0 peg parser test in test-chat.cpp

Files changed

File Change
src/llama-chat.h Add GRANITE_3_X / GRANITE_4_0 enum values
src/llama-chat.cpp Map entries, detection logic, and parallel handlers
models/templates/ibm-granite-granite-4.0.jinja Production Granite 4.0 Jinja template
tests/test-chat-template.cpp 3.x backwards compat + 4.0 fix tests (C++ and Jinja)
tests/test-chat.cpp Granite 4.0 peg parser test

Test plan

  • test-chat-template — Granite 3.x backwards compat (C++ + Jinja): PASS
  • test-chat-template — Granite 4.0 fix (C++ + Jinja): PASS
  • test-chat — Granite 3.3 peg parser (reasoning + basic chat): PASS
  • test-chat — Granite 4.0 peg parser (basic chat + tool calling): PASS
  • Regression tested with actual model weights: Granite 3.0, 3.1, 3.3, 4.0 (basic chat, tool calling with --jinja, C++ path without --jinja)

🤖 Generated with Claude Code

EDIT: I used Claude Code as start point and then iterated from it manually. You can check the whole process here: jesus-talavera-ibm#1

Introduce `LLM_CHAT_TEMPLATE_GRANITE_4_0` alongside the existing Granite
3.x template (renamed `LLM_CHAT_TEMPLATE_GRANITE_3_X`).

The Granite 4.0 Jinja template uses `<tool_call>` XML tags and maps the
`assistant_tool_call` role to `<|start_of_role|>assistant<|end_of_role|><|tool_call|>`.
Without a matching C++ handler, the fallback path emits the literal role
`assistant_tool_call` which the model does not recognize, breaking tool
calling when `--jinja` is not used.

Changes:
- Rename `LLM_CHAT_TEMPLATE_GRANITE` to `LLM_CHAT_TEMPLATE_GRANITE_3_X`
  (preserves existing 3.x behavior unchanged)
- Add `LLM_CHAT_TEMPLATE_GRANITE_4_0` enum, map entry, and handler
- Detection: `<|start_of_role|>` + (`<tool_call>` or `<tools>`) → 4.0,
  otherwise → 3.x
- Add production Granite 4.0 Jinja template
- Add tests for both 3.x and 4.0 template paths (C++ and Jinja)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gabe-l-hart

Copy link
Copy Markdown
Collaborator

🤖 Generated with Claude Code

In general, this project doesn't accept fully generated PRs. For context, @jesus-talavera-ibm and I iterated on this on jesus-talavera-ibm#1, so there has been significant manual review of the generated code.

@ggml-gh-bot

This comment was marked as resolved.

@github-actions github-actions Bot added the testing Everything test related label Mar 20, 2026

@gabe-l-hart gabe-l-hart left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should refactor the test-chat-template.cpp changes to match the other tests, but otherwise this looks good!

Comment thread src/llama-chat.cpp
{ "exaone-moe", LLM_CHAT_TEMPLATE_EXAONE_MOE },
{ "rwkv-world", LLM_CHAT_TEMPLATE_RWKV_WORLD },
{ "granite", LLM_CHAT_TEMPLATE_GRANITE },
{ "granite", LLM_CHAT_TEMPLATE_GRANITE_3_X },

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of this change is to carefully delineate between template for the 3.x series and the 4.0 series (there may be further 4.x template changes in the future, thus being explicit about 4_0).

As written, this is nicely backwards-compatible since the enum name is only used internally and the string name is mapped consistently.

Comment thread tests/test-chat-template.cpp Outdated
}
}

// Test Granite 3.x template (LLM_CHAT_TEMPLATE_GRANITE_3_X) — backwards compatibility

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have caught this in pre-review, but I think we should try to rewrite these tests to match the ones above that all follow a standard format and use common test logic.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, Gabe! I've adapted the tests here: 3f2a48a

@gabe-l-hart gabe-l-hart left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change looks good.

@pwilkin @ngxson @ggerganov We've validated this against both 3.x and 4.0 models and the code/tests look tight, so I think it's ready for CI and maintainer review.

@ggerganov ggerganov added the merge ready A maintainer can use this label to indicate that they consider the changes final and ready to merge. label Apr 2, 2026
@pwilkin pwilkin merged commit 6137c32 into ggml-org:master Apr 2, 2026
45 of 48 checks passed
slartibardfast pushed a commit to slartibardfast/llama.cpp that referenced this pull request Apr 12, 2026
…ing (ggml-org#20804)

* chat : add Granite 4.0 chat template with correct tool_call role mapping

Introduce `LLM_CHAT_TEMPLATE_GRANITE_4_0` alongside the existing Granite
3.x template (renamed `LLM_CHAT_TEMPLATE_GRANITE_3_X`).

The Granite 4.0 Jinja template uses `<tool_call>` XML tags and maps the
`assistant_tool_call` role to `<|start_of_role|>assistant<|end_of_role|><|tool_call|>`.
Without a matching C++ handler, the fallback path emits the literal role
`assistant_tool_call` which the model does not recognize, breaking tool
calling when `--jinja` is not used.

Changes:
- Rename `LLM_CHAT_TEMPLATE_GRANITE` to `LLM_CHAT_TEMPLATE_GRANITE_3_X`
  (preserves existing 3.x behavior unchanged)
- Add `LLM_CHAT_TEMPLATE_GRANITE_4_0` enum, map entry, and handler
- Detection: `<|start_of_role|>` + (`<tool_call>` or `<tools>`) → 4.0,
  otherwise → 3.x
- Add production Granite 4.0 Jinja template
- Add tests for both 3.x and 4.0 template paths (C++ and Jinja)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Code review: follow standard format and use common logic in test-chat-template.cpp

* Rename custom_conversation variable for extra_conversation to give it a more meaningful name

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
…ing (ggml-org#20804)

* chat : add Granite 4.0 chat template with correct tool_call role mapping

Introduce `LLM_CHAT_TEMPLATE_GRANITE_4_0` alongside the existing Granite
3.x template (renamed `LLM_CHAT_TEMPLATE_GRANITE_3_X`).

The Granite 4.0 Jinja template uses `<tool_call>` XML tags and maps the
`assistant_tool_call` role to `<|start_of_role|>assistant<|end_of_role|><|tool_call|>`.
Without a matching C++ handler, the fallback path emits the literal role
`assistant_tool_call` which the model does not recognize, breaking tool
calling when `--jinja` is not used.

Changes:
- Rename `LLM_CHAT_TEMPLATE_GRANITE` to `LLM_CHAT_TEMPLATE_GRANITE_3_X`
  (preserves existing 3.x behavior unchanged)
- Add `LLM_CHAT_TEMPLATE_GRANITE_4_0` enum, map entry, and handler
- Detection: `<|start_of_role|>` + (`<tool_call>` or `<tools>`) → 4.0,
  otherwise → 3.x
- Add production Granite 4.0 Jinja template
- Add tests for both 3.x and 4.0 template paths (C++ and Jinja)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Code review: follow standard format and use common logic in test-chat-template.cpp

* Rename custom_conversation variable for extra_conversation to give it a more meaningful name

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
…ing (ggml-org#20804)

* chat : add Granite 4.0 chat template with correct tool_call role mapping

Introduce `LLM_CHAT_TEMPLATE_GRANITE_4_0` alongside the existing Granite
3.x template (renamed `LLM_CHAT_TEMPLATE_GRANITE_3_X`).

The Granite 4.0 Jinja template uses `<tool_call>` XML tags and maps the
`assistant_tool_call` role to `<|start_of_role|>assistant<|end_of_role|><|tool_call|>`.
Without a matching C++ handler, the fallback path emits the literal role
`assistant_tool_call` which the model does not recognize, breaking tool
calling when `--jinja` is not used.

Changes:
- Rename `LLM_CHAT_TEMPLATE_GRANITE` to `LLM_CHAT_TEMPLATE_GRANITE_3_X`
  (preserves existing 3.x behavior unchanged)
- Add `LLM_CHAT_TEMPLATE_GRANITE_4_0` enum, map entry, and handler
- Detection: `<|start_of_role|>` + (`<tool_call>` or `<tools>`) → 4.0,
  otherwise → 3.x
- Add production Granite 4.0 Jinja template
- Add tests for both 3.x and 4.0 template paths (C++ and Jinja)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Code review: follow standard format and use common logic in test-chat-template.cpp

* Rename custom_conversation variable for extra_conversation to give it a more meaningful name

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026
…ing (ggml-org#20804)

* chat : add Granite 4.0 chat template with correct tool_call role mapping

Introduce `LLM_CHAT_TEMPLATE_GRANITE_4_0` alongside the existing Granite
3.x template (renamed `LLM_CHAT_TEMPLATE_GRANITE_3_X`).

The Granite 4.0 Jinja template uses `<tool_call>` XML tags and maps the
`assistant_tool_call` role to `<|start_of_role|>assistant<|end_of_role|><|tool_call|>`.
Without a matching C++ handler, the fallback path emits the literal role
`assistant_tool_call` which the model does not recognize, breaking tool
calling when `--jinja` is not used.

Changes:
- Rename `LLM_CHAT_TEMPLATE_GRANITE` to `LLM_CHAT_TEMPLATE_GRANITE_3_X`
  (preserves existing 3.x behavior unchanged)
- Add `LLM_CHAT_TEMPLATE_GRANITE_4_0` enum, map entry, and handler
- Detection: `<|start_of_role|>` + (`<tool_call>` or `<tools>`) → 4.0,
  otherwise → 3.x
- Add production Granite 4.0 Jinja template
- Add tests for both 3.x and 4.0 template paths (C++ and Jinja)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Code review: follow standard format and use common logic in test-chat-template.cpp

* Rename custom_conversation variable for extra_conversation to give it a more meaningful name

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request May 15, 2026
…ing (ggml-org#20804)

* chat : add Granite 4.0 chat template with correct tool_call role mapping

Introduce `LLM_CHAT_TEMPLATE_GRANITE_4_0` alongside the existing Granite
3.x template (renamed `LLM_CHAT_TEMPLATE_GRANITE_3_X`).

The Granite 4.0 Jinja template uses `<tool_call>` XML tags and maps the
`assistant_tool_call` role to `<|start_of_role|>assistant<|end_of_role|><|tool_call|>`.
Without a matching C++ handler, the fallback path emits the literal role
`assistant_tool_call` which the model does not recognize, breaking tool
calling when `--jinja` is not used.

Changes:
- Rename `LLM_CHAT_TEMPLATE_GRANITE` to `LLM_CHAT_TEMPLATE_GRANITE_3_X`
  (preserves existing 3.x behavior unchanged)
- Add `LLM_CHAT_TEMPLATE_GRANITE_4_0` enum, map entry, and handler
- Detection: `<|start_of_role|>` + (`<tool_call>` or `<tools>`) → 4.0,
  otherwise → 3.x
- Add production Granite 4.0 Jinja template
- Add tests for both 3.x and 4.0 template paths (C++ and Jinja)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Code review: follow standard format and use common logic in test-chat-template.cpp

* Rename custom_conversation variable for extra_conversation to give it a more meaningful name

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request May 15, 2026
…ing (ggml-org#20804)

* chat : add Granite 4.0 chat template with correct tool_call role mapping

Introduce `LLM_CHAT_TEMPLATE_GRANITE_4_0` alongside the existing Granite
3.x template (renamed `LLM_CHAT_TEMPLATE_GRANITE_3_X`).

The Granite 4.0 Jinja template uses `<tool_call>` XML tags and maps the
`assistant_tool_call` role to `<|start_of_role|>assistant<|end_of_role|><|tool_call|>`.
Without a matching C++ handler, the fallback path emits the literal role
`assistant_tool_call` which the model does not recognize, breaking tool
calling when `--jinja` is not used.

Changes:
- Rename `LLM_CHAT_TEMPLATE_GRANITE` to `LLM_CHAT_TEMPLATE_GRANITE_3_X`
  (preserves existing 3.x behavior unchanged)
- Add `LLM_CHAT_TEMPLATE_GRANITE_4_0` enum, map entry, and handler
- Detection: `<|start_of_role|>` + (`<tool_call>` or `<tools>`) → 4.0,
  otherwise → 3.x
- Add production Granite 4.0 Jinja template
- Add tests for both 3.x and 4.0 template paths (C++ and Jinja)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Code review: follow standard format and use common logic in test-chat-template.cpp

* Rename custom_conversation variable for extra_conversation to give it a more meaningful name

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026
…ing (ggml-org#20804)

* chat : add Granite 4.0 chat template with correct tool_call role mapping

Introduce `LLM_CHAT_TEMPLATE_GRANITE_4_0` alongside the existing Granite
3.x template (renamed `LLM_CHAT_TEMPLATE_GRANITE_3_X`).

The Granite 4.0 Jinja template uses `<tool_call>` XML tags and maps the
`assistant_tool_call` role to `<|start_of_role|>assistant<|end_of_role|><|tool_call|>`.
Without a matching C++ handler, the fallback path emits the literal role
`assistant_tool_call` which the model does not recognize, breaking tool
calling when `--jinja` is not used.

Changes:
- Rename `LLM_CHAT_TEMPLATE_GRANITE` to `LLM_CHAT_TEMPLATE_GRANITE_3_X`
  (preserves existing 3.x behavior unchanged)
- Add `LLM_CHAT_TEMPLATE_GRANITE_4_0` enum, map entry, and handler
- Detection: `<|start_of_role|>` + (`<tool_call>` or `<tools>`) → 4.0,
  otherwise → 3.x
- Add production Granite 4.0 Jinja template
- Add tests for both 3.x and 4.0 template paths (C++ and Jinja)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Code review: follow standard format and use common logic in test-chat-template.cpp

* Rename custom_conversation variable for extra_conversation to give it a more meaningful name

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge ready A maintainer can use this label to indicate that they consider the changes final and ready to merge. testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants