llama : Add IBM granite template#10013
llama : Add IBM granite template#10013ngxson merged 11 commits intoggml-org:masterfrom arch-btw:master
Conversation
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
|
@ngxson thank you so much for your help, I appreciate your time and effort! I have applied your review 👍 I have a quick question, when I run I'm not sure what it should look like in the context of this script, considering the other prompts have things like additional spacing as well. Edit: sorry, I'm looking at the deepseek template. I guess mine failed:
|
|
You need to add an example of an unformatted chat template in |
Small change to \n
|
Hi @ngxson Thank you again for your guidance! Here is the output of And here is the output of And a prompt example: |
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
|
@ngxson Thank you again and sorry about that! |
ngxson
left a comment
There was a problem hiding this comment.
Thanks, merging once the CI passes
Branch: GraniteThreeSupport This is a port of the work done in llama.cpp with a slight tweak for the tool call response: ggml-org/llama.cpp#10013 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Branch: GraniteThreeSupport This is a port of the work done in llama.cpp with a slight tweak for the tool call response: ggml-org/llama.cpp#10013 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
* Add granite template to llama.cpp * Add granite template to test-chat-template.cpp * Update src/llama.cpp Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * Update tests/test-chat-template.cpp Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * Added proper template and expected output * Small change to \n Small change to \n * Add code space & Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * Fix spacing * Apply suggestions from code review * Update src/llama.cpp --------- Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
* Add granite template to llama.cpp * Add granite template to test-chat-template.cpp * Update src/llama.cpp Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * Update tests/test-chat-template.cpp Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * Added proper template and expected output * Small change to \n Small change to \n * Add code space & Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * Fix spacing * Apply suggestions from code review * Update src/llama.cpp --------- Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Branch: GraniteThreeSupport This is a port of the work done in llama.cpp with a slight tweak for the tool call response: ggml-org/llama.cpp#10013 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Branch: GraniteThreeSupport This is a port of the work done in llama.cpp with a slight tweak for the tool call response: ggml-org/llama.cpp#10013 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>




Hi @ggerganov and @ngxson,
I'd like to contribute the IBM granite template to llama.cpp.
I've made my best effort but I'm new to this, so could you please be so kind to review? Maybe you can see if it's alright.
I've followed the wiki instructions but feel free to make changes or provide feedback on how to improve.
The model is here: https://huggingface.co/ibm-granite/granite-3.0-8b-instruct
Here is the current output of llama-cli:
--