Add script to convert GGMLv3 LLaMA models to GGUF#2682
Merged
ggerganov merged 11 commits intoggml-org:gguffrom Aug 21, 2023
Merged
Add script to convert GGMLv3 LLaMA models to GGUF#2682ggerganov merged 11 commits intoggml-org:gguffrom
ggerganov merged 11 commits intoggml-org:gguffrom
Conversation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Currently in a pretty reasonable state. Testing/feedback would be appreciated.
Converted file tested to parse these prompts to the same tokens as pre-GGUF llama.cpp:
你喜欢小狗吗?Once upon a time, in a dark forest, there lived a little foxI also tested these models with the second prompt:
openorca-platypus2-13b.ggmlv3.q5_K_M.bingplatty-30b-superhot-8k.ggmlv3.q4_K_M.binplatypus2-70b-instruct.ggmlv3.q4_K_M.binIdentical generation compared to loading the actual GGML file with pre-GGUF llama.cpp when specifying a seed.
Note: When testing, be sure to specify
--epsand--gqaas is appropriate. You'll probably also want to specify--context-length(it defaults to2048).edit: It's now possible to use HF or "original" format metadata like vocab when converting. Some information about this and the current state of the pull: #2682 (comment)
Some perplexity results here: #2682 (comment)