Skip to content

Add script to convert GGMLv3 LLaMA models to GGUF#2682

Merged
ggerganov merged 11 commits intoggml-org:gguffrom
KerfuffleV2:feat-convert-ggml-to-gguf
Aug 21, 2023
Merged

Add script to convert GGMLv3 LLaMA models to GGUF#2682
ggerganov merged 11 commits intoggml-org:gguffrom
KerfuffleV2:feat-convert-ggml-to-gguf

Conversation

@KerfuffleV2
Copy link
Contributor

@KerfuffleV2 KerfuffleV2 commented Aug 20, 2023

Currently in a pretty reasonable state. Testing/feedback would be appreciated.

Converted file tested to parse these prompts to the same tokens as pre-GGUF llama.cpp:

  1. 你喜欢小狗吗?
  2. Once upon a time, in a dark forest, there lived a little fox

I also tested these models with the second prompt:

  1. Random LLaMA1 7B
  2. openorca-platypus2-13b.ggmlv3.q5_K_M.bin
  3. gplatty-30b-superhot-8k.ggmlv3.q4_K_M.bin
  4. platypus2-70b-instruct.ggmlv3.q4_K_M.bin

Identical generation compared to loading the actual GGML file with pre-GGUF llama.cpp when specifying a seed.

Note: When testing, be sure to specify --eps and --gqa as is appropriate. You'll probably also want to specify --context-length (it defaults to 2048).

edit: It's now possible to use HF or "original" format metadata like vocab when converting. Some information about this and the current state of the pull: #2682 (comment)

Some perplexity results here: #2682 (comment)

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet