Conversation
|
Few things to note (I'll push a commit tomorrow when I get back to work):
|
|
Still unsure about the KV metadata part, but pushed updates for the Xenova/gpt-4o. something like this? |
|
Thanks I have tested a few different gguf models created with your branch and they seem to be working ok. Posting them to huggingface https://huggingface.co/Mungert/Phi-4-mini-instruct.gguf . Many thanks for getting Phi-4-mini-instruct working |
|
@Mungert69 please don't post gguf on HF before the PR is merging, as there can be more works and your gguf may break after this is finished. |
ngxson
left a comment
There was a problem hiding this comment.
This PR is also missing tokenizer .inp/.out files.
Since I cannot push to this PR (because you created from your master branch), I will make another PR to replace it. Will keep your commits there so you're still in co-author
|
|
||
| layer.rope_long = create_tensor(tn(LLM_TENSOR_ROPE_FACTORS_LONG, "weight", i), { n_embd_head/2 }, TENSOR_NOT_REQUIRED | (i != 0 ? TENSOR_DUPLICATED : 0)); | ||
| layer.rope_short = create_tensor(tn(LLM_TENSOR_ROPE_FACTORS_SHORT, "weight", i), { n_embd_head/2 }, TENSOR_NOT_REQUIRED | (i != 0 ? TENSOR_DUPLICATED : 0)); | ||
| if (hparams.rope_scaling_type_train == LLAMA_ROPE_SCALING_TYPE_LONGROPE) { |
There was a problem hiding this comment.
I think this check works but not actually correct, since scaling_type is to calculate attn_factor
There was a problem hiding this comment.
Also, the else branch is redundant because we know from the conversion script that if rot_pct is not set, then we're sure that n_rot = n_embd / n_head
Other arch like LLM_ARCH_LLAMA does the same thing
| {"name": "megrez", "tokt": TOKENIZER_TYPE.BPE, "repo": "https://huggingface.co/Infinigence/Megrez-3B-Instruct"}, | ||
| {"name": "deepseek-v3", "tokt": TOKENIZER_TYPE.BPE, "repo": "https://huggingface.co/deepseek-ai/DeepSeek-V3"}, | ||
| {"name": "deepseek-r1-qwen", "tokt": TOKENIZER_TYPE.BPE, "repo": "https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"}, | ||
| {"name": "gpt-4o", "tokt": TOKENIZER_TYPE.BPE, "repo": "https://huggingface.co/Xenova/gpt-4o", }, |
There was a problem hiding this comment.
this does not actually work since Xenova/gpt-4o misses config.json, I have to make an exception for it
|
|
||
|
|
||
| def generate_extra_tensors(self) -> Iterable[tuple[str, Tensor]]: | ||
| if self.hparams.get("partial_rotary_factor") is not None: |
There was a problem hiding this comment.
This can be written shorter: self.hparams.get("partial_rotary_factor", 1.0)
Small correction, this is not needed. Other arch like |
|
Close and supersede by #12108 |
* Added Phi-4-mini-instruct support * Update regex per ngxson * Change the vocab base to Xenova/gpt-4o * fix conversion update script * no need to check longrope * minor style fix * fix python style --------- Co-authored-by: Nicholas Sparks <nisparks@microsoft.com>
…2108) * Added Phi-4-mini-instruct support * Update regex per ngxson * Change the vocab base to Xenova/gpt-4o * fix conversion update script * no need to check longrope * minor style fix * fix python style --------- Co-authored-by: Nicholas Sparks <nisparks@microsoft.com>
Added new vocab type: gpt-4o
Added Phi3 support for partial_rotary_factor
Added Phi3 support for tie_word_embeddings