convert : update Falcon script for new HF config by cebtenzzre · Pull Request #3448 · ggml-org/llama.cpp

cebtenzzre · 2023-10-02T22:10:08Z

This aligns the Falcon convert script with the recent changes to the HuggingFace models.

Green-Sky · 2023-10-03T00:04:35Z

is this a continuation of #3049 ? :)

cebtenzzre · 2023-10-03T00:12:37Z

is this a continuation of #3049 ? :)

#3049 is a separate script that deals with 180B safetensors files. I am trying to convert 7B pytorch files using convert-falcon-hf-to-gguf.py.

By recent changes, I mean three days ago.

Green-Sky · 2023-10-03T11:55:50Z

Yea, but your changes are a subset of the actual changes done for 180B

$ diff convert-falcon-hf-to-gguf.py convert-falcon180-hf-to-gguf.py --color=always -w

2c2
< # HF falcon--> gguf conversion
---
> # HF falcon180B--> gguf conversion
16a17
> from safetensors import safe_open
48c49
<         if filename.startswith("pytorch_model-"):
---
>         if filename.startswith("model-00"):
90c91
< if hparams["architectures"][0] != "RWForCausalLM":
---
> if hparams["architectures"][0] != "FalconForCausalLM":
103c104
< block_count = hparams["n_layer"]
---
> block_count = hparams["num_hidden_layers"]
111,113c112,114
< gguf_writer.add_head_count(hparams["n_head"])
< if "n_head_kv" in hparams:
<     gguf_writer.add_head_count_kv(hparams["n_head_kv"])
---
> gguf_writer.add_head_count(hparams["num_attention_heads"])
> if "num_kv_heads" in hparams:
>     gguf_writer.add_head_count_kv(hparams["num_kv_heads"])
181,182c182,183
< n_head    = hparams["n_head"]
< n_head_kv = hparams["n_head_kv"] if "n_head_kv" in hparams else 1
---
> n_head    = hparams["num_attention_heads"]
> n_head_kv = hparams["num_kv_heads"] if "num_kv_heads" in hparams else 1
193c194
<         f"pytorch_model-{n:05}-of-{num_parts:05}.bin" for n in range(1, num_parts + 1)
---
>         f"model-{n:05}-of-{num_parts:05}.safetensors" for n in range(1, num_parts + 1)
200c201
<     model_part = torch.load(dir_model / part_name, map_location="cpu")
---
>     with safe_open(dir_model / part_name, framework="pt", device="cpu") as model_part:
203c204
<         data = model_part[name]
---
>             data = model_part.get_tensor(name)

(i compared an older version of convert-falcon, bc there have been some small changes on master since.)

cebtenzzre · 2023-10-03T13:19:31Z

Yea, but your changes are a subset of the actual changes done for 180B

Okay, so what do you suggest I do in order to get Falcon <180B working on master again?

ggerganov

Do we still need #3049 ?

cebtenzzre · 2023-10-03T18:52:49Z

Do we still need #3049 ?

The script in this PR does not know how to read safetensors files, so I think we still need that PR.

FNsi · 2023-10-05T06:50:50Z

By no means, why not merge #3049 first, abandon it then, finally merge this.

No one will lose the contribution records, and the falcon inference keep working.

cebtenzzre · 2023-10-05T13:51:43Z

Could someone test this script with Falcon-180B?

ggerganov · 2023-10-05T14:18:22Z

Testing atm - will take some time.
Would this change fix the reported issues in: #3484

cebtenzzre · 2023-10-05T15:15:19Z

Would this change fix the reported issues in: #3484

Unfortunately not. That appears to be another regression caused by #3252

ggerganov · 2023-10-05T16:27:38Z

Could someone test this script with Falcon-180B?

It works with this repo: https://huggingface.co/tiiuae/falcon-180B/tree/main

Which is strange, since this looks like safetensors data, which we don't expect to be able to read with this script.

cebtenzzre · 2023-10-05T17:06:17Z

Which is strange, since this looks like safetensors data, which we don't expect to be able to read with this script.

I recently added safetensors support to this PR so it can support 180B, which is why I asked.

Green-Sky

i dont have the means to test the 180b, but looks like you merged it nicely.

…example * 'master' of github.com:ggerganov/llama.cpp: kv cache slot search improvements (ggml-org#3493) prompts : fix editorconfig checks after ggml-org#3416 parallel : add option to load external prompt file (ggml-org#3416) server : reuse llama_sample_token common util (ggml-org#3494) llama : correct hparams comparison (ggml-org#3446) ci : fix xcodebuild destinations (ggml-org#3491) convert : update Falcon script for new HF config (ggml-org#3448) build : use std::make_tuple() for compatibility with older GCC versions (ggml-org#3488) common : process escape sequences in reverse prompts (ggml-org#3461) CLBlast: Fix handling of on-device tensor data server : fix incorrect num_tokens_predicted (ggml-org#3480) swift : disable ACCELERATE_NEW_LAPACK (ggml-org#3481) ci : add swift build via xcodebuild (ggml-org#3482)

Also adds Falcon-180B support. Closes ggml-org#3049 Co-authored-by: jb <jonathan.t.barnard@gmail.com>

rlanday · 2023-10-18T14:09:09Z

This change breaks the old architecture (RWForCausalLM). Kind of annoying because there was another change that requires reconverting the original HuggingFace model:
#3484

cebtenzzre · 2023-10-18T14:13:27Z

This change breaks the old architecture (RWForCausalLM).

The official falcon-7b, falcon-40b, and falcon-180B models on HF are compatible with the new version of this script. What is your use case?

rlanday · 2023-10-19T03:09:22Z

I have this model downloaded from Hugging Face:
https://huggingface.co/ehartford/WizardLM-Uncensored-Falcon-40b/tree/main

Is this the official version of Falcon?
https://huggingface.co/tiiuae/falcon-7b/tree/main
The most recent commit message on the model files is “Upload RWForCausalLM.” The convert script works properly on this?

cebtenzzre · 2023-10-19T12:33:02Z

The most recent commit message on the model files is “Upload RWForCausalLM.” The convert script works properly on this?

The latest commit is 898df13 "Move to in-library checkpoint (for real this time)". That's the one you want.

I suppose I could add back support for the older format until the finetunes are updated.

rlanday · 2023-10-20T02:05:32Z

Oh, I see; they updated the config.json file but the actual multi-GB model files did not change. I think this might not be clear to some people, who might end up re-downloading the whole model again, which is not a good user experience. Also, some fine-tunes might not ever be updated. In my opinion, it would be preferable to support both formats, but I will let you decide how to prioritize this. Thank you for looking into this issue.

ggerganov approved these changes Oct 3, 2023

View reviewed changes

logicchains and others added 2 commits October 5, 2023 09:46

gguf: add script for converting falcon 180B

73eefdf

convert : merge Falcon-180B script into main Falcon script

94dd85c

cebtenzzre force-pushed the update-falcon-convert branch from 31c88e3 to 94dd85c Compare October 5, 2023 13:50

Green-Sky approved these changes Oct 5, 2023

View reviewed changes

cebtenzzre merged commit 48edda3 into ggml-org:master Oct 5, 2023

yusiwen pushed a commit to yusiwen/llama.cpp that referenced this pull request Oct 7, 2023

convert : update Falcon script for new HF config (ggml-org#3448)

54a910d

Also adds Falcon-180B support. Closes ggml-org#3049 Co-authored-by: jb <jonathan.t.barnard@gmail.com>

Conversation

cebtenzzre commented Oct 2, 2023

Uh oh!

Green-Sky commented Oct 3, 2023

Uh oh!

cebtenzzre commented Oct 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Green-Sky commented Oct 3, 2023

Uh oh!

cebtenzzre commented Oct 3, 2023

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

cebtenzzre commented Oct 3, 2023

Uh oh!

FNsi commented Oct 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cebtenzzre commented Oct 5, 2023

Uh oh!

ggerganov commented Oct 5, 2023

Uh oh!

cebtenzzre commented Oct 5, 2023

Uh oh!

ggerganov commented Oct 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cebtenzzre commented Oct 5, 2023

Uh oh!

Green-Sky left a comment

Choose a reason for hiding this comment

Uh oh!

rlanday commented Oct 18, 2023

Uh oh!

cebtenzzre commented Oct 18, 2023

Uh oh!

rlanday commented Oct 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cebtenzzre commented Oct 19, 2023

Uh oh!

rlanday commented Oct 20, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

cebtenzzre commented Oct 3, 2023 •

edited

Loading

FNsi commented Oct 5, 2023 •

edited

Loading

ggerganov commented Oct 5, 2023 •

edited

Loading

rlanday commented Oct 19, 2023 •

edited

Loading