Qwen 2.5 by danielhanchen · Pull Request #1280 · unslothai/unsloth

danielhanchen · 2024-11-12T11:22:31Z

No description provided.

orginal -> original

* Fix DPO, ORPO (#1177) * Fix TRL * Update mistral.py * Patch processing_class * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Installation guide (#1165) * chore: update chat_templates.py (#1166) orginal -> original * Disable Flex Attention * Update tokenizer_utils.py * Update _utils.py * n_items * Update cross_entropy_loss.py * Fix DPO, ORPO * Update _utils.py --------- Co-authored-by: timothelaborie <97834767+timothelaborie@users.noreply.github.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> * Add warning for missing Unpack and KwargsForCausalLM in older Transformers versions --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> Co-authored-by: timothelaborie <97834767+timothelaborie@users.noreply.github.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>

* Enhance rotary embedding handling in LlamaAttention and LongRopeRotaryEmbedding * Typo * Improve rotary embedding handling in LlamaAttention to prevent errors with short KV cache * Update llama.py * Update llama.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>

#1254) * Fix: cast logits to float32 in cross_entropy_forward to prevent errors * Update cross_entropy_loss.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>

) * Throw error when inferencing longer than max_popsition_embeddings without rope scaling * Update llama.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>

Co-authored-by: root <root@ieeres.chu.cam.ac.uk>

* Fix TRL * Update mistral.py * Patch processing_class * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Installation guide (unslothai#1165) * chore: update chat_templates.py (unslothai#1166) orginal -> original * Disable Flex Attention * Update tokenizer_utils.py * Update _utils.py * n_items * Update cross_entropy_loss.py * Fix DPO, ORPO * Update _utils.py * Update _utils.py * fix/transformers-unpack (unslothai#1180) * Fix DPO, ORPO (unslothai#1177) * Fix TRL * Update mistral.py * Patch processing_class * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Installation guide (unslothai#1165) * chore: update chat_templates.py (unslothai#1166) orginal -> original * Disable Flex Attention * Update tokenizer_utils.py * Update _utils.py * n_items * Update cross_entropy_loss.py * Fix DPO, ORPO * Update _utils.py --------- Co-authored-by: timothelaborie <97834767+timothelaborie@users.noreply.github.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> * Add warning for missing Unpack and KwargsForCausalLM in older Transformers versions --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> Co-authored-by: timothelaborie <97834767+timothelaborie@users.noreply.github.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> * Update cross_entropy_loss.py * Update _utils.py * Update _utils.py * donot upcast lm_head and embeddings to float32 (unslothai#1186) * Cleanup upcast logs (unslothai#1188) * Fix/phi-longrope (unslothai#1193) * Enhance rotary embedding handling in LlamaAttention and LongRopeRotaryEmbedding * Typo * Improve rotary embedding handling in LlamaAttention to prevent errors with short KV cache * Update llama.py * Update llama.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update transformers * Unk token issues * Update _utils.py * Fix pad token * Update llama.py * Typo * ignored labels * Revert "ignored labels" This reverts commit 9d07be0. * More patching * Update _utils.py * Update _utils.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Feat/all tmp (unslothai#1219) * Update save.py Check whether path is in /tmp dir for Kaggle environment * Update save.py Move temporary_location to /tmp in Kaggle * Enhance Kaggle environment support in save and tokenizer utilities --------- Co-authored-by: dendarrion <37800703+dendarrion@users.noreply.github.com> Co-authored-by: Erland366 <erland.pg366@gmail.com> * Bug fixes * Update pyproject.toml * Update _utils.py * Update __init__.py * Update __init__.py * Update _utils.py * Update _utils.py * Update _utils.py * Update _utils.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Tied weights * Revert "Tied weights" This reverts commit 8090b7c. * Tied weights * Utils * CE Loss patching * Update __init__.py * Update __init__.py * Patching * Update cross_entropy_loss.py * CE Loss * Update _utils.py * Update _utils.py * CE Loss * Update _utils.py * Update _utils.py * Layernorm * Update _utils.py * Update _utils.py * Post patch * Update _utils.py * Update llama.py * Update _utils.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * typing * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * int64 * Update _utils.py * Update cross_entropy_loss.py * constexpr * constexpr * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update _utils.py * Update _utils.py * Update _utils.py * CE * Update cross_entropy_loss.py * Update _utils.py * Update llama.py * Update _utils.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update utils.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * typing * Update rope_embedding.py * types * Disable compiling * Update _utils.py * Update _utils.py * Forward hook * Update _utils.py * Update llama.py * Update _utils.py * Update llama.py * Update llama.py * Update _utils.py * Update pyproject.toml * Update _utils.py * Update llama.py * CE Loss * Update cross_entropy_loss.py * Update _utils.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update llama.py * Update _utils.py * Update _utils.py * Update _utils.py * Update _utils.py * Update _utils.py * Fix: cast logits to float32 in cross_entropy_forward to prevent errors (unslothai#1254) * Fix: cast logits to float32 in cross_entropy_forward to prevent errors * Update cross_entropy_loss.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Throw error when inferencing longer than max_popsition_embeddings (unslothai#1236) * Throw error when inferencing longer than max_popsition_embeddings without rope scaling * Update llama.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * CLI now handles user input strings for dtype correctly (unslothai#1235) Co-authored-by: root <root@ieeres.chu.cam.ac.uk> * Update flex_attention.py * Update _utils.py * Update _utils.py * Update flex_attention.py * Update flex_attention.py * Update loader.py * Update loader.py * Update flex_attention.py * Update flex_attention.py * Update flex_attention.py * Update flex_attention.py * Update _utils.py * Update cross_entropy_loss.py * Update _utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * triton_cast * Update utils.py * Qwen 2.5 Coder --------- Co-authored-by: timothelaborie <97834767+timothelaborie@users.noreply.github.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Datta Nimmaturi <datta.nimmaturi@nutanix.com> Co-authored-by: dendarrion <37800703+dendarrion@users.noreply.github.com> Co-authored-by: Erland366 <erland.pg366@gmail.com> Co-authored-by: Edwin Fennell <edwinfennell1@gmail.com> Co-authored-by: root <root@ieeres.chu.cam.ac.uk>

* Fix TRL * Update mistral.py * Patch processing_class * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Installation guide (unslothai#1165) * chore: update chat_templates.py (unslothai#1166) orginal -> original * Disable Flex Attention * Update tokenizer_utils.py * Update _utils.py * n_items * Update cross_entropy_loss.py * Fix DPO, ORPO * Update _utils.py * Update _utils.py * fix/transformers-unpack (unslothai#1180) * Fix DPO, ORPO (unslothai#1177) * Fix TRL * Update mistral.py * Patch processing_class * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Installation guide (unslothai#1165) * chore: update chat_templates.py (unslothai#1166) orginal -> original * Disable Flex Attention * Update tokenizer_utils.py * Update _utils.py * n_items * Update cross_entropy_loss.py * Fix DPO, ORPO * Update _utils.py --------- Co-authored-by: timothelaborie <97834767+timothelaborie@users.noreply.github.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> * Add warning for missing Unpack and KwargsForCausalLM in older Transformers versions --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> Co-authored-by: timothelaborie <97834767+timothelaborie@users.noreply.github.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> * Update cross_entropy_loss.py * Update _utils.py * Update _utils.py * donot upcast lm_head and embeddings to float32 (unslothai#1186) * Cleanup upcast logs (unslothai#1188) * Fix/phi-longrope (unslothai#1193) * Enhance rotary embedding handling in LlamaAttention and LongRopeRotaryEmbedding * Typo * Improve rotary embedding handling in LlamaAttention to prevent errors with short KV cache * Update llama.py * Update llama.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update transformers * Unk token issues * Update _utils.py * Fix pad token * Update llama.py * Typo * ignored labels * Revert "ignored labels" This reverts commit 110ee41. * More patching * Update _utils.py * Update _utils.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Feat/all tmp (unslothai#1219) * Update save.py Check whether path is in /tmp dir for Kaggle environment * Update save.py Move temporary_location to /tmp in Kaggle * Enhance Kaggle environment support in save and tokenizer utilities --------- Co-authored-by: dendarrion <37800703+dendarrion@users.noreply.github.com> Co-authored-by: Erland366 <erland.pg366@gmail.com> * Bug fixes * Update pyproject.toml * Update _utils.py * Update __init__.py * Update __init__.py * Update _utils.py * Update _utils.py * Update _utils.py * Update _utils.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Tied weights * Revert "Tied weights" This reverts commit 42bb212. * Tied weights * Utils * CE Loss patching * Update __init__.py * Update __init__.py * Patching * Update cross_entropy_loss.py * CE Loss * Update _utils.py * Update _utils.py * CE Loss * Update _utils.py * Update _utils.py * Layernorm * Update _utils.py * Update _utils.py * Post patch * Update _utils.py * Update llama.py * Update _utils.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * typing * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * int64 * Update _utils.py * Update cross_entropy_loss.py * constexpr * constexpr * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update _utils.py * Update _utils.py * Update _utils.py * CE * Update cross_entropy_loss.py * Update _utils.py * Update llama.py * Update _utils.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update utils.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * Update rms_layernorm.py * typing * Update rope_embedding.py * types * Disable compiling * Update _utils.py * Update _utils.py * Forward hook * Update _utils.py * Update llama.py * Update _utils.py * Update llama.py * Update llama.py * Update _utils.py * Update pyproject.toml * Update _utils.py * Update llama.py * CE Loss * Update cross_entropy_loss.py * Update _utils.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update cross_entropy_loss.py * Update llama.py * Update _utils.py * Update _utils.py * Update _utils.py * Update _utils.py * Update _utils.py * Fix: cast logits to float32 in cross_entropy_forward to prevent errors (unslothai#1254) * Fix: cast logits to float32 in cross_entropy_forward to prevent errors * Update cross_entropy_loss.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Throw error when inferencing longer than max_popsition_embeddings (unslothai#1236) * Throw error when inferencing longer than max_popsition_embeddings without rope scaling * Update llama.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * CLI now handles user input strings for dtype correctly (unslothai#1235) Co-authored-by: root <root@ieeres.chu.cam.ac.uk> * Update flex_attention.py * Update _utils.py * Update _utils.py * Update flex_attention.py * Update flex_attention.py * Update loader.py * Update loader.py * Update flex_attention.py * Update flex_attention.py * Update flex_attention.py * Update flex_attention.py * Update _utils.py * Update cross_entropy_loss.py * Update _utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * Update tokenizer_utils.py * triton_cast * Update utils.py * Qwen 2.5 Coder --------- Co-authored-by: timothelaborie <97834767+timothelaborie@users.noreply.github.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Datta Nimmaturi <datta.nimmaturi@nutanix.com> Co-authored-by: dendarrion <37800703+dendarrion@users.noreply.github.com> Co-authored-by: Erland366 <erland.pg366@gmail.com> Co-authored-by: Edwin Fennell <edwinfennell1@gmail.com> Co-authored-by: root <root@ieeres.chu.cam.ac.uk>

danielhanchen and others added 30 commits October 21, 2024 01:02

Fix TRL

f0aca90

Update mistral.py

f4ae585

Patch processing_class

106f213

Update tokenizer_utils.py

ef84212

Update tokenizer_utils.py

4f7c527

Update tokenizer_utils.py

aa2b207

Update tokenizer_utils.py

101389d

Update tokenizer_utils.py

c0f0fc9

Update tokenizer_utils.py

b3e0033

Installation guide (#1165)

aabb5ff

chore: update chat_templates.py (#1166)

30bf339

orginal -> original

Disable Flex Attention

2895839

Update tokenizer_utils.py

06f5d75

Update _utils.py

28e6eea

n_items

b821f20

Update cross_entropy_loss.py

e561366

Fix DPO, ORPO

4ff247a

Merge branch 'main' into nightly

2b858a5

Update _utils.py

1c063b4

Update _utils.py

f195ee1

Update cross_entropy_loss.py

5961c34

Update _utils.py

7308bb8

Update _utils.py

0096e5b

Merge branch 'main' into nightly

44b480f

donot upcast lm_head and embeddings to float32 (#1186)

6776055

Cleanup upcast logs (#1188)

625209e

Update transformers

6f28d16

Merge branch 'main' into nightly

f94f7c1

Erland366 and others added 27 commits November 6, 2024 12:16

Fix: cast logits to float32 in cross_entropy_forward to prevent errors (

f24aef5

#1254) * Fix: cast logits to float32 in cross_entropy_forward to prevent errors * Update cross_entropy_loss.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>

Throw error when inferencing longer than max_popsition_embeddings (#1236

3d906e6

) * Throw error when inferencing longer than max_popsition_embeddings without rope scaling * Update llama.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>

CLI now handles user input strings for dtype correctly (#1235)

de1049b

Co-authored-by: root <root@ieeres.chu.cam.ac.uk>

Update flex_attention.py

be72975

Update _utils.py

05170cd

Update _utils.py

7e0877d

Update flex_attention.py

6b5c599

Update flex_attention.py

1ba9f2e

Update loader.py

da61c4d

Update loader.py

3316ee2

Update flex_attention.py

501ca84

Update flex_attention.py

ce621b7

Update flex_attention.py

4b01ff1

Update flex_attention.py

ef5052a

Update _utils.py

52bca32

Merge branch 'main' into nightly

68b8d62

Merge branch 'main' into nightly

15da065

Update cross_entropy_loss.py

8b3e9c2

Update _utils.py

3a1e7ef

Update tokenizer_utils.py

f1ec165

Update tokenizer_utils.py

a4e9705

Update tokenizer_utils.py

92c6a27

Update tokenizer_utils.py

673f541

Update tokenizer_utils.py

8fe9109

triton_cast

ad41479

Update utils.py

fcf2009

Qwen 2.5 Coder

af9ba07

danielhanchen merged commit 899caf0 into main Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Qwen 2.5#1280

Qwen 2.5#1280
danielhanchen merged 209 commits into
mainfrom
nightly

danielhanchen commented Nov 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

Conversation

danielhanchen commented Nov 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants