add mlp bias for llama models by mayank31398 · Pull Request #30031 · huggingface/transformers

mayank31398 · 2024-04-04T03:11:17Z

This adds bias support for MLP in Llama
Can we add this?
It would help a lot for some models we are developing
@ArthurZucker and @younesbelkada

This lets us re-use the llama model class for our models :) without adding another class for a new model.

mayank31398 · 2024-04-04T18:09:59Z

@younesbelkada light ping again

ArthurZucker

Hey! Though this is not necessarily against our transformers philosophy, we would need to have justification: meaning a new cool model released rather than a promise that it will be release!
This should be easy to have on the hub with trust_remote_code=True no?

ArthurZucker

ALright! Now that you have a new model coming this makes sense. Could you make sure the CIs pass?

younesbelkada

Great work !

src/transformers/models/llama/configuration_llama.py

HuggingFaceDocBuilderDev · 2024-05-03T08:27:55Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

younesbelkada

Great, thanks !

mayank31398 · 2024-05-03T08:57:20Z

Thanks everyone for the quick turnaround 🤗

#### Motivation The `Calico` models currently set the mlp and attention bias to true, which was hard-coded to false in flash and paged llama implementations. This will use the config params set in huggingface/transformers#30031 to set those values properly. #### Modifications - added attention_bias, mlp_bias to config for Flash and Paged Llama implementations (default is False) - set bias in attention and mlp to the config value #### Result Models should be able to load properly if containing attention and mlp bias --------- Signed-off-by: Joshua Rosenkranz <jmrosenk@us.ibm.com> Signed-off-by: Joe Runde <Joseph.Runde@ibm.com> Co-authored-by: Joe Runde <Joseph.Runde@ibm.com>

add bias

f28600f

ArthurZucker reviewed Apr 5, 2024

View reviewed changes

mayank31398 added 2 commits May 2, 2024 15:54

Merge branch 'main' into llama-bias

d37f7ee

Merge branch 'main' into llama-bias

0aba976

ArthurZucker approved these changes May 3, 2024

View reviewed changes

younesbelkada approved these changes May 3, 2024

View reviewed changes

src/transformers/models/llama/configuration_llama.py Outdated Show resolved Hide resolved

src/transformers/models/llama/configuration_llama.py Outdated Show resolved Hide resolved

fix quality

a41eca1

mayank31398 force-pushed the llama-bias branch from 978df7f to a41eca1 Compare May 3, 2024 08:44

younesbelkada approved these changes May 3, 2024

View reviewed changes

younesbelkada merged commit 425e1a0 into huggingface:main May 3, 2024

mayank31398 deleted the llama-bias branch May 3, 2024 09:05

This was referenced May 6, 2024

added attn and mlp bias IBM/text-generation-inference#83

Closed

Added attn and mlp bias IBM/text-generation-inference#84

Closed

added mlp and attn bias option to flash and paged llama models IBM/text-generation-inference#85

Merged

heyselbi mentioned this pull request May 7, 2024

added mlp and attn bias option to flash and paged llama models (#85) red-hat-data-services/text-generation-inference#32

Merged

psyv282j9d mentioned this pull request May 7, 2024

Add Support for IBM Granite ggml-org/llama.cpp#7116

Closed

Semihal mentioned this pull request May 29, 2024

[New Model]: IBM Granite Code Models vllm-project/vllm#5095

Closed

younesbelkada mentioned this pull request Jun 20, 2024

Granite language models #31502

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add mlp bias for llama models#30031

add mlp bias for llama models#30031
younesbelkada merged 4 commits intohuggingface:mainfrom
mayank31398:llama-bias

mayank31398 commented Apr 4, 2024 •

edited

Loading

Uh oh!

mayank31398 commented Apr 4, 2024

Uh oh!

ArthurZucker left a comment

Uh oh!

ArthurZucker left a comment

Uh oh!

younesbelkada left a comment

Uh oh!

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented May 3, 2024

Uh oh!

younesbelkada left a comment

Uh oh!

mayank31398 commented May 3, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mayank31398 commented Apr 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mayank31398 commented Apr 4, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

younesbelkada left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented May 3, 2024

Uh oh!

younesbelkada left a comment

Choose a reason for hiding this comment

Uh oh!

mayank31398 commented May 3, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mayank31398 commented Apr 4, 2024 •

edited

Loading