added mlp and attn bias option to flash and paged llama models (#85) by heyselbi · Pull Request #32 · red-hat-data-services/text-generation-inference

heyselbi · 2024-05-07T11:46:40Z

Motivation

The Calico models currently set the mlp and attention bias to true, which was hard-coded to false in flash and paged llama implementations. This will use the config params set in
huggingface/transformers#30031 to set those values properly.

Modifications

added attention_bias, mlp_bias to config for Flash and Paged Llama implementations (default is False)
set bias in attention and mlp to the config value

Result

Models should be able to load properly if containing attention and mlp bias

Motivation

Closes: https://issues.redhat.com/browse/RHOAIENG-6839

[Describe why this change is needed]

Modifications

[Describe the code changes]

Result

[Describe how the changes affects existing behavior and how to test it]

Related Issues

[Resolves opendatahub-io#123]

#### Motivation The `Calico` models currently set the mlp and attention bias to true, which was hard-coded to false in flash and paged llama implementations. This will use the config params set in huggingface/transformers#30031 to set those values properly. #### Modifications - added attention_bias, mlp_bias to config for Flash and Paged Llama implementations (default is False) - set bias in attention and mlp to the config value #### Result Models should be able to load properly if containing attention and mlp bias --------- Signed-off-by: Joshua Rosenkranz <jmrosenk@us.ibm.com> Signed-off-by: Joe Runde <Joseph.Runde@ibm.com> Co-authored-by: Joe Runde <Joseph.Runde@ibm.com>

z103cb

/lgtm
/approved

openshift-ci · 2024-05-07T11:59:05Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: heyselbi, z103cb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [heyselbi,z103cb]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci bot requested review from dtrifiro and z103cb May 7, 2024 11:46

openshift-ci bot added the approved label May 7, 2024

z103cb approved these changes May 7, 2024

View reviewed changes

openshift-ci bot assigned z103cb May 7, 2024

openshift-ci bot added the lgtm label May 7, 2024

openshift-merge-bot bot merged commit 3853574 into red-hat-data-services:rhoai-2.8 May 7, 2024

heyselbi deleted the rhoai-2-8-3 branch May 17, 2024 16:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added mlp and attn bias option to flash and paged llama models (#85)#32

added mlp and attn bias option to flash and paged llama models (#85)#32
openshift-merge-bot[bot] merged 1 commit intored-hat-data-services:rhoai-2.8from
heyselbi:rhoai-2-8-3

heyselbi commented May 7, 2024

Uh oh!

z103cb left a comment

Uh oh!

openshift-ci bot commented May 7, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

heyselbi commented May 7, 2024

Motivation

Modifications

Result

Motivation

Modifications

Result

Related Issues

Uh oh!

z103cb left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented May 7, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants