Skip to content

split head_dim from hidden_size for llama like gemma or mistral #32846

@bzantium

Description

@bzantium

Feature request

split head_dim from hidden_size like gemma or mistral

Motivation

make not to align head_dim with hidden_size

Your contribution

slightly revise modeling code and submit PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions