Skip to content

Non-Causal attention mask support? #696

@brthor

Description

@brthor

I am attempting to use a non-causal attention mask for a llama model with Unsloth and looking over the code to find the best way to achieve all the other speedups but use a non-causal attention mask.

I would appreciate some advice on places to look for code changes, prior to stepping through with a debugger.

I see there that the Attention forward still seems to support the attention mask:
https://github.com/unslothai/unsloth/blob/main/unsloth/models/llama.py#L375

But it may be ignored in the forward fn:
https://github.com/unslothai/unsloth/blob/main/unsloth/models/llama.py#L589

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions