Skip to content

Conversation

@Matvezy
Copy link
Contributor

@Matvezy Matvezy commented Jan 6, 2026

Description

Qwen was slow due to not using flash attn on available hardware. Also with new transformers a new bug of loading in loras for base weights was introduced. This PR fixes both. Additionally this PR standardizes things between Qwen 2.5 and 3.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

How has this change been tested, please provide a testcase or example of how you tested the change?

Tested Locally

Any specific deployment considerations

No

Docs

  • Docs updated? What were the changes: No

@grzegorz-roboflow grzegorz-roboflow merged commit 46001a6 into main Jan 6, 2026
52 checks passed
@grzegorz-roboflow grzegorz-roboflow deleted the add_qwen3vl branch January 6, 2026 11:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants