Add fusions for OpenAI CLIP #20721

kunal-vaishnavi · 2024-05-18T05:47:50Z

Description

This PR adds fusions for OpenAI's CLIP model. Here is an example of how to run the ORT transformer optimizer for the linked CLIP model.

$ git clone https://github.com/microsoft/onnxruntime
$ cd onnxruntime/onnxruntime/python/tools/transformers
$ python3 optimizer.py --input /path/to/model.onnx --output /path/to/model_opt.onnx --model_type clip --num_heads 16 --hidden_size 1024 --use_external_data_format --opt_level 0

Motivation and Context

This PR helps optimize multi-modal models that use CLIP for the vision encoder.

onnxruntime/python/tools/transformers/fusion_attention_clip.py

-                ["Expand", "Unsqueeze", "Unsqueeze", "Where", "Less"],
-                [causal_mask_input_index, 0, 0, 0, 0],
+                ["Concat", "Expand", "Unsqueeze", "Unsqueeze", "Where", "Less"],
+                [causal_mask_input_index, 0, 0, 0, 0, 0],


### Description This PR adds unit tests for [fusing the vision components](#20721) of Phi-3 vision and Phi-3.5 vision. ### Motivation and Context Many multi-modal models use a CLIP encoder or a variant of CLIP as part of their encoders. These fusion unit tests will ensure that the vision components of Phi-3 vision and Phi-3.5 vision can still be fused when existing fusions are modified to support more models.

kunal-vaishnavi added 4 commits May 14, 2024 19:56

Add OpenAI CLIP fusions

8755e1f

Add QuickGelu fusion test

2135cc8

Remove commented out code

c50a4aa

Add changes suggested by linter

433d3fc

github-advanced-security bot found potential problems May 18, 2024

View reviewed changes

hanbitmyths previously approved these changes May 18, 2024

View reviewed changes

Fix path mismatch check

9bc5af3

kunal-vaishnavi dismissed hanbitmyths’s stale review via 9bc5af3 May 18, 2024 07:09

hanbitmyths approved these changes May 18, 2024

View reviewed changes

hanbitmyths merged commit ca22a5a into microsoft:main May 18, 2024

kunal-vaishnavi mentioned this pull request Jul 2, 2024

[Transformers Optimizer] CLIP-ViT encoder attention not getting fused #21208

Closed

kunal-vaishnavi mentioned this pull request Jan 14, 2025

Add unit tests for Phi vision #23357

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add fusions for OpenAI CLIP #20721

Add fusions for OpenAI CLIP #20721

Uh oh!

kunal-vaishnavi commented May 18, 2024 •

edited

Loading

Uh oh!

Check failure

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add fusions for OpenAI CLIP #20721

Add fusions for OpenAI CLIP #20721

Uh oh!

Conversation

kunal-vaishnavi commented May 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

Check failure

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kunal-vaishnavi commented May 18, 2024 •

edited

Loading