Skip to content

[Performance 2/6] Replace einops.rearrange with torch native ops#15804

Merged
AUTOMATIC1111 merged 1 commit intoAUTOMATIC1111:devfrom
huchenlei:rearrange_fix
Jun 8, 2024
Merged

[Performance 2/6] Replace einops.rearrange with torch native ops#15804
AUTOMATIC1111 merged 1 commit intoAUTOMATIC1111:devfrom
huchenlei:rearrange_fix

Conversation

@huchenlei
Copy link
Copy Markdown
Contributor

@huchenlei huchenlei commented May 15, 2024

Description

According to lllyasviel/stable-diffusion-webui-forge#716 (comment), einops.rearrange calls in crossattn is causing extra overhead. Replacing it with torch native ops can save ~55ms/it.

Screenshots/videos:

image

TODO

There are other places where einops.rearrange can be replaced by torch native ops, but this one in CrossAttn is the most critical one. Instrument the usage of einops.rearrange elsewhere might also yield some improvements.

Checklist:

"""
b, n, _ = t.shape # Get the batch size (b) and sequence length (n)
d = t.shape[2] // h # Determine the depth per head
return t.reshape(b, n, h, d)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t.reshape(b,n,h,-1) should achieve similar result without having to explicitly calculate d


q = _reshape(q_in)
k = _reshape(k_in)
v = _reshape(v_in)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be done in 1 line with q, k, v = (_reshape(t) for t in (q_in, k_in, v_in))

@AUTOMATIC1111 AUTOMATIC1111 merged commit de7f5cd into AUTOMATIC1111:dev Jun 8, 2024
lllyasviel added a commit to lllyasviel/stable-diffusion-webui-forge that referenced this pull request Aug 8, 2024
wkpark added a commit to wkpark/stable-diffusion-webui that referenced this pull request Sep 8, 2024
 * replace rearrange to view AUTOMATIC1111#15804
   * see also lllyasviel/stable-diffusion-webui-forge@79adfa8
 * conditional use torch.rms_norm for torch 2.4
 * fix RMSNorm() for clear: use torch.ones()
wkpark added a commit to wkpark/stable-diffusion-webui that referenced this pull request Sep 8, 2024
 * replace rearrange to view AUTOMATIC1111#15804
   * see also lllyasviel/stable-diffusion-webui-forge@79adfa8
 * conditional use torch.rms_norm for torch 2.4
 * fix RMSNorm() for clear: use torch.ones()
ruchej pushed a commit to ruchej/stable-diffusion-webui that referenced this pull request Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants