Skip to content

[Performance] LDM optimization patches#15824

Merged
AUTOMATIC1111 merged 3 commits intoAUTOMATIC1111:devfrom
drhead:patch-4
Jun 8, 2024
Merged

[Performance] LDM optimization patches#15824
AUTOMATIC1111 merged 3 commits intoAUTOMATIC1111:devfrom
drhead:patch-4

Conversation

@drhead
Copy link
Copy Markdown
Contributor

@drhead drhead commented May 17, 2024

Description

Change 1: Timestep Embedding Patch

  • Fixes a blocking op in the timestep embedding. It was creating a tensor on CPU and then moving it to GPU, which would force a sync every step.
  • Combined with the other performance PRs (mine and HCL's), Torch's dispatch queue should be completely unblocked (until extensions with similar problems mess it up). This will allow near constant 100% GPU usage.

Change 2: SpatialTransformer.forward einops removal

  • Changes the function to use native torch reshape/view/permute ops and removes the .contiguous() call.
  • Prevents 32 calls to aten::copy_ and void at::native::elementwise_kernel<128, 4, at::nati... per forward pass (SD 1.5). Speedup seems to be around 6-8 ms per forward, but my profiler is being a little inconsistent with the timing (512x512, batch 4, overclocked 3090)

Checklist:

@drhead drhead requested a review from AUTOMATIC1111 as a code owner May 17, 2024 16:16
@drhead
Copy link
Copy Markdown
Contributor Author

drhead commented May 17, 2024

I think #18620 might need to be merged before tests will pass on this.

@w-e-w
Copy link
Copy Markdown
Collaborator

w-e-w commented May 17, 2024

so we need to wait 2769 new posts to merge this 🙃

@drhead
Copy link
Copy Markdown
Contributor Author

drhead commented May 17, 2024

Upon further review I think it would be sufficient for #15820 to be merged first lol

@drhead drhead changed the title Patch timestep embedding to create tensor on-device LDM optimization patches May 17, 2024
@drhead
Copy link
Copy Markdown
Contributor Author

drhead commented May 17, 2024

Added another patch, and it passes tests now.

@drhead drhead changed the title LDM optimization patches [Performance] LDM optimization patches May 21, 2024
@AUTOMATIC1111 AUTOMATIC1111 merged commit 93b53dc into AUTOMATIC1111:dev Jun 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants