Skip to content

[Performance 4/6] Precompute is_sdxl_inpaint flag#15806

Merged
AUTOMATIC1111 merged 3 commits intoAUTOMATIC1111:devfrom
huchenlei:inpaint_fix
Jun 8, 2024
Merged

[Performance 4/6] Precompute is_sdxl_inpaint flag#15806
AUTOMATIC1111 merged 3 commits intoAUTOMATIC1111:devfrom
huchenlei:inpaint_fix

Conversation

@huchenlei
Copy link
Copy Markdown
Contributor

Description

According to lllyasviel/stable-diffusion-webui-forge#716 (comment) , the check of whether the model is sdxl inpaint is calling state_dict on every sampling step. state_dict is a very expensive function that costs ~40ms. This overhead is for all inference regardless of model type, which is dumb.

This PR precomputes is_sdxl_inpaint flag so that we do not call state_dict on every sampling step.

Original PR that introduce this change: #14390

Screenshots/videos:

image

Checklist:

@huchenlei huchenlei requested a review from AUTOMATIC1111 as a code owner May 15, 2024 20:36
@huchenlei huchenlei changed the title Precompute is_sdxl_inpaint flag [Performance 4/6] Precompute is_sdxl_inpaint flag May 15, 2024
@huchenlei huchenlei changed the base branch from master to dev May 15, 2024 20:50
@Panchovix
Copy link
Copy Markdown

Panchovix commented May 16, 2024

Just wanted to comment that all these performance PRs are amazing! I get pretty similar speeds vs Forge on a RTX 4090. (It seems that A1111 with these PRs actually generate a tad bit faster vs Forge, but the former takes a bit more time to start generating)

@huchenlei
Copy link
Copy Markdown
Contributor Author

Just wanted to comment that all these performance PRs are amazing! I get pretty similar speeds vs Forge on a RTX 4090. (It seems that A1111 with these PRs actually generate a tad bit faster vs Forge, but the former takes a bit more time to start generating)

There are 2 more PRs to come, but they are not as straightforward. So they might take longer to prepare. I am also having all performance fix merged to https://github.com/huchenlei/stable-diffusion-webui/tree/all_perf so you don't need to patch these PRs one by one.

@AUTOMATIC1111 AUTOMATIC1111 merged commit 6450d24 into AUTOMATIC1111:dev Jun 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants