[Performance 4/6] Precompute is_sdxl_inpaint flag#15806
[Performance 4/6] Precompute is_sdxl_inpaint flag#15806AUTOMATIC1111 merged 3 commits intoAUTOMATIC1111:devfrom
Conversation
|
Just wanted to comment that all these performance PRs are amazing! I get pretty similar speeds vs Forge on a RTX 4090. (It seems that A1111 with these PRs actually generate a tad bit faster vs Forge, but the former takes a bit more time to start generating) |
There are 2 more PRs to come, but they are not as straightforward. So they might take longer to prepare. I am also having all performance fix merged to https://github.com/huchenlei/stable-diffusion-webui/tree/all_perf so you don't need to patch these PRs one by one. |
Description
According to lllyasviel/stable-diffusion-webui-forge#716 (comment) , the check of whether the model is sdxl inpaint is calling
state_dicton every sampling step.state_dictis a very expensive function that costs ~40ms. This overhead is for all inference regardless of model type, which is dumb.This PR precomputes
is_sdxl_inpaintflag so that we do not callstate_dicton every sampling step.Original PR that introduce this change: #14390
Screenshots/videos:
Checklist: