vulkan: set all memory allocations to high priority#17624
vulkan: set all memory allocations to high priority#176240cc4m merged 2 commits intoggml-org:masterfrom
Conversation
|
Even if it does, I don't think it's always preferable to keep models in VRAM, over other data. So at the very least we'd have to make it configurable. @netrunnereve do you know if the extension makes a difference for RADV memory eviction behaviour? Maybe it's a way to keep models loaded without the RADV-side flag. |
I'm pretty sure RADV doesn't care about this extension and just handles memory like it usually does. By that I mean that it tries to put stuff in vram if possible (note that doesn't always happen) and if memory gets swapped out of vram it doesn't know how to bring it back. The only way to deal with this is to use my nogttspill flag and #17605 is literally a textbook example of why we need it. I suppose we could ask Mesa to see if they're interested in supporting VK_EXT_memory_priority though and basically have it disable GTT allocations if set high enough. |
|
I'm not too surprised this didn't help, and am OK with abandoning it. This does get hooked up to WDDM priorities on windows and it might help more there. |
Personally I don't mind this as long as it's optional and disabled by default. Like you said there are other systems which can probably make use of this. |
|
Since you already implemented it, let's just keep it disabled by default and enable with an environment variable. |
|
Added the env var. |
* vulkan: set all memory allocations to high priority * gate by env var
* vulkan: set all memory allocations to high priority * gate by env var
* vulkan: set all memory allocations to high priority * gate by env var
* vulkan: set all memory allocations to high priority * gate by env var
For #17605, though I'm not sure whether it'll help.