vulkan: record actual memory properties during buffer creation#24326
vulkan: record actual memory properties during buffer creation#24326winstonma wants to merge 1 commit into
Conversation
Seems more like this change leads to data corruption on Intel. @rillomas another buffer issue, can you look into what is going on with the Intel Linux CI? Lots of failed tests. Intel Windows passed. |
So far I can only reproduce any failures using 37b1516
Following environment did not reproduce:
I originally thought it was a driver issue but possibly something specific to LunarLake? Kind of unlikely though.
Sample failure log on LunarLake |
|
Applying the following change (taken from #23770) fixes the mismatch on LNL diff --git a/ggml/src/ggml-vulkan/ggml-vulkan.cpp b/ggml/src/ggml-vulkan/ggml-vulkan.cpp
index 23ab2246f..ccbd131c0 100644
--- a/ggml/src/ggml-vulkan/ggml-vulkan.cpp
+++ b/ggml/src/ggml-vulkan/ggml-vulkan.cpp
@@ -7619,6 +7619,23 @@ static void ggml_vk_buffer_read_2d(vk_buffer& src, size_t offset, void * dst, si
if(src->memory_property_flags & vk::MemoryPropertyFlagBits::eHostVisible && src->device->uma) {
GGML_ASSERT(src->memory_property_flags & vk::MemoryPropertyFlagBits::eHostCoherent);
+ std::lock_guard<std::recursive_mutex> guard(src->device->mutex);
+ vk_context subctx = ggml_vk_create_temporary_context(src->device->compute_queue.cmd_pool);
+ ggml_vk_ctx_begin(src->device, subctx);
+ subctx->s->buffer->buf.pipelineBarrier(
+ vk::PipelineStageFlagBits::eComputeShader | vk::PipelineStageFlagBits::eTransfer,
+ vk::PipelineStageFlagBits::eHost,
+ {},
+ { { vk::AccessFlagBits::eShaderWrite | vk::AccessFlagBits::eTransferWrite,
+ vk::AccessFlagBits::eHostRead } },
+ {}, {});
+ ggml_vk_ctx_end(subctx);
+ ggml_vk_submit(subctx, src->device->fence);
+ VK_CHECK(src->device->device.waitForFences({ src->device->fence }, true, UINT64_MAX),
+ "vk_buffer_read_2d uma waitForFences");
+ src->device->device.resetFences({ src->device->fence });
+ ggml_vk_queue_command_pools_cleanup(src->device);
+
for (size_t i = 0; i < height; i++) {
memcpy((uint8_t *) dst + i * dpitch, (const uint8_t *) src->ptr + offset + i * spitch, width);
} |
|
Oh, we totally forgot about #23770, right. |
|
Please rebase now that #23770 is merged, to see if this fixes the Intel CI. |
37b1516 to
b26ea2b
Compare
|
Done |
Overview
Before the patch, memory_property_flags was being set to the requested flags rather than what was actually allocated. This PR sets memory_property_flags from what the driver actually gave.
It ensures that subsequent operations (like
ggml_vk_buffer_write_2dorggml_vk_buffer_read_2d) can correctly determine if the memory isHostVisibleorHostCoherent. Without this change, if the driver provide cached memory when visible memory is being requested, the program wouldn't know it was cached and would skip the necessary cache flushing, leading to data corruption.Additional information
Requirements