wgpu-core's submission tracking goes to great pains to avoid refcount traffic for resources used by commands in flight, but those efforts are ineffective, and the traffic occurs anyway. Since the complexity fails to deliver, it should be removed.
In more detail:
Our backend APIs require that resources (buffers, textures, etc.) used by commands in flight not be freed until the commands have finished execution. (Quite reasonable.)
One of the striking things about wgpu_core::device::life::ActiveSubmission is that it does not hold strong references to the resources used by the commands it represents. Rather, the ActiveSubmission::last_resources set is initially empty (ignore Device::pending_writes for now), and resources are only added to it if they are destroyed while their most recent queue submission is still in flight, as a way of deferring their actual destruction until after the command has finished.
As far as I understand, it's done this way because it's rare in practice for applications to free resources while they are still in use by the GPU, and thus any time spent fiddling with ownership on each queue submission is almost always wasted. With the approach above, in the common case, most ActiveSubmission::last_resources sets remain empty throughout their lifetime, and no resources need to be touched on completion.
However, this strategy is ineffective.
Compare this approach to a simpler one where ActiveSubmission simply holds strong references to the resources its commands use. Queue submission would move strong references from CommandBuffers and PendingWrites into the new ActiveSubmission. When execution is complete, those strong references would be dropped, and any unreferenced resources would be freed.
First, note that, in the best case where nothing is actually freed, the only difference between the two approaches is the amount of refcount traffic at command completion. Dropping strong references to resources that are still alive is a pointer dereference and an atomic decrement.
Second, note that wgpu_core::command::CommandBuffer holds strong references to the resources its commands use. On queue submission, these strong references are packed into a BakedCommands value, which is then dropped. This incurs exactly the same pointer dereference and refcount traffic, on every resource directly used by the command buffer, that would occur at execution completion if ActiveSubmission held strong references to all resources. The only effect of the current approach is to move the pointer dereferences and refcount fiddling from queue draining time to submission time.
Moving the traffic from later to earlier doesn't seem worth the complexity required to support it.
Instead, queue submission should move the owning references from the command buffers and pending writes into the ActiveSubmission, along with the encoders and command buffers. This will move the refcount traffic from queue submission time to queue draining time. We can then remove the delicate and complex code that populates ActiveSubmission::last_resources. LifetimeTracker::triage_suspected and everything that supports it could go away, replaced by idiomatic use of Arc.
cc @kvark
wgpu-core's submission tracking goes to great pains to avoid refcount traffic for resources used by commands in flight, but those efforts are ineffective, and the traffic occurs anyway. Since the complexity fails to deliver, it should be removed.In more detail:
Our backend APIs require that resources (buffers, textures, etc.) used by commands in flight not be freed until the commands have finished execution. (Quite reasonable.)
One of the striking things about
wgpu_core::device::life::ActiveSubmissionis that it does not hold strong references to the resources used by the commands it represents. Rather, theActiveSubmission::last_resourcesset is initially empty (ignoreDevice::pending_writesfor now), and resources are only added to it if they are destroyed while their most recent queue submission is still in flight, as a way of deferring their actual destruction until after the command has finished.As far as I understand, it's done this way because it's rare in practice for applications to free resources while they are still in use by the GPU, and thus any time spent fiddling with ownership on each queue submission is almost always wasted. With the approach above, in the common case, most
ActiveSubmission::last_resourcessets remain empty throughout their lifetime, and no resources need to be touched on completion.However, this strategy is ineffective.
Compare this approach to a simpler one where
ActiveSubmissionsimply holds strong references to the resources its commands use. Queue submission would move strong references fromCommandBuffers andPendingWritesinto the newActiveSubmission. When execution is complete, those strong references would be dropped, and any unreferenced resources would be freed.First, note that, in the best case where nothing is actually freed, the only difference between the two approaches is the amount of refcount traffic at command completion. Dropping strong references to resources that are still alive is a pointer dereference and an atomic decrement.
Second, note that
wgpu_core::command::CommandBufferholds strong references to the resources its commands use. On queue submission, these strong references are packed into aBakedCommandsvalue, which is then dropped. This incurs exactly the same pointer dereference and refcount traffic, on every resource directly used by the command buffer, that would occur at execution completion ifActiveSubmissionheld strong references to all resources. The only effect of the current approach is to move the pointer dereferences and refcount fiddling from queue draining time to submission time.Moving the traffic from later to earlier doesn't seem worth the complexity required to support it.
Instead, queue submission should move the owning references from the command buffers and pending writes into the
ActiveSubmission, along with the encoders and command buffers. This will move the refcount traffic from queue submission time to queue draining time. We can then remove the delicate and complex code that populatesActiveSubmission::last_resources.LifetimeTracker::triage_suspectedand everything that supports it could go away, replaced by idiomatic use ofArc.cc @kvark