Skip to content

Split out the IndirectParametersMetadata into CPU-populated and GPU-populated buffers.#17863

Merged
superdump merged 3 commits intobevyengine:mainfrom
pcwalton:split-indirect-parameters-metadata
Feb 18, 2025
Merged

Split out the IndirectParametersMetadata into CPU-populated and GPU-populated buffers.#17863
superdump merged 3 commits intobevyengine:mainfrom
pcwalton:split-indirect-parameters-metadata

Conversation

@pcwalton
Copy link
Copy Markdown
Contributor

The GPU can fill out many of the fields in IndirectParametersMetadata using information it already has:

  • early_instance_count and late_instance_count are always initialized to zero.

  • mesh_index is already present in the work item buffer as the input_index of the first work item in each batch.

This patch moves these fields to a separate buffer, the GPU indirect parameters metadata buffer. That way, it avoids having to write them on CPU during batch_and_prepare_binned_render_phase. This effectively reduces the number of bits that that function must write per mesh from 160 to 64 (in addition to the 64 bits per mesh instance).

Additionally, this PR refactors UntypedPhaseIndirectParametersBuffers to add another layer, MeshClassIndirectParametersBuffers, which allows abstracting over the buffers corresponding indexed and non-indexed meshes. This patch doesn't make much use of this abstraction, but forthcoming patches will, and it's overall a cleaner approach.

This didn't seem to have much of an effect by itself on batch_and_prepare_binned_render_phase time, but subsequent PRs dependent on this PR yield roughly a 2× speedup.

@pcwalton pcwalton added A-Rendering Drawing game state to the screen C-Performance A change motivated by improving speed, memory usage or compile times C-Code-Quality A section of code that is hard to understand or change S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Feb 14, 2025
GPU-populated buffers.

The GPU can fill out many of the fields in `IndirectParametersMetadata`
using information it already has:

* `early_instance_count` and `late_instance_count` are always
  initialized to zero.

* `mesh_index` is already present in the work item buffer as the
  `input_index` of the first work item in each batch.

This patch moves these fields to a separate buffer, the *GPU indirect
parameters metadata* buffer. That way, it avoids having to write them on
CPU during `batch_and_prepare_binned_render_phase`. This effectively
reduces the number of bits that that function must write per mesh from
160 to 64 (in addition to the 64 bits per mesh *instance*).

Additionally, this PR refactors `UntypedPhaseIndirectParametersBuffers`
to add another layer, `MeshClassIndirectParametersBuffers`, which allows
abstracting over the buffers corresponding indexed and non-indexed
meshes. This patch doesn't make much use of this abstraction, but
forthcoming patches will, and it's overall a cleaner approach.

This didn't seem to have much of an effect by itself on
`batch_and_prepare_binned_render_phase` time, but subsequent PRs
dependent on this PR yield roughly a 2× speedup.
@pcwalton pcwalton force-pushed the split-indirect-parameters-metadata branch from 780ad40 to bf6d8f8 Compare February 14, 2025 18:48
Copy link
Copy Markdown
Member

@tychedelia tychedelia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm! Tested on macOS.

@pcwalton pcwalton added this to the 0.16 milestone Feb 17, 2025
@pcwalton pcwalton requested a review from superdump February 17, 2025 21:27
/// The buffers containing all the information that indirect draw commands use
/// to draw the scene, for a single mesh class (indexed or non-indexed), for a
/// single phase.
pub struct MeshClassIndirectParametersBuffers<IP>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I'm not super keen on the word 'class', I don't have a better suggestion. I like how much gets ripped out of this.

@superdump superdump added this pull request to the merge queue Feb 18, 2025
Merged via the queue into bevyengine:main with commit 8f36106 Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Rendering Drawing game state to the screen C-Code-Quality A section of code that is hard to understand or change C-Performance A change motivated by improving speed, memory usage or compile times S-Needs-Review Needs reviewer attention (from anyone!) to move forward

Projects

No open projects
Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants