Fix motion vector computation after #17688. by pcwalton · Pull Request #17717 · bevyengine/bevy

pcwalton · 2025-02-07T03:03:35Z

PR #17688 broke motion vector computation, and therefore motion blur, because it enabled retention of MeshInputUniforms, and MeshInputUniforms contain the indices of the previous frame's transform and the previous frame's skinned mesh joint matrices. On frame N, if a MeshInputUniform is retained on GPU from the previous frame, the previous_input_index and previous_skin_index would refer to the indices for frame N - 2, not the index for frame N - 1.

This patch fixes the problems. It solves these issues in two different ways, one for transforms and one for skins:

To fix transforms, this patch supplies the frame index to the shader as part of the view uniforms, and specifies which frame index each mesh's previous transform refers to. So, in the situation described above, the frame index would be N, the previous frame index would be N - 1, and the previous_input_frame_number would be N - 2. The shader can now detect this situation and infer that the mesh has been retained, and can therefore conclude that the mesh's transform hasn't changed.
To fix skins, this patch replaces the explicit previous_skin_index with an invariant that the index of the joints for the current frame and the index of the joints for the previous frame are the same. This means that the MeshInputUniform never has to be updated even if the skin is animated. The downside is that we have to copy joint matrices from the previous frame's buffer to the current frame's buffer in extract_skins.

The rationale behind (2) is that we currently have no mechanism to detect when joints that affect a skin have been updated, short of comparing all the transforms and setting a flag for extract_meshes_for_gpu_building to consume, which would regress performance as we want extract_skins and
extract_meshes_for_gpu_building to be able to run in parallel.

To test this change, use cargo run --example motion_blur.

PR bevyengine#17688 broke motion vector computation, and therefore motion blur, because it enabled retention of `MeshInputUniform`s, and `MeshInputUniform`s contain the indices of the previous frame's transform and the previous frame's skinned mesh joint matrices. On frame N, if a `MeshInputUniform` is retained on GPU from the previous frame, the `previous_input_index` and `previous_skin_index` would refer to the indices for frame N - 2, not the index for frame N - 1. This patch fixes the problems. It solves these issues in two different ways, one for transforms and one for skins: 1. To fix transforms, this patch supplies the *frame index* to the shader as part of the view uniforms, and specifies which frame index each mesh's previous transform refers to. So, in the situation described above, the frame index would be N, the previous frame index would be N - 1, and the `previous_input_frame_number` would be N - 2. The shader can now detect this situation and infer that the mesh has been retained, and can therefore conclude that the mesh's transform hasn't changed. 2. To fix skins, this patch replaces the explicit `previous_skin_index` with an invariant that the index of the joints for the current frame and the index of the joints for the previous frame are the same. This means that the `MeshInputUniform` never has to be updated even if the skin is animated. The downside is that we have to copy joint matrices from the previous frame's buffer to the current frame's buffer in `extract_skins`. The rationale behind (2) is that we currently have no mechanism to detect when joints that affect a skin have been updated, short of comparing all the transforms and setting a flag for `extract_meshes_for_gpu_building` to consume, which would regress performance as we want `extract_skins` and `extract_meshes_for_gpu_building` to be able to run in parallel. To test this change, use `cargo run --example motion_blur`.

tychedelia

Makes sense. Tested on macOS.

…on-blur

crates/bevy_render/src/view/mod.rs

aevyrie · 2025-02-08T21:43:51Z

Motion blur seems to be broken, not sure if introduced here though

The car is not moving relative to the camera, but it is blurred. For comparison, this is what it should look like (from 0.15):

After poking for a bit, it looks like motion vectors are just the camera motion?

Getting rid of previous_input_is_valid makes the output more correct - the car is not blurred, but meshes like trees are super blurred and flicker.

Okay, even more interesting. If I get rid of previous_input_is_valid, the first few moments look correct - everything is blurred as expected, until more trees come into view, at which point they become massively over blurred. This almost seems like an indexing error?

Ex: everything is correct except the trees in the distance on the left:

Then, once the car rounds the corner and a bunch of meshes come into view, all the newly visible/disoccluded meshes have broken motion vectors:

Huh, it also appears the previous_input_frame_count > view.frame_count always returns true, which seems very wrong?

…on-blur

aevyrie

Issues appear to be resolved in latest commit.

superdump · 2025-02-11T09:01:48Z

crates/bevy_pbr/src/render/mesh.rs

    /// Low 16 bits: index of the material inside the bind group data.
    /// High 16 bits: index of the lightmap in the binding array.
    pub material_and_lightmap_bind_group_slot: u32,
+    pub timestamp: u32,


Why call it timestamp if it's a frame count?

Maybe frame_count_when_updated or something like that? That seems to be the semantics of it.

superdump · 2025-02-11T09:03:08Z

crates/bevy_pbr/src/render/skin.rs

    }
 }
+
+pub fn mark_meshes_as_changed_if_their_skins_changed(


For consistency:

Suggested change

pub fn mark_meshes_as_changed_if_their_skins_changed(

pub fn mark_3d_meshes_as_changed_if_their_skins_changed(

It would be nice to port all this to 2D as well... :)

superdump

Just a minor variable naming thing for clarity. Otherwise LGTM.

…on-blur

Currently, Bevy rebuilds the buffer containing all the transforms for joints every frame, during the extraction phase. This is inefficient in cases in which many skins are present in the scene and their joints don't move, such as the Caldera test scene. To address this problem, this commit switches skin extraction to use a set of retained GPU buffers with allocations managed by the offset allocator. I use fine-grained change detection in order to determine which skins need updating. Note that the granularity is on the level of an entire skin, not individual joints. Using the change detection at that level would yield poor performance in common cases in which an entire skin is animated at once. Also, this patch yields additional performance from the fact that changing joint transforms no longer requires the skinned mesh to be re-extracted. Note that this optimization can be a double-edged sword. In `many_foxes`, fine-grained change detection regressed the performance of `extract_skins` by 3.4x. This is because every joint is updated every frame in that example, so change detection is pointless and is pure overhead. Because the `many_foxes` workload is actually representative of animated scenes, this patch includes a heuristic that disables fine-grained change detection if the number of transformed entities in the frame exceeds a certain fraction of the total number of joints. Currently, this threshold is set to 25%. Note that this is a crude heuristic, because it doesn't distinguish between the number of transformed *joints* and the number of transformed *entities*; however, it should be good enough to yield the optimum code path most of the time. Finally, this patch fixes a bug whereby skinned meshes are actually being incorrectly retained if the buffer offsets of the joints of those skinned meshes changes from frame to frame. To fix this without retaining skins, we would have to re-extract every skinned mesh every frame. Doing this was a significant regression on Caldera. With this PR, by contrast, mesh joints stay at the same buffer offset, so we don't have to update the `MeshInputUniform` containing the buffer offset every frame. This also makes PR bevyengine#17717 easier to implement, because that PR uses the buffer offset from the previous frame, and the logic for calculating that is simplified if the previous frame's buffer offset is guaranteed to be identical to that of the current frame. On Caldera, this patch reduces the time spent in `extract_skins` from 1.79 ms to near zero. On `many_foxes`, this patch regresses the performance of `extract_skins` by approximately 10%-25%, depending on the number of foxes. This has only a small impact on frame rate.

pcwalton · 2025-02-12T06:02:50Z

I'd like #17818 to land first as that will simplify things.

Currently, Bevy rebuilds the buffer containing all the transforms for joints every frame, during the extraction phase. This is inefficient in cases in which many skins are present in the scene and their joints don't move, such as the Caldera test scene. To address this problem, this commit switches skin extraction to use a set of retained GPU buffers with allocations managed by the offset allocator. I use fine-grained change detection in order to determine which skins need updating. Note that the granularity is on the level of an entire skin, not individual joints. Using the change detection at that level would yield poor performance in common cases in which an entire skin is animated at once. Also, this patch yields additional performance from the fact that changing joint transforms no longer requires the skinned mesh to be re-extracted. Note that this optimization can be a double-edged sword. In `many_foxes`, fine-grained change detection regressed the performance of `extract_skins` by 3.4x. This is because every joint is updated every frame in that example, so change detection is pointless and is pure overhead. Because the `many_foxes` workload is actually representative of animated scenes, this patch includes a heuristic that disables fine-grained change detection if the number of transformed entities in the frame exceeds a certain fraction of the total number of joints. Currently, this threshold is set to 25%. Note that this is a crude heuristic, because it doesn't distinguish between the number of transformed *joints* and the number of transformed *entities*; however, it should be good enough to yield the optimum code path most of the time. Finally, this patch fixes a bug whereby skinned meshes are actually being incorrectly retained if the buffer offsets of the joints of those skinned meshes changes from frame to frame. To fix this without retaining skins, we would have to re-extract every skinned mesh every frame. Doing this was a significant regression on Caldera. With this PR, by contrast, mesh joints stay at the same buffer offset, so we don't have to update the `MeshInputUniform` containing the buffer offset every frame. This also makes PR #17717 easier to implement, because that PR uses the buffer offset from the previous frame, and the logic for calculating that is simplified if the previous frame's buffer offset is guaranteed to be identical to that of the current frame. On Caldera, this patch reduces the time spent in `extract_skins` from 1.79 ms to near zero. On `many_foxes`, this patch regresses the performance of `extract_skins` by approximately 10%-25%, depending on the number of foxes. This has only a small impact on frame rate.

…on-blur

pcwalton · 2025-02-18T08:59:00Z

This should be ready to go assuming CI is green.

pcwalton requested review from IceSentry, JMS55, aevyrie and tychedelia February 7, 2025 03:03

pcwalton added C-Bug An unexpected or incorrect behavior P-Regression Functionality that used to work but no longer does. Add a test for this! A-Rendering Drawing game state to the screen S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Feb 7, 2025

tychedelia approved these changes Feb 7, 2025

View reviewed changes

pcwalton added 2 commits February 7, 2025 15:17

Merge remote-tracking branch 'origin/main' into count-frames-for-moti…

bae2d3b

…on-blur

Fix meshlets

636e58b

IceSentry reviewed Feb 8, 2025

View reviewed changes

crates/bevy_render/src/view/mod.rs Outdated Show resolved Hide resolved

Remove FrameNumber in favor of FrameCount

fa60b0a

pcwalton mentioned this pull request Feb 8, 2025

Retain bins from frame to frame. #17698

Merged

alice-i-cecile added this to the 0.16 milestone Feb 10, 2025

pcwalton added 2 commits February 10, 2025 23:14

Merge remote-tracking branch 'origin/main' into count-frames-for-moti…

f202a12

…on-blur

Try to fix motion blur properly.

ea2029d

aevyrie approved these changes Feb 11, 2025

View reviewed changes

superdump reviewed Feb 11, 2025

View reviewed changes

superdump approved these changes Feb 11, 2025

View reviewed changes

pcwalton added 3 commits February 11, 2025 11:26

Merge remote-tracking branch 'origin/main' into count-frames-for-moti…

9506251

…on-blur

wip

b21cf8c

pcwalton mentioned this pull request Feb 12, 2025

Retain skins from frame to frame. #17818

Merged

pcwalton added the S-Blocked This cannot move forward until something else changes label Feb 12, 2025

Fix meshlets

3b9c765

pcwalton added 5 commits February 11, 2025 23:46

Fix ambiguity

d58aae4

Make SkinByteOffset not derive Component

f043ffb

Leave enough space at the end of the buffer if we use a uniform buffer.

c048c79

Merge remote-tracking branch 'origin/main' into retain-skins

c55ce3f

Merge branch 'retain-skins' into count-frames-for-motion-blur

35a53c2

pcwalton added 3 commits February 17, 2025 21:50

Merge remote-tracking branch 'origin/main' into count-frames-for-moti…

354538d

…on-blur

Warning police

124ac32

Merge remote-tracking branch 'origin/main' into count-frames-for-moti…

3308b93

…on-blur

superdump approved these changes Feb 18, 2025

View reviewed changes

superdump added this pull request to the merge queue Feb 18, 2025

Merged via the queue into bevyengine:main with commit 0517b96 Feb 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix motion vector computation after #17688.#17717

Fix motion vector computation after #17688.#17717
superdump merged 18 commits intobevyengine:mainfrom
pcwalton:count-frames-for-motion-blur

pcwalton commented Feb 7, 2025

Uh oh!

tychedelia left a comment

Uh oh!

Uh oh!

aevyrie commented Feb 8, 2025 •

edited

Loading

Uh oh!

aevyrie left a comment

Uh oh!

superdump Feb 11, 2025

Uh oh!

superdump Feb 11, 2025

Uh oh!

superdump Feb 11, 2025

Uh oh!

superdump left a comment

Uh oh!

pcwalton commented Feb 12, 2025

Uh oh!

pcwalton commented Feb 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

	pub fn mark_meshes_as_changed_if_their_skins_changed(
	pub fn mark_3d_meshes_as_changed_if_their_skins_changed(

Uh oh!

Conversation

pcwalton commented Feb 7, 2025

Uh oh!

tychedelia left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aevyrie commented Feb 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aevyrie left a comment

Choose a reason for hiding this comment

Uh oh!

superdump Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

superdump Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

superdump Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

superdump left a comment

Choose a reason for hiding this comment

Uh oh!

pcwalton commented Feb 12, 2025

Uh oh!

pcwalton commented Feb 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

aevyrie commented Feb 8, 2025 •

edited

Loading