Batch skinned meshes on platforms where storage buffers are available. by pcwalton · Pull Request #16599 · bevyengine/bevy

pcwalton · 2024-12-02T04:56:39Z

This commit makes skinned meshes batchable on platforms other than WebGL 2. On supported platforms, it replaces the two uniform buffers used for joint matrices with a pair of storage buffers containing all matrices for all skinned meshes packed together. The indices into the buffer are stored in the mesh uniform and mesh input uniform. The GPU mesh preprocessing step copies the indices in if that step is enabled.

On the many_foxes demo, I observed a frame time decrease from 15.470ms to 11.935ms. This is the result of reducing the submit_graph_commands time from an average of 5.45ms to 0.489ms, an 11x speedup in that portion of rendering.

This is what the profile looks like for many_foxes after these changes.

This commit makes skinned meshes batchable on platforms other than WebGL 2. On such platforms, it replaces the two uniform buffers used for joint matrices with a pair of storage buffers containing all matrices for all skinned meshes concatenated together. The indices into the buffer are stored in the mesh uniform and mesh input uniform. The GPU mesh preprocessing step copies the indices in if that step is enabled. On the `many_foxes` demo, I observed a frame time decrease from 15.470ms to 11.935ms. This is the result of reducing the `submit_graph_commands` time from an average of 5.45ms to 0.489ms, an 11x speedup in that portion of rendering.

crates/bevy_pbr/src/render/skinning.wgsl

JMS55 · 2024-12-05T18:06:43Z

I'm not sure if this is working? I ran many_foxes through renderdoc, and I still see hundreds of vkCmdDraws, each with two vkCmdBindDescriptorSets before it. One for StandardMaterial, one for skinned_mesh_bind_group. I would've expected only one vkCmdBindDescriptorSets for skinned_mesh_bind_group, and then since the meshes are the same we don't have to rebind StandardMaterial either, and then you would be able to collapse all the draws down to one draw with instance_count=N. Am I missing why this doesn't seem to reduce commands?

pcwalton · 2024-12-05T21:41:40Z

Hmm, it works for me on my G14 Windows 11, with many_foxes --count=100 (it's a low number because I'm in a debug build and don't want to kill my system). What platform are you on?

JMS55 · 2024-12-05T21:45:28Z

Windows, Intel 155h.

pcwalton · 2024-12-05T22:02:01Z

It works on my M2 mac mini too. Hmm.

JMS55

Works fine on my desktop.

kristoff3r

I'm not super familiar with the skinning system, but the code looks consistent and I tested many_foxes on both Linux+webgl2 and both works.

Elabajaba · 2024-12-10T00:06:41Z

Running into a panic with this on the animated_fox example.

thread 'Compute Task Pool (9)' panicked at crates/bevy_pbr/src/render/mesh.rs:927:57:
index out of bounds: the len is 20 but the index is 20
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Encountered a panic in system `bevy_pbr::render::mesh::collect_meshes_for_gpu_building`!

backtrace

thread 'Compute Task Pool (9)' panicked at crates/bevy_pbr/src/render/mesh.rs:927:57:
index out of bounds: the len is 16 but the index is 21
stack backtrace:
   0: rust_begin_unwind
             at /rustc/90b35a6239c3d8bdabc530a6a0816f7ff89a0aaf/library/std/src/panicking.rs:665:5
   1: core::panicking::panic_fmt
             at /rustc/90b35a6239c3d8bdabc530a6a0816f7ff89a0aaf/library/core/src/panicking.rs:74:14
   2: core::panicking::panic_bounds_check
             at /rustc/90b35a6239c3d8bdabc530a6a0816f7ff89a0aaf/library/core/src/panicking.rs:276:5
   3: <usize as core::slice::index::SliceIndex<[T]>>::index
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/index.rs:301:10
   4: core::slice::index::<impl core::ops::index::Index<I> for [T]>::index
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/index.rs:16:9
   5: <alloc::vec::Vec<T,A> as core::ops::index::Index<I>>::index
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/vec/mod.rs:3323:9
   6: bevy_pbr::render::mesh::RenderMeshInstanceGpuBuilder::update
             at ./crates/bevy_pbr/src/render/mesh.rs:927:57
   7: bevy_pbr::render::mesh::collect_meshes_for_gpu_building
             at ./crates/bevy_pbr/src/render/mesh.rs:1424:21
   8: core::ops::function::FnMut::call_mut
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:166:5
   9: core::ops::function::impls::<impl core::ops::function::FnMut<A> for &mut F>::call_mut
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:294:13
  10: <Func as bevy_ecs::system::function_system::SystemParamFunction<fn(F0,F1,F2,F3,F4,F5) .> Out>>::run::call_inner
             at ./crates/bevy_ecs/src/system/function_system.rs:998:21
  11: <Func as bevy_ecs::system::function_system::SystemParamFunction<fn(F0,F1,F2,F3,F4,F5) .> Out>>::run
             at ./crates/bevy_ecs/src/system/function_system.rs:1001:17
  12: <bevy_ecs::system::function_system::FunctionSystem<Marker,F> as bevy_ecs::system::system::System>::run_unsafe
             at ./crates/bevy_ecs/src/system/function_system.rs:793:19
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Encountered a panic in system `bevy_pbr::render::mesh::collect_meshes_for_gpu_building`!

bushrat011899 · 2024-12-10T00:08:09Z

Following up from Discord, I tested this branch on an Intel i5-1240P with the following Renderdoc settings:

Capturing once yielded:

With the following statistics:

I believe this is sufficient evidence that the feature works on Intel iGPUs, but I'm still learning this tooling so take with a grain of salt.

Elabajaba · 2024-12-10T00:16:12Z

Looks like motion vectors are also broken with this (tested by adding the MotionVectorPrepass to the camera in many_foxes).

2024-12-10T00:12:26.608557Z ERROR wgpu::backend::wgpu_core: Handling wgpu errors as fatal by default
thread 'Compute Task Pool (8)' panicked at /home/evan/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wgpu-23.0.1/src/backend/wgpu_core.rs:2976:18:
wgpu error: Validation Error

Caused by:
  In RenderPass::end
    In a set_bind_group command
      BindGroup with 'skinned_motion_mesh_bind_group' label 1 expects 0 dynamic offset. However 1 dynamic offset were provided.

backtrace

stack backtrace:
   0: rust_begin_unwind
             at /rustc/90b35a6239c3d8bdabc530a6a0816f7ff89a0aaf/library/std/src/panicking.rs:665:5
   1: core::panicking::panic_fmt
             at /rustc/90b35a6239c3d8bdabc530a6a0816f7ff89a0aaf/library/core/src/panicking.rs:74:14
   2: wgpu::backend::wgpu_core::default_error_handler
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wgpu-23.0.1/src/backend/wgpu_core.rs:3050:5
   3: wgpu::backend::wgpu_core::ErrorSinkRaw::handle_error
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wgpu-23.0.1/src/backend/wgpu_core.rs:3034:21
   4: wgpu::backend::wgpu_core::ContextWgpuCore::handle_error_inner
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wgpu-23.0.1/src/backend/wgpu_core.rs:299:9
   5: wgpu::backend::wgpu_core::ContextWgpuCore::handle_error
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wgpu-23.0.1/src/backend/wgpu_core.rs:311:9
   6: <wgpu::backend::wgpu_core::ContextWgpuCore as wgpu::context::Context>::render_pass_end
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wgpu-23.0.1/src/backend/wgpu_core.rs:2976:13
   7: <T as wgpu::context::DynContext>::render_pass_end
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wgpu-23.0.1/src/context.rs:2682:9
   8: <wgpu::api::render_pass::RenderPassInner as core::ops::drop::Drop>::drop
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wgpu-23.0.1/src/api/render_pass.rs:15:13
   9: core::ptr::drop_in_place<wgpu::api::render_pass::RenderPassInner>
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:574:1
  10: core::ptr::drop_in_place<wgpu::api::render_pass::RenderPass>
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:574:1
  11: core::ptr::drop_in_place<bevy_render::render_phase::draw_state::TrackedRenderPass>
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:574:1
  12: core::mem::drop
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/mem/mod.rs:943:24
  13: <bevy_core_pipeline::prepass::node::PrepassNode as bevy_render::render_graph::node::ViewNode>::run::{{closure}}
             at ./crates/bevy_core_pipeline/src/prepass/node.rs:172:13
  14: core::ops::function::FnOnce::call_once{{vtable.shim}}
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:250:5
  15: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:2454:9
  16: bevy_render::renderer::RenderContext::finish::{{closure}}::{{closure}}
             at ./crates/bevy_render/src/renderer/mod.rs:511:41
  17: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::future::future::Future>::poll
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panic/unwind_safe.rs:297:9
  18: <futures_lite::future::CatchUnwind<F> as core::future::future::Future>::poll::{{closure}}
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-lite-2.3.0/src/future.rs:588:42
  19: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panic/unwind_safe.rs:272:9
  20: std::panicking::try::do_call
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:557:40
  21: __rust_try
  22: std::panicking::try
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:520:19
  23: std::panic::catch_unwind
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:358:14
  24: <futures_lite::future::CatchUnwind<F> as core::future::future::Future>::poll
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-lite-2.3.0/src/future.rs:588:9
  25: async_executor::Executor::spawn_inner::{{closure}}
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-executor-1.13.0/src/lib.rs:250:20
  26: async_task::raw::RawTask<F,T,S,M>::run::{{closure}}
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-task-4.7.1/src/raw.rs:550:21
  27: core::ops::function::FnOnce::call_once
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:250:5
  28: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panic/unwind_safe.rs:272:9
  29: std::panicking::try::do_call
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:557:40
  30: __rust_try
  31: std::panicking::try
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:520:19
  32: std::panic::catch_unwind
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:358:14
  33: async_task::raw::RawTask<F,T,S,M>::run
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-task-4.7.1/src/raw.rs:549:23
  34: async_task::runnable::Runnable<M>::run
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-task-4.7.1/src/runnable.rs:781:18
  35: async_executor::State::run::{{closure}}::{{closure}}
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-executor-1.13.0/src/lib.rs:741:21
  36: <futures_lite::future::Or<F1,F2> as core::future::future::Future>::poll
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-lite-2.3.0/src/future.rs:449:33
  37: async_executor::State::run::{{closure}}
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-executor-1.13.0/src/lib.rs:748:32
  38: async_executor::Executor::run::{{closure}}
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-executor-1.13.0/src/lib.rs:344:34
  39: futures_lite::future::block_on::{{closure}}
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-lite-2.3.0/src/future.rs:99:19
  40: std::thread::local::LocalKey<T>::try_with
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:283:12
  41: std::thread::local::LocalKey<T>::with
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:260:9
  42: futures_lite::future::block_on
             at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-lite-2.3.0/src/future.rs:78:5
  43: bevy_tasks::task_pool::TaskPool::new_internal::{{closure}}::{{closure}}::{{closure}}::{{closure}}
             at ./crates/bevy_tasks/src/task_pool.rs:171:37
  44: std::panicking::try::do_call
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:557:40
  45: __rust_try
  46: std::panicking::try
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:520:19
  47: std::panic::catch_unwind
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:358:14
  48: bevy_tasks::task_pool::TaskPool::new_internal::{{closure}}::{{closure}}::{{closure}}
             at ./crates/bevy_tasks/src/task_pool.rs:165:43
  49: std::thread::local::LocalKey<T>::try_with
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:283:12
  50: std::thread::local::LocalKey<T>::with
             at /home/me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:260:9
  51: bevy_tasks::task_pool::TaskPool::new_internal::{{closure}}::{{closure}}
             at ./crates/bevy_tasks/src/task_pool.rs:158:25
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Encountered a panic in system `bevy_render::renderer::render_system`!

pcwalton · 2024-12-10T01:10:12Z

I fixed the motion vectors issue, but was unable to reproduce the animated_fox issue on either debug or release on my Windows laptop or Mac mini.

Elabajaba · 2024-12-10T01:53:28Z

I fixed the motion vectors issue, but was unable to reproduce the animated_fox issue on either debug or release on my Windows laptop or Mac mini.

I can't reproduce the animated_fox crash after the latest main was merged in 🤷

Motion vectors no longer crash but they also don't appear to be working for skinned meshes now.

Motion vector prepass texture once everything is rendered (left main, right this PR):

alice-i-cecile · 2024-12-10T03:00:02Z

This seems fine to merge, but I'm doing a manual example run here to check for regressions. Ping me once that's done / tomorrow if I forget.

pcwalton · 2024-12-10T05:06:23Z

Ok, the skinned mesh motion vector thing should be fixed now.

Elabajaba

Can confirm that motion vectors for skinned meshes appear to be fixed!

pcwalton · 2024-12-10T05:21:23Z

I believe the CI failure is #15981.

pcwalton · 2024-12-10T08:24:52Z

Example run is green.

bevyengine#16599) This commit makes skinned meshes batchable on platforms other than WebGL 2. On supported platforms, it replaces the two uniform buffers used for joint matrices with a pair of storage buffers containing all matrices for all skinned meshes packed together. The indices into the buffer are stored in the mesh uniform and mesh input uniform. The GPU mesh preprocessing step copies the indices in if that step is enabled. On the `many_foxes` demo, I observed a frame time decrease from 15.470ms to 11.935ms. This is the result of reducing the `submit_graph_commands` time from an average of 5.45ms to 0.489ms, an 11x speedup in that portion of rendering. ![Screenshot 2024-12-01 192838](https://github.com/user-attachments/assets/7d2db997-8939-466e-8b9e-050d4a6a78ee) This is what the profile looks like for `many_foxes` after these changes. ![Screenshot 2024-12-01 193026](https://github.com/user-attachments/assets/68983fc3-01b8-41fd-835e-3d93cb65d0fa) --------- Co-authored-by: François Mockers <mockersf@gmail.com>

pcwalton requested a review from IceSentry December 2, 2024 04:57

pcwalton added C-Performance A change motivated by improving speed, memory usage or compile times A-Animation Make things move and change over time A-Rendering Drawing game state to the screen S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Dec 2, 2024

pcwalton added 2 commits December 1, 2024 21:53

Unbreak meshlets

7b76e54

rustfmt police

281724e

JMS55 reviewed Dec 3, 2024

View reviewed changes

crates/bevy_pbr/src/render/skinning.wgsl Outdated Show resolved Hide resolved

pcwalton added 2 commits December 4, 2024 09:22

Address review comment

25584f8

Merge remote-tracking branch 'origin/main' into batch-skinned-meshes

032ac1c

pcwalton requested a review from kristoff3r December 4, 2024 17:28

pcwalton added 3 commits December 4, 2024 11:03

Rustfmt police

8f9a6b1

Merge remote-tracking branch 'origin/main' into batch-skinned-meshes

257cbdb

Try harder to fix meshlets

a3c324e

pcwalton requested a review from JMS55 December 5, 2024 17:00

Merge remote-tracking branch 'origin/main' into batch-skinned-meshes

81ddb04

Merge remote-tracking branch 'origin/main' into batch-skinned-meshes

ae6c377

JMS55 approved these changes Dec 6, 2024

View reviewed changes

kristoff3r approved these changes Dec 6, 2024

View reviewed changes

pcwalton added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Dec 6, 2024

Merge remote-tracking branch 'origin/main' into batch-skinned-meshes

f25101e

Fix motion vectors

8792d33

pcwalton added 2 commits December 9, 2024 21:05

Fix motion vector computation for skinned meshes harder

c7ae7b3

Merge remote-tracking branch 'origin/main' into batch-skinned-meshes

b7029da

Elabajaba approved these changes Dec 10, 2024

View reviewed changes

Merge branch 'main' into batch-skinned-meshes

537ae19

github-actions bot mentioned this pull request Dec 10, 2024

16599 bevyengine/bevy-example-runner#62

Closed

alice-i-cecile added this pull request to the merge queue Dec 10, 2024

Merged via the queue into bevyengine:main with commit 3188e5a Dec 10, 2024

thomasza92 mentioned this pull request Dec 18, 2025

Batch skinned meshes #11633

Open

Uh oh!

Conversation

pcwalton commented Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

JMS55 commented Dec 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pcwalton commented Dec 5, 2024

Uh oh!

JMS55 commented Dec 5, 2024

Uh oh!

pcwalton commented Dec 5, 2024

Uh oh!

JMS55 left a comment

Choose a reason for hiding this comment

Uh oh!

kristoff3r left a comment

Choose a reason for hiding this comment

Uh oh!

Elabajaba commented Dec 10, 2024

Uh oh!

bushrat011899 commented Dec 10, 2024

Uh oh!

Elabajaba commented Dec 10, 2024

Uh oh!

pcwalton commented Dec 10, 2024

Uh oh!

Elabajaba commented Dec 10, 2024

Uh oh!

alice-i-cecile commented Dec 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pcwalton commented Dec 10, 2024

Uh oh!

Elabajaba left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pcwalton commented Dec 10, 2024

Uh oh!

pcwalton commented Dec 10, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

pcwalton commented Dec 2, 2024 •

edited

Loading

JMS55 commented Dec 5, 2024 •

edited

Loading

alice-i-cecile commented Dec 10, 2024 •

edited

Loading

Elabajaba left a comment •

edited

Loading