Skip to content

Remove render_resource_wrapper#15441

Merged
alice-i-cecile merged 1 commit intobevyengine:mainfrom
bes:remove-erased-render-device
Sep 30, 2024
Merged

Remove render_resource_wrapper#15441
alice-i-cecile merged 1 commit intobevyengine:mainfrom
bes:remove-erased-render-device

Conversation

@bes
Copy link
Copy Markdown
Contributor

@bes bes commented Sep 26, 2024

Objective

  • Remove all uses of render_resource_wrapper.
  • Make it easier to share a wgpu::Device between Bevy and application code.

Solution

Removed the render_resource_wrapper macro.

To improve the RenderCreation:: Manual API, ErasedRenderDevice was replaced by Arc. Unfortunately I had to introduce one more usage of WgpuWrapper which seems like an unwanted constraint on the caller.

Testing

  • Did you test these changes? If so, how?

    • Ran cargo test.
    • Ran a few examples.
    • Used RenderCreation::Manual in my own project
    • Exercised RenderCreation::Automatic through examples
  • Are there any parts that need more testing?

    • No
  • How can other people (reviewers) test your changes? Is there anything specific they need to know?

    • Run examples
    • Use RenderCreation::Manual in their own project

@github-actions
Copy link
Copy Markdown
Contributor

Welcome, new contributor!

Please make sure you've read our contributing guide and we look forward to reviewing your pull request shortly ✨

@bes bes force-pushed the remove-erased-render-device branch from ce23c9b to cea2a10 Compare September 26, 2024 11:15
@bes bes changed the title Remove ErasedRenderDevice Remove render_resource_wrapper Sep 26, 2024
@bes
Copy link
Copy Markdown
Contributor Author

bes commented Sep 26, 2024

Looks like CI / build-wasm-atomics doesn't like this change.

Here's one of many similar errors:

error[E0277]: `(dyn std::any::Any + 'static)` cannot be shared between threads safely
   --> crates/bevy_render/src/view/window/mod.rs:51:40
    |
51  |             render_app.init_resource::<ScreenshotToScreenPipeline>();
    |                        -------------   ^^^^^^^^^^^^^^^^^^^^^^^^^^ `(dyn std::any::Any + 'static)` cannot be shared between threads safely
    |                        |
    |                        required by a bound introduced by this call
    |
    = help: the trait `Sync` is not implemented for `(dyn std::any::Any + 'static)`, which is required by `ScreenshotToScreenPipeline: bevy_ecs::system::Resource`
    = help: the trait `bevy_ecs::system::Resource` is implemented for `ScreenshotToScreenPipeline`
    = note: required for `Unique<(dyn std::any::Any + 'static)>` to implement `Sync`
note: required because it appears within the type `Box<(dyn std::any::Any + 'static)>`
   --> /home/runner/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:233:12
    |
233 | pub struct Box<
    |            ^^^
note: required because it appears within the type `wgpu::BindGroupLayout`
   --> /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wgpu-22.1.0/src/lib.rs:796:12
    |
796 | pub struct BindGroupLayout {
    |            ^^^^^^^^^^^^^^^
    = note: required for `Arc<wgpu::BindGroupLayout>` to implement `std::marker::Send`
note: required because it appears within the type `bind_group_layout::BindGroupLayout`
   --> crates/bevy_render/src/render_resource/bind_group_layout.rs:8:12

Funnily enough, I can't reproduce it on my mac with latest nightly 🤔

cargo +nightly check --target wasm32-unknown-unknown -Z build-std=std,panic_abort

@IQuick143 IQuick143 added the A-Rendering Drawing game state to the screen label Sep 26, 2024
@alice-i-cecile alice-i-cecile added C-Usability A targeted quality-of-life change that makes Bevy easier to use X-Contentious There are nontrivial implications that should be thought through labels Sep 26, 2024
@alice-i-cecile alice-i-cecile added the S-Needs-Review Needs reviewer attention (from anyone!) to move forward label Sep 26, 2024
@tychedelia
Copy link
Copy Markdown
Member

Looks like CI / build-wasm-atomics doesn't like this change.

I think (?) we need to add WgpuWrapper around a bunch of these, which is gross but what I expected from removing the macro.

@bes bes force-pushed the remove-erased-render-device branch from cea2a10 to 1340167 Compare September 26, 2024 17:25
@bes
Copy link
Copy Markdown
Contributor Author

bes commented Sep 26, 2024

It was a bit of an ick, but i wrapped everything in WgpuWrapper now...

@bes bes force-pushed the remove-erased-render-device branch 2 times, most recently from ff6a82a to d25e7ac Compare September 26, 2024 18:15
@bes bes force-pushed the remove-erased-render-device branch from d25e7ac to 78d5671 Compare September 26, 2024 18:52
@robtfm
Copy link
Copy Markdown
Contributor

robtfm commented Sep 26, 2024

since it seems like extra functionality (beyond the compile time optimizations) have been added to the render_resource_wrapper/ErasedXXX types, i'm wondering if we should just remove the debug version and continue to use the release version. or maybe we could wrap up the functionality into a single struct which encapsulates the current macro's release-mode behaviour.

i don't think removing the compile-time optimizations should be resulting in any call-site changes though...

@bes
Copy link
Copy Markdown
Contributor Author

bes commented Sep 26, 2024

since it seems like extra functionality (beyond the compile time optimizations) have been added to the render_resource_wrapper/ErasedXXX types, i'm wondering if we should just remove the debug version and continue to use the release version. or maybe we could wrap up the functionality into a single struct which encapsulates the current macro's release-mode behaviour.

i don't think removing the compile-time optimizations should be resulting in any call-site changes though...

This will work for everything except wgpu::Device which I need to be shared between the application and Bevy. Basically what you're asking for is what was in the PR to begin with, I think?

But I'm wondering what extra functionality you mean? Deref, or the atomics stuff, or both?

@mockersf
Copy link
Copy Markdown
Member

Could you check the compilation duration with this PR? The erased types were introduce for compilation speed, see #5950

@robtfm
Copy link
Copy Markdown
Contributor

robtfm commented Sep 26, 2024

Basically what you're asking for is what was in the PR to begin with, I think?

maybe? sorry i'm not sure what was in the pr to begin with (i looked when it only addressed the RenderDevice but not since then).

what extra functionality you mean? Deref, or the atomics stuff, or both?

both - i think the call sites should all be unchanged. The macro can (i think) be replaced with a struct that encapsulates the extra behaviour that's been added on top of the compile time optimizations (which were only ever in the debug version of the macro).

@bes
Copy link
Copy Markdown
Contributor Author

bes commented Sep 26, 2024

both - i think the call sites should all be unchanged. The macro can (i think) be replaced with a struct that encapsulates the extra behaviour that's been added on top of the compile time optimizations (which were only ever in the debug version of the macro).

Any external user that depends on Bevy and wants to use a shared wgpu::Device would need to use this new Bevy-defined struct instead of WgpuWrapper, which doesn't change much, I think? Except that we can implement Deref for it and it doesn't change the call sites.

Do you mean something like this (just a sketch):

struct BevyArc<T> where T: Send + Sync {
    inner: Arc<send_wrapper::SendWrapper<T>>,
}

impl<T: Send + Sync> Deref for BevyArc<T> { //...

And then remove WgpuWrapper?

@bes
Copy link
Copy Markdown
Contributor Author

bes commented Sep 26, 2024

It'a bit depressing that we can't just use an Arc, but I'm also not familiar with the problem send_wrapper solves for Bevy, so I am keeping an open mind here.

@tychedelia
Copy link
Copy Markdown
Member

It'a bit depressing that we can't just use an Arc, but I'm also not familiar with the problem send_wrapper solves for Bevy, so I am keeping an open mind here.

I'm not familiar with the history here, I also agree it's a bit unfortunate we can't just use the arc'd types as well as wgpu's ids. But there's backstory here I'm sure.

@bes
Copy link
Copy Markdown
Contributor Author

bes commented Sep 27, 2024

I would love some guidance on what to do here. I just wanted a slightly better API for RenderCreation:: Manual 😅

Copy link
Copy Markdown
Contributor

@robtfm robtfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok sorry, after a closer look i think this will be totally fine.

moving the SurfaceTexture into_inner up a level should avoid the api/callsite changes completely.

then we just need to run compile timings - are you happy to do that or shall i do it?

Copy link
Copy Markdown
Contributor Author

@bes bes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied the given suggestions. Thanks!

@bes bes force-pushed the remove-erased-render-device branch from 78d5671 to 533fa23 Compare September 27, 2024 18:34
@bes
Copy link
Copy Markdown
Contributor Author

bes commented Sep 27, 2024

then we just need to run compile timings - are you happy to do that or shall i do it?

I can do it if you teach me how to, but I also think that you doing it yourself might lead to better results. If you don't have an M-series mac on hand, I can do it for mac.

Copy link
Copy Markdown
Member

@tychedelia tychedelia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Approving pending no major regression in compile time.

@robtfm
Copy link
Copy Markdown
Contributor

robtfm commented Sep 27, 2024

It’s just cargo build --timings, check before and after. If the issue persists (or rather is back, since I did check it was gone a few months ago) you’ll see a ~30% slowdown in debug builds. Release would be unaffected since the macro was only type-erasing in debug.

#5950 had a particularly bad userspace example in the top comment that would get like 95% worse as well.

I’ll run windows tomorrow if nobody else does it in the meantime

@bes
Copy link
Copy Markdown
Contributor Author

bes commented Sep 27, 2024

MacBook Pro M2 Max - 32GB RAM

Before my changes - commit 5fcbdc1

$ cargo clean
     Removed 148120 files, 123.9GiB total
$ cargo build --timings
...
Finished `dev` profile [unoptimized + debuginfo] target(s) in 47.31s
image image

@bes
Copy link
Copy Markdown
Contributor Author

bes commented Sep 27, 2024

MacBook Pro M2 Max - 32GB RAM

After my changes - commit 533fa23

$ cargo clean
     Removed 19794 files, 6.0GiB total
$ cargo build --timings
...
Finished `dev` profile [unoptimized + debuginfo] target(s) in 43.66s
image image

Meaning - On my machine, compilation speed decreased by 4 seconds for the debug build (7.7% shorter)

@bes
Copy link
Copy Markdown
Contributor Author

bes commented Sep 27, 2024

For release (cargo build --release --timings) the numbers are:
Before - Finished release profile [optimized] target(s) in 1m 33s
After - Finished release profile [optimized] target(s) in 1m 32s

@bes bes force-pushed the remove-erased-render-device branch from 533fa23 to 807acb0 Compare September 27, 2024 21:19
@robtfm
Copy link
Copy Markdown
Contributor

robtfm commented Sep 27, 2024

Looks like the wasm issue is back. I think that’s because WgpuWrapper derives Deref even though its contents differ by platform. It should probably have a manual impl with target=T in the wasm case, though I don’t know what the knock on effects would be.

@bes
Copy link
Copy Markdown
Contributor Author

bes commented Sep 28, 2024

Looks like the wasm issue is back. I think that’s because WgpuWrapper derives Deref even though its contents differ by platform. It should probably have a manual impl with target=T in the wasm case, though I don’t know what the knock on effects would be.

I think the fix is easy, go back to the coersion i used before: layout: layout.as_ref().map(|layout| -> &PipelineLayout { layout }),. If you think it's unclear, I can add comments above to explain.

As I wrote in the review comment:

Your suggestion doesn't work, since WgpuWrapper has a different type signature between wasm32 and everything else, so your specific number of derefs &** only works for the non-wasm32-version.

My version works reliably for both, because the compiler will insert the correct number of derefs at compile time.

What do you want me to do?

@robtfm
Copy link
Copy Markdown
Contributor

robtfm commented Sep 28, 2024

Yes go ahead, your original way makes sense. Sorry!

This commit removes the render_resource_wrapper macro, which was
added because using Arc was deemed too slow. After discussing
on Discord, it was suggested that we could remove the macro and
just wrap the affected types in Arc instead.
@bes bes force-pushed the remove-erased-render-device branch from 807acb0 to b368a0d Compare September 28, 2024 06:44
@bes
Copy link
Copy Markdown
Contributor Author

bes commented Sep 28, 2024

Yes go ahead, your original way makes sense. Sorry!

Great! Updated! Let me know how to proceed.

Copy link
Copy Markdown
Contributor

@robtfm robtfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

windows debug build 1m29s main vs 1m26s pr.

lgtm

@tychedelia tychedelia added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Sep 28, 2024
@alice-i-cecile alice-i-cecile added this pull request to the merge queue Sep 30, 2024
Merged via the queue into bevyengine:main with commit 72aaa41 Sep 30, 2024
robtfm pushed a commit to robtfm/bevy that referenced this pull request Oct 4, 2024
# Objective

* Remove all uses of render_resource_wrapper.
* Make it easier to share a `wgpu::Device` between Bevy and application
code.

## Solution

Removed the `render_resource_wrapper` macro.

To improve the `RenderCreation:: Manual ` API, `ErasedRenderDevice` was
replaced by `Arc`. Unfortunately I had to introduce one more usage of
`WgpuWrapper` which seems like an unwanted constraint on the caller.

## Testing

- Did you test these changes? If so, how?
    - Ran `cargo test`.
    - Ran a few examples.
    - Used `RenderCreation::Manual` in my own project
    - Exercised `RenderCreation::Automatic` through examples

- Are there any parts that need more testing?
    - No

- How can other people (reviewers) test your changes? Is there anything
specific they need to know?
    - Run examples
    - Use `RenderCreation::Manual` in their own project
@cart
Copy link
Copy Markdown
Member

cart commented Mar 12, 2025

Hmm this changeset costs 10 seconds on current main for me on linux (1 min 2 seconds vs 52 seconds). Perhaps this optimization only comes into play on linux?

@robtfm
Copy link
Copy Markdown
Contributor

robtfm commented Mar 12, 2025

rerunning on windows with rustc 1.85
pr is 1m31s, df23b93 (the prior commit) is 1m34s

so win still looks good. pretty surprising, i thought this was a compiler trait-bound rabbit hole which shouldn't be platform dependent at all...

@mockersf
Copy link
Copy Markdown
Member

I also see a slowdown with this PR on macos on a m4, a bit on a smaller scale that what Cart is reporting, but with more variation (running on a laptop, and efficiency/performance cores)

To run a good benchmark for this kind of things, you should use something like hyperfine.

git checkout 72aaa41603d28d5d6ed6906f57f6c596ba3868c3 # this PR
hyperfine --prepare 'sleep 1; cargo clean' 'RUSTC_WRAPPER= cargo build --release'
git checkout HEAD~1
hyperfine --prepare 'sleep 1; cargo clean' 'RUSTC_WRAPPER= cargo build --release'

This PR:

  Time (mean ± σ):     81.554 s ±  4.136 s    [User: 1059.526 s, System: 38.421 s]
  Range (min … max):   73.708 s … 85.835 s    10 runs

main just before:

  Time (mean ± σ):     77.674 s ±  7.937 s    [User: 1019.559 s, System: 37.408 s]
  Range (min … max):   62.272 s … 83.375 s    10 runs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Rendering Drawing game state to the screen C-Usability A targeted quality-of-life change that makes Bevy easier to use S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it X-Contentious There are nontrivial implications that should be thought through

Projects

No open projects
Status: Done

Development

Successfully merging this pull request may close these issues.

7 participants