DX12 increase max_storage_buffers_per_shader_stage and max_storage_textures_per_shader_stage#3798
Merged
teoxoy merged 5 commits intogfx-rs:trunkfrom May 30, 2023
Merged
Conversation
Contributor
Author
|
Edit: Fixed |
teoxoy
requested changes
May 25, 2023
JMS55
suggested changes
May 26, 2023
github-merge-queue bot
pushed a commit
to bevyengine/bevy
that referenced
this pull request
Jun 18, 2023
 # Objective - Add Screen space ambient occlusion (SSAO). SSAO approximates small-scale, local occlusion of _indirect_ diffuse light between objects. SSAO does not apply to direct lighting, such as point or directional lights. - This darkens creases, e.g. on staircases, and gives nice contact shadows where objects meet, giving entities a more "grounded" feel. - Closes #3632. ## Solution - Implement the GTAO algorithm. - https://www.activision.com/cdn/research/Practical_Real_Time_Strategies_for_Accurate_Indirect_Occlusion_NEW%20VERSION_COLOR.pdf - https://blog.selfshadow.com/publications/s2016-shading-course/activision/s2016_pbs_activision_occlusion.pdf - Source code heavily based on [Intel's XeGTAO](https://github.com/GameTechDev/XeGTAO/blob/0d177ce06bfa642f64d8af4de1197ad1bcb862d4/Source/Rendering/Shaders/XeGTAO.hlsli). - Add an SSAO bevy example. ## Algorithm Overview * Run a depth and normal prepass * Create downscaled mips of the depth texture (preprocess_depths pass) * GTAO pass - for each pixel, take several random samples from the depth+normal buffers, reconstruct world position, raytrace in screen space to estimate occlusion. Rather then doing completely random samples on a hemisphere, you choose random _slices_ of the hemisphere, and then can analytically compute the full occlusion of that slice. Also compute edges based on depth differences here. * Spatial denoise pass - bilateral blur, using edge detection to not blur over edges. This is the final SSAO result. * Main pass - if SSAO exists, sample the SSAO texture, and set occlusion to be the minimum of ssao/material occlusion. This then feeds into the rest of the PBR shader as normal. --- ## Future Improvements - Maybe remove the low quality preset for now (too noisy) - WebGPU fallback (see below) - Faster depth->world position (see reverted code) - Bent normals - Try interleaved gradient noise or spatiotemporal blue noise - Replace the spatial denoiser with a combined spatial+temporal denoiser - Render at half resolution and use a bilateral upsample - Better multibounce approximation (https://drive.google.com/file/d/1SyagcEVplIm2KkRD3WQYSO9O0Iyi1hfy/view) ## Far-Future Performance Improvements - F16 math (missing naga-wgsl support https://github.com/gfx-rs/naga/issues/1884) - Faster coordinate space conversion for normals - Faster depth mipchain creation (https://github.com/GPUOpen-Effects/FidelityFX-SPD) (wgpu/naga does not currently support subgroup ops) - Deinterleaved SSAO for better cache efficiency (https://developer.nvidia.com/sites/default/files/akamai/gameworks/samples/DeinterleavedTexturing.pdf) ## Other Interesting Papers - Visibility bitmask (https://link.springer.com/article/10.1007/s00371-022-02703-y, https://cdrinmatane.github.io/posts/cgspotlight-slides/) - Screen space diffuse lighting (https://github.com/Patapom/GodComplex/blob/master/Tests/TestHBIL/2018%20Mayaux%20-%20Horizon-Based%20Indirect%20Lighting%20(HBIL).pdf) ## Platform Support * SSAO currently does not work on DirectX12 due to issues with wgpu and naga: * gfx-rs/wgpu#3798 * gfx-rs/naga#2353 * SSAO currently does not work on WebGPU because r16float is not a valid storage texture format https://gpuweb.github.io/gpuweb/wgsl/#storage-texel-formats. We can fix this with a fallback to r32float. --- ## Changelog - Added ScreenSpaceAmbientOcclusionSettings, ScreenSpaceAmbientOcclusionQualityLevel, and ScreenSpaceAmbientOcclusionBundle --------- Co-authored-by: IceSentry <c.giguere42@gmail.com> Co-authored-by: IceSentry <IceSentry@users.noreply.github.com> Co-authored-by: Daniel Chia <danstryder@gmail.com> Co-authored-by: Elabajaba <Elabajaba@users.noreply.github.com> Co-authored-by: Robert Swain <robert.swain@gmail.com> Co-authored-by: robtfm <50659922+robtfm@users.noreply.github.com> Co-authored-by: Brandon Dyer <brandondyer64@gmail.com> Co-authored-by: Edgar Geier <geieredgar@gmail.com> Co-authored-by: Nicola Papale <nicopap@users.noreply.github.com> Co-authored-by: Carter Anderson <mcanders1@gmail.com>
13 tasks
NoahShomette
pushed a commit
to NoahShomette/bevy
that referenced
this pull request
Jun 19, 2023
 # Objective - Add Screen space ambient occlusion (SSAO). SSAO approximates small-scale, local occlusion of _indirect_ diffuse light between objects. SSAO does not apply to direct lighting, such as point or directional lights. - This darkens creases, e.g. on staircases, and gives nice contact shadows where objects meet, giving entities a more "grounded" feel. - Closes bevyengine#3632. ## Solution - Implement the GTAO algorithm. - https://www.activision.com/cdn/research/Practical_Real_Time_Strategies_for_Accurate_Indirect_Occlusion_NEW%20VERSION_COLOR.pdf - https://blog.selfshadow.com/publications/s2016-shading-course/activision/s2016_pbs_activision_occlusion.pdf - Source code heavily based on [Intel's XeGTAO](https://github.com/GameTechDev/XeGTAO/blob/0d177ce06bfa642f64d8af4de1197ad1bcb862d4/Source/Rendering/Shaders/XeGTAO.hlsli). - Add an SSAO bevy example. ## Algorithm Overview * Run a depth and normal prepass * Create downscaled mips of the depth texture (preprocess_depths pass) * GTAO pass - for each pixel, take several random samples from the depth+normal buffers, reconstruct world position, raytrace in screen space to estimate occlusion. Rather then doing completely random samples on a hemisphere, you choose random _slices_ of the hemisphere, and then can analytically compute the full occlusion of that slice. Also compute edges based on depth differences here. * Spatial denoise pass - bilateral blur, using edge detection to not blur over edges. This is the final SSAO result. * Main pass - if SSAO exists, sample the SSAO texture, and set occlusion to be the minimum of ssao/material occlusion. This then feeds into the rest of the PBR shader as normal. --- ## Future Improvements - Maybe remove the low quality preset for now (too noisy) - WebGPU fallback (see below) - Faster depth->world position (see reverted code) - Bent normals - Try interleaved gradient noise or spatiotemporal blue noise - Replace the spatial denoiser with a combined spatial+temporal denoiser - Render at half resolution and use a bilateral upsample - Better multibounce approximation (https://drive.google.com/file/d/1SyagcEVplIm2KkRD3WQYSO9O0Iyi1hfy/view) ## Far-Future Performance Improvements - F16 math (missing naga-wgsl support https://github.com/gfx-rs/naga/issues/1884) - Faster coordinate space conversion for normals - Faster depth mipchain creation (https://github.com/GPUOpen-Effects/FidelityFX-SPD) (wgpu/naga does not currently support subgroup ops) - Deinterleaved SSAO for better cache efficiency (https://developer.nvidia.com/sites/default/files/akamai/gameworks/samples/DeinterleavedTexturing.pdf) ## Other Interesting Papers - Visibility bitmask (https://link.springer.com/article/10.1007/s00371-022-02703-y, https://cdrinmatane.github.io/posts/cgspotlight-slides/) - Screen space diffuse lighting (https://github.com/Patapom/GodComplex/blob/master/Tests/TestHBIL/2018%20Mayaux%20-%20Horizon-Based%20Indirect%20Lighting%20(HBIL).pdf) ## Platform Support * SSAO currently does not work on DirectX12 due to issues with wgpu and naga: * gfx-rs/wgpu#3798 * gfx-rs/naga#2353 * SSAO currently does not work on WebGPU because r16float is not a valid storage texture format https://gpuweb.github.io/gpuweb/wgsl/#storage-texel-formats. We can fix this with a fallback to r32float. --- ## Changelog - Added ScreenSpaceAmbientOcclusionSettings, ScreenSpaceAmbientOcclusionQualityLevel, and ScreenSpaceAmbientOcclusionBundle --------- Co-authored-by: IceSentry <c.giguere42@gmail.com> Co-authored-by: IceSentry <IceSentry@users.noreply.github.com> Co-authored-by: Daniel Chia <danstryder@gmail.com> Co-authored-by: Elabajaba <Elabajaba@users.noreply.github.com> Co-authored-by: Robert Swain <robert.swain@gmail.com> Co-authored-by: robtfm <50659922+robtfm@users.noreply.github.com> Co-authored-by: Brandon Dyer <brandondyer64@gmail.com> Co-authored-by: Edgar Geier <geieredgar@gmail.com> Co-authored-by: Nicola Papale <nicopap@users.noreply.github.com> Co-authored-by: Carter Anderson <mcanders1@gmail.com>
james7132
pushed a commit
to james7132/bevy
that referenced
this pull request
Jun 19, 2023
 # Objective - Add Screen space ambient occlusion (SSAO). SSAO approximates small-scale, local occlusion of _indirect_ diffuse light between objects. SSAO does not apply to direct lighting, such as point or directional lights. - This darkens creases, e.g. on staircases, and gives nice contact shadows where objects meet, giving entities a more "grounded" feel. - Closes bevyengine#3632. ## Solution - Implement the GTAO algorithm. - https://www.activision.com/cdn/research/Practical_Real_Time_Strategies_for_Accurate_Indirect_Occlusion_NEW%20VERSION_COLOR.pdf - https://blog.selfshadow.com/publications/s2016-shading-course/activision/s2016_pbs_activision_occlusion.pdf - Source code heavily based on [Intel's XeGTAO](https://github.com/GameTechDev/XeGTAO/blob/0d177ce06bfa642f64d8af4de1197ad1bcb862d4/Source/Rendering/Shaders/XeGTAO.hlsli). - Add an SSAO bevy example. ## Algorithm Overview * Run a depth and normal prepass * Create downscaled mips of the depth texture (preprocess_depths pass) * GTAO pass - for each pixel, take several random samples from the depth+normal buffers, reconstruct world position, raytrace in screen space to estimate occlusion. Rather then doing completely random samples on a hemisphere, you choose random _slices_ of the hemisphere, and then can analytically compute the full occlusion of that slice. Also compute edges based on depth differences here. * Spatial denoise pass - bilateral blur, using edge detection to not blur over edges. This is the final SSAO result. * Main pass - if SSAO exists, sample the SSAO texture, and set occlusion to be the minimum of ssao/material occlusion. This then feeds into the rest of the PBR shader as normal. --- ## Future Improvements - Maybe remove the low quality preset for now (too noisy) - WebGPU fallback (see below) - Faster depth->world position (see reverted code) - Bent normals - Try interleaved gradient noise or spatiotemporal blue noise - Replace the spatial denoiser with a combined spatial+temporal denoiser - Render at half resolution and use a bilateral upsample - Better multibounce approximation (https://drive.google.com/file/d/1SyagcEVplIm2KkRD3WQYSO9O0Iyi1hfy/view) ## Far-Future Performance Improvements - F16 math (missing naga-wgsl support https://github.com/gfx-rs/naga/issues/1884) - Faster coordinate space conversion for normals - Faster depth mipchain creation (https://github.com/GPUOpen-Effects/FidelityFX-SPD) (wgpu/naga does not currently support subgroup ops) - Deinterleaved SSAO for better cache efficiency (https://developer.nvidia.com/sites/default/files/akamai/gameworks/samples/DeinterleavedTexturing.pdf) ## Other Interesting Papers - Visibility bitmask (https://link.springer.com/article/10.1007/s00371-022-02703-y, https://cdrinmatane.github.io/posts/cgspotlight-slides/) - Screen space diffuse lighting (https://github.com/Patapom/GodComplex/blob/master/Tests/TestHBIL/2018%20Mayaux%20-%20Horizon-Based%20Indirect%20Lighting%20(HBIL).pdf) ## Platform Support * SSAO currently does not work on DirectX12 due to issues with wgpu and naga: * gfx-rs/wgpu#3798 * gfx-rs/naga#2353 * SSAO currently does not work on WebGPU because r16float is not a valid storage texture format https://gpuweb.github.io/gpuweb/wgsl/#storage-texel-formats. We can fix this with a fallback to r32float. --- ## Changelog - Added ScreenSpaceAmbientOcclusionSettings, ScreenSpaceAmbientOcclusionQualityLevel, and ScreenSpaceAmbientOcclusionBundle --------- Co-authored-by: IceSentry <c.giguere42@gmail.com> Co-authored-by: IceSentry <IceSentry@users.noreply.github.com> Co-authored-by: Daniel Chia <danstryder@gmail.com> Co-authored-by: Elabajaba <Elabajaba@users.noreply.github.com> Co-authored-by: Robert Swain <robert.swain@gmail.com> Co-authored-by: robtfm <50659922+robtfm@users.noreply.github.com> Co-authored-by: Brandon Dyer <brandondyer64@gmail.com> Co-authored-by: Edgar Geier <geieredgar@gmail.com> Co-authored-by: Nicola Papale <nicopap@users.noreply.github.com> Co-authored-by: Carter Anderson <mcanders1@gmail.com>
Elabajaba
added a commit
to Elabajaba/wgpu
that referenced
this pull request
Jun 27, 2023
…_textures_per_shader_stage` (gfx-rs#3798)
3 tasks
cwfitzgerald
pushed a commit
that referenced
this pull request
Jun 29, 2023
…er_shader_stage`) (#3891)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Checklist
cargo clippy.RUSTFLAGS=--cfg=web_sys_unstable_apis cargo clippy --target wasm32-unknown-unknownif applicable.Connections
#3333
Description
DX12's
max_storage_buffers_per_shader_stageandmax_storage_textures_per_shader_stagewere set too low. This changes it to use the limits laid out in https://learn.microsoft.com/en-us/windows/win32/direct3d12/hardware-supportWebGPU minimum is FeatureLevel(FL) 11.0 + resource binding tier 2, or FL11.1 which is 64 UAVs (gpuweb/gpuweb#1069), so I'm not sure if this should error if we're FL11.0 + tier 1.
Testing
It didn't crash when I ran the tests or the water example with WGPU_BACKEND="dx12".