Skip to content

Implement MSAA per-sample shading option to improve rendering quality#76073

Draft
Calinou wants to merge 2 commits into
godotengine:masterfrom
Calinou:msaa-add-per-sample-shading-2
Draft

Implement MSAA per-sample shading option to improve rendering quality#76073
Calinou wants to merge 2 commits into
godotengine:masterfrom
Calinou:msaa-add-per-sample-shading-2

Conversation

@Calinou

@Calinou Calinou commented Apr 14, 2023

Copy link
Copy Markdown
Member

Compared to setting Scaling 3D > Scale above 1.0, this provides better quality (thanks to MSAA's superior sampling pattern) with better performance and lower VRAM utilization.

This also works on 2D rendering, which can be useful to maximize sharpness of downscaled 2D textures without graininess when not using mipmaps. In 2D, this also provides antialiasing for hard Light2D shadows.

Per-sample shading is mainly targeted at Movie Maker mode usage, although modern high-end GPUs can handle per-sample shading with 2× or 4× MSAA during gameplay while staying above 60 FPS.

This is supported in Forward+ and Forward Mobile. However, most mobile GPUs don't support per-sample shading as a hardware limitation (this includes Apple Silicon).

TODO

  • Check for hardware per-sample shading support before trying to enable it to avoid crashes on unsupported hardware (as done for the MSAA level).
  • Implement in Direct3D 12 if an equivalent exists there. (To my knowledge, a Metal equivalent to this feature does not exist.)
  • Make setting changes effective without requiring a restart. Right now, it will sometimes work in the editor (likely because a shader recompilation is triggered somewhere), but not in the running project and not in 2D.
  • Use separate project settings for 2D and 3D, so you can control per-sample shading with more granularity (and with a lower performance impact if you only need it on 2D or 3D).

Performance

Tested at https://github.com/Calinou/godot-reflection's spawn location.

OS: Fedora 37
CPU: Intel Core i9-13900K
GPU: NVIDIA GeForce RTX 4090

Antialiasing type 1152×648 3840×2160
No AA 868 FPS (1.15 mspf)1 324 FPS (3.09 mspf)
Render Scale 2.0 (old supersampling approach) 644 FPS (1.55 mspf)1 93 FPS (10.75 mspf)
4× MSAA + 4× SSAA (this PR) 876 FPS (1.14 mspf)1 209 FPS (4.78 mspf)

Preview

1152×648

No AA Render Scale 2.0 (old supersampling approach) 4× MSAA + 4× SSAA (this PR) Ground truth2
No AA Render Scale 2.0 (old supersampling approach) 4× MSAA + 4× SSAA (this PR) Ground truth

3840×2160

No AA Render Scale 2.0 (old supersampling approach) 4× MSAA + 4× SSAA (this PR)
No AA Render Scale 2.0 (old supersampling approach) 4× MSAA + 4× SSAA (this PR)

Footnotes

  1. Performance wildly fluctuates at the default resolution, likely due to a CPU bottleneck. 2 3

  2. Render in a window twice as large, then downsample in an image editor.

@jcostello

jcostello commented Apr 15, 2023

Copy link
Copy Markdown
Contributor

Really nice improvement in quality and performance. I notice some lost in texture quality. The wall bricks looks blurier like in FXAA, Is that expected?

Edit: Only in 1152×648

@Calinou

Calinou commented Apr 15, 2023

Copy link
Copy Markdown
Member Author

Really nice improvement in quality and performance. I notice some lost in texture quality. The wall bricks looks blurier like in FXAA, Is that expected?

I'm not sure why this is happening. This likely has to do with SDFGI reflections looking different depending on the actual viewport resolution, but not per-sample shading sample count (which is strange). I've tried using a mipmap LOD bias of -1.0 when using per-sample shading, but that didn't improve the situation.

I've added a "ground truth" example to OP to show what would be the best possible appearance.

@Calinou Calinou force-pushed the msaa-add-per-sample-shading-2 branch from dbe149c to 90bad0f Compare April 15, 2023 00:54
@Calinou Calinou force-pushed the msaa-add-per-sample-shading-2 branch from 90bad0f to b148c45 Compare October 27, 2023 23:35
@Calinou Calinou force-pushed the msaa-add-per-sample-shading-2 branch from b148c45 to b1e99a5 Compare July 5, 2024 18:00
Compared to setting Scaling 3D > Scale above 1.0, this provides better
quality (thanks to MSAA's superior sampling pattern) with better performance
and lower VRAM utilization.

This also works on 2D rendering, which can be useful
to maximize sharpness of downscaled 2D textures without graininess when not
using mipmaps. In 2D, this also provides antialiasing for hard Light2D shadows.

Per-sample shading is mainly targeted at Movie Maker mode usage, although modern
high-end GPUs can handle per-sample shading with 2× or 4× MSAA during gameplay
while staying above 60 FPS.
@Calinou Calinou force-pushed the msaa-add-per-sample-shading-2 branch from b1e99a5 to b781d30 Compare July 7, 2024 17:23
@clayjohn

Copy link
Copy Markdown
Member

I'm not sure this is worth supporting. It is limited to Vulkan, and hardware support is not very good. It is a cool feature, but the limitations make it pretty hard to justify supporting in a cross platform engine.

@izarii-dev

izarii-dev commented Oct 26, 2024

Copy link
Copy Markdown

about the support on DirectX 12, I found that NVIDIA can do per-sample shading at driver level for DirectX11 games. I'm not sure how they did this but it can be helpful, they called it "variable rate supersampling"
the minimum GPU for this feature is GTX 1650 Super

https://developer.nvidia.com/vrworks/graphics/variablerateshading#:~:text=Variable%20Rate%20Supersampling%20(VRSS),-FIGURE%203%3A%20VRSS&text=Variable%20Rate%20Supersampling%20(VRSS)%20leverages%20NVIDIA%20Variable%20Rate%20Shading%20(,increase%20image%20quality%20and%20performance.

@Calinou

Calinou commented Oct 28, 2024

Copy link
Copy Markdown
Member Author

@izarii-dev That feature is only meant to be used in VR.

There also exist driver-specific MSAA settings that act as supersampling (SGSSAA or "16x MSAA"), but these are not available at an application level without using proprietary APIs.1 16x MSAA (which was typically 4x SSAA + 4x MSAA) was available in OpenGL, but not in Vulkan or Direct3D 12 🙁

Footnotes

  1. Only a handful of games have an in-game SGSSAA option for this reason, the most notable examples being some PC ports developed by PH3 such as Ys X: Nordics.

@alyssarosenzweig

Copy link
Copy Markdown

unless I'm very confused, per-sample shading is a standard feature supported even with gles3.2 and metal... although you may need to trigger it implicitly with a gl_SampleID read (or equivalent) rather than being able to set minSampleShading?

@alyssarosenzweig

Copy link
Copy Markdown

glMinSampleShading is available in gles3.2 at least

@clayjohn

Copy link
Copy Markdown
Member

unless I'm very confused, per-sample shading is a standard feature supported even with gles3.2 and metal... although you may need to trigger it implicitly with a gl_SampleID read (or equivalent) rather than being able to set minSampleShading?

It seems like I was the one confused, I thought we were talking about VK_KHR_fragment_shading_rate, not sampleRateShading.

We already expose the sampleShadingEnable and minSampleShading members of VkPipelineMultisampleStateCreateInfo in the Vulkan backend through our RDPipelineMultisampleState class. We do still need to verify that sampleRateShading is true though.

@v1993

v1993 commented Jun 19, 2025

Copy link
Copy Markdown

The cross-API (Vulkan, DirectX, Metal, apparently even newer OpenGL ES) way to do this is to use the sample interpolation qualifier on your fragment shader's inputs. Unfortunately, this indeed means that changing this option requires recompiling fragment shaders, so it might make sense to avoid this under Vulkan by keeping the current approach for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement MSAA per-sample shading option to improve rendering quality Vulkan: 2D msaa does not AA clipped children

6 participants