Add Direct3D 12 rendering driver (Mesa NIR approach)#70315
Conversation
9844db9 to
7292df9
Compare
0647208 to
faa55d1
Compare
|
I own a Samsung Galaxy Book Go, which is a Windows on ARM laptop that only supports DirectX 12 (not Vulkan nor OpenGL natively). These two first errors might be more related to other parts of Godot. 1.- Error opening Godot editor in WoA: OpenGL / Vulkan are preferred by default even if they are not available.I executed the Editor build/artifact (win x86_64) available in the checks of this PR and it did not start because Godot tried to use OpenGL to render the Project manager window by default, showing an error mentioning that OpenGL is not available. Seeing the given advice in the error dialog, I ran Godot again with the parameter After that, I could open or create a project, but only "Forward+" or "Mobile" projects would let me continue ("Compatibility" depends on OpenGL/GLES3). 2.- Error executing a Godot project: In "Forward+" or "Mobile" projects Vulkan is preferred by default even if it is not available.In the editor, when pressing "Play" the game would also trigger the error dialog mentioning that Vulkan is not available. 3.- Errors logged when selecting a CSGBox3D in the editorWhen running the editor with d3d12 renderer in that laptop, when I select a CSGBox3D these messages are thrown in the debug log on every frame until I select another node: Feel free to get in touch in case you need to test anything in this laptop. |
The Forward Mobile backend uses Vulkan/Direct3D 12, not OpenGL. According to this PR's description, it should be able to work with Direct3D 12. |
Yes, you are right. I just checked and it works too but with the same issues. |
faa55d1 to
32dd3bd
Compare
|
Starting having a look at what it will take to build this driver using MinGW-GCC on Linux. I had to do some fixes to the diff --git a/drivers/d3d12/SCsub b/drivers/d3d12/SCsub
index 95d7937807..c4c4e45a4a 100644
--- a/drivers/d3d12/SCsub
+++ b/drivers/d3d12/SCsub
@@ -12,26 +12,26 @@ thirdparty_obj = []
# DirectX Headers (must take precedence over Windows SDK's).
-env.Prepend(CPPPATH=["#thirdparty/directx_headers"])
env_d3d12_rd.Prepend(CPPPATH=["#thirdparty/directx_headers"])
# Direct3D 12 Memory Allocator.
-env.Append(CPPPATH=["#thirdparty/d3d12ma"])
env_d3d12_rd.Append(CPPPATH=["#thirdparty/d3d12ma"])
thirdparty_sources_d3d12ma = ["#thirdparty/d3d12ma/D3D12MemAlloc.cpp"]
-env_thirdparty_d3d12ma = env.Clone()
+env_thirdparty_d3d12ma = env_d3d12_rd.Clone()
+env_thirdparty_d3d12ma.disable_warnings()
env_thirdparty_d3d12ma.add_source_files(thirdparty_obj, thirdparty_sources_d3d12ma)
-env_thirdparty_d3d12ma.Append(CCFLAGS=["/std:c++14", "/permissive-"])
-env_thirdparty_d3d12ma.Append(CCFLAGS=["/wd4189", "/wd4324", "/wd4505"])
+if env.msvc:
+ env_thirdparty_d3d12ma.Append(CCFLAGS=["/std:c++14", "/permissive-"])
# Mesa (SPIR-V to DXIL functionality).
env_thirdparty_mesa = env.Clone()
+env_thirdparty_mesa.disable_warnings()
mesa_dir = "#thirdparty/mesa"
mesa_gen_dir = "#thirdparty/mesa/generated"
@@ -147,7 +147,7 @@ env_thirdparty_mesa.Append(
("PACKAGE_BUGREPORT", '\\"https://gitlab.freedesktop.org/mesa/mesa/-/issues\\"'),
"PIPE_SUBSYSTEM_WINDOWS_USER",
"_USE_MATH_DEFINES",
-
+ "HAVE_STRUCT_TIMESPEC",
]
)
if env.msvc:
@@ -162,7 +162,6 @@ if env.msvc:
"_ALLOW_KEYWORD_MACROS",
("_HAS_EXCEPTIONS", 0),
"NOMINMAX",
- "HAVE_STRUCT_TIMESPEC",
]
)
env_thirdparty_mesa.Append(CFLAGS=["/std:c11"])
@@ -174,13 +173,7 @@ else:
]
)
env_thirdparty_mesa.Append(CFLAGS=["-std=c11"])
- env_thirdparty_mesa.Append(CXXFLAGS=["-std=cpp++17"])
-
-# No point in fighting warnings in Mesa.
-if env.msvc:
- env_thirdparty_mesa.Append(CCFLAGS=["/W0"])
-else:
- env_thirdparty_mesa.Append(CCFLAGS=["-w"])
+ env_thirdparty_mesa.Append(CXXFLAGS=["-std=c++17"])
# Add all.
With the above changes, some stuff compiles fine but I still have some roadblocks: Here it's on MinGW headers are typically not fully in sync with the Windows SDK, and notably for DirectX they've historically been missing some bits. Together with Pedro we tried to see what happens when not using our vendored
And when using our vendored headers, the above problem with |
|
@panreyes, I've been able to reproduce your issue. I also found other ways to trigger it. My investigation led to the hypothesis that it's due to a bug in upstream NIR-to-DXIL functionality. I've reported it: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7950 |
32dd3bd to
9e513a7
Compare
c9a28b4 to
a482fa7
Compare
|
Pushed addressing all the feedback and rebasing on top of latest I've fixed the V-sync issue with different logic than @TempoLabGames's, but with their insight is at its core. Thank you! |
The waitable swap chain approach already outperforms the Vulkan renderer in terms of latency. I'm not familiar with the Vulkan API, but as far as I can tell, it lacks the information you would need to do this. It looks like it's been discussed in KhronosGroup/Vulkan-Docs#370 and KhronosGroup/Vulkan-Docs#1364 but it's not clear to me whether what's been proposed would actually allow you to match this. The core problem is that the application needs to know when to generate a frame. In a typical game loop you'll process input and run the frame update logic, and only when you go to call |
I believe that using |
Mailbox V-Sync isn't supported on all platforms and drivers, so we need to ensure FIFO provides an experience that's as good as possible. |
@versalinyaa Mailbox, when working as intended, is effectively the same as disabling VSync, except only complete frames get drawn to the screen. This has two problems:
By synchronising on the VBlank, you end up consistently one frame behind. If you wanted, you could set a time budget less than the length of a frame and sleep for the rest (before sampling input) to reduce this further. Having said all that, VSYNC_MAILBOX doesn't seem to work properly for me on Windows. I expect VSYNC_MAILBOX to have the same latency as the frame that shows above the top tear line with VSYNC_OFF, but in ExclusiveFullscreen it's a full frame behind. It appears to be preventing DirectFlip from bypassing DWM. |
|
For the record, confirming that this is being delayed a bit more and should be merged early in the 4.3 release cycle. It's too late for 4.2 as we're in feature freeze, mostly due to me not finding the time early enough to get this formatted the way I think we can merge it. @bruvzg did some initial work to evaluate moving the mesa code to a separate repo so we could build it as a static library, reducing the impact on the Godot repo in term of thirdparty code and build time: https://github.com/bruvzg/godot-nir-static/ But the new pre-requisite for this work to be merged will first be rebasing it on top of the RenderingDeviceDriver which @RandomShaper has been working on in #83452, which will also be merged early on for 4.3. In the meantime, this PR is still functional and recently rebased, so anyone interested in D3D 12 should be able to merge it in their fork on top of the 4.2 release. |
For the reference, there are some additional modification (related to the static Mesa and MinGW build) in the https://github.com/bruvzg/godot/tree/d3d12_mesa, MinGW build still fails to link, probably due to outdated DX import libs in MinGW distros. |
I'd like to suggest merging the current one first, when it's time, even if the RDD version comes immediately after. I believe that the diff between both "styles," the same we'll have for Vulkan, would be a valuable piece of the repo history. It will be an additional sample of the architectural shift and also it will be there at hand for looking for small pieces of logic that may get lost in translation and we may want to recover some day. Not a strong opinion, but I'd rather keep the non-RDD version "snapshotted" in the main history instead of opening yet another PR for the RDD version so we can keep the non-RDD one in the archives, let alone in a branch in my fork. |
|
Sounds good to me :) |
clayjohn
left a comment
There was a problem hiding this comment.
Giving my stamp of approval. I have worked with this code for awhile now and reviewed it in depth during a few of its iterations. Today I did a light review to look out for any major issues and found none. I trust at this stage in the lifecycle of this work, if issues are discovered, we can fix them promptly in master.
The new build instructions work perfectly when compiling with MSVC. I'll make a PR shortly updating the documentation with the content from this PR's description.
akien-mga
left a comment
There was a problem hiding this comment.
Did a buildsystem code review, looks pretty good to me!
Great work @RandomShaper (and thanks @bruvzg for the help with godot-nir-static!).
I'll do a bit more testing on Windows and approve/merge.
There was a problem hiding this comment.
I think a lot of these are overkill, duplicate defines we already set or are not useful (e.g. the bugreport URL for Mesa). I also had issues with the _Static_assert define that breaks mingw (but there are other issues with mingw).
I'm puzzled to still see this here though, isn't all this to compile mesa, and thus stuff that should be in godot-nir-static's SConstruct?
There was a problem hiding this comment.
I added all that I found in the Mesa build system. I'm not sure if all of them are required, but probably most at least. Sadly we need them here as well as in godot-nir-static because the Mesa headers need to see the same definitions here and therre. Considering here we only care about certain headers, the right answer may be a subset of them, though.
There was a problem hiding this comment.
Tested successfully on Windows 10 with latest dxc and our mesa 23.1.0-devel build.
I ran it on those two projects, which worked fine:
GDQuest TPS demo: https://github.com/gdquest-demos/godot-4-3d-third-person-controllerRPicster's Desert Light: https://github.com/RPicster/godot4-demo-desert-light
Edit: Actually I tested wrongly, and was still using Vulkan. With my d3d12, both projects crash on edit. I suspect using latest dxc was a mistake, I'll test again with the version from March 2023.
Edit 2: Well, DXC_PATH only seems to be used to copy a DLL in dev_builds, which is not what I compiled, so it must be something else.
This is outstanding work @RandomShaper, really impressive!
I apologize again for the long delay reviewing and merging this.
Removed |
|
Tested the latest version again, now I can confirm it seems functional even in non-dev builds. I could open and run Desert Light fine 🎉 On the other hand the GDQuest TPS demo crashes when importing some scene. Crash logs (no debug syms)I'm fine merging anyway and leaving this to debug in a follow up issue/PR. |
|
Thanks! Truly amazing work Pedro! 🥇🎉👏 |
|
Big thanks to everyone that has been involved, giving feedback, testing, etc., the production team and @bruvzg. |
Direct3D 12 Rendering Driver (via Mesa 3D's NIR)
This is a replacement of #64304. The difference lies in how they approach shader compilation; i.e., how they manage to take SPIR-V shaders into Direct3D. The old one used SPIRV-Cross plus mix and match of SPIR-V and DXIL reflection data. Also, for emulating specialization constants in Direct3D it needed part of the source code of the DirectX Shader Compiler, to be able to patch the LLVM IR bitcode. The new one does the shader translation via Mesa's intermediate representation of shaders (NIR), plus the Microsoft-contributed code to that project that translates NIR to DXIL. That also allows this PR to adjust the shader bindings in a way that avoids the need for the DXIL reflection and mix-and-match steps. Furthermore, the trick for specialization constants is also made much simpler, thanks to a local patch to the Mesa source code, so there's no need to bundle part of DXC either.
This is a feature-complete Direct3D 12 RenderingDevice implementation for Godot Engine. It works as a drop-in replacement for the Vulkan one. It is selectable in the project settings as an alternative to use on Windows.
By supporting Direct3D 12, Godot gains support for multiple new platforms, such as:
This PR includes some preparatory changes, to uncouple the
RenderingDevicefrom Vulkan, that is, abstracting the modern Godot rendering architecture from whatever rendering API is used. Moreover, instead of a monolithic commit, the code of the driver itself is split into three, much more manageable commits.Highlights
Performance
Depending on the complexity of the scene, effects used, etc., this first version of the renderer performs generally worse than the Vulkan one. In some tests, D3D12 has not been able to deliver more than 75% of the Vulkan frames per second. In some other, D3D12 has been able to outperform Vulkan by a small margin. Performance improvements will be ironed out over time.
Homogeneity
The D3D12 rendering driver has been written taking the Vulkan one as a basis and keeping as much as possible from the original. This effort gives two-fold benefits: on the one hand, the overall structure of the code files, including auxiliary structures and other elements, is very similar, which makes maintenance easier; on the other hand, both renderers are more similar at the functional level. An example of this is that the D3D12 renderer will be as picky as the Vulkan one when it comes to validation and error checking, even in areas where the Microsft API wouldn't impose such strict constraints.
Specialization Constants
In Vulkan it is possible to create multiple variations of a pipeline with different values for certain parameters that end up as compile-time constants in the shader generated under the hood. Those parameters are called specialization constants.
In Direct3D there's no counterpart of that mechanism. However, Godot rendering relies on it for some of its shaders. A way to have specialization constants in the Direct3D/DXIL world had to be researched. It was finally found and is used in this code.
The technique is explained in this Twitter thread: https://twitter.com/RandomPedroJ/status/1532725156623286272.Update for this new PR: The new approach shares some details, but it's both more powerful and simpler. An article about it will be published soon.Code Comments
To avoid making this PR description unnecessarily long, the reader is advised to find additional insight in the comments.
Assertions
Given that some data crosses many stages from its inception to where it's finally used, the code is full of dev-only checks ensure the sanity of many different data structures at different points in time. The expectation is that this will make easier to catch bugs —even subtle ones— in areas of high complexity.
Known Issues
Compilation & Distribution
Grab the (main, not PDB) .zip file corresponding to the
1.7.2207(2023-12-12) v1.7.2308 version of the DirectX Shader Compiler from https://github.com/Microsoft/DirectXShaderCompiler/releases.Unzip the file to some path.
Optional (only for developers wanting to debug graphics with the PIX tool, only for debug builds):
(Update for this new PR) Optional:
1.710.0-preview1.610.5) is the latest tested), at https://devblogs.microsoft.com/directx/directx12agility/. You’ll be finally taken to a NuGet package page where you can click Download package to get it. UPDATE: If you use a preview version of the Agility SDK, remembe to enable developer mode in Windows; otherwise it won't be used.2023-12-07: Download the latest
godot-nir-staticdistribution from https://github.com/godotengine/godot-nir-static/releases/ (23.1.0-develis the only tested at the time of this writing), or make one yourself with these steps:mako(https://www.makotemplates.org/), needed to generate some files.mesa_libspath you have to provide later is this directory.)git submodule update --init./update_mesa.shsconsHuge thanks to @bruvzg for making this workflow possible!
Build Godot with the following additional parameters to SCons:
d3d12=yes dxc_path=<...>, plus (if using the Agility SDK)agility_sdk_path=<...>, plus (if using PIX)pix_path=<...>, (2023-12-07) plusmesa_libs=<...>.NOTE: The build process will copy
dxcompiler.dllanddxil.dll(Update for this new PR: Now the shader compiler DLL is not needed at all; only the validator-signer, dxil.dll, is required.) from thebin/x64/directory in the DXC zipfile to the Godot binary directory. D3D12-enabled Godot packages for distribution to end users must includethose filesthat file, both for the editor and games.2023-12-07: Both
dxil.dlland the DLLs from the Agility SDK come in multiple versions, for different architectures. Now, to allow you to have builds of Godot for multiple archs in the same build tree, the D3D12 driver and build system have the following enhancements:DXIL.dllis copied both tobin/and the appropriatebin/<arch>/, so you can end up with multiple arch-specific versions of the DLL plus the latest one (or single one) built inbin/. That lets you use a single or multi-arch workflow without build-time changes. At runtime, the renderer will try to load the DLL from the arch-specific one, falling back to the same directory as the Godot executable.bin/). If you passagility_sdk_multi_arch=yesto SCons, you'll opt-in for multi-arch. DLLs will be copied to the appropiatebin/<arch>/subdirs and at runtime the right one will be loaded.Future Work
Besides fixing the known issues described in another section, there are many options for potential improvement, the most important of which are described below. The code also has a number of TODO items that refer to these and other, generally smaller, potential enhancements or nice-to-haves.
Render Pass API
The D3D12 renderer uses what in the Vulkan world is called dynamic rendering. In other words, it doesn't use render pass —and subpass— APIs. This was done to make things simpler, but came with a couple of downsides.
Actionable item: Re-work render pass management with the proper APIs, which may be needed to squeeze performance from certain kind of devices.
Enhanced Barriers
Direct3D 12 was released with a way to synchronize the GPU work consisting in resource barriers. In short, they are not nearly as fine-grained as Vulkan's memory and pipeline barriers are, the biggest consequence of this being comparatively worse performance. Microsoft has later powered Direct3D with the so-called enhanced barriers, which are the same that Vulkan has. Recent GPU drivers and Windows versions already support them.
Actionable item: Re-work synchronization based on enhanced barriers, which will give more performance and make the code more similar to the one in the Vulkan renderer.
More Reasonable Dependencies
Currently, this is using SPIRV-Cross for shader translation to HLSL and an important chunk of DXC for the specialization constants hack. When the Microsoft provided support for DXIL in Mesa is mature —when checked for the purpose of this work it wasn't yet—, we may be able to use it —via NIR— instead of that two other dependencies for those purposes. Microsoft is donating engineer time to Mesa for this effort, so we hope it will be in an usable state soon for us.Actionable item: Watch the status of DXIL in Mesa and replace SPIRV-Cross and the DXC source code as soon as feasible.Update for this new PR: This actionable item is precisely what this new PR is about!
Deprecate Texture Aliasing
In Vulkan it is possible to tell upfront which formats a texture will be interpreted as, and it'll just work. In Direct3D 12 there was traditionally no way to do the same. Therefore, there are limitations on which reinterpretations one can do.
Godot needs to do two of them that are illegal in D3D12: write as
R32and read asR9G9B9E5, and write asR16and read asR4B4G4A4. The Direct3D 12 renderer code works around that limitation by abusing texture aliases, which, according to some tests across different GPUs, seems to work fine in practice.The legal approach would be to make copies of the textures when the time to read comes. However, that won't still work for the
R4B4G4A4case. Therefore, the aliasing workaround is used for every case by now.Luckily, Direct3D has recently added a new API
CreateCommittedResource3()that provides the same nicety as Vulkan, but it's still not widely available and, at the time of this writing, the D3D12 Memory Allocator library still doesn't support it (there's a PR, though: GPUOpen-LibrariesAndSDKs/D3D12MemoryAllocator#44).Thanks go to Matías N. Goldberg, which was of great help in this investigation.
Actionable item: Add check support and preferCreateCommittedResource3()to the aliasing hack where possible.UPDATE: Done.
Further Homogeneity
Actionable item: Fuse as much as possible the elements that Vulkan and D3D12 have in common —staging buffer, static arrays of data format names, etc.—. This should reduce the codebase size and make it easier to maintain (and eventually add more platforms).
Missing NIR-to-DXIL Features (new for this PR)
These two feature requests in the Mesa 3D repo must be honored so multi-view and shader subgroup operations can be enabled back:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/7904Done.https://gitlab.freedesktop.org/mesa/mesa/-/issues/7905Done.More
Just to make it complete, there are a few more potential improvements that may or may not be already in a TODO in the comments:
Try to assign HLSL bindings manually and inform SPIRV-Cross in a deterministic way. That would make reflection, management of root signature and population of handle heaps simpler and more efficient. (Credit: @reduz.)Update for this new PR: As mentioned above, matching bindings between the SPIR-V and DXIL realms is now much more convenient.More sensible use of the shared heap (i.e., track which resources/samplers are already bound and reuse somehow).Update: Done.p_post_barrierparameters as a hint somehow?)fsr_upscale.hdirectly, given the appropriate defines.material_samplers) to static samplers and/or descriptors to root descriptors when possible.D3D12_FEATURE_DATA_ARCHITECTUREbeing UMA, useWriteToSubresource()instead ofmemcpy().D3D12_BUFFER_SRV_FLAG_RAWfor CBV, or another usage.🍀 This work has been financed and kindly donated to the Godot Engine project by W4 Games. 🍀
Production edit: closes godotengine/godot-roadmap#30