-
Notifications
You must be signed in to change notification settings - Fork 360
Add a bunch of limits. #1863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a bunch of limits. #1863
Conversation
- `minUniformBufferOffsetAlignment` at 256. Required from all APIs.
- `minStorageBufferOffsetAlignment` at 256. Required for most Vulkan
Android devices.
- `maxVertexOutputComponents` at 64. Required for early Apple GPUs
on Metal.
- `maxFragmentInputComponents` at 60. Required for early Apple GPUs
on Metal.
- `maxColorAttachments` at 4. Required for early Apple GPUs on Metal
and many Android Vulkan devices.
- `maxComputeWorkgroupStorageSize` at 16352. Required for early
Apple GPUs on Metal and many Android Vulkan devices.
- `maxComputeWorkgroupInvocations` at 256. Required for many Android
Vulkan devices. Chose to not have per-dimension limits as they would
be more constraining and can be worked around when translating from
WGSL to SPIR-V.
- `maxComputePerDimensionDispatchSize` at 65535. Required for many
Android Vulkan devices.
See gpuweb#1343
|
(in talking with @kainino0x @Kangz and @kvark, I think this is where we ended up feeling:) Ideally we can require We might want |
|
I think we should unite the in/out location limits. Quickly glancing over them on Android Vulkan shows 64 being a safe bet. And yes, max color attachments should just be a hard limit in the spec. The |
|
Maybe we should put 🌶️ or 🌶️🌶️ next to the spicier limits with respect to portability, like on a restaurant menu. :) |
For the record, we also discussed whether we can believe Metal's 60 limitations for fragment shaders or not given use of the position builtin would probably cost 0.
That should be the case for all limits. |
|
See #1865 about having limits which exist but aren't configurable. |
kainino0x
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kainino0x
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still LGTM % #1865 and the open comment
| {{GPUShaderModule}} entry-point. | ||
|
|
||
| <tr><td><dfn>maxComputeWorkgroupInvocations</dfn> | ||
| <td>{{GPUSize32}} <td>Higher <td>256 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Metal, the workgroup size limit is per pipeline, not per device. In theory, any particular compute shader may have a lower maximum workgroup size than the device's maximum due to heavy resource usage.
I don't have any direct experience with this for Metal, so I'm just curious as to whether we are confident that we'll always be able to support a workgroup size of 256 for all possible compute shaders?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this a while ago in #275 but never resolved. In general pipeline compilation can fail for many different reasons (couldn't allocate registers, or too much spilling is another one). 256 is probably safe in most cases, but to be 100% safe developers should use createRenderPipelineAsync to know if the compilation succeeded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I suppose there is no better option than to expect 256 to be supported on Metal pipelines :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my understanding, Metal is actually strictly less restrictive than other platforms - all of the APIs can fail pipeline creation arbitrarily (e.g. because they couldn't manage to allocate all of the memory needed by the shader at a particular workgroup size). Instead, Metal does not require workgroup_size at compile time, so they let you create a pipeline, and then tell you what the maximum workgroup_size is afterward*. In other words, Metal fails softly (reduces the maximum workgroup size) while other APIs fail hard (fail to create the pipeline at all).
(* In practice, I suspect this only helps for compute shaders which do not use workgroup memory, as workgroup memory size is almost always dependent on workgroup size.)
kainino0x
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree no need to block on #1865, but have more comments
| {{GPUShaderModule}} entry-point. | ||
|
|
||
| <tr><td><dfn>maxComputeWorkgroupInvocations</dfn> | ||
| <td>{{GPUSize32}} <td>Higher <td>256 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my understanding, Metal is actually strictly less restrictive than other platforms - all of the APIs can fail pipeline creation arbitrarily (e.g. because they couldn't manage to allocate all of the memory needed by the shader at a particular workgroup size). Instead, Metal does not require workgroup_size at compile time, so they let you create a pipeline, and then tell you what the maximum workgroup_size is afterward*. In other words, Metal fails softly (reduces the maximum workgroup size) while other APIs fail hard (fail to create the pipeline at all).
(* In practice, I suspect this only helps for compute shaders which do not use workgroup memory, as workgroup memory size is almost always dependent on workgroup size.)
toji
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Merging as discussed with Kai in the editor's meeting today. |
minUniformBufferOffsetAlignmentat 256. Required from all APIs.minStorageBufferOffsetAlignmentat 256. Required for most VulkanAndroid devices.
maxVertexOutputComponentsat 64. Required for early Apple GPUson Metal.
maxFragmentInputComponentsat 60. Required for early Apple GPUson Metal.
maxColorAttachmentsat 4. Required for early Apple GPUs on Metaland many Android Vulkan devices.
maxComputeWorkgroupStorageSizeat 16352. Required for earlyApple GPUs on Metal and many Android Vulkan devices.
maxComputeWorkgroupInvocationsat 256. Required for many AndroidVulkan devices. Chose to not have per-dimension limits as they would
be more constraining and can be worked around when translating from
WGSL to SPIR-V.
maxComputePerDimensionDispatchSizeat 65535. Required for manyAndroid Vulkan devices.
See #1343
Preview | Diff