Skip to content

Conversation

@kvark
Copy link
Contributor

@kvark kvark commented Oct 18, 2018

No description provided.

Copy link
Contributor

@litherum litherum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up, it's fantastic to have this description written down!


Users issue rendering and compute commands (such as resource bindings, draw calls, etc) via command buffers. The concept of `WebGPUCommandBuffer` matches the native graphics APIs. Those command buffers go through the following stages in their life cycle. It starts with creating a new `WebGPUCommandBuffer` from a `WebGPUCommandQueue` instance. From this point, the command buffer is considered to be in "recording" state.

Commands can be encoded independently of the `WebGPUDevice`, or anything currently happening on GPU. The recording is CPU-only operation, and multiple command buffers can be recorded independently on web workers. (TODO: disallow recording multiple command buffers on the same thread/web worker?). Recording usually consists of a number of passes, be it render or compute, with occasional copy operations inserted between them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what "independently" means. Do you mean independently in time (aka it can be done while the GPU is busy) or device-independently (aka commands are represented in a device-independent byte code and then further compiled to the specific device later)?


Commands can be encoded independently of the `WebGPUDevice`, or anything currently happening on GPU. The recording is CPU-only operation, and multiple command buffers can be recorded independently on web workers. (TODO: disallow recording multiple command buffers on the same thread/web worker?). Recording usually consists of a number of passes, be it render or compute, with occasional copy operations inserted between them.

Since a programmable pass defines the resource binding scope, synchronization rules, fixes the resuorce usage, and exposes a number of specific operations, we encapsulate the encoder of a pass into a separate object, such as `WebGPURenderPassEncoder` and `WebGPUComputePassEncoder`. The pass encoder object can be obtained from a command buffer by calling `beginRenderPass` or `beginComputePass` correspondingly. The command buffer is expected to be in "recording" state, or otherwise a synchronous error is triggered. No operations are expected to be done on the `WebGPUCommandBuffer` if there is an open pass being encoded to it. Calling any methods on the command buffer with an open pass, or submitting it to the command queue, triggers a synchronous error. A pass encoding consists of state setting code and draw/dispatch calls, which are all methods on the corresponding encoder object. In order to close a pass, the user calls `WebGPUProgrammablePassEncoder::endPass`, which returns the owner `WebGPUCommandBuffer` object.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passes cannot straddle command buffers, and a command buffer may contain multiple passes (I think?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/are expected to/may/

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: resuorce


TODO: section about finishing the recording

When a command buffer is recorded, it is in "ready" state. It is valid to transfer this object between web workers. The only operations available on a "ready" command buffer is dropping it and submitting it via `WebGPUCommandQueue::submit`. This method gets a sequence of command buffers and submits them (in the given order) to the GPU driver for execution. There are a few hidden (from the user point of view) stages here before the command buffer actually reaches the GPU.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "dropping" mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume it means letting it be garbage collected before it is submitted.

When a command buffer is recorded, it is in "ready" state. It is valid to transfer this object between web workers. The only operations available on a "ready" command buffer is dropping it and submitting it via `WebGPUCommandQueue::submit`. This method gets a sequence of command buffers and submits them (in the given order) to the GPU driver for execution. There are a few hidden (from the user point of view) stages here before the command buffer actually reaches the GPU.

Once submitted, the command buffer switches to "executing" state. If the WebGPU implementation fails to submit the command buffer due to a problem with recorded content (e.g. exceeding the limit for the instance count in a draw call), it is turned into an internally null object, and the asynchronous error is reported. The feature to re-use command buffers for multiple submissions is still being discussed, and until this is clear, we consider the `WebGPUCommandBuffer` to be moved into submission. Any operations on a command buffer in the "executing" state, other than dropping it (which is what the user is expected to do), would trigger a synchronous error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention that "submission" means that the GPU will complete working on it in finite time


If the submission is successful, then at some point in time the GPU will be done processing it. The WebGPU implementation takes the responsibility to detect this moment and gracefully recycle/destroy this command buffer, when it's safe to do so.

## Short version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually "short versions" go at the top of the document

@kvark
Copy link
Contributor Author

kvark commented Oct 19, 2018

Thanks for the quick review, @litherum , and great feedback!


"recording" -> "ready" -> "executing" -> done

Command buffers are created from and submitted to a command queue. The queue is also used to signal fences, allowing the user to know when the command buffers are done.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are command buffers supposed to be submitted in the same order they were created on the queue or not? (I assume the don't but better make sure)

@Kangz
Copy link
Contributor

Kangz commented Oct 21, 2018

Thanks for putting this together, I have one question but it looks very clear otherwise. The model LGTM, though I wouldn't be against having a different object for finished command buffers.

@kvark
Copy link
Contributor Author

kvark commented Oct 22, 2018

Thanks for feedback! I believe the concerns are addressed now.

@Kangz
Copy link
Contributor

Kangz commented Oct 23, 2018

Yep, still LGTM


Command buffers carry sequences of user commands on the CPU side. They can be recorded independently of the work done on GPU, or each other. They go through the following stages:

creation -> "recording" -> "ready" -> "executing" -> done
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recording usually consists of a number of passes

Seems like there should be a loop here between ready and recording


When a command buffer is recorded, it is in "ready" state. It is valid to transfer this object between web workers. When "ready", a command buffer can only be submitted for execution via `WebGPUCommandQueue::submit`. This method gets a sequence of command buffers and submits them (in the given order) to the GPU driver. There are a few hidden (from the user point of view) stages here before the command buffer actually reaches the GPU.

Once submitted, the command buffer switches to "executing" state, which means the driver and then the GPU starts working on it with an expectation to be done in reasonable amount of time. If the WebGPU implementation fails to submit the command buffer due to a problem with recorded content (e.g. exceeding the limit for the instance count in a draw call), it is turned into an internally null object, and the asynchronous error is reported. The feature to re-use command buffers for multiple submissions is still being discussed, and until this is clear, we consider the `WebGPUCommandBuffer` to be moved into submission. Any operations on a command buffer in the "executing" state, other than dropping it (which is what the user is expected to do), would trigger a synchronous error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"which means the command buffer will execute (both on the CPU and GPU) in finite time"

@kvark
Copy link
Contributor Author

kvark commented Oct 29, 2018

The finish() sentence is added, and I believe all the remaining concerns are resolved now. Please take a look.

@Kangz
Copy link
Contributor

Kangz commented Oct 30, 2018

Still LGTM, would still like a different object for finished command buffers but we can discuss that later.

@kvark
Copy link
Contributor Author

kvark commented Oct 30, 2018

@Kangz I'm not feeling comfortable pushing the design changes into the document. As we've discussed on the last call, the WebIDL is the source of truth, it's the upstream of all the changes. The documentation is just explaining how the IDL is supposed to be used. Thus, we should consider adding the notion of the finished command buffer object here only after (or at the same time as) it's changed in the IDL file.

@Kangz
Copy link
Contributor

Kangz commented Oct 30, 2018

Understood, thanks!


Commands can be encoded independent of anything done on `WebGPUDevice` or the underlying GPU.
The recording is CPU-only operation, and multiple command buffers can be recorded independently on web workers.
(TODO: disallow recording multiple command buffers on the same thread/web worker?).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I am missing something I do not see why we would disallow recording multiple command buffers on the same thread/web worker.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case in question is when the user has multiple command buffers in "recording" state on the same thread at a time. This is hardly a case used often. IIRC, D3D12 command pool disallows that explicitly (but multiple pools can be used).

@kvark
Copy link
Contributor Author

kvark commented Oct 31, 2018

I believe we got thumb ups from major parties. Let's proceed and then patch in follow-ups.

@kvark kvark merged commit b555093 into gpuweb:master Oct 31, 2018
@kvark kvark deleted the doc-submission branch October 31, 2018 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants