Merged
Conversation
nical
approved these changes
Jun 25, 2024
b764128 to
012ba30
Compare
012ba30 to
ce1960b
Compare
888b3e9 to
58ae38e
Compare
58ae38e to
4548b57
Compare
Member
Author
|
Despite some mitigations, Linux is failing this benchmark |
a6f4fd5 to
1a5a5df
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Connections
Description
Adds a benchmark for compute pass recording, very similar to what we have for render passes.
The prime motivation for this was to figure out whether the extensive changes I made to compute pass recording made performance worse or better - in fact there are good reasons for either. The short answer: It improved by 4-10% pass time since before I started!! 🥳
Even better, including submit time the improvements are 10-30%, but this is very likely not associated with the compute pass recording refactors :)
Unfortunately those changes landed over a quite long period of time so unless someone bisects this carefully we won't know what caused it exactly. It could be that the "fully consume the pass" change caused these improvements (we now make use of the fact that a pass can't be submitted twice) but then again this is probably a wash since before compute pass lifetimes refactor work started, compute pass was a very simple data structure (now it has extensive resource ownership). So it's just as likely that something else caused this.
For this comparision, I backported the benchmarks to c1291bd. to check it out yourself use the
before-computepass-work-with-benchesbranch on my fork.Raw results comparing
c1291bd1312a77be73954856d0e7728877232033against this branch:Testing
it is a test!
Checklist
cargo fmt.cargo clippy. If applicable, add:--target wasm32-unknown-unknown--target wasm32-unknown-emscriptencargo xtask testto run tests.CHANGELOG.md. See simple instructions inside file.