Skip to content

Add more robust benchmarks#165

Merged
alice-i-cecile merged 10 commits intoDioxusLabs:mainfrom
TimJentzsch:128-more-robust-benchmarks
Oct 11, 2022
Merged

Add more robust benchmarks#165
alice-i-cecile merged 10 commits intoDioxusLabs:mainfrom
TimJentzsch:128-more-robust-benchmarks

Conversation

@TimJentzsch
Copy link
Copy Markdown
Collaborator

@TimJentzsch TimJentzsch commented Jun 16, 2022

Objective

Closes #128.

This PR aims to add more robust benchmarks that deal with much larger trees to get a better comparison point for performance improvements.

Solution

This PR generates large trees of 100,000 nodes via pseudo-random generation.

One benchmark uses a very flat hierarchy, the other one has a branching factor of 7 so it goes deeper (about 6).

I also increased the sample size of the other benchmarks from 100 to 200 to reduce noise.

Context

The current benchmarks worked with rather small trees and are very noisy. We can also think about increasing the sample size to improve the stability of these benchmarks.

Feedback wanted

  • The "deep" benchmark is a lot slower than the "flat" benchmark, even though the depth should only be about log_7(100_000) ~ 6. The flat benchmark takes < 5 ms and the deep benchmark almost 3 sec. While I expected the deep benchmark to be slower, this difference seems to be insane. Did I implement the benchmarks wrong?

@alice-i-cecile alice-i-cecile added the performance Layout go brr label Jun 16, 2022
Copy link
Copy Markdown
Collaborator

@alice-i-cecile alice-i-cecile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a pseudo-random generation approach like this useful for bench-marking?

Yes, I think this is quite nice. It allows us to explore how robust are benchmarks are down the line, and the seed makes generation repeatable.

Is the approach to generate the leaf styles good to generate versatile trees?

I like this, but I don't see why we're only randomizing the leaf styles. Why not randomize the style of every node?

The match statement in get_random_dimension uses float ranges, however this will be disallowed in the future. What is a better approach to select the dimension here?

Use one of the more appropriate tools from the rand crate; there should be one available to select randomly from an iterator (and possibly from an enum). Let me know if you need help finding this.

}

/// Get a randomly generated style for a node
fn get_random_style(rng: &mut ChaCha8Rng) -> FlexboxLayout {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this should be a method on FlexboxLayout. I would actually put this into the main crate and hide it behind a off-by-default feature flag. This functionality would be excellent for downstream crates like bevy and dioxus as well.

I think the best architecture here is actually a Random trait that we can implement for each of the fields on the FlexboxLayout, as well as on the FlexboxLayout itself.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would a random layout style be useful for bevy and dioxus? Or do you mean for their benchmarks?

This would require us to add the rng dependencies to the main crate, but I guess if it's behind a feature flag that's okay.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem of this approach might be that you might want to fine-tune the probabilities of each style.

Another problem might be that for deep layouts, it's easy to create nonsensical styles, e.g. a parent node will often be smaller than the child nodes. I'm not sure what the best way would be to fix this.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would a random layout style be useful for bevy and dioxus? Or do you mean for their benchmarks?

For their benchmarks, yes :)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alice-i-cecile One problem with this approach would be that I can not find a way to enable specific features by default only in a given profile.

We don't want to have the rand feature enabled by default, but we do want it in the bench profile.

I guess one way to resolve this would be to explicitly enable the feature with the CLI flag and to only define the benchmarks if the feature is enabled. However, then you would need to always remember to enable the feature.

Do you know of a better way to do this?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I think if taffy defines these methods in a feature flag, downstream dependencies will be able to use that feature flag in their dev dependencies?

I think that having to remember to add a flag to the dev dependencies a single time per library is fine.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And how can we handle that for our own benchmarks? I haven't found a way to enable a feature for a specific profile only.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to import taffy as a dev-dependency within this crate, and can enable additional features there.

@TimJentzsch
Copy link
Copy Markdown
Collaborator Author

I like this, but I don't see why we're only randomizing the leaf styles. Why not randomize the style of every node?

Sorry, this is a mistake on my part. We'll probably randomize all node styles, if we can avoid the issues mentioned in my other comment.

@alice-i-cecile
Copy link
Copy Markdown
Collaborator

Another problem might be that for deep layouts, it's easy to create nonsensical styles, e.g. a parent node will often be smaller than the child nodes. I'm not sure what the best way would be to fix this.

I'm alright with these nonsense layouts as a startout. I don't expect them to be terribly unrepresentative, and if we find that they are we can tune the randomization in a followup.

@alice-i-cecile alice-i-cecile added benchmarks Measuring performance and removed performance Layout go brr labels Jun 17, 2022
@TimJentzsch TimJentzsch force-pushed the 128-more-robust-benchmarks branch from 38f5abc to 4362ff9 Compare October 10, 2022 15:34
@TimJentzsch TimJentzsch marked this pull request as ready for review October 11, 2022 15:44
@TimJentzsch
Copy link
Copy Markdown
Collaborator Author

This should now be ready for review.

The two benchmarks (both use 100_000 nodes):

  • Flat hierarchy: Every child has 1-4 sub-children, otherwise it's a flat hierarchy.
  • Deep hierarchy: Tree has a branching factor of 7 until the 100_000 nodes are reached. This should give us a depth of about log_7(100_000) ~ 6.

The deep hierarchy benchmark needs almost 3 seconds while the flat benchmark finishes in < 5 ms on my system.

This difference seems insane to me, is there an error in the tree generation? Otherwise we might have some performance problems, 100k is not that many nodes.

Copy link
Copy Markdown
Collaborator

@alice-i-cecile alice-i-cecile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct to me, and the code is in a solid place to start.

Unfortunately yes, the performance differences in nested hierarchies are really stark. We should probably add and then graph a single branch tree by depth as a follow-up, but there was at least one remaining exponential performance issues according to @mockersf :(

@nicoburns
Copy link
Copy Markdown
Collaborator

The deep hierarchy benchmark needs almost 3 seconds while the flat benchmark finishes in < 5 ms on my system.
This difference seems insane to me, is there an error in the tree generation? Otherwise we might have some performance problems, 100k is not that many nodes.

I think Taffy is currently at least O(n^2) complexity in respect to depth. Possibly even O(n!). I think some of this is unavoidable, but I also think there's something not quite right with the caching layer. I was logging out calls to compute recently, and it seemed to be called way more than I would expect for nested hierachies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmarks Measuring performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create more robust benchmarks

3 participants