Add support for new --show-percentiles arg to get more granular percentiles instead of just median to all commands#43
Conversation
| By default, only the median values are returned. You can optionally request all the individual run values as well. | ||
|
|
||
| #### Required arguments | ||
| #### Arguments |
There was a problem hiding this comment.
Adjusting this simply because not all arguments are required.
| * `--number` (`-n`): Total number of requests to send. | ||
| * `--file` (`-f`): File with URLs (one URL per line) to run benchmark tests for. | ||
| * `--output` (`-o`): The output format. | ||
| * `--output` (`-o`): The output format: Either "table" or "csv". |
There was a problem hiding this comment.
@felixarntz under all other "Arguments" headings I've noticed the usage of --format (-f) for the output format, but here it's defined as --output (-o).
Is this correct? It may be correct in code (and function as --output) but should we be interchanging flags?
There was a problem hiding this comment.
@10upsimon I'm aware of this, this happened due to different people working on the commands and names diverging as such :)
I agree we should address this, but thinking better in a separate PR. Please feel free to open one!
| * `--number` (`-n`): Total number of requests to send. | ||
| * `--file` (`-f`): File with URLs (one URL per line) to run benchmark tests for. | ||
| * `--output` (`-o`): The output format. | ||
| * `--output` (`-o`): The output format: Either "table" or "csv". |
There was a problem hiding this comment.
@felixarntz same comment here WRT --output as opposed to --format
|
@felixarntz code is good for me, and I've functionally tested and all of the commands and percentile/non percentile applications work as expected. So really just looking for feedback on whether we need to alter This applies to the Aside from that, good to go from my end thank you! |
|
@mukeshpanchal27 Could you give this a review some time this week? |
Follow up to #40.
This PR chooses a hard-coded set of percentiles 10, 25, 50 (median), 75, 90, previously discussed with @oandregal. This is also what is commonly used for HTTP Archive queries by that team, and we have also used that set of percentiles in our own HTTP Archive queries (e.g. this one).
In the future we can maybe make the concrete percentiles controllable by the argument, but that doesn't seem necessary at this point. The above percentiles are a good baseline to get a more representative impression of the overall accuracy of the values.
Testing this PR
wpt-*commands, make sure the median results are the same.benchmark-*commands, make sure the median results are roughly similar (they won't be the same because it's different runs).--show-percentilesflag in each one.wpt-*commands, make sure thep50results are the same as the previous median results.benchmark-*commands, make sure thep50results are roughly similar to the previous median results (they won't be the same because it's different runs).p25value should be the value of the 2nd lowest run result, or if you use 10 runs, thep90value should be the 9th lowest / 2nd highest run result. This sanity check can be done for any of the commands, since the underlying percentile calculation logic is the same in all of them.