Support comparing two sets of pystats#98816
Conversation
This adds support for comparing pystats collected from two different builds. - The `--json-output` can be used to load in a set of raw stats and output a JSON file. - Two of these JSON files can be provided on the next run, and then comparative results between the two are output. The refactor required is basically to: - Separate out the building of table contents from emitting the table - Call these new functions from functions designed for either single results or comparative results Part of the work for: faster-cpython/tools#115
markshannon
left a comment
There was a problem hiding this comment.
I assume that you've checked that this produces the same output.
| return [] | ||
|
|
||
| if len(a_rows): | ||
| a_ncols = list(set(len(x) for x in a_rows)) |
There was a problem hiding this comment.
Why the list(set(...)), wouldn't set(...) be sufficient?
There was a problem hiding this comment.
Further down, I want to get the single value out of the set. (a_ncols[0])
Tools/scripts/summarize_stats.py
Outdated
| ncols = b_ncols[0] | ||
|
|
||
| default = [""] * (ncols - 1) | ||
| a_data = dict((x[0], x[1:]) for x in a_rows) |
There was a problem hiding this comment.
a_data = { x[0]: x[1:] for x in x in a_rows}
Tools/scripts/summarize_stats.py
Outdated
|
|
||
| default = [""] * (ncols - 1) | ||
| a_data = dict((x[0], x[1:]) for x in a_rows) | ||
| b_data = dict((x[0], x[1:]) for x in b_rows) |
There was a problem hiding this comment.
Is it worth adding a check for duplicate keys? len(a_data) == len(a_rows)
|
|
||
| def main(): | ||
| stats = gather_stats() | ||
| def output_single_stats(stats): |
There was a problem hiding this comment.
Are you using the "emit_" and "output_" prefixes interchangeably, or is there a difference?
There was a problem hiding this comment.
It's keeping the naming from the original code (which is @markshannon's), which has output_stats as a top-level function (which I split into three). I guess the difference is that output_ is these top-level functions, whereas each of the emit_ functions emits a single section. But we certainly could use emit_ everywhere.
Yep. And you can see the comparative output, and the single output on my prototype PR. |
This adds support for comparing pystats collected from two different builds.
--json-outputcan be used to load in a set of raw stats and output a JSON file.The refactor required is basically to:
Part of the work for: faster-cpython/tools#115
See mdboom#3 for a prototype of where this is possibly headed in a Github Action.