Skip to content

Include pystats during warmups#170

Merged
vstinner merged 1 commit intopsf:mainfrom
mdboom:stats-on-warmups
Oct 30, 2023
Merged

Include pystats during warmups#170
vstinner merged 1 commit intopsf:mainfrom
mdboom:stats-on-warmups

Conversation

@mdboom
Copy link
Copy Markdown
Collaborator

@mdboom mdboom commented Oct 25, 2023

On recent investigations, it's become clear that the support for py_stats in pyperf (that I myself originally submitted) isn't ideal, since stats aren't collected during warmups. Since warmups run in the same process as the "main" runs, if any optimizations or specializations are performed during the warmup, they aren't recorded in the stats.

This change simply includes warmups in stats collection. Calibration runs are left as is, since they are run in a separate process.

@markshannon
Copy link
Copy Markdown

It seems odd to have the stats different from the times.
I approve of removing "warmup", but I'd like the stats and times to be for the same thing.

Copy link
Copy Markdown
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@vstinner vstinner merged commit a81aead into psf:main Oct 30, 2023
@vstinner
Copy link
Copy Markdown
Member

pyperf stores timing of each "run", but if I understand correctly, stats are only cumulated for all runs, maybe even for all worker processes, no?

@mdboom
Copy link
Copy Markdown
Collaborator Author

mdboom commented Oct 31, 2023

I approve of removing "warmup", but I'd like the stats and times to be for the same thing.

That was the goal when pystats support was originally added to pyperf, but that hides a lot of the specialization/optimization work. There's a discussion of what happens when warmups are included in the timings here. I'm not sure it's worth it: faster-cpython/bench_runner#61

@mdboom
Copy link
Copy Markdown
Collaborator Author

mdboom commented Oct 31, 2023

pyperf stores timing of each "run", but if I understand correctly, stats are only cumulated for all runs, maybe even for all worker processes, no?

If you mean we don't collect stats for each run separately, but instead collect them for the whole run of the process (excluding the pyperf / pyperformance "harness"), you are correct. It means we can't see what each run did, and if earlier runs behave differently than later ones etc. It probably could be done, but would add a lot of complexity / generate a lot more data etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants