Run only one pytest parametrization when generating optest#108936
Run only one pytest parametrization when generating optest#108936ezyang wants to merge 5 commits intogh/ezyang/2335/basefrom
Conversation
Signed-off-by: Edward Z. Yang <ezyang@meta.com> [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/108936
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit 045f8c9 with merge base 54dd65f ( UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
…optest" Richard, I'm curious to see what you think of this. I'm trying to use optest on the torchvision test suite, and after hacking up pytest support in #108929 I noticed that this was 5x'ing the test time... for no good reason. * torchvision nms tests before optests: 60 passed, 4 skipped, 1206 deselected in 11.47s * after optests: 300 passed, 20 skipped, 1206 deselected in 49.85s It's no good reason because torchvision has parametrized the tests to get a spread of various random generation, but for checking schema or fake tensor, we don't actually need to test for different values. This PR hacks up the codegen to replace pytest parametrize markers so that, instead of sampling many values, by default we only sample the first value. There are more bells and whistles we could add; for example, I could add an extra custom pytest marker to let you say "no no, please run each parametrization in the optest". I could also not do this. With this PR: * reduced optests: 88 passed, 4 skipped, 1206 deselected in 13.89s Signed-off-by: Edward Z. Yang <ezyangmeta.com> [ghstack-poisoned]
| if pytestmark := new_method.__dict__.get("pytestmark"): | ||
| import pytest | ||
|
|
||
| new_pytestmark = [] | ||
| for mark in pytestmark: | ||
| if isinstance(mark, pytest.Mark) and mark.name == "parametrize": | ||
| argnames, argvalues = mark.args | ||
| assert not mark.kwargs, "NYI" | ||
| new_pytestmark.append( | ||
| pytest.mark.parametrize(argnames, (next(iter(argvalues)),)) | ||
| ) | ||
| else: | ||
| new_pytestmark.append(mark) | ||
| new_method.__dict__["pytestmark"] = new_pytestmark | ||
|
|
There was a problem hiding this comment.
I've thought about this before, especially because the internal tests also use some parametrization that make the tests go by slower. It kind of depends on how the parametrization is used:
- if it is just to generate different random values, then yes, one parametrization is sufficient.
- if it is to generate different options in your operator (e.g. BatchNorm with training=True/False), then it is important that we test all of them (BatchNorm mutates in one path but not the other)
Since generate_opcheck_tests is meant to be a "slow but runs everything check, because the user didn't want to write OpInfos (and we don't have an OpInfo V2 API)", I think I'd prefer the default to be "we run all parameterizations", but if the user understands what their test is doing and how it exercises their operator, they should be able to control the behavior to run just a single parametrization, perhaps by adding a decorator on the original test. Though, this does run the risk that the user copy-pastes tests with this decorator when writing new tests. WDYT?
There was a problem hiding this comment.
This is fine by me. I'll need to figure out a mechanism to add extra annotations to parametrize though.
…optest" Richard, I'm curious to see what you think of this. I'm trying to use optest on the torchvision test suite, and after hacking up pytest support in #108929 I noticed that this was 5x'ing the test time... for no good reason. * torchvision nms tests before optests: 60 passed, 4 skipped, 1206 deselected in 11.47s * after optests: 300 passed, 20 skipped, 1206 deselected in 49.85s It's no good reason because torchvision has parametrized the tests to get a spread of various random generation, but for checking schema or fake tensor, we don't actually need to test for different values. This PR hacks up the codegen to replace pytest parametrize markers so that, instead of sampling many values, by default we only sample the first value. There are more bells and whistles we could add; for example, I could add an extra custom pytest marker to let you say "no no, please run each parametrization in the optest". I could also not do this. With this PR: * reduced optests: 88 passed, 4 skipped, 1206 deselected in 13.89s Signed-off-by: Edward Z. Yang <ezyangmeta.com> [ghstack-poisoned]
Richard, I'm curious to see what you think of this. I'm trying to use optest on the torchvision test suite, and after hacking up pytest support in #108929 I noticed that this was 5x'ing the test time... for no good reason. * torchvision nms tests before optests: 60 passed, 4 skipped, 1206 deselected in 11.47s * after optests: 300 passed, 20 skipped, 1206 deselected in 49.85s It's no good reason because torchvision has parametrized the tests to get a spread of various random generation, but for checking schema or fake tensor, we don't actually need to test for different values. This PR hacks up the codegen to replace pytest parametrize markers so that, instead of sampling many values, by default we only sample the first value. There are more bells and whistles we could add; for example, I could add an extra custom pytest marker to let you say "no no, please run each parametrization in the optest". I could also not do this. With this PR: * reduced optests: 88 passed, 4 skipped, 1206 deselected in 13.89s Signed-off-by: Edward Z. Yang <ezyangmeta.com> [ghstack-poisoned]
Richard, I'm curious to see what you think of this. I'm trying to use optest on the torchvision test suite, and after hacking up pytest support in #108929 I noticed that this was 5x'ing the test time... for no good reason. * torchvision nms tests before optests: 60 passed, 4 skipped, 1206 deselected in 11.47s * after optests: 300 passed, 20 skipped, 1206 deselected in 49.85s It's no good reason because torchvision has parametrized the tests to get a spread of various random generation, but for checking schema or fake tensor, we don't actually need to test for different values. This PR hacks up the codegen to replace pytest parametrize markers so that, instead of sampling many values, we sample only one value if you mark it with `opcheck_only_one`. There's a carveout for device parametrization, where we always run all those variants. With this PR: * reduced optests: 88 passed, 4 skipped, 1206 deselected in 13.89s Companion torchvision PR which uses this at pytorch/vision#7961 Signed-off-by: Edward Z. Yang <ezyangmeta.com> [ghstack-poisoned]
| if pytestmark := new_method.__dict__.get("pytestmark"): | ||
| import pytest |
There was a problem hiding this comment.
Can you include a comment somewhere on how to actually mark a test with opcheck_only_one, for those who are unfamiliar with pytest?
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Stack from ghstack (oldest at bottom):
Richard, I'm curious to see what you think of this. I'm trying to use optest on the torchvision test suite, and after hacking up pytest support in #108929 I noticed that this was 5x'ing the test time... for no good reason.
It's no good reason because torchvision has parametrized the tests to get a spread of various random generation, but for checking schema or fake tensor, we don't actually need to test for different values.
This PR hacks up the codegen to replace pytest parametrize markers so that, instead of sampling many values, we sample only one value if you mark it with
opcheck_only_one. There's a carveout for device parametrization, where we always run all those variants.With this PR:
Companion torchvision PR which uses this at pytorch/vision#7961
Signed-off-by: Edward Z. Yang ezyang@meta.com