Skip to content

make test folder a package#4436

Closed
pmeier wants to merge 4 commits intopytorch:mainfrom
pmeier:tests-package
Closed

make test folder a package#4436
pmeier wants to merge 4 commits intopytorch:mainfrom
pmeier:tests-package

Conversation

@pmeier
Copy link
Contributor

@pmeier pmeier commented Sep 17, 2021

This solves two issues:

  1. Without this we cannot import other modules within subfolders, e.g. using anything from test/common_utils.py in test/prototype/*. With this change, we can do from test.common_utils import foo from anywhere in our test suite.
  2. pytest requires unique module names in non-package folders. That means, we cannot have test/protototype/datasets/test_utils.py if test/test_utils.py already exists. See this CI run from Add prototype namespace and datasets #4433 for an example.

cc @pmeier

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pmeier , this LGTM but I'll have to import this into fbcode before merging to make sure this doesn't toast the internal infra.... I'll get back to you

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just marking so we don't accidentally merge before checking fbcode. Also as Philip noted on slack the onnx import failures seem legit

@pmeier
Copy link
Contributor Author

pmeier commented Sep 17, 2021

The problem lies in the fact that the test is invoked with python test/test_onnx.py. While this invokes pytest internally

vision/test/test_onnx.py

Lines 500 to 501 in a2b4c65

if __name__ == '__main__':
pytest.main([__file__])

the test package is never imported and thus importing from submodules of it (regardless of absolute or relative) fails.

So the question is, if we want to retain the ability to run tests by invoking the module with python directly? After our migration to pytest I see no point. In the example above it functions as drop-in replacement: pytest test/test_onnx.py

@fmassa
Copy link
Member

fmassa commented Sep 17, 2021

Let's just assume that tests are run with pytest only then

@mthrok
Copy link
Contributor

mthrok commented Sep 17, 2021

I recommend to introduce another directory in test as a top-level unit test module for the following reasons.

  1. Having torchvision and test (the top level test directory) forces to have torchvision from the repository root precedence when import torchvision.
    This is fine for the regular development flow, but sometimes, there are occasions one wants to run unit tests on the installations from binary distribution, such as nightly and release candidates. In this case, having test module top directory inside of test directory comes handy because cd test and the tests will run on installed versions, instead of source directory. Otherwise one has to delete torchvision directory first to run tests.
  2. There is a standard module called test, though it is not really used. I do not know if the existing test frameworks use this module, but having test/__init__.py and invoking it from the top level might cause a problem.

@NicolasHug
Copy link
Member

These are good points @mthrok , thanks. Pytest actually has a few guidelines on this: https://docs.pytest.org/en/6.2.x/goodpractices.html#choosing-a-test-layout-import-rules

Their recommendation would be to move torchvision into src/torchvision instead.

This makes me wonder whether this is worth it or not. Would it be possible to have a test folder within the prototype package instead? I guess you wouldn't be able to access the same common test utils though

@pmeier
Copy link
Contributor Author

pmeier commented Sep 17, 2021

This makes me wonder whether this is worth it or not. Would it be possible to have a test folder within the prototype package instead? I guess you wouldn't be able to access the same common test utils though

Yes, I think it is possible. Currently, I don't need access to anything, but that might be a problem in the future.

@mthrok
Copy link
Contributor

mthrok commented Sep 17, 2021

Their recommendation would be to move torchvision into src/torchvision instead.

I considered doing so for audio since it would not be difficult but I was afraid that it might give degraded developer experience to folks work on multiple repos.

I also see that data is following the pattern.
It would be best, if all PyTorch projects were structured that way from the beginning.

@mthrok
Copy link
Contributor

mthrok commented Sep 17, 2021

This makes me wonder whether this is worth it or not. Would it be possible to have a test folder within the prototype package instead? I guess you wouldn't be able to access the same common test utils though

Yes, I think it is possible. Currently, I don't need access to anything, but that might be a problem in the future.

With that approach, I assume that CI test script needs to be updated so as to run the tests in CI. Prototype features can come and go. I think it's best to reduce the number of places one need to update.

@pmeier
Copy link
Contributor Author

pmeier commented Sep 17, 2021

With that approach, I assume that CI test script needs to be updated so as to run the tests in CI. Prototype features can come and go. I think it's best to reduce the number of places one need to update.

We could have a separate job to run pytest torchvision/prototype similar to our onnx and hub tests. Maybe we can even only trigger it if something in there changes. In any case if the prototype test fails on an unrelated PR, we know that the error can be ignored.

@NicolasHug
Copy link
Member

NicolasHug commented Sep 17, 2021

I'm not sure I understand the concern @mthrok , is this about test discovery?

I think we would just need to change

-pytest --cov=torchvision --junitxml=test-results/junit.xml -v --durations 20 test --ignore=test/test_datasets_download.py
+pytest --cov=torchvision --junitxml=test-results/junit.xml -v --durations 20 test torchvision --ignore=test/test_datasets_download.py

in run_test.sh

(there might be options in pytest.ini as well, I haven't checked)

@pmeier
Copy link
Contributor Author

pmeier commented Sep 17, 2021

I think we would just need to change

Nope, we don't need to change anything to run the "normal" test suite. pytest is set up to only look for tests in test/ so we don't need to ignore torchvision/:

vision/pytest.ini

Lines 9 to 10 in 1140ecf

testpaths =
test

@NicolasHug
Copy link
Member

I'm assuming that we want all CIs to run both the tests in test and in torchvision/whatever/test?

That's why I'm saying we just need to add torchvision explicitly so that the CI doesn't omit any of the torchvision/whatever/test folders.

Or better then, edit testpaths to also include torchvision and we don't need to change anything else.

@pmeier
Copy link
Contributor Author

pmeier commented Sep 20, 2021

There are three ways forward here:

  1. Make test/ a package, which would solve both issues that prompted this PR. As @mthrok noted, this will prohibit us from using the test package, but since the documentation reads:

    The test package is meant for internal use by Python only. It is documented for the benefit of the core developers of Python. Any use of this package outside of Python’s standard library is discouraged as code mentioned here can change or be removed without notice between releases of Python.

    we shouldn't be using that anyway.

  2. pytest also supports a layout in which the tests exist alongside the source, e.g. torchvision/prototype/datasets/test/. That would solve the issue of duplicated names, but would prohibit us from using anything from common test utilities unless we also move them inside torchvision.

  3. We could also add a new folder test_prototype/ that would have the same effect and downsides as 2.

For all three variants we can have a separate prototype workflow with minimal effort.

@mthrok
Copy link
Contributor

mthrok commented Sep 20, 2021

For all three variants we can have a separate prototype workflow with minimal effort.

I think 1 is the minimal effort for the long term. It does not prevent the prototype test from using the common test utilities, it does not require any change to the CI, and it has an added benefit of running the tests for package installations.

I feel that 2. is going against the already established code organization, without justification why it has to be that way. And one has to remember to enforce it every time a new prototype is added, why the tests for prototype have to be placed alongside of code, unlike the rest of the code. Once you have a good justification of why it has to be that way, you probably need to document it somewhere.

@NicolasHug
Copy link
Member

Something that I missed earlier and that @pmeier clarified in our chat is that the main goal here is to separate our regular tests from the prototype tests, because the prototype tests are much more subject to fail than the rest, and we want to separate these signals in the CI.

With that in mind, I'm not sure I see how option 1. really solves this?

@pmeier
Copy link
Contributor Author

pmeier commented Sep 20, 2021

With that in mind, I'm not sure I see how option 1. really solves this?

If we go for a test/prototype/ folder, the only thing we would need to do, is to add another --ignore here and in all other run_test.sh scripts that point directly to test/. The new workflow would simply invoke pytest test/prototype.

@NicolasHug
Copy link
Member

Thanks @pmeier

I feel that 2. is going against the already established code organization, without justification why it has to be that way. And one has to remember to enforce it every time a new prototype is added, why the tests for prototype have to be placed alongside of code, unlike the rest of the code. Once you have a good justification of why it has to be that way, you probably need to document it somewhere.

I understand the constraint, but this seems to apply to all 3 proposed solutions? None of them completely follow the current code organization I would say.

My personal (decreasing) order of preference would be 3 -> 2 -> 1. Option 1. would require to create an additional folder, whether it's test/something/ or src/torchvision which seems more disruptive that the rest (in terms of git blame implications, changes to different workflows, fbcode, etc.)

@pmeier
Copy link
Contributor Author

pmeier commented Sep 20, 2021

Option 1. would require to create an additional folder

Every option will add an folder:

  1. test/prototype/
  2. torchvision/prototype/test/ (or nested deeper)
  3. test_prototype/

or src/torchvision

Note that this will not solve the issue we are trying to solve here, since it only changes how torchvision is imported, but not the test collection.

@NicolasHug
Copy link
Member

Every option will add an folder:

yes but option 1., i.e. making test a package, will require to either move torchvision to src/torchvision (#4436 (comment)), or add another directory in test (#4436 (comment)).

This is the extra directory that I'm taking about

@pmeier
Copy link
Contributor Author

pmeier commented Sep 20, 2021

I've had an offline discussion with @NicolasHug about his concerns. There was some confusion about how the pytest documentation was phrased. The takeaway is, that as long as we install torchvision in --editable / develop mode, option 1. can be adopted with the same effort as the other ones and especially without disrupting anyone's workflow.

We do that in CI and also recommend it in our contribution guide, so we should be fine. This PR already implements it and minus all the unrelated CI failures it works out.

Edit: After some more discussion we realized that I was in fact wrong and Nicolas' concerns are valid. To import the test/ package, pytest needs add the project root to sys.path, which includes our local version of torchvision. Since the project root is prepended to the path, the local version will always take precedence over any installed one, rendering testing against such a version futile.

Thus, I now agree with @NicolasHug's order of preference, i.e. 3. -> 2. -> 1. Sorry for the confusion.

@pmeier
Copy link
Contributor Author

pmeier commented Sep 21, 2021

After some more investigation, there might be another option. This whole issue revolves around the fact that pytest will add the project root to sys.path. With pytest>=6 we have access to --import-mode=importlib, which circumvents all of this, since sys.path is not touched. The documentation is not clear if we are able to have common test utilities, but my initial tests failed. I asked for clarification in opened pytest-dev/pytest#9109. Let's wait for an answer before we decide here.

@pmeier
Copy link
Contributor Author

pmeier commented Sep 21, 2021

Unfortunately, --import-mode=importlib implies that no local dependencies are allowed. There are some ideas to change that in pytest-dev/pytest#8964, but no consensus or PR yet. I think we could fairly easy write a small plugin that simply imports test/ before we start collection. Otherwise, I stand by my order of preference above.

Edit: I've hacked together https://github.com/pmeier/pytest-import as a PoC pytest plugin. With it installed, we can use --import-mode=importlib and have local dependencies. So this is probably the "cleanest" solution for the problem, but until pytest supports something like this natively, it means more work for us. I'm willing to do it, but it is probably not worth it.

@fmassa
Copy link
Member

fmassa commented Sep 23, 2021

I propose a different option to move forward here: keep the hierarchy flat, add test files like test_prototype_datasets.py, and use decorators to skip those tests if torchvision.prototype is not available.

One possible option is probably to use https://docs.pytest.org/en/stable/reference.html#pytest-importorskip-ref

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous comment

@datumbox
Copy link
Contributor

Thanks everyone for the input on this. It's great to see that lots of thought was put on the solution. I wonder if we can make the necessary changes and unblock this PR?

I'm not going to pitch in on which direction we will follow as I think you got enough options. I just want to make sure that the prototype area has an agreed initial way to support tests because we plan to introduce more soon. If we all agree, we could adopt the quick an easy approach mentioned by Francisco at #4436 (comment) and then provide a better approach as we improve our test suite?

@pmeier
Copy link
Contributor Author

pmeier commented Oct 20, 2021

We opted to ignore prototype tests by default and only run them if specifically requested. That way we can keep them in the test/ folder without issues. See #4664 for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants