My conclusion from #8146 is that astropy.test() (testing installed code) is not running the exact same set of tests under the same settings as python setup.py test (testing full source code). Do we care? If so, what can we do about it (probably would involve refactoring the test runner)?