Skip to content
This repository was archived by the owner on Aug 11, 2023. It is now read-only.

Possible fix for #31#32

Closed
astrofrog wants to merge 9 commits intoastropy:masterfrom
astrofrog:fix-egg-info
Closed

Possible fix for #31#32
astrofrog wants to merge 9 commits intoastropy:masterfrom
astrofrog:fix-egg-info

Conversation

@astrofrog
Copy link
Member

This is a possible fix for #31 (it also includes #30 for now). The basic problem is that we should be able to do:

python setup.py egg_info

in an affiliated package, and have it work if Astropy is not installed. With the current attached solution, things are still not quite right because in the egg-info directory, the SOURCES.txt file doesn't contain the full list of files, and the PKG-INFO file doesn't have quite the right version (the git string is not present). Either we decide to not care about this, or it seems we are going to have to include a copy of setup_helpers and version_helpers in affiliated packages (unless anyone can think of a better solution).

There's a third option, which is to not do quite so much for affiliated packages from the core package. For several simple packages I've been writing, I don't really need the sophisticated functions in Astropy to give me the list of sub-packages, data, etc. so we could trim down on all the 'magic' stuff and only really use Astropy to import the custom commands (test, build_sphinx, the custom build and install, etc.). But for package discovery, we leave that up to the developer, and they shouldn't use anything from Astropy core.

@astrofrog
Copy link
Member Author

This is now a little better, because filter_packages is simple enough to be defined inside setup.py. But the list of packages and data is still not complete because if astropy is not installed, get_package_info and/or update_package_files do. not get called.

I'm leaning towards a solution that involves simply not using the core package to do package discovery. Most affiliated packages are going to be a lot simpler than the core package, and it should be easy enough to define package and package_data by hand, right?

@embray
Copy link
Member

embray commented Oct 7, 2013

I'm glad you brought this up-- @wkerzendorf brought this to my attention last week, but I didn't have any time to say anything about it. I don't think we should bother implementing this pull request except maybe as a short term fix but really this isn't the right way to go about it--as you wrote in the PR it won't do the right things.

I actually didn't realize that the setup.py for affiliated packages tries to import astropy (or at least I had forgotten, or didn't really think about it). I think that originally the package template did include its own copies of setup_helpers.py and version_helpers.py but we changed it because it was getting to be a pain to synchronize between the two.

But I don't think affiliated packages should have to import astropy in order for setup.py to work in the first place. That only works for the astropy package itself since it can import from the source checkout. That doesn't make sense for packages that depend on astropy in any way.

I have two solutions in mind, each with plusses and minuses (and there might be a middle ground between them too).

In either case, I think the first step needs to be to move setup_helpers.py and version_helpers.py into their own sub-package, along with a utils module containing any utilities that they require, separate from the main astropy.utils sub-package. From this point there are a few options.

The first, and my personal preference especially from the view of affiliated packages (though also potentially beneficial for the main Astropy package) is to create a separate subproject, say, "astropy-distutils" that contains all the utilities used in setting up Astropy or affiliated packages using our framework. This could potentially include all the test stuff too. This would be a much smaller and simpler package than Astropy itself, and so can easily be used with the setup_requires feature of setuptools (where a package listed in setup_requires is downloaded and added to sys.path if it's not already installed).

I have some experience with this: STScI has a package I wrote called "stsci-distutils" that serves the same purpose: a common collection of build and installation tools shared by all our packages. This works rather well. A particular advantage of this is that in many cases, if a tweak is needed to the build infrastructure (say, for example, Apple changes their default C compiler again) support for that can be added simply by releasing a new "astropy-distutils", rather than having to release a whole new astropy to fix builds on the platform in question. Of course, this won't always be possible, but it would work in many cases.

Another possibility would be to do something involving sub-modules, but I haven't fully explored that option yet.

The middle ground might be to create a separate "astropy-distutils" distribution as already suggested. But we could also include a copy of it in the main Astropy repo as a submodule, and affiliated packages may do the same, if desired. That way when doing a git clone you would also be able to get the build tools easily without much additional effort (but "astropy-distutils" should still be included in setup_requires so that setup.py can at least grab the latest released version when it's not already available).

Anyways, regardless of where we go with this the main point has to be that there should not be an import astropy in an affiliated package's setup.py (unless wrapped in a try/except ImportError: raise ImportError("Astropy required to install this package yadda yadda") but that's still not very useful).

@astrofrog
Copy link
Member Author

@embray - thanks for your reply! I had independently reached a similar conclusion today about having a separate repository for the setup stuff. However, I'm not convinced by using setup_requires because it would be nice to still have astropy-distutils provide the package discovery tools, but remember that we need egg_info to work with vanilla python so that is not possible.

My preferred solution is to have it in a separate repo, and use submodules as you suggest. This actually makes it even easier for any affiliated packages to update than currently if there are any changes in the astropy-distutils stuff as they then just need to update the sub-module! (and the affiliated package template becomes a lot simpler). Also, stable releases will see essentially no difference than currently, which I like.

So just to be clear, +1 to a astropy-distutils (or maybe astropy-setuptools) in a separate repository, and including it via submodules in astropy core and affiliated packages.

Maybe @mdboom and @eteq can also chime in on this?

@embray
Copy link
Member

embray commented Oct 11, 2013

but remember that we need egg_info to work with vanilla python so that is not possible

I don't really know what you mean by that. If there were a separate astropy-distutils package, running any setup.py command ensures that dependencies listed in setup_requires are available.

In other words, I don't know what you mean by "vanilla python" or why that's relevant.

@embray
Copy link
Member

embray commented Oct 11, 2013

By the way, perhaps I should clarify--while the fact that setup_requires dependencies are passed into setup() (which is typically the last thing called in a setup.py) is one of those many setuptools misfeatures, it is at least very easy to work around that.

Near the top of setup.py just write:

setuptools.Distribution({'setup_requires': 'astropy-setuptools'}) (or whatever we call it) and then immediately after it's fine to import astropy_setuptools or whatever. (My personal preference is still to make astropy into a namespace package though).

@astrofrog
Copy link
Member Author

@embray - when you run pip install using a pip requirements file, it will run python setup.py egg_info for every package first before installing any package, even ones in setup_requires. At least, that's what I've found from experimentation, but I would love to be proved wrong there. After all, Python needs to be able to parse the setup_requires option, which means that all the code before that needs to run without importing any third-party packages!

Just saw your latest comment - I think I see what you mean now. I kind of link the idea of making an astropy-setuptools package, but it's good to know there is a workaround in the meantime (though I still need to confirm this works in the pip case mentioned above).

@wkerzendorf
Copy link
Member

hey guys, i'm against an astropy_setuptools (only the name ;-) ), mainly because I think this great work that has been done on making the setup stuff easy does not belong in the astronomy world alone. I think under the guise of software carpentry it would be great to have a science_setuptools or something like this. You're involved in software carpentry @embray - your thoughts?

@astrofrog
Copy link
Member Author

@embray - I don't understand your comment about making astropy a namespace package?

@astrofrog
Copy link
Member Author

(sorry for the slow comment edits - writing this from a bus using tethering on my phone ;)

@wkerzendorf
Copy link
Member

@astrofrog I put a pip requirements file into my radiative transfer code (github.com/wkerzendorf/tardis) and it didn't seem to help install astropy beforehands. As far as I did understand pip, you need to specify a requirements file on disk before installing. pip install tardis-sn --pre certainly didn't seem to pick up anything in that file. but maybe I did it wrong. I wonder if `setuptools.Distribution({'setup_requires': 'astropy-setuptools'})' does the trick.

@astrofrog
Copy link
Member Author

@wkerzendorf - the original issue I'm worries about here is different though - I'm saying that if you have:

numpy
Cython
astropy
aplpy

in a pip-requirements file and try installing with pip install -r pip-requirements, it will fail because egg_info won't run on aplpy because the setup.py requires astropy.

@wkerzendorf
Copy link
Member

@astrofrog I see, thx.

@eteq
Copy link
Member

eteq commented Oct 11, 2013

I think @embray is saying we make astropy into a namespace package so that we can provide an astropy.setuphelpers as the separate package - that is, it will still live in the astropy namespace even though it isn't installed as part of the core package. Seems fine to me, although I'm not entirely sure how that interacts with the extra stuff we have in astropy/__init__.py - can we have that if it's a namespace package?

On the general issue, I'm also +1 on the separate package solution, but with some caveats/concerns:

  • Because it's a separate package, we'll have to manage its releases, imposing an extra burden on the team and users (they have to understand that there are two different packages to deal with, potentially with different releases). One way to lower this burden is for the stup package to be released/versioned parallel to astropy. This complicates the release process for astropy, but is probably a lot easier than managing both separately?
  • Will we manage the issues separately from astropy itself? I forsee a lot of people not realizing it's a separate package and posting issues to astropy if we do that. So we might just tell people to use the astropy issues instead (perhaps even turn off the issue tracker)
  • Does namespace packaging have any issue on compatibility with py 2.6 that would require us to remove it as a supported version? I know there was a PEP about this, but I don't recall which version it applies to.

OTOH, as @wkerzendorf points out, this may be useful for people other than astropy and affiliated packages. But if so, all of my concerns above become more serious because we can't tie any of them directly to astropy. So in that case, someone will have to take charge of the package and ensure it continues forward. I'm not sure that's a good use of our resources right now.

@astrofrog
Copy link
Member Author

@eteq @embray - I would much prefer that this would simply be included in astropy and affiliated packages as a submodule - it basically answers all of @eteq's concerns about release cycle about namespace. There is no release cycle, we simply keep the astropy package up to date with the latest setup tools version, and affiliated packages can update the submodule when needed. Or am I missing an obvious issue with this solution?

@embray
Copy link
Member

embray commented Oct 11, 2013

First of all I'd like to note that I think that bringing up pip specifically in this discussion is a distraction. The more fundamental problem can be stated as such:

The astropy package contains a number of useful utilities and hacks on distutils for building and installing, and
distributing a package that depends on Numpy and Cython and that developed on git (as it includes utils for
including the git changeset in the version info). The problem is that, as all those utilities reside in the astropy
package, it has to be installed before they can be used in another package's setup.py. The setup_requires
feature allows a package to be made available when running a project's setup.py script. Unfortunately, the
astropy package itself being fairly sizeable and slow to build makes running any setup.py requiring astropy
very slow. Furthermore, the way the feature is implemented requires running the setup() function, which supply's
all the project's metadata, even though some of that metadata might require some of astropy's setup utilities to
be generated (such as the full version string). This catch-22 does at least have a workaround in the form of the
setuptools.Distribution({'setup_requires': [...]}) hack.

So the solution I'm proposing is to separate the setup features out into a smallish separate package that can easily be slurped in by adding it to setup_requires. This will work even for running setup.py egg_info, by the way.

I like what @wkerzendorf is saying about a more general setup-helpers package for more generic scientific projects, but at the moment what we currently have still contains a number of Astropy-specific bits, especially where our configuration system is concerned, and other bits related to affiliated packages. But I agree that it's possible some of this could be refactored and generalized, but I might do that as a separate project later. Much of that point also applies to the package-template in general. So I'm on board with the whole thing in principle, but I don't think it should be the "first step".

@eteq is correct about my motivation behind making it a namespace package, but is also correct that that probably can't work with all the astropy.__init__ stuff so we may have to drop that idea.

I'm a little concerned about the submodule idea, but my concerns may be due to lack of experience with submodules. Mainly, can we set up submodules so that when one does a git clone path/to/astropy.git it also automatically includes the submodule in the tree? Otherwise things like pip install git+http://path/to/astropy.git will fail spectacularly since the tools needed to run the setup.py will be absent from the default repository clone.

As for @eteq's concern about having to manage a separate release: I think it's a valid concern, but I don't think it's that big a deal. This would be a very simple package to make releases for, compared to Astropy itself. And to reiterate a point I made earlier, it has the advantage that we can address platform-specific build/installation issues simply by releasing a new astropy-setuptools rather than having to cut a whole new Astropy release. This concept is already proven in practice with stsci-distutils which we use as a setup requirement for all of our packages. It takes about 5 minutes for me to do a new stsci.distutils release, and it can be done whenever a new setup issue is discovered.

@astrofrog
Copy link
Member Author

@embray - I'm on board with what you are saying, and at the end of the day, if a separate package without submodules works, then I'm fine with that solution. When you say it would work with egg_info, we would need to add setup_requires to the top of the file in this case, right?

Just a couple of notes about submodules - when you git clone you then need to run:

git submodule init
git submodule update

However, other projects do sometimes use them (e.g. IPython) and those packages can be pip installed, so there are solutions.

However, using a separate package makes it even easier for affiliated packages because they don't need to worry about submodules. So at the end of the day I withdraw my suggestion and +1 a separate package.

How do we proceed? Shall we let the list know we plan to try this in case there are any objections? Or shall we go ahead and do this and open a pull request with the setup code taken out, and a new repo with the setup code, then ask for comments?

@astrofrog
Copy link
Member Author

I have moved the discussion to an issue in the core package:

astropy/astropy#1563

I will close this issue since the pull request won't be useful.

@astrofrog astrofrog closed this Oct 12, 2013
@embray
Copy link
Member

embray commented Oct 14, 2013

Just some follow-on comments in response to your last comment:

Just because IPython "works" with pip doesn't mean you can do a pip install directly from Git, which is the use case I'm talking about. Or maybe you can, but it may be that the submodules used by IPython are optional for a successful install. Whereas we're talking about a dependency that's required for the setup.py script to work at all.

I also wouldn't be against, if not slightly in favor of a hybrid approach, whereby we have a separate package for install utils, but we can also include the the master branch of that as a subpackage in Astropy, since I would always want to test against the latest dev version anyways.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants