ENH: add support for setting astropy's cache and config directories via environment variables (blocked by #17934)#17640
Conversation
|
Thank you for your contribution to Astropy! 🌌 This checklist is meant to remind the package maintainers who will review this pull request of some common things to look for.
|
|
👋 Thank you for your draft pull request! Do you know that you can use |
2c5461e to
d0a3b7b
Compare
pllim
left a comment
There was a problem hiding this comment.
I want to focus on the big picture first before diving into implementation and tests.
- Should we allow mix and match, e.g.,
ASTROPY_CONFIG_DIRandXDG_CACHE_HOME? - What if a long time user now starts to use
ASTROPYprefix but has a lot of stuff inXDGalready; should we have helper function to help them move the config and cache over? We might already have export/import for cache, but not sure about config. - Is it simpler going forward if we have a plan to eventually exclusively use the
ASTROPYprefix? - Are there special use cases (e.g., HPC, cloud, people sending their config to collaborators manually) where this change will come back and bite us?
astropy/config/tests/test_configs.py
Outdated
| pytest.param("XDG_CONFIG_HOME", paths.get_config_dir_path, id="config"), | ||
| pytest.param( | ||
| "XDG_CACHE_HOME", | ||
| paths.set_temp_cache, | ||
| paths.get_cache_dir_path, | ||
| id="xdg-cache", | ||
| ), |
There was a problem hiding this comment.
Would it be better to parametrize twice: once over "ASTROPY" and "XDG" and once over "cache" and "config"?
There was a problem hiding this comment.
Not sure. I just pushed a second commit to that effect. Let me know if you like it better.
Obviously yes.
If a user wants to customize the Independently from the above, temporarily setting the Astropy environment variable allows running
Handling the Astropy and XDG environment variables has a lot of common code and they can share tests too, if the tests are parametrized correctly. The code functionality over code complexity ratio very favorable for supporting both sets of variables.
In its current state the changes could cause problems to anyone who has set the Astropy environment variables and then runs our tests, like I already described in the other pull request, but addressing that is not complicated (it is only somewhat tedious because of code repetition). Once that has been taken care of I don't foresee any real problems. |
|
I agree with @eerovaher's answers.
sorry, I missed that. I'll address this shortly ! |
|
@eerovaher I should have addressed all your points so far |
603eff0 to
7878773
Compare
eerovaher
left a comment
There was a problem hiding this comment.
docs/conftest.py still needs to be updated.
e6145c0 to
4147533
Compare
|
It looks like we are getting somewhere with this, and the testing approach was hugely improved thanks to @eerovaher's suggestions already, so let's undraft this now ! |
eerovaher
left a comment
There was a problem hiding this comment.
docs/conftest.py still needs to be updated and config documentation should be updated too.
astropy/config/tests/test_configs.py
Outdated
| @pytest.fixture | ||
| def isolated_setup(monkeypatch): | ||
| # ignore global state of the test session | ||
| monkeypatch.delenv("ASTROPY_CACHE_DIR", raising=False) | ||
| monkeypatch.delenv("ASTROPY_CONFIG_DIR", raising=False) | ||
| monkeypatch.delenv("XDG_CACHE_HOME", raising=False) | ||
| monkeypatch.delenv("XDG_CONFIG_HOME", raising=False) | ||
|
|
||
| monkeypatch.setattr(paths.set_temp_cache, "_temp_path", None) | ||
| monkeypatch.setattr(paths.set_temp_config, "_temp_path", None) |
There was a problem hiding this comment.
Is this thread safe? Don't you need to use locks explicitly?
There was a problem hiding this comment.
It's not. The current implementation is just meant for code sharing. I'll add locking now but thread safety will only be guaranteed in between users of this fixture.
There was a problem hiding this comment.
...thread safety will only be guaranteed in between users of this fixture.
That's why it would be better to make the code inherently type safe, but if that's too complicated then the fixture will have to do.
| _directory_type: Literal["cache", "config"] | ||
| _directory_env_var: Literal["XDG_CACHE_HOME", "XDG_CONFIG_HOME"] | ||
| _xdg_env_dir: Literal["XDG_CACHE_HOME", "XDG_CONFIG_HOME"] | ||
| _astropy_env_dir: Literal["ASTROPY_CACHE_DIR", "ASTROPY_CONFIG_DIR"] |
There was a problem hiding this comment.
Would it be better to name the variables ASTROPY_CACHE_HOME and ASTROPY_CONFIG_HOME so that they would be more similar to the corresponding XDG variable names? Or are the *_CACHE_DIR and *_CONFIG_DIR used commonly enough?
There was a problem hiding this comment.
I can't think of any widely used example of any. I know I'm biased to view *_DIR as more "natural" because I've seen it used in HPC culture where I spent my years as a researcher. No strong opinions on this point, I guess.
There was a problem hiding this comment.
The ASTROPY_*_DIR names are not bad, but we should think about the names very carefully because whatever we choose will be very difficult to change.
There was a problem hiding this comment.
I agree it's worth taking the time to make sure we're not ignoring an existing convention.
I'll do some more research now.
There was a problem hiding this comment.
Here's a list of resources in arbitrary places I happened to think of
uvhasUV_CACHE_DIRpoetryhasPOETRY_CACHE_DIRandPOETRY_CONFIG_DIRpdmseems to only supportXDG_*_HOMEvariablesmatplotlibhasMPLCONFIGDIRCPythonuses the same naming convention as matplotlib: all upper case, no underscore- I found nothing relevant from
pip,rye,IPython,numpyorscipy
so I found at least two popular packaging tools (uv and poetry) happen to use the naming convention I spontaneously used (i.e. <TOOLNAME>_CONFIG_DIR). The only directly comparable case I found (MPLCONFIGDIR) doesn't fit our overarching convention for naming env variables ASTROPY_* (using underscores between words), so I think we can safely disregard it.
There was a problem hiding this comment.
+1 for using ASTROPY_{CONFIG/CACHE}_DIR, with the caveat that it should directly point to the directory to be used. The system shouldn't look for another astropy directory in there.
| "environment_variable,func", | ||
| "env_var_template", | ||
| [ | ||
| # Regression test for #17514 - XDG_CACHE_HOME had no effect |
There was a problem hiding this comment.
Looking this again it's clear that the comment isn't very helpful anymore because the parametrization of this test has changed too much. So the information in this comment is still valuable, but the exact phrasing is not.
There was a problem hiding this comment.
@eerovaher , so you want a new comment added with different phrasing, still?
astropy/config/tests/test_configs.py
Outdated
| xdg_target_dir.mkdir(parents=True) | ||
| monkeypatch.setenv(xdg_env_var, str(tmp_path / "xdg")) | ||
|
|
||
| monkeypatch.setattr(cls, "_temp_path", None) |
There was a problem hiding this comment.
Why is this needed if you are using the isolated_setup fixture?
There was a problem hiding this comment.
That's just accidental left overs. Thanks !
astropy/config/tests/test_configs.py
Outdated
| env_var_template, | ||
| dir_type, | ||
| ): | ||
| match dir_type: |
There was a problem hiding this comment.
This match-statement shouldn't be needed.
There was a problem hiding this comment.
This was my solution to your previous suggestion #17640 (comment)
If none of my approaches are satisfactory, could you suggest another one ?
There was a problem hiding this comment.
Include cls and func in the parametrization like you do in the other tests.
4147533 to
33b0854
Compare
eerovaher
left a comment
There was a problem hiding this comment.
It would be best if more people would have a look at the proposed environment variable names.
astropy/config/tests/test_configs.py
Outdated
|
|
||
|
|
||
| @pytest.fixture | ||
| def isolated_setup(monkeypatch): |
There was a problem hiding this comment.
The fixture itself looks fine, but this name seems too generic.
| "environment_variable,func", | ||
| "env_var_template", | ||
| [ | ||
| # Regression test for #17514 - XDG_CACHE_HOME had no effect |
There was a problem hiding this comment.
Looking this again it's clear that the comment isn't very helpful anymore because the parametrization of this test has changed too much. So the information in this comment is still valuable, but the exact phrasing is not.
astropy/config/tests/test_configs.py
Outdated
| env_var_template, | ||
| dir_type, | ||
| ): | ||
| match dir_type: |
There was a problem hiding this comment.
Include cls and func in the parametrization like you do in the other tests.
docs/config/index.rst
Outdated
| ``$HOME/.astropy/config``. It can be customized with the environment variable | ||
| ``XDG_CONFIG_HOME`` in which case the ``$XDG_CONFIG_HOME/astropy`` directory | ||
| must exist. Note that ``XDG_CONFIG_HOME`` comes from a Linux-centric | ||
| ``$HOME/.astropy/config``. It can be customized with the environment variables |
There was a problem hiding this comment.
The default should be $HOME/.config/astropy, but that's far beyond the scope of this pull request.
docs/config/index.rst
Outdated
| .. note:: | ||
| ``ASTROPY_CONFIG_DIR`` and ``ASTROPY_CACHE_DIR`` require ``astropy`` 7.1 | ||
| or newer. They are ignored in older versions. |
There was a problem hiding this comment.
Our documentation is versioned together with the code, so this should not be needed at all.
| .. note:: | |
| ``ASTROPY_CONFIG_DIR`` and ``ASTROPY_CACHE_DIR`` require ``astropy`` 7.1 | |
| or newer. They are ignored in older versions. |
There was a problem hiding this comment.
We have other places in configuration docs that mention when some behaviors started being supported. I think it's helpful for anyone trying to use this, not realizing that they are using a version that's too old.
There was a problem hiding this comment.
Usually this information should be in the API documentation, not in the narrative documentation. But we don't have a good way of documenting environment variables in our API documentation, so maybe we should keep this here.
docs/config/index.rst
Outdated
| .. note:: | ||
| ``astropy`` also respects ``XDG_CACHE_HOME`` and ``ASTROPY_CACHE_DIR`` for | ||
| defining cache location. If both are defined, the latter takes precedence. |
There was a problem hiding this comment.
Cache has its own documentation starting at:
Line 25 in 7ef2800
It appears to be incomplete, but now would be a good time to improve it.
There was a problem hiding this comment.
thanks, I missed that !
There was a problem hiding this comment.
Actually, looking more closely, it may not be an oversight, but an intentional decision to avoid repetitions; the very next line to the one you quoted links to the page I'm updating here "for more details", so I think it's reasonable to keep my change as is.
There was a problem hiding this comment.
Why would it be reasonable to document the environment variables that control the cache directory in config documentation instead of the caching documentation?
There was a problem hiding this comment.
To me, the cost of having docs that repeats the underlying logic feels greater than the mild inconvenience for users to have to navigate to another page, in particular because it creates a risk of these sections going out of sync.
In any case this came from #9182, which is a pretty heavy PR, and where this point didn't seem to be discussed at all. Maybe we should ask for a third opinion here ? @pllim, what do you think ?
There was a problem hiding this comment.
On a high level, there is a philosophy that "every page could be first page" because of search engine. So having multiple sections link back to the text (instead of repeating word by word) could help.
There was a problem hiding this comment.
Including important information about how our cache works in our config documentation still doesn't sound good to me. Documenting config in cache documentation doesn't sound any better. One option would be to have a separate page for documenting environment variables and both config and cache documentation could link to that.
There was a problem hiding this comment.
Sorry for falling behind. I'm coming back to this PR and I just pushed a patch where I centralized the information on environment variables to a dedicated page. I'll undraft the PR back when I'm 100% sure that it renders at intented.
33b0854 to
13ae62e
Compare
|
@ejeschke , would this satisfy your original request to be able to custom cache/config directories for Astropy for Ginga's use? Thanks, all! |
Thanks, @neutrinoceros, @pllim ! Yes, this looks very useful for several applications where we utilize |
|
@ejeschke , are you interested to test out this branch before merge? |
I will do so, but not necessary to hold up the merge as long as the developers have tested it thoroughly. It's a rather straightforward change. |
6e1224d to
3f49c01
Compare
|
Actually, I failed to remember that I suppose this warrants a changelog fragment on its own, but I'd like to get feedback from @pllim on this approach first. |
e4d3ff5 to
3d1412f
Compare
|
Thanks! I will get back to you soon but probably not for a few days. |
|
It looks to me like this pull request could be split into three separate pull requests:
|
|
I opened
|
| XDG_CONFIG_HOME environment variable is set and the | ||
| ``$XDG_CONFIG_HOME/astropy`` directory exists, it will be that directory. | ||
| ASTROPY_CONFIG_DIR (or XDG_CONFIG_HOME) environment variable is set and the | ||
| ``$ASTROPY_CONFIG_DIR/astropy`` directory exists, it will be that directory. |
There was a problem hiding this comment.
Please let ASTROPY_CONFIG_DIR directly point to the directory instead of expecting another level of astropy in there.
That makes sense for the XDG_... scheme, it doesn't for a specific environment variable for astropy. Let it point to the directory directly.
There was a problem hiding this comment.
That sounds like a good idea and the naming difference between *_DIR and *_HOME should make it reasonably clear that one set of variables points to the directories themselves and the other to the parent directories. However, it would be best to try behave similarly to other packages that define analogous environment variables.
There was a problem hiding this comment.
That's the case for all systems I know. Just checking the specific examples posted above:
- https://python-poetry.org/docs/configuration#cache-directory ✔️
- https://docs.astral.sh/uv/configuration/environment/#uv_cache_dir same ✔️
- https://matplotlib.org/stable/install/environment_variables_faq.html#envvar-MPLCONFIGDIR check
The logic should be:
astropy looks at the following locations for it's configuration (highest precedence first):
$ASTROPY_CONFIG_DIR$XDG_CONFIG_HOME/astropy$HOME/.astropy/config
There was a problem hiding this comment.
I agree that astropy should look for the configuration in $ASTROPY_CONFIG_DIR instead of $ASTROPY_CONFIG_DIR/astropy.
I disagree that if neither $ASTROPY_CONFIG_DIR nor $XDG_CONFIG_HOME/astropy are set then astropy should use $HOME/.astropy/config. It should instead use $HOME/.config/astropy (see #6803). But I do acknowledge that this is beyond the scope of this pull request.
There was a problem hiding this comment.
It should instead use $HOME/.config/astropy
Agreed! Maybe add that in-between to keep backwards compatibility?
There was a problem hiding this comment.
$ASTROPY_CONFIG_DIR
$XDG_CONFIG_HOME/astropy
$HOME/.config/astropy
$HOME/.astropy/config
|
Is there something we can do to help advance this PR? |
|
I think @neutrinoceros needs to resolve conflicts, for a start. If he no longer has time or interest to pursue, I could possibly take over with reduced scope. |
|
I'll pick it up as soon as possible. Unlikely to be able to this week, but I'll do my best to get back to it next week. Thanks for the ping and for your patience. |
|
I also don't think #17934 necessarily block this. We can still keep the old behavior for the XDG stuff in this PR. |
|
Hi humans 👋 - this pull request hasn't had any new commits for approximately 4 months. I plan to close this in 30 days if the pull request doesn't have any new commits by then. In lieu of a stalled pull request, please consider closing this and open an issue instead if a reminder is needed to revisit in the future. Maintainers may also choose to add keep-open label to keep this PR open but it is discouraged unless absolutely necessary. If this PR still needs to be reviewed, as an author, you can rebase it to reset the clock. If you believe I commented on this pull request incorrectly, please report this here. |
|
In lights of recent comments on #17934, I would still like it to go forward before I rebase and undraft this one, otherwise I'll be facing avoidable merge conflicts. Will keep this open in the mean time. |
3d1412f to
f9bb03a
Compare
Description
The approach is different enough from #17507 that it warranted a separate PR. Hopefully this one satisfies the needs highlighted in @eerovaher's and @pllim's early review there.
I'll add documentation once everyone is happy with the approach !
Blocked by