-
Notifications
You must be signed in to change notification settings - Fork 1.5k
DOC: Document RECORD modification #1423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pfmoore
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is a subtle "gotcha" that probably warrants a warning, but we need to be careful not to make too much of this particular issue. After all, it's basically just saying "if you don't maintain the metadata the way you should, then things will break". Which is fairly self-evident when you put it like that.
I don't expect the people who are writing files into site-packages will even think they need to read this spec - so this probably isn't going to signficantly reduce how often this issue comes up in practice. It's more about raising general awareness, and should be written from that perspective.
|
It's definitely correct that if you dynamically change files in site-packages, then you must update RECORD (BTW, this applies to modifying existing files as well, since RECORD contains hashes). However, I think that this text could lead to the impression that dynamically changing installed packages is officially supported (as long as you do update RECORD). Is there consensus on that? It seems really fragile and janky, to me anyway. What if site-packages is read-only? What if a Python process unrelated to the package that is updating itself is running at the same time (maybe pip, maybe something that uses importlib APIs) and this creates a race condition? Also, what about environment reproducibility? |
|
Sorry, I started writing before @pfmoore posted and didn't see his comments. |
I think the opposite is true, the presumption is that only installers should be writing to package directories. It's not a rule (how would you even define it without having to pin down what counts as an installer?) but it's definitely baked into how people think about things. I'd be a strong -1 on anything that suggested modifying installed packages was supported. Do it if you have to, but when everything breaks, you have nobody to blame but yourself :-) And in particular, don't expect us to tell you what you need to do in order to do it safely. |
|
Here's my suggestion: diff --git a/source/specifications/recording-installed-packages.rst b/source/specifications/recording-installed-packages.rst
index 2610717d..005b6e5b 100644
--- a/source/specifications/recording-installed-packages.rst
+++ b/source/specifications/recording-installed-packages.rst
@@ -110,6 +110,16 @@ The RECORD file
The ``RECORD`` file holds the list of installed files.
It is a CSV file containing one record (line) per installed file.
+.. note::
+
+ It is *strongly discouraged* for an installed package to modify itself
+ (e.g., store cache files under its namespace in ``site-packages``).
+ Changes inside ``site-packages`` should be left to specialized installer
+ tools such as pip. If a package is nevertheless modified in this way,
+ then the ``RECORD`` must be updated, otherwise uninstalling the package
+ will leave unlisted files in place (possibly resulting in a zombie
+ namespace package).
+
The CSV dialect must be readable with the default ``reader`` of Python's
``csv`` module:
|
|
👍 works for me! @jeanas do you want to open a new PR, comment inline as a |
|
Just copy paste, it doesn't matter to me. |
Distilled based on the conversation in pypa/pip#11835 (comment) and related comments (cc @pfmoore). Happy to modify language or whatever, or close if this recommendation is incorrect!
📚 Documentation preview 📚: https://python-packaging-user-guide--1423.org.readthedocs.build/en/1423/