[red-knot] Fix bug where module resolution would not be invalidated if an entire package was deleted#12378
Conversation
|
| FilePathRef::System(path) => resolver.db.system().path_exists(&path.join("__init__.pyi")), | ||
| FilePathRef::Vendored(path) => resolver.db.vendored().exists(path.join("__init__.pyi")), | ||
| FilePathRef::System(path) => system_path_to_file(resolver.db.upcast(),path.join("__init__.pyi")).is_some(), | ||
| FilePathRef::Vendored(path) => vendored_path_to_file(resolver.db.upcast(), path.join("__init__.pyi")).is_some(), |
There was a problem hiding this comment.
Nit: We can use vendored().exists here because we know that the system is read only.
There was a problem hiding this comment.
I wondered about this. In what situation should we use vendored_path_to_file()?
There was a problem hiding this comment.
In cases where we need a File, for example because we have to pass it to source_text
There was a problem hiding this comment.
Right -- but if we know it's immutable, maybe we should just be calling vendored.read-to_string() instead of source_text() to get the source text -- the same argument applies, no? (Not trying to be argumentative, just curious!)
There was a problem hiding this comment.
It depends. Not if it is a Python file, because calling source_text then has the benefit that we read the vendored python file exactly once instead of multiple times (file.read_to_string is not cached).
That's also not the point I made above. We need vendored_path_to_file when calling parsed_module because the function takes a File argument (because it doesn't care if it is a vendored or non vendored file). That's when you need a file.
There was a problem hiding this comment.
file.read_to_stringis not cached
Nor is vendored.exists() -- each time we call that, we'll be querying the zip archive. Is the logic that we think that this will be inexpensive enough that it's not worth caching?
There was a problem hiding this comment.
I'm not sure I understand. This seems to be unrelated to the original question.
it's not worth caching
Probably not. Caching comes at a high cost (memory, locking, lookup). We cache the resolved module, which should prevent from calling into the vendored system in the first place. But we have to wait for some real world usage to tell.
There was a problem hiding this comment.
This seems to be unrelated to the original question.
It seems related to me, but perhaps this illustates confusion on my part :-)
I'm trying to understand when and why (in future PRs) I should go via the Files APIs for reading information about files stored in the vendored zip archive, and when it makes sense to just get this information from the VendoredFileSystem. If I understand correctly, the tradeoffs are as follows:
- Going via the
FilesAPI has the advantage that the query is automatically invalidated when the file changes -- but the file will never change for vendored files, so, unlike with filesystem files, that's not a good reason to go via theFilesAPI. - Going via the
FilesAPI has the advantage that the call is automatically cached. This means that if we go via theFilesAPI, we'll end up querying the zip archive less frequently; but that could also cost more than it saves us, due to higher memory allocation/consumption, and because locking and lookups in the cache are also costly. - Going via the
FilesAPI gives you aFileobject that you can pass directly to queries such asruff_db::parsed::parsed_module. You wouldn't be able to call these queries if you just read the contents of the file as a string using theVendoredFileSystemdirectly.
Is that an accurate summary? I feel like my question has been answered now, anyway!
There was a problem hiding this comment.
I think that's a good summary. I now see how it is related.
Regarding vendored_path_to_file(...).exists. This is actually not cached for files that don't exist. We could, but we currently don't to avoid tracking Files for vendored paths that don't exist. So vendored().exists and vendored_path_to_file(path).exists have the same cost for files that don't exist.
9430f04 to
031effb
Compare
Summary
This fixes a bug where module resolution would not be invalidated if an entire package was deleted
Test Plan
I added a test that fails on
main, and passes with this PR.