Skip to content

Conversation

@notatallshaw
Copy link
Member

@notatallshaw notatallshaw commented Nov 25, 2025

Inspired by #987 I investigated if calling _cmpkey to calculate _key on every Version initialization made sense.

At least for pip, _key is only accessed on less than 30% of Version objects ever created, with longer resolutions going down to below 20%, so precalculating _key with _cmpkey is doing work that's often not needed.

_version could also be made lazy, but at least for pip _version is accessed on 100% of all Version objects.

@henryiii
Copy link
Contributor

henryiii commented Nov 25, 2025

There's also functools.cachedproperty, I think this implementation might be faster though, at least before 3.12 removed a mutex. Might be worth profiling if you want to. Here it is if you want to try it:

diff --git a/src/packaging/version.py b/src/packaging/version.py
index 5261cd5..319ce66 100644
--- a/src/packaging/version.py
+++ b/src/packaging/version.py
@@ -9,6 +9,7 @@

 from __future__ import annotations

+import functools
 import re
 from typing import Any, Callable, NamedTuple, SupportsInt, Tuple, Union

@@ -185,7 +186,6 @@ class Version(_BaseVersion):

     _regex = re.compile(r"^\s*" + VERSION_PATTERN + r"\s*$", re.VERBOSE | re.IGNORECASE)
     _version: _Version
-    __key: CmpKey | None

     def __init__(self, version: str) -> None:
         """Initialize a Version object.
@@ -216,18 +216,16 @@ class Version(_BaseVersion):
         # Key which will be used for sorting
         self.__key = None

-    @property
+    @functools.cached_property
     def _key(self) -> CmpKey:
-        if self.__key is None:
-            self.__key = _cmpkey(
-                self._version.epoch,
-                self._version.release,
-                self._version.pre,
-                self._version.post,
-                self._version.dev,
-                self._version.local,
-            )
-        return self.__key
+        return _cmpkey(
+            self._version.epoch,
+            self._version.release,
+            self._version.pre,
+            self._version.post,
+            self._version.dev,
+            self._version.local,
+        )

     def __repr__(self) -> str:
         """A representation of the Version that shows all internal state.

@henryiii henryiii changed the title Lazily calculate _key in Version perf: lazily calculate _key in Version Nov 25, 2025
@notatallshaw
Copy link
Member Author

I took a look at cached property and there are too many nuances, dependencies, and behavior changes for me to be comfortable using it: https://docs.python.org/3/library/functools.html#functools.cached_property

@henryiii
Copy link
Contributor

Saves about 15% (total time) if _key is never accessed.

@henryiii henryiii merged commit 39b0432 into pypa:main Nov 26, 2025
40 checks passed
@henryiii
Copy link
Contributor

After this and all the recent optimizations (regex, etc), loading every version from PyPI takes 82.4 seconds. On 25.0, it takes 116.1 seconds. (Python 3.14)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants