Skip to content

Conversation

@henryiii
Copy link
Contributor

@henryiii henryiii commented Nov 27, 2025

Another slow line in the profile. This one is the only case in the codebase that I see using a simple generator like this on .split("."), all the rest use map. Using map saves about 8% when creating Version's. Using tuple([...]) also saves around 8%, but I think the map is nicer than using a list comprehension inside a function call (and also consistent with the other cases, like in tags.py).

Edit: also found one more place, inside the __str__ method, which is used in SpecifierSet, for example.

@henryiii
Copy link
Contributor Author

I also tried this variation:

diff --git a/src/packaging/version.py b/src/packaging/version.py
index ecaf6c6..db4e802 100644
--- a/src/packaging/version.py
+++ b/src/packaging/version.py
@@ -255,25 +255,11 @@ class Version(_BaseVersion):
         >>> str(Version("1.0a5"))
         '1.0a5'
         """
-        parts = [self.base_version]
-
-        # Pre-release
-        if self.pre is not None:
-            parts.append("".join(map(str, self.pre)))
-
-        # Post-release
-        if self.post is not None:
-            parts.append(f".post{self.post}")
-
-        # Development release
-        if self.dev is not None:
-            parts.append(f".dev{self.dev}")
-
-        # Local version segment
-        if self.local is not None:
-            parts.append(f"+{self.local}")
-
-        return "".join(parts)
+        pre = "" if self.pre is None else "".join(map(str, self.pre))
+        post = "" if self.post is None else f".post{self.post}"
+        dev = "" if self.dev is None else f".dev{self.dev}"
+        local = "" if self.local is None else f"+{self.local}"
+        return f"{self.base_version}{pre}{post}{dev}{local}"

     @property
     def epoch(self) -> int:

Which I think reads a little better, but it's not measurably faster (using str(Version(version))), in fact, it might be around 1% slower.

@notatallshaw
Copy link
Member

notatallshaw commented Nov 27, 2025

I've also found places where using a map over a generator provides a small speed up, but I was hesitant to propose a PR because it wasn't clear to me if this would hold true in general or just happened to be true for the particular implementation of Python I was tested on.

Do you know, for example, if pypy is faster for maps or generators?

I'm not against this, especially when the map calls are so simple, I just want to make sure we're not chasing down implementation specific details that could change in the future.

@henryiii
Copy link
Contributor Author

I've (mostly) been testing on the latest version of Python, which should have as many optimizations as possible for generators. List comprehensions are also fast, but map feels better than tuple([int(i) for i in match.group("release").split(".")]), which very much looks like an implementation optimization.

The map takes 8% less time on PyPy as well (PyPy is about 4x faster overall). Interestingly, older versions of Python have a smaller difference here; 3.8 using the map only saves 4%.

Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
@henryiii henryiii merged commit f3440e9 into pypa:main Nov 27, 2025
40 checks passed
@henryiii henryiii deleted the henryiii/perf/map branch November 27, 2025 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants