Speed improvements for typing#439
Conversation
|
This now only touches Python 3 version. If Jukka's results will be promising, then I will also backport this to Python 2. |
|
I'm also seeing a few % speedup running mypy on my codebase. |
gvanrossum
left a comment
There was a problem hiding this comment.
This makes sense as a speedup as long as you can convince me the semantics are unchanged.
| """ | ||
| assert isinstance(a, GenericMeta) and isinstance(b, GenericMeta) | ||
| # Reduce each to its origin. | ||
| return _gorg(a) is _gorg(b) |
There was a problem hiding this comment.
So is _gorb(b) entirely redundant here? I see that you've inlined this as a._gorg is b everywhere.
There was a problem hiding this comment.
Probably this is just a coincidence, maybe it was different in older versions, but now b (and its ._gorg) is known "statically" in all places where _geqv is used.
src/typing.py
Outdated
| new_bases = [] | ||
| for base in bases: | ||
| if isinstance(base, GenericMeta): | ||
| bextra = getattr(base, '__extra__', None) |
There was a problem hiding this comment.
As long as we're micro-optimizing, this call is wasted when extra is falsey or origin is truthy (if I understand the condition below correctly).
There was a problem hiding this comment.
OK, I reshuffled code here, so that we quickly escape if origin is truthy and then if extra is falsey.
src/typing.py
Outdated
| if isinstance(base, GenericMeta): | ||
| bextra = getattr(base, '__extra__', None) | ||
| if (extra and bextra and not origin and bextra not in bases and | ||
| type(bextra) is abc.ABCMeta): |
There was a problem hiding this comment.
I'm not sure how this is equivalent to the original code (I would have expected something simpler, matching _gorg(b) if isinstance(b, GenericMeta) else b.
There was a problem hiding this comment.
The point is this is not totally equivalent. This makes more "aggressive" erasure for classes with __extra__, this shortens MROs for special typing types. Compare:
Before:
>>> pprint(List[int].__mro__)
(typing.List[int],
typing.List,
<class 'list'>,
typing.MutableSequence,
<class 'collections.abc.MutableSequence'>,
typing.Sequence,
<class 'collections.abc.Sequence'>,
typing.Reversible,
<class 'collections.abc.Reversible'>,
typing.Collection,
<class 'collections.abc.Collection'>,
<class 'collections.abc.Sized'>,
typing.Iterable,
<class 'collections.abc.Iterable'>,
typing.Container,
<class 'collections.abc.Container'>,
typing.Generic,
<class 'object'>)and after:
(typing.List[int],
typing.List,
<class 'list'>,
<class 'collections.abc.MutableSequence'>,
<class 'collections.abc.Sequence'>,
<class 'collections.abc.Reversible'>,
<class 'collections.abc.Collection'>,
<class 'collections.abc.Sized'>,
<class 'collections.abc.Iterable'>,
<class 'collections.abc.Container'>,
<class 'object'>)There was a problem hiding this comment.
I am actually a bit worried maybe this is too aggressive, since sometimes we even lose Generic, for example:
>>> pprint(MutableMapping.__mro__)
(typing.MutableMapping,
<class 'collections.abc.MutableMapping'>,
<class 'collections.abc.Mapping'>,
<class 'collections.abc.Collection'>,
<class 'collections.abc.Sized'>,
<class 'collections.abc.Iterable'>,
<class 'collections.abc.Container'>,
<class 'object'>)Although all tests pass, and I tried to play with this and it works as expected (it is still an instance of GenericMeta). Probably, I will add Generic at the end of bases in such cases (just for consistency).
|
PS. I'm not sure how we would test this speedup with real code. Did you ask for us to have this patched into the Python version we use to run mypy and then run mypy over our large codebase? That makes sense except the time there is pretty variable (since it runs so long that all sorts of other activities on the same machine affect the timing). |
|
I am marking this as WIP until I figure out what is going on with Python 2.
Yes, even if it is not super-stable it would be great to have few more data points. |
|
I decided to keep only obvious optimizations, and leave more aggressive things (that might change semantics) out for now. I think you should be happy with this now. I will probably make another PR soon dealing with (excessively) long MROs. |
python2/typing.py
Outdated
| namespace.update({'__origin__': origin, '__extra__': extra}) | ||
| self = super(GenericMeta, cls).__new__(cls, name, bases, namespace) | ||
| super(GenericMeta, self).__setattr__('_gorg', self if not origin | ||
| else origin._gorg) |
There was a problem hiding this comment.
Whitespace nit: I don't love how this line is broken up; either split before the comma or indent the else to align with the if.
| """ | ||
| assert isinstance(a, GenericMeta) and isinstance(b, GenericMeta) | ||
| # Reduce each to its origin. | ||
| return _gorg(a) is _gorg(b) |
| arg_attrs = arg.__dict__.copy() | ||
| for attr, val in arg.__dict__.items(): | ||
| if val in arg.__bases__: | ||
| if val in arg.__bases__ + (arg,): |
There was a problem hiding this comment.
Why is this added (arg,) needed? And why not in the PY2 version? (If there's a good reason please add a comment.)
There was a problem hiding this comment.
Yes, it is better to have them in both versions, @no_type_chek can be also used in Python 2, fixed this.
There was a problem hiding this comment.
Oh wait, it's needed because arg._gorg == arg. Fine.
There was a problem hiding this comment.
Oh wait, it's needed because
arg._gorg == arg
Exactly. Sorry, I have answered only half of your question.
| arg_attrs = arg.__dict__.copy() | ||
| for attr, val in arg.__dict__.items(): | ||
| if val in arg.__bases__: | ||
| if val in arg.__bases__ + (arg,): |
There was a problem hiding this comment.
Oh wait, it's needed because arg._gorg == arg. Fine.
|
I'm torn about whether to merge this. Maybe we should wait until 3.6.2 has actually been released? The RC is supposed to be cut today, I expect the final release in a few weeks (see PEP 494). |
This makes sense. We can keep the master "clean" (i.e. identical with current CPython state) for any emergency fixes (just in case). Then I can merge this right after the release of 3.6.2.final. |
|
OK, Python 3.6.2 was released earlier today, so that I am merging this now. |
Incorporates change made by python#439, which was recently merged.
This PR contains two performance improvements based on profiling and inspired by #432:
_gorgand_geqvThis gives roughly 5% speed-up for
mypy mypy. @JukkaL I will be grateful if you could do more benchmarking with some real-life large codebases.