bpo-28638: speed up namedtuple creation by avoiding exec#2736
bpo-28638: speed up namedtuple creation by avoiding exec#2736JelleZijlstra wants to merge 10 commits intopython:masterfrom
Conversation
Creating a namedtuple is relatively slow because it uses exec(). This commit reduces the exec()'ed code, but still uses exec() for creating the __new__ method. I don't know of a way to avoid using exec() for __new__ beyond manipulating bytecode directly. However, avoiding exec() for creating the class itself still yields a significant speedup. In an unscientific benchmark I ran, creating 1000 namedtuple classes now takes about 0.14 s instead of 0.44 s. There is one backward compatibility break: namedtuples no longer have a _source attribute, because we no longer exec() their source. I kept the verbose=True argument around for compatibility, but it now does nothing.
|
@JelleZijlstra, thanks for your PR! By analyzing the history of the files in this pull request, we identified @rhettinger, @serhiy-storchaka and @vsajip to be potential reviewers. |
|
But why can't we keep a |
Lib/collections/__init__.py
Outdated
| raise TypeError('Expected %d arguments, got %d' % (num_fields, len(result))) | ||
| return result | ||
|
|
||
| _make.__func__.__doc__ = 'Make a new {typename} object from a sequence or iterable'.format(typename=typename) |
There was a problem hiding this comment.
I think this line should be wrapped (as you did below for _replace).
Lib/collections/__init__.py
Outdated
| @classmethod | ||
| def _make(cls, iterable, new=tuple.__new__, len=len): | ||
| result = new(cls, iterable) | ||
| if len(result) != num_fields: |
There was a problem hiding this comment.
This will keep the closure alive, maybe make num_fileds a (private) attribute of the class and define this function outside of namedtuple? (TBH however, I have no idea how efficient this will be, but intuitively this should be more efficient than with nested function at least in terms of memory).
There was a problem hiding this comment.
Probably @serhiy-storchaka could say whether this makes sense.
There was a problem hiding this comment.
Going to add cls._num_fields to avoid adding another function call here.
Lib/collections/__init__.py
Outdated
| _make.__func__.__doc__ = 'Make a new {typename} object from a sequence or iterable'.format(typename=typename) | ||
|
|
||
| def _replace(_self, **kwds): | ||
| result = _self._make(map(kwds.pop, field_names, _self)) |
There was a problem hiding this comment.
Same situation with closure here and in __repr__, maybe use _self._fields instead of field_names.
|
This looks pretty good so far. I would like to layer in lazy generation of "_source" and to keep the "verbose" option functional. I think is possible to do this all in a way that it transparent to the user, that keeps the current API fully intact and only modestly increases the complexity of the code |
|
Thanks! I'm going to first address Ivan's comments related to closures, then work on bringing back |
ilevkivskyi
left a comment
There was a problem hiding this comment.
This generally looks good now. Maybe Raymond will have more comments. Also it would be good to see detailed benchmarks (space and time) for this particular implementation vs the original one.
| exec(class_definition, namespace) | ||
| result = namespace[typename] | ||
| result._source = class_definition | ||
| arg_list = repr(tuple(field_names)).replace("'", "")[1:-1] |
There was a problem hiding this comment.
I feel ', '.join(field_names) is pythonic code.
Is this line different from it?
There was a problem hiding this comment.
The only difference is in the case of 1-element tuple.
There was a problem hiding this comment.
Why is the trailing comma desirable in the 1-element tuple case?
There was a problem hiding this comment.
Otherwise it's no longer a tuple.
|
Quick and dirty bench on pypy3.5-5.8 original: patched: |
|
Same bench on Python 3.6.2 on macOS original: patched: |
|
And memory usage on CPython 3.6.2 original: patched: |
Doc/library/collections.rst
Outdated
| .. versionadded:: 3.3 | ||
|
|
||
| .. versionchanged:: 3.7 | ||
| ``_source`` is no longer used to created the named tuple class. |
There was a problem hiding this comment.
I find this sentence a bit confusing to understand. Something like "_source is no longer the actual named tuple class implementation, but an equivalent implementation" would be clearer to me.
There was a problem hiding this comment.
Changing this to ``_source`` is no longer used to create the named tuple class implementation, but contains an equivalent implementation.
| exec(class_definition, namespace) | ||
| result = namespace[typename] | ||
| result._source = class_definition | ||
| arg_list = repr(tuple(field_names)).replace("'", "")[1:-1] |
There was a problem hiding this comment.
The only difference is in the case of 1-element tuple.
| @classmethod | ||
| def _make(cls, iterable, new=tuple.__new__, len=len): | ||
| result = new(cls, iterable) | ||
| if len(result) != cls._num_fields: |
There was a problem hiding this comment.
Why not use just len(cls._fields)?
There was a problem hiding this comment.
I didn't want to introduce another function call. That might be premature optimization though.
There was a problem hiding this comment.
You may be able to add num_fields=num_fields argument like len=len.
Lib/collections/__init__.py
Outdated
| _repr_template = '{name}=%r' | ||
| _new_template = ''' | ||
| def __new__(_cls, {arg_list}): | ||
| 'Create new instance of {typename}({arg_list})' |
There was a problem hiding this comment.
It may be faster to set the __doc__ attribute explicitly rather than compile it.
| namespace = {'_tuple': tuple} | ||
| new_source = _new_template.format(typename=typename, arg_list=arg_list) | ||
| exec(new_source, namespace) | ||
| __new__ = namespace['__new__'] |
There was a problem hiding this comment.
Set __qualname__ and __module__ attributes of the __new__ method and other methods.
|
General question: do we want to use f-strings here rather than explicit format() calls? |
f-strings should be faster, so that probably we should use them where possible. |
|
A downside of f-strings would be that they would make it harder to port this implementation to e.g. pypy, which doesn't support 3.6 yet. I can check how much of a difference it makes in benchmarks. |
|
PyPy3.5 supports f-string. |
|
If we want performance about formatting string, why not use '%'-format? It seems faster than two others. |
Because original code uses
Why do you think so? It seems f-string is fast in most cases. |
Notes: - fasternt.py is a copy of collections/__init__.py as of python/cpython#2736 - you need to install perf==0.9.6 to run the benchmark, the latest version doesn't work - run the benchmark with ./bench in the prof directory. In summary, fasternt is ~4x faster than the stdlib at creating namedtuple classes and has no effect on instantiation or attribute access. cnamedtuple is 40x faster at class creation and ~30% faster at instantiation and attribute access.
Lib/collections/__init__.py
Outdated
| _new_template = ''' | ||
| def __new__(_cls, {arg_list}): | ||
| 'Create new instance of {typename}({arg_list})' | ||
| return _tuple.__new__(_cls, ({arg_list})) |
There was a problem hiding this comment.
Instead of _tuple.__new__, how about put namespace['_tuple_new'] = tuple.__new__ and call _tuple_new?
It makes eval and instantiation bit (about 3%) faster.
this pull request:
$ ./pyenv/versions/3.6.2/bin/python -m perf timeit -s 'from collections import namedtuple' -- 'namedtuple("Foo", "foo bar baz")'
.....................
Mean +- std dev: 90.1 us +- 3.3 us
$ ./pyenv/versions/3.6.2/bin/python -m perf timeit -s 'from collections import namedtuple; Foo=namedtuple("Foo", "foo bar baz")' -- 'Foo(1,2,3)'
.....................
Mean +- std dev: 503 ns +- 15 ns
_tuple_new version:
$ ./pyenv/versions/3.6.2/bin/python -m perf timeit -s 'from collections import namedtuple' -- 'namedtuple("Foo", "foo bar baz")'
.....................
Mean +- std dev: 87.1 us +- 2.7 us
$ ./pyenv/versions/3.6.2/bin/python -m perf timeit -s 'from collections import namedtuple; Foo=namedtuple("Foo", "foo bar baz")' -- 'Foo(1,2,3)'
.....................
Mean +- std dev: 485 ns +- 21 ns
There was a problem hiding this comment.
Thanks, making that change.
|
@JelleZijlstra would you update this pull request? |
|
Sorry, I've been busy. I'll try to find some time for it today. |
|
I think I've addressed all comments since my last push. Let me know if you'd like to see any other changes. Another open question: Should we use f-strings? They'll likely be a little faster, but may make it harder to port this code to other Python versions and implementations. |
This patch will not be backported even Python 3.6 which supports f-string. So please don't worry about it. |
|
Yes, I also think f-strings should be used here. |
|
Per past policy, optimizations don't get backported. I'll look at this patch more during the sprints next week. |
| __new__ = namespace['__new__'] | ||
| __new__.__doc__ = f'Create new instance of {typename}({arg_list})' | ||
|
|
||
| @classmethod |
There was a problem hiding this comment.
Might be cleaner to move this down to the class_namespace, to avoid having to use .__func__
|
|
||
| def __getnewargs__(self): | ||
| 'Return self as a plain tuple. Used by copy and pickle.' | ||
| return tuple(self) |
There was a problem hiding this comment.
These three functions without formatted docstrings don't need to be redeclared for each namedtuple, do they?
There was a problem hiding this comment.
I think that doesn't work now because I'm setting __qualname__ and __module__ on them.
| _repr_template = '{name}=%r' | ||
| _new_template = ''' | ||
| def __new__(_cls, {arg_list}): | ||
| return _tuple_new(_cls, ({arg_list})) |
There was a problem hiding this comment.
Wouldn't that be better handled by inserting the trailing comma here?
| new_source = _new_template.format(arg_list=arg_list) | ||
| exec(new_source, namespace) | ||
| __new__ = namespace['__new__'] | ||
| __new__.__doc__ = f'Create new instance of {typename}({arg_list})' |
There was a problem hiding this comment.
Otherwise you get a strange trailing comma in this docstring
There was a problem hiding this comment.
This pull request is about performance. I don't think changing much is better.
Current pull request follows current code.
>>> foo = collections.namedtuple("foo", "a")
>>> print(foo._source)
...
def __new__(_cls, a,):
'Create new instance of foo(a,)'
return _tuple.__new__(_cls, (a,))
...
There was a problem hiding this comment.
Although the current pull request doesn't produce exactly the same code any more anyway, since _tuple.__new__ is now _tuple_new, so there's no particular argument against changing it other than "it doesn't need to be part of this pr" (which I agree is true)
| @@ -0,0 +1,2 @@ | |||
| Speed up namedtuple class creation by 4x by avoiding usage of exec(). Patch | |||
There was a problem hiding this comment.
This could be read as exec was eliminated, so perhaps '...by reducing usage of exec().'?
The main differences are:
|
Creating a namedtuple is relatively slow because it uses exec().
This commit reduces the exec()'ed code, but still uses exec() for
creating the
__new__method. I don't know of a way to avoid usingexec() for
__new__beyond manipulating bytecode directly.However, avoiding exec() for creating the class itself still yields
a significant speedup. In an unscientific benchmark I ran, creating
1000 namedtuple classes now takes about 0.14 s instead of 0.44 s.
There is one backward compatibility break: namedtuples no longer have
a _source attribute, because we no longer exec() their source. I kept
the verbose=True argument around for compatibility, but it now does
nothing.
cc @sixolet @methane
https://bugs.python.org/issue28638