Conversation
|
I don't think this implementation works in the following case: The problem is that Unfortunately, it seems like this case is not covered by the tests. I feel bad about that; this is definitely a case that I should've included in #565. I realize now that there are no tests for cases in which Maybe a way to improve the performance of |
|
I'll take a PR with more tests, please! |
|
@kalekundert I see your optimization in #740. I was thinking about different approach, and came up with this idea (but not any drafts or PoC).
more-itertools/more_itertools/more.py Line 4242 in bed0811 Which uses more-itertools/more_itertools/recipes.py Lines 357 to 360 in bed0811 So the issue is that more-itertools/more_itertools/recipes.py Lines 335 to 340 in bed0811 What if we write another version of def _zip_equal_generator(n, *iterables):
for combo in zip_longest(*iterables, fillvalue=_marker):
match combo.count(_marker):
case 0: yield combo
case n: break
case _: raise UnequalIterablesError()Where What do you think? (Sorry for bad english) |
|
That seems like a good idea to me. I can't think of any reason why it wouldn't work, and it should be just as fast as your original implementation. Also, just to be clear, I didn't mean for #740 to preclude more substantial optimizations like this. I just noticed a small inefficiency, and it was easy to fix. |
Your optimization is absolutely good, didn't mean to put any offense on it. Hope any other ideas would work too! |
|
For both branches, could you please make sure that you've merged in |
Yep, done! |
|
@kalekundert I have implemented optimization above, seems that it passes both old tests and new ones you added in #739. I am looking for your help reviewing it. |
|
I just copypasted |
|
I only have a few very minor comments. Most of them pertain more to
Other than that, everything looks good to me. I assume this version still runs ≈10x faster than the current implementation? Speed tests for minor optimizations mentioned above: from itertools import zip_longest
from more_itertools import consume
_marker = object()
args = [
range(100_000),
range(100_000),
range(100_000),
]
def z1(iterables):
for combo in zip_longest(*iterables, fillvalue=_marker):
for val in combo:
if val is _marker:
raise UnequalIterablesError()
yield combo
def z2(iterables):
_marker = object()
for combo in zip_longest(*iterables, fillvalue=_marker):
for val in combo:
if val is _marker:
raise UnequalIterablesError()
yield combo
def z3(iterables):
for combo in zip_longest(*iterables, fillvalue=_marker):
if _marker in combo:
raise UnequalIterablesError()
yield combo
def z4(iterables):
_marker = object()
for combo in zip_longest(*iterables, fillvalue=_marker):
if _marker in combo:
raise UnequalIterablesError()
yield combo
import timeit
for f in [z1, z2, z3, z4]:
print(
timeit.timeit(
stmt=f'consume(f(args))',
globals=dict(f=f) | globals(),
number=100,
)
)Output: |
I can do something like this: def zip_broadcast(*objects, scalar_types=(str, bytes), strict=False):
...
if strict:
+ yield from _zip_equal(
+ *iterables, gen=partial(
+ _zip_equal_generator_scalars, iterables_count
+ )
+ )
else:
yield from zip(*iterables)
+def _zip_equal(*iterables, gen=_zip_equal_generator):
...
except TypeError:
+ return gen(iterables)
def _zip_equal_generator_scalars(n, iterables):
...It reverts
It is also raised when regular iterable of infinite/undefined length is occured, so how can this check be eliminated? |
Don't take my suggestions too seriously; I'm not sure they're all good ideas. |
Good spots! Maybe I should't optimize |
Feel free to try, I invited you to collaborate on my fork, so that you can push your commits in this PR. I would agree with any of your decision to add/not add any change, such as above. |
|
@bbayles I have resolved conflicts, is it planned to be merged? |
That's not safe. Miscounts if there's a non-marker object that equals the marker. Demo: from unittest.mock import ANY
_marker = object()
for combo in zip('foo', [1, ANY, 3]):
print(combo.count(_marker))Output (Attempt This Online!): |
|
@pochmann if it treats |
Oh, is it because current one uses |
|
Yes, counting with Btw I think I also optimized this, but can't find it right now... Maybe I dismissed it because the current implementation is simpler (especially after prefilling the scalars, which I had also done). But mine might be simpler than the new suggestion. I might try again... |
What Python version did you use? I think globals got faster in the last few versions. And are those results stable? (I.e., you ran it multiple times and always got very similar results?) |
|
@Masynchin About your current proposal: I'd rather do it like this, without the extra functions: def zip_broadcast(*objects, scalar_types=(str, bytes), strict=False):
...
iterables = [repeat(obj) if is_scalar(obj) else obj for obj in objects]
if not strict:
yield from zip(*iterables)
if lengths of the non-scalars are all the same:
yield from zip(*iterables)
(or raise UnequalIterablesError if that's the case)
for combo in zip_longest(*iterables, fillvalue=_marker):
...Advantages:
|
Fixed and added as test |
|
Also, what should I do about flake8 in CI? Same as #742 (comment) |
|
Merge in the |
|
I found that in this PR |
I used version 3.10.0, and I ran it a bunch of times. IIRC, it's not a stable effect when there are only ≈10 items per iterable, but it's very stable by the time you get to ≈1000. That said, this is a really micro optimization, and I just pointed it out because I noticed it. |
Fixed
Done |
Can you tell what that was, give an example? I don't see it, and the fix commit changes a lot of code. |
more_itertools/more.py
Outdated
|
|
||
| iterables_count = sum(1 for obj in objects if not is_scalar(obj)) | ||
| iterables = list(filterfalse(is_scalar, objects)) | ||
| iterables_count = ilen(iterables) |
There was a problem hiding this comment.
It's a list, just use len. And I don't think you need the count in the variable. You only use it once or twice, the first time you can just check the list instead, and the second time you can just call len there.
There was a problem hiding this comment.
It's a list, just use
len.
My bad, I had filterfalse in mind while adding ilen.
...and the second time you can just call len there.
elif markers == iterables_count:Should I use len here? It would be more performant if I check the length once and not every step of for-loop
There was a problem hiding this comment.
Should I use
lenhere?
Yes.
It would be more performant if I check the length once and not every step of for-loop
No. You get there at most once.
There was a problem hiding this comment.
Your print is at the wrong place. You only get to the len call if the if condition is false. Prepend the print to the elif instead: elif print(...) or condition:.
There was a problem hiding this comment.
You are right. If if statement failed, then for-loop cancels in both elif or else branches. Thanks for notice!
Consider UnequalIterablesError: Iterables have different lengths: index 0 has length 1; index 1 has length 2Before the fix it would be |
Ah, ok. Not sure which is better, indexes referring to all arguments or just to iterables. Wasn't it |
I just rerun |
|
I run this benchmark on Python 3.11.3: import timeit
from more_itertools import consume, zip_broadcast
N = 1_000
G = globals()
t1 = timeit.timeit("consume(zip_broadcast(1, 2, [1] * 100_000))", number=N, globals=G)
t2 = timeit.timeit("consume(zip_broadcast(1, 2, [1] * 200_000, [2, 3] * 100_000))", number=N, globals=G)
t3 = timeit.timeit("consume(zip_broadcast(1, 2, [1] * 100_000, strict=True))", number=N, globals=G)
print(t1, t2, t3, sep="\n")On master (266ebdc) and this PR (2dd6fe2), here are the results:
Almost 10x speed up. There is no unresolved questions, can we merge it? |
|
I think none of those test your slow case, can you test that as well? |
|
Is there still a test to add here? |
After #739 been merged and I tweaked this PR to pass this tests, no other problems was found. The only thing requested is regression benchmarks, but I am currently too busy with other stuff. My thought is that it can only regress if caller provides only one iterable and N scalars. I would be happy if anyone could verify this |
|
I think nobody is eager to finish this off, so closing. Thanks for the contribution nonetheless! |

Issue reference
No Issues 😎
Changes
Change logic inside
zip_broadcast. Now it converts scalars to iterables, so that it can just zip them all together.Simple benchmark: