Simplify zip iterator and improve performance in certain cases by haampie · Pull Request #27415 · JuliaLang/julia

haampie · 2018-06-04T09:27:16Z

This is a separate pr built on top of #27386. It removes the Zip2 struct, since what used to be Zip1 is already returning a 1-tuple and hence is a base case (turns out @StefanKarpinski was right in the comments here: eaf5a95, but maybe only after Zip1 was updated.)

Further it fixes a bug in isdone where the tail of the zip iterator was never checked (https://github.com/JuliaLang/julia/blob/master/base/iterators.jl#L373). Should still add a test for this.

I've built this on top of #27386 to show that we now finally get fully optimized code in some cases -- note the lack of @inbounds:

$ ./julia
> using BenchmarkTools
> function f(v)
     s = 0
     for (i, w) = zip(LinearIndices(v), v)
       s += i * w
     end
     s
   end
> @benchmark f(v) setup = (v = rand(Int, 1000))

On current master:

  median time:      546.735 ns (0.00% GC)

On this branch:

  median time:      245.483 ns (0.00% GC)

This is now on par with writing for (i, w) = enumerate(v) and with the handwritten loop

@inbounds for i = 1 : length(v)
  s += i * v[i]
end

StefanKarpinski · 2018-06-04T12:42:44Z

Very cool! Test failure in test/iterators looks legitimate though. It would be nice to be able to define enumerate(v) simply as zip(LinearIndices(v), v) and have that be maximally efficient.

haampie · 2018-06-04T13:29:43Z

Yeah, test failures are annoying, but potentially there's still a workaround without the machinery proposed in #27412.

For vectorization it seems crucial to call iterate first on both iterators, and only afterwards check the state. This corresponds to the case isdone(a) === isdone(b) === missing. Otherwise we have at least one stateful iterator, in which case early returns are required.

haampie · 2018-06-04T15:00:26Z

It seems this passes the iterators tests ~~while retaining the performance in the example above :)~~, hm, not really, apparently some last changes spoiled it.

Can we tell the compiler somehow the first branch in zip_iterate is almost always true?

haampie · 2018-06-04T15:31:19Z

Alright, things get vectorized again. This stuff is extremely fragile, so maybe I should add comments with warnings about spoiling vectorization.

It could be way more robust and simpler if we did not have to worry about stateful iterators.

edit: I stumbled upon an inference issue in the case of 3+ iterators -- let's see if it can be fixed.

…ar} iterator

Furthermore, reorder the instructions for the fast path to enable vectorization in some cases.

bramtayl · 2018-06-05T14:51:20Z

n.b. #26765

haampie · 2018-06-05T15:44:40Z

Thanks for the pointer, I wasn't aware of that issue.

haampie · 2018-06-05T22:21:46Z

Some observations: https://gist.github.com/haampie/5dd6803b217709c4629d3eeb75c16258

Basically, we can do really well if we specialize for stateless iterators. Type inference seems perfect even with recursion up to 14 calls of iterate deep (cc @bramtayl); some daunting looking loops that have closed-form expressions get optimized away; and vectorization kicks in.

I cannot get all these great results to work with the current way stateful iterators are handled. If I find time I'll take a look at how we could use simple traits to work around this.

haampie · 2018-06-06T14:32:26Z

I think I might give up on this pr -- at least on the recursive structure. It's really more of an art than a science to get performance right.

The results in the comment above look promising, but in turns out it does not generate optimal code for

function vectorize_multiple_products(xs::Vector{T}, ys::Vector{T}, zs::Vector{T}) where {T}
    s = zero(T)
    for (x,y,z) = zip(xs, ys, zs)
        s += x * y * z
    end
    s
end

@benchmark vectorize_multiple_products(xs, ys, zs) = setup(xs = rand(1000); ys = rand(1000); zs = rand(1000);)

which is about 60% slower than the current implementation of zip; unless one interchanges the tail = iterate(z.tail, state[2]) and head === nothing && return nothing lines, but then it does not recognize it can simplify optimize_away_loop_with_closed_form. In the recursive setting it seems virtually impossible to fix all problems.

haampie force-pushed the simplify-zip branch from 5701e79 to 7d2c0e6 Compare June 4, 2018 09:41

haampie force-pushed the simplify-zip branch from 7d2c0e6 to 95f30a0 Compare June 4, 2018 14:58

haampie force-pushed the simplify-zip branch from 95f30a0 to f68afde Compare June 4, 2018 15:26

haampie mentioned this pull request Jun 4, 2018

Guarantee inbounds iteration over Array{T} #27386

Merged

haampie force-pushed the simplify-zip branch from f68afde to 23669f7 Compare June 5, 2018 07:24

haampie added 4 commits June 5, 2018 11:58

Guarantee inbounds iteration over Array{T}

4ce9066

Replace indexing with an iterator with actual indexing

45174ab

Make the CharStr iterator unaware of the initial state of a Vector{Ch…

8923c1b

…ar} iterator

Remove the Zip2 struct, since Zip1 already forms a valid base case.

20723d8

Furthermore, reorder the instructions for the fast path to enable vectorization in some cases.

haampie force-pushed the simplify-zip branch from 23669f7 to 20723d8 Compare June 5, 2018 08:58

StefanKarpinski requested a review from JeffBezanson June 5, 2018 14:55

haampie closed this Jun 6, 2018

haampie deleted the simplify-zip branch June 6, 2018 14:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Simplify zip iterator and improve performance in certain cases#27415

Simplify zip iterator and improve performance in certain cases#27415
haampie wants to merge 4 commits intoJuliaLang:masterfrom
haampie:simplify-zip

haampie commented Jun 4, 2018 •

edited

Loading

Uh oh!

StefanKarpinski commented Jun 4, 2018

Uh oh!

haampie commented Jun 4, 2018

Uh oh!

haampie commented Jun 4, 2018 •

edited

Loading

Uh oh!

haampie commented Jun 4, 2018 •

edited

Loading

Uh oh!

bramtayl commented Jun 5, 2018

Uh oh!

haampie commented Jun 5, 2018

Uh oh!

haampie commented Jun 5, 2018

Uh oh!

haampie commented Jun 6, 2018 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

haampie commented Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

StefanKarpinski commented Jun 4, 2018

Uh oh!

haampie commented Jun 4, 2018

Uh oh!

haampie commented Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

haampie commented Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bramtayl commented Jun 5, 2018

Uh oh!

haampie commented Jun 5, 2018

Uh oh!

haampie commented Jun 5, 2018

Uh oh!

haampie commented Jun 6, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

haampie commented Jun 4, 2018 •

edited

Loading

haampie commented Jun 4, 2018 •

edited

Loading

haampie commented Jun 4, 2018 •

edited

Loading

haampie commented Jun 6, 2018 •

edited

Loading