Faster path drawing for the cairo backend (cairocffi only)#8787
Faster path drawing for the cairo backend (cairocffi only)#8787timhoffm merged 9 commits intomatplotlib:masterfrom
Conversation
fe66d3b to
37775fb
Compare
abce702 to
241a306
Compare
| cur = ctx.get_current_point() | ||
| ctx.curve_to( | ||
| *np.concatenate([cur / 3 + points[:2] * 2 / 3, | ||
| points[:2] * 2 / 3 + points[-2:] / 3])) |
There was a problem hiding this comment.
Not sure why this is different than before?
There was a problem hiding this comment.
previous version was actually incorrect (see basically https://en.wikipedia.org/wiki/B%C3%A9zier_curve#Degree_elevation)..
|
|
||
|
|
||
| def _convert_paths(ctx, paths, transforms, clip=None): | ||
| return (_convert_paths_fast if HAS_CAIRO_CFFI else _convert_paths_slow)( |
There was a problem hiding this comment.
Maybe if at top-level would be quicker/simpler?
There was a problem hiding this comment.
By top-level, I meant if HAS_CAIRO_CFFI: _convert_paths = _convert_paths_fast ...
|
|
||
| def _convert_paths_fast(ctx, paths, transforms, clip=None): | ||
| # We directly convert to the internal representation used by cairo, for | ||
| # which ABI compatibility is guaranteed. The layout is for each item is |
| transform = (transform | ||
| + Affine2D().scale(1.0, -1.0).translate(0, self.height)) | ||
|
|
||
| + Affine2D().scale(1, -1).translate(0, self.height)) |
There was a problem hiding this comment.
transforms are mutable but don't implement iadd (so += actually doesn't modify the transform in place), but I was too lazy to check and want to make clear that we're indeed not mutating the transform.
|
|
||
| transform = (transform | ||
| + Affine2D().scale(1.0, -1.0).translate(0, self.height)) | ||
| + Affine2D().scale(1, -1).translate(0, self.height)) |
| self.convert_path( | ||
| ctx, marker_path, marker_trans + Affine2D().scale(1.0, -1.0)) | ||
| _convert_path( | ||
| ctx, marker_path, marker_trans + Affine2D().scale(1, -1)) |
| vars(gc).update(gc_vars) | ||
| for k, v in gc_vars.items(): | ||
| try: | ||
| getattr(gc, "set" + k)(v) |
There was a problem hiding this comment.
Actually no, I'm just copying the __dict__ of the original gc, and (kind of an implementation detail...) GraphicsContextBase stores "property" foo (accessed by get/set_foo) in ._foo). Not very elegant I admit.
| antialiaseds, urls, offset_position): | ||
| path, transform = path_id | ||
| transform = (Affine2D(transform.get_matrix()).translate(xo, yo) | ||
| + Affine2D().scale(1, -1).translate(0, self.height)) |
There was a problem hiding this comment.
Why do you need the second Affine2D?
| if new_key == reuse_key: | ||
| grouped_draw.append((path, transform)) | ||
| else: | ||
| _draw_paths() |
There was a problem hiding this comment.
Do we need to worry about draw order and such here?
There was a problem hiding this comment.
The draw order is maintained, it's just that I batch the draws that use the same gc together.
| as_int[::4][cairo_type_positions] = codes | ||
| as_int[1::4][cairo_type_positions] = cairo_type_sizes | ||
| mask[::2][cairo_type_positions] = mask[1::2][cairo_type_positions] = False | ||
| as_float[mask] = vertices.ravel() |
There was a problem hiding this comment.
Could be re-arranged slightly to keep the int and float bits together.
There was a problem hiding this comment.
Not sure what you meant?
There was a problem hiding this comment.
Well, you start with a buffer, make an int version, then a float version, then a mask (which is for float). But then go back to filling the int version, modifying the mask (which is float stuff again), then fill the float version. So I mean more like:
buf = np.empty(cairo_num_data * 16, np.uint8)
as_int = np.frombuffer(buf.data, np.int32)
as_int[::4][cairo_type_positions] = codes
as_int[1::4][cairo_type_positions] = cairo_type_sizes
as_float = np.frombuffer(buf.data, np.float64)
mask = np.ones_like(as_float, bool)
mask[::2][cairo_type_positions] = mask[1::2][cairo_type_positions] = False
as_float[mask] = vertices.ravel() 241a306 to
f1824ee
Compare
|
Comments addressed. |
a103612 to
245f5de
Compare
|
comments addressed |
245f5de to
4a9f752
Compare
|
Generally, I don't see a problem with this change, but using your microbenchmark, I can't seem to reproduce the speedup you see: vs. |
|
Are you certain your gtk3cairo is using cairocffi rather than pycairo (which is not accelerated by this PR)? |
|
OK, I'm not really sure what happened, but I just tried again and I get 12.705427 vs. 17.470702 now. |
timhoffm
left a comment
There was a problem hiding this comment.
Not a cario expert, but look good to me.
Not really clear with the naming of convert_paths. To me convert feels like it should return the converted object. Maybe apply_paths? If that's not good, please add a short docstring to the convert functions.
flatten() always makes a copy, whereas ravel() does not.
The removed pair of ctx.save and ctx.restore was clearly unnecessary (the outer one is still there, and the font is reset at each loop iteration).
Improves the performance of mplot3d/wire3d_animation on the gtk3cairo backend from ~8.3fps to ~10.5fps (as a comparison, gtk3agg is at ~16.2fps).
Further increase the performance of mplot3d/wire3d_animation on the gtk3cairo backend from ~10.5fps to ~11.6fps (as a comparison, gtk3agg is at ~16.2fps).
For the slow code path, implement the degree elevation formula. For the fast code path, the path cleaner was already handling this for us, converting everything to lines.
|
Renamed to _append_path, which matches the name of the function in the cairo API: https://cairographics.org/manual/cairo-Paths.html#cairo-append-path |
|
Would it help you if pycairo had something like |
|
I moved my efforts to just doing everything in C(++) (https://github.com/anntzer/mplcairo) but I guess such a method can't hurt in general... |
ok
yeah, but it's unmaintained atm |
When using cairocffi only:
Accelerate Path drawing by directly constructing a cairo_path_t using numpy operations rather than repeatedly calling the cairo API to add one point at a time.
Accelerate PathCollection drawing by concatenating Paths that have the same drawing parameters (color, etc., i.e. GraphicsContext attributes).
Minor additional performance improvements (remove an unneeded ndarray copy and an unneeded context save/restore pair).
Overall, this brings the performance of examples/mplot3d/wire3d_animation.py (used as a microbenchmark) from ~8.3fps to ~11.6fps on the same computer (just optimizing Paths instead of PathCollections gives ~10.5fps), using the gtk3cairo backend. As a comparison point, the gtk3agg backend renders ~16.2fps for that example.
From profiling, I believe that further improvements (on that same microbenchmark) could be achieved by improving the text drawing (currently, we use the "toy" API, which is unfortunately the only one exposed by cairocffi so far; additionally, I am not certain whether ft2font exposes pointers to the underlying freetype structures).
Or (for other performance tests) draw_markers could probably also benefit from a similar approach as this PR.