Something like ~20% of CPU time is taken up by tessellation. Doing a local engine build and replacing the RRects with regular rects causes this to drop off the top 20 functions.
We should create a specialized tessellation routine for symetrical rounded rects.