stamps Performance problems with detailed art (too many small shapes?)

This is probably a hard problem to crack, but here's a description anyway...

This snowflake pattern in the Context Free gallery renders quickly in the CF app: https://contextfreeart.org/gallery2/index.html#design/3771

Here's the CF code:

shape A {
  CIRCLE[]
  loop 6 [b.1r-60] A [s.4.5y.5]
}
startshape A

And here's my translation in Stamps:

(define rr random-real)

(define-shape A
  (circle)
  ((loop ([i 6])
         (A [s (rr .4 .5)]
            [y .5]
            [b (* i .1)]
            [r (* i -60)]))))
         

(maximum-render-cycles 200000)
(start-shape A)

I guess that maximum-render-cycles needs to be increased to get the detail, but this results in long render times and out of memory errors.

Nov 07 '17 10:11 ericclack

yes, unfortunately, C++ will beat racket no matter how much we optimize.

Things to try:

Use Racket Profile to identify the bottleneck
Using more of typed racket improves performance
For memory optimization: the path calculation algorithm could compute union of paths to reduce the number of objects
I haven't explored any concurrency/parallelism - which is definitely possible in this application
Maybe smarted pruning techniques (stop expanding if shape gets too small, but that can be tricky)

Nov 07 '17 23:11 rodrigosetti

I think exploring the pruning might be beneficial. Here's an example, which renders some circles - about 25 are visible, but 5001 get rendered:

#lang s-exp stamps/lang

(define-shape row
  (circle [saturation 1  ]
        [hue        0  ]
        [brightness 0.5])
  (row [translate   0.5   0.5]
       [scale    0.8  ]
       [hue      2    ])
)

(maximum-render-cycles 10000)
(start-shape row)

I'm guessing the pruning code would go in the function record-paths - is that correct?

Let me know what you think and I'll do some exploring...

Nov 09 '17 13:11 ericclack

Yes, that would be the obvious place.

I guess the sub-problems we need to solve to optimize the performance of your benchmark case are:

an efficient test to decide whether a polygon completely occludes another (polygon = 2xN matrix)
an efficient test to decide whether a new polygon added (in record-path method) occludes any of the existing polygons (in paths-queue instance variable) - and then efficiently remove those occluded polygons. Most likely the existing polygons set must be represented in something other than a queue (maybe a quad-tree?).

Nov 10 '17 02:11 rodrigosetti

Hi Rodrigo, My first strategy is to understand fully the recording paths code (especially recursive ones), so I'm writing some test code in a branch here: https://github.com/ericclack/racket-stamps/blob/performance/private/tests/renderer-recursive-tests.rkt

Once I've done that I'll start trying out some pruning strategies, perhaps with some alternative data structures as you suggest.

I'll let you know how I get on... -Eric.

Nov 14 '17 12:11 ericclack

My bad, I misunderstood you! I thought you're referring to the path-record% class (path-record.rkt).

The class intent of the class is simply to be a place where: 1) you can add new "paths", and 2) you can "render" those added paths into a device context (dc). My previous comments refer to ideas to implement in that class.

As for the record-paths function (render.rkt), it's not really recursive, just using a trick of "let" expressions to make a loop work. The purpose is to "keep applying" the renderer to shape rules and getting more shape rules until we reach the maximum render cycles or there are no more rule expansions - all while adding those paths to an instance of path-record%.

I'm not 100% clear how to optimize that function

Nov 15 '17 01:11 rodrigosetti

Ah I see how it works now, with the let-loop.

I have a theory that at some point we can discard the functions returned by (renderer pr) in function record-paths in render.rkt... when their shapes would be too small to see. I'm going to explore this and let you know how I get on.

Nov 16 '17 12:11 ericclack

More progress here: https://github.com/ericclack/racket-stamps/tree/performance

More to do...

Nov 20 '17 17:11 ericclack

Testing seems to show no performance improvement! However it's been interesting for me to see how the code works... we'll see if further work leads anywhere...

Nov 20 '17 18:11 ericclack

Looking further I can see that the pruning is not having the desired effect -- certainly we are drawing less, but the work to record the paths seems to still reach the maximum-render-cycles limit, and this takes most of the time.

See here the low number of shapes and the high render time, which actually is mostly in "recording paths" (from https://github.com/ericclack/racket-stamps/blob/performance/examples/cf-pink-blossom2.rkt)

rendering...recording paths... drawing paths...  2983 shapes, 28595 ms
saving to 1024x768 image output.png (png)...done

So I need to look at my logic here:

https://github.com/ericclack/racket-stamps/blob/2d9ff58e3972d66506e15a2c5fc9a13a2c5cdc0c/private/render.rkt#L72

Nov 21 '17 10:11 ericclack

See option #18

Dec 16 '17 11:12 ericclack