[Impeller] Defer drawable aquisition until root pass encoding. #48141

jonahwilliams · 2023-11-16T20:40:24Z

WIP. replaces #47976

impeller/renderer/backend/metal/surface_mtl.mm

jonahwilliams · 2023-11-17T02:48:36Z

impeller/renderer/backend/metal/command_buffer_mtl.mm

+                  .GetMTLTexture();
+        }
+
+        // Render command encoder creation has been observed to exceed the stack


TODO: need to make this a regular NSThread so I can increase the stack size.

knopp · 2023-11-17T10:28:23Z

I played with this branch a bit on iPhone 13 PRO, 120hz. Haven't gotten to partial repaint yet.

it is probably a good thing to not block raster thread when waiting for drawable. That said, I measured how much it takes to obtain the drawable in GetDrawableDeferred and it still oscilates between ~0.02ms and 6-8ms. Testing it with Flutter gallery, I did notice a pattern:

For Flutter driven animations, such as activity indicator, [layer nextDrawable] consistently takes 6-8ms. For user interaction repaints, like scrolling (while dragging)[layer nextDrawable] consistently takes 0.02ms.

It gets messier with shorter animations started from user interaction, like ballistic or material spill. Less consistency there. Can't shake off the feeling that we have a timing issue somewhere.

knopp · 2023-11-17T11:12:09Z

After disabling _allowPauseAfterVsync inside VsyncWaiterIOS, [layer nextDrawable] during progress indicator animation now either takes 7ms or 0.02ms (more often), this number is consistent until the animation finishes.

knopp · 2023-11-17T16:14:14Z

Well, if there is a timing issue on our side I can't seem to find what it is. Also apple official example seems to have same problem - when rendering on background thread, [layer nextDrawable] takes 1-7ms on 120hz device. On background thread with presentsWithTransaction this goes down to sub 1ms, but of course that brings its own can of worms.

So I do have a bit of concern about this PR - it removes the lag from raster thread, which is good, but given that [layer nextDrawable] can still block for unreasonable amount of time, is it not possible that we'll be dropping frames without being aware of it (not showing on timeline).

jonahwilliams · 2023-11-17T16:18:32Z

presentsWithTransaction from the background thread isn't valid, we went down that path before: flutter/flutter#133147

jonahwilliams · 2023-11-17T16:20:20Z

Also I don't see how we would drop frames. If you try and run an application with very slow rendering (say dozens of blurs), then we still get backpressure.

knopp · 2023-11-17T16:23:00Z

Also I don't see how we would drop frames. If you try and run an application with very slow rendering (say dozens of blurs), then we still get backpressure.

Even with this PR sometimes [layer nextDrawable] on the background thread takes 7-9 ms, would this not drop frame?

jonahwilliams · 2023-11-17T16:26:07Z

If we're blocking on aquiring the drawable we're already commited to that frame, I don't see how you would drop it?

knopp · 2023-11-17T17:43:25Z

If we're blocking on aquiring the drawable we're already commited to that frame, I don't see how you would drop it?

Not me, the compositor - If presentDrawable is called too late and we miss the deadline and then replace it with another frame, compositor will drop it. (But maybe that's not so straightforward with promotion 4ms resolution)

When playing with this branch I noticed that when this happens I get display link callbacks while [layer nextDrawable] is in progress, in which case I definitely don't think the drawable is presented in time. But also when this happens, the 5-7ms nextDrawable delay becomes consistent, so everything appears smooth, with just some additional latency. But still I'd love to know what exactly is triggering this.

jonahwilliams · 2023-11-17T17:50:17Z

Not me, the compositor - If presentDrawable is called too late and we miss the deadline and then replace it with another frame, compositor will drop it. (But maybe that's not so straightforward with promotion 4ms resolution)

but we're not dropping that presentation. its going to happen at the next vsync interval. There is no indication I've seen that a drawable has a specific presentation time it must be submited by or it gets dropped, especially if there is no CA transaction associated with the presentation.

jonahwilliams · 2023-11-17T18:07:21Z

And again, why would this be worse than blocking for 7-8ms before we do any raster work?

knopp · 2023-11-17T18:18:46Z

but we're not dropping that presentation. its going to happen at the next vsync interval. There is no indication I've seen that a drawable has a specific presentation time it must be submited by or it gets dropped, especially if there is no CA transaction associated with the presentation.

Right. But we don't know that we missed the deadline. So there's already another frame being rasterized and if that happens fast enough it will be presented before the end of next vsync. Thus the previous frame will dropped by the compositor.

                      [vsync]                         [vsync]
      Frame N            |            Frame N+1          |
                         |                               |
|raster thread|          |  |raster thread|              |
     |====next_drawable====|      |next_drawable|        | 
                         | |present|            |present||

When this happens it will not be shown as jank on the timeline, but it will seem janky on device.

And again, why would this be worse than blocking for 7-8ms before we do any raster work?

It's definitely not worse.

knopp · 2023-11-17T18:30:13Z

I don't want to sound too negative. Unless we can figure out what causes the delay (not holding my breath) this is a clear improvement.

jonahwilliams · 2023-11-17T18:33:42Z

The scenario you are describing is already technically possible today. We can get up to a pipeline depth of two, which means that from the engine's perspective of a vsync interval, double submission is entirely possible.

This was my working theory for a while, but I can't find any evidence that this is actually happening on the system side. drawables have a presentation time attribute that we could intrument to determine if any are getting skipped, I can try that today.

I don't want to sound too negative.

I don't think you're being too negative. its important to be critical on changes like this, especially those that rely on APple's under documented functionality.

jonahwilliams · 2023-11-17T19:06:09Z

Here is how I am testing this. Using an app that just has a linear progress indicator indeterminatly spinning.

The documentation for addPresentedHandler is unclear as to whether or not it fires if a drawable was skipped. It does note that drawable.presentedTime will be 0 if the drawable is skipped, which makes me thing it will. To ensure this is the case, I add a static atomic int, increment it for acuqired drawables and decrement it when the present handler fires. If we were skipping drawables and the handler fires, I'll see a print. If skipping drawables doesn't cause the handler to fire, then the number will increment over time.

I don't have any expectations for what the value is itself since the docs dont say when the presented handler is fired, but I'd suspect that 3 or less would be reasonable.

diff --git a/impeller/renderer/backend/metal/surface_mtl.mm b/impeller/renderer/backend/metal/surface_mtl.mm
index 4a49692c5f..5c9d502725 100644
--- a/impeller/renderer/backend/metal/surface_mtl.mm
+++ b/impeller/renderer/backend/metal/surface_mtl.mm
@@ -3,6 +3,7 @@
 // found in the LICENSE file.
 
 #include "impeller/renderer/backend/metal/surface_mtl.h"
+#include <atomic>
 #include <optional>
 
 #include "flutter/fml/trace_event.h"
@@ -18,6 +19,8 @@
 
 namespace impeller {
 
+static std::atomic_int64_t present_count_ = 0;
+
 #pragma GCC diagnostic push
 #pragma GCC diagnostic ignored "-Wunguarded-availability-new"
 
@@ -307,8 +310,17 @@ bool SurfaceMTL::Present() const {
       [drawable present];
     } else {
       auto deferred_drawable = deferred_drawable_;
+      present_count_++;
       [command_buffer addScheduledHandler:^(id<MTLCommandBuffer> _Nonnull) {
         auto drawable = deferred_drawable->get();
+        [drawable addPresentedHandler:^(id<MTLDrawable> draw){
+          present_count_--;
+          // Returns 0 if a frame has not been presented or has been skipped.
+          if (draw.presentedTime == 0) {
+            FML_LOG(ERROR) << "SKIPPED";
+          }
+          FML_LOG(ERROR) << present_count_;
+        }];
         [drawable present];
       }];
       [command_buffer commit];

If I run this then I observe the number very slowly creeping up from 2 to 3. If I let it run for multiple minutes that doesn't seem to change. If we were regularly dropping presentation I would expect to see this number climb quickly. If it happened only rarely then it would climb slowly. It seems consistent.

If I background the app, then I get a bunch of. "SKIPPED", I suspect because iOS pauses rendering. THe callback seems to fire for every cmd buffer.

If I foreground it again, it goes right back to 2-3.

jonahwilliams · 2023-11-17T19:12:42Z

Trying a different app like https://github.com/gskinnerTeam/flutter-wonderous-app/ , I notice a similar behavior expect the count gets higher. Oddly enough I can consistently generate a skipped frame on the 2nd or 3rd frame after I start a slow gesture to swipe between the tabs.

I'm going to try blocking on the cmd buffer to see if that reproduces in all cases or is specific to using the worker thread.

jonahwilliams · 2023-11-17T19:16:50Z

Even if I block on drawable aquisition on the raster thread instead of in the worker like I'm doing here, I still see iOS skipping drawables when I initiaiate a gesture on the wonderous screen.

[ERROR:flutter/impeller/renderer/backend/metal/surface_mtl.mm(321)] SKIPPED
[ERROR:flutter/impeller/renderer/backend/metal/surface_mtl.mm(323)] 0
[ERROR:flutter/impeller/renderer/backend/metal/surface_mtl.mm(323)] 2
[ERROR:flutter/impeller/renderer/backend/metal/surface_mtl.mm(323)] 3
[ERROR:flutter/impeller/renderer/backend/metal/surface_mtl.mm(323)] 3

This is with waitForNextDrawable on raster thread blocking, but at end of raster workload:

jonahwilliams · 2023-11-17T19:18:52Z

If this was because we use present and not presentDrawable:atTime: or presentDrawable:afterMinimumDuration: I wouldn't expect to see this on the first few frames of a gesture.

jonahwilliams · 2023-11-17T19:19:38Z

Its actually very easy to reproduce by putting your thumb down on the screen and lifting up without moving.

knopp · 2023-11-18T08:41:12Z

I think I finally have a clue about what is going on. I have a simple metal application with one displaylink on background thread (slightly modified example). Even with this fairly simple example, iPhone 13 Pro will sometimes not manage to put frame on glass in expected time. The interesting part is the behavior when this happens:

With presentsWithTransaction=YES such frame will be dropped by compositor. presentedTime inside presentedHandler will be 0. The frame will never see the light of the day. The interval between displaylink target and presentedTime is consistently 16ms.
With presentsWithTransaction=NO such frames are never dropped. When frame is not presented in time, the previous frame will be displayed for longer and then the new frame is displayed, increasing inverval between displaylink target and presentedTime to 25ms. After this, nextDrawable will take ~8ms to complete, because the frame display latency has increased. Eventually this will self correct by nextDrawable blocking the thread enough so that present can keep up and the interval between displaylink target and presentedTime goes back to 16ms.

I tried a simple experiment, every time presentedTime-targetTime exceeded 20ms, I would drop the next frame (simply not call present), to mimic the behavior of presentsWithTransaction=YES. And it worked. [nextDrawable] never blocks and the displayLink target - presentedTime keeps consistently at 16ms.

TLDR: presentsWithTransaction=YES drops frames. presentsWithTransaction=NO will never drop frame, but [nextDrawable] acts as a backpressure for renderer to eventually keep up with presentation.

If we move [nextDrawable] to separate thread we will lose the backpressure, possibly increasing the latency. Perhaps we could figure out when to drop the frame instead, so our behavior would be closer to presentsWithTransaction=YES.

knopp · 2023-11-18T11:39:12Z

To confirm my suspicion that without backpressure on raster thread and not dropping frames this PR will keep increasing presentation latency ([command_buffer commit] -> show on glass) indefinitely I tried logging it here. When running FlutterGallery Circular progress indicator, the latency keep steadily increasing, after few minutes it gets to hundreds of milliseconds:

...
Presentation latency 0.056334 (frame 5434)
...
Presentation latency 0.064422 (frame 7625)
...
Presentation latency 0.106325 (frame 12760)
...

Note that this is on A15. Maybe on faster devices more frames will be presented on time and the latency won't increase that quickly.

Here's what 200ms presentation latency after few minutes of animation looks like :)

latency.mov

knopp · 2023-11-18T15:45:39Z

Considering all of the above, here is a very quick prototype for frame dropping. It makes some assumptions about frame duration, but it seems to be working quite well.

Circular progress indicator example with frame dropping:

Same thing on main:

In this example, only one frame gets dropped at the beginning of transition from gallery menu to example. That is enough to remove the delay from [layer nextDrawable].

jonahwilliams · 2023-11-18T16:51:49Z

Really interesting results - I think we can keep the backpressure if it provides some mechanism to avoid getting out of sync. But the idea of dropping frames ourselves is absolutely terrifying to me. i.e. we drop the last frame of a particular animation we could leave the UI in an inconsistent state, right? I suspect that the presentation engine has more herustics than we do.

So if its the case that presentsWIthTransaction solves this problem by allowing the presentation engine to drop frames, then it sounds like we actually need to merge the raster thread onto the platform thread so we can safely present with transaction.

knopp · 2023-11-18T17:03:25Z

Really interesting results - I think we can keep the backpressure if it provides some mechanism to avoid getting out of sync. But the idea of dropping frames ourselves is absolutely terrifying to me. i.e. we drop the last frame of a particular animation we could leave the UI in an inconsistent state, right? I suspect that the presentation engine has more herustics than we do.

I don't think this is about heuristics. As far as I can tell, presentsWithTransaction is basically what we do on macOS - replace layer contents with IOSurface that backs the texture. If you replace the contents twice in one frame, only the later will be picked by the compositor.

As for dropping the last frame, if you look at my prototype branch, that should be a fixable problem.

So if its the case that presentsWIthTransaction solves this problem by allowing the presentation engine to drop frames, then it sounds like we actually need to merge the raster thread onto the platform thread so we can safely present with transaction.

I think that would be a last resort?

jonahwilliams · 2023-11-18T17:03:36Z

Alternatively we can investigate a frame dropping approach - it worries me greatly tbh, I don't know how we make that safe.

jonahwilliams · 2023-11-18T17:03:51Z

I've added back the backpressure mechanism to this PR.

knopp · 2023-11-18T17:10:34Z

Alternatively we can investigate a frame dropping approach - it worries me greatly tbh, I don't know how we make that safe.

It should worry you :) But that doesn't mean it's not worth giving it a shot.

jonahwilliams · 2023-11-18T17:24:23Z

lol fair. I am worried a lot in general these days so I'm probably being overly cautious. I think we should be able to tell if there is another frame in flight (since flutter will render with a pipeline depth of 2). That would be a signal that is safe to skip. I would need to think about the edge cases there.

jonahwilliams · 2023-12-18T17:48:03Z

Closing for now, I think @knopp 's changes will be the preferred path (once we do some more testing).

jonahwilliams added 2 commits November 16, 2023 12:38

[Impeller] defer drawable aquisition.

60d639a

++

07f8699

jonahwilliams mentioned this pull request Nov 16, 2023

[Impeller] remove drawable latency (and partial repaint). #47976

Closed

github-actions bot added the e: impeller label Nov 16, 2023

jonahwilliams commented Nov 16, 2023

View reviewed changes

impeller/renderer/backend/metal/surface_mtl.mm Outdated Show resolved Hide resolved

jonahwilliams mentioned this pull request Nov 16, 2023

[Impeller][iOS] Remove nextDrawable latency by deferring drawable acquisition. flutter/flutter#138490

Closed

jonahwilliams added 4 commits November 16, 2023 12:58

golf

1bc0729

++

f445d62

+

c270bbd

++

555de4b

jonahwilliams commented Nov 17, 2023

View reviewed changes

knopp mentioned this pull request Nov 18, 2023

[Impeller] Unstable frame rates on high frame rate iOS devices post 16.6. flutter/flutter#134959

Closed

add back backpressure.

ddee569

chinmaygarde assigned jonahwilliams Dec 2, 2023

jonahwilliams closed this Dec 18, 2023

[Impeller] Defer drawable aquisition until root pass encoding. #48141

[Impeller] Defer drawable aquisition until root pass encoding. #48141

Uh oh!

Conversation

jonahwilliams commented Nov 16, 2023

Uh oh!

Uh oh!

jonahwilliams Nov 17, 2023

Choose a reason for hiding this comment

Uh oh!

knopp commented Nov 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

knopp commented Nov 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

knopp commented Nov 17, 2023

Uh oh!

jonahwilliams commented Nov 17, 2023

Uh oh!

jonahwilliams commented Nov 17, 2023

Uh oh!

knopp commented Nov 17, 2023

Uh oh!

jonahwilliams commented Nov 17, 2023

Uh oh!

knopp commented Nov 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jonahwilliams commented Nov 17, 2023

Uh oh!

jonahwilliams commented Nov 17, 2023

Uh oh!

knopp commented Nov 17, 2023

Uh oh!

knopp commented Nov 17, 2023

Uh oh!

jonahwilliams commented Nov 17, 2023

Uh oh!

jonahwilliams commented Nov 17, 2023

Uh oh!

jonahwilliams commented Nov 17, 2023

Uh oh!

jonahwilliams commented Nov 17, 2023

Uh oh!

jonahwilliams commented Nov 17, 2023

Uh oh!

jonahwilliams commented Nov 17, 2023

Uh oh!

knopp commented Nov 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

knopp commented Nov 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

knopp commented Nov 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jonahwilliams commented Nov 18, 2023

Uh oh!

knopp commented Nov 18, 2023

Uh oh!

jonahwilliams commented Nov 18, 2023

Uh oh!

jonahwilliams commented Nov 18, 2023

Uh oh!

knopp commented Nov 18, 2023

Uh oh!

jonahwilliams commented Nov 18, 2023

Uh oh!

jonahwilliams commented Dec 18, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

knopp commented Nov 17, 2023 •

edited

Loading

knopp commented Nov 17, 2023 •

edited

Loading

knopp commented Nov 17, 2023 •

edited

Loading

knopp commented Nov 18, 2023 •

edited

Loading

knopp commented Nov 18, 2023 •

edited

Loading

knopp commented Nov 18, 2023 •

edited

Loading