Skip to content
This repository was archived by the owner on Feb 25, 2025. It is now read-only.

Conversation

@jonahwilliams
Copy link
Contributor

WIP. replaces #47976

jonahwilliams added 4 commits November 16, 2023 12:58
.GetMTLTexture();
}

// Render command encoder creation has been observed to exceed the stack
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: need to make this a regular NSThread so I can increase the stack size.

@knopp
Copy link
Member

knopp commented Nov 17, 2023

I played with this branch a bit on iPhone 13 PRO, 120hz. Haven't gotten to partial repaint yet.

it is probably a good thing to not block raster thread when waiting for drawable. That said, I measured how much it takes to obtain the drawable in GetDrawableDeferred and it still oscilates between ~0.02ms and 6-8ms. Testing it with Flutter gallery, I did notice a pattern:

For Flutter driven animations, such as activity indicator, [layer nextDrawable] consistently takes 6-8ms. For user interaction repaints, like scrolling (while dragging)[layer nextDrawable] consistently takes 0.02ms.

It gets messier with shorter animations started from user interaction, like ballistic or material spill. Less consistency there. Can't shake off the feeling that we have a timing issue somewhere.

@knopp
Copy link
Member

knopp commented Nov 17, 2023

After disabling _allowPauseAfterVsync inside VsyncWaiterIOS, [layer nextDrawable] during progress indicator animation now either takes 7ms or 0.02ms (more often), this number is consistent until the animation finishes.

@knopp
Copy link
Member

knopp commented Nov 17, 2023

Well, if there is a timing issue on our side I can't seem to find what it is. Also apple official example seems to have same problem - when rendering on background thread, [layer nextDrawable] takes 1-7ms on 120hz device. On background thread with presentsWithTransaction this goes down to sub 1ms, but of course that brings its own can of worms.

So I do have a bit of concern about this PR - it removes the lag from raster thread, which is good, but given that [layer nextDrawable] can still block for unreasonable amount of time, is it not possible that we'll be dropping frames without being aware of it (not showing on timeline).

@jonahwilliams
Copy link
Contributor Author

presentsWithTransaction from the background thread isn't valid, we went down that path before: flutter/flutter#133147

@jonahwilliams
Copy link
Contributor Author

Also I don't see how we would drop frames. If you try and run an application with very slow rendering (say dozens of blurs), then we still get backpressure.

@knopp
Copy link
Member

knopp commented Nov 17, 2023

Also I don't see how we would drop frames. If you try and run an application with very slow rendering (say dozens of blurs), then we still get backpressure.

Even with this PR sometimes [layer nextDrawable] on the background thread takes 7-9 ms, would this not drop frame?

@jonahwilliams
Copy link
Contributor Author

If we're blocking on aquiring the drawable we're already commited to that frame, I don't see how you would drop it?

@knopp
Copy link
Member

knopp commented Nov 17, 2023

If we're blocking on aquiring the drawable we're already commited to that frame, I don't see how you would drop it?

Not me, the compositor - If presentDrawable is called too late and we miss the deadline and then replace it with another frame, compositor will drop it. (But maybe that's not so straightforward with promotion 4ms resolution)

When playing with this branch I noticed that when this happens I get display link callbacks while [layer nextDrawable] is in progress, in which case I definitely don't think the drawable is presented in time. But also when this happens, the 5-7ms nextDrawable delay becomes consistent, so everything appears smooth, with just some additional latency. But still I'd love to know what exactly is triggering this.

@jonahwilliams
Copy link
Contributor Author

Not me, the compositor - If presentDrawable is called too late and we miss the deadline and then replace it with another frame, compositor will drop it. (But maybe that's not so straightforward with promotion 4ms resolution)

but we're not dropping that presentation. its going to happen at the next vsync interval. There is no indication I've seen that a drawable has a specific presentation time it must be submited by or it gets dropped, especially if there is no CA transaction associated with the presentation.

@jonahwilliams
Copy link
Contributor Author

And again, why would this be worse than blocking for 7-8ms before we do any raster work?

@knopp
Copy link
Member

knopp commented Nov 17, 2023

but we're not dropping that presentation. its going to happen at the next vsync interval. There is no indication I've seen that a drawable has a specific presentation time it must be submited by or it gets dropped, especially if there is no CA transaction associated with the presentation.

Right. But we don't know that we missed the deadline. So there's already another frame being rasterized and if that happens fast enough it will be presented before the end of next vsync. Thus the previous frame will dropped by the compositor.

                      [vsync]                         [vsync]
      Frame N            |            Frame N+1          |
                         |                               |
|raster thread|          |  |raster thread|              |
     |====next_drawable====|      |next_drawable|        | 
                         | |present|            |present||

When this happens it will not be shown as jank on the timeline, but it will seem janky on device.

And again, why would this be worse than blocking for 7-8ms before we do any raster work?

It's definitely not worse.

@knopp
Copy link
Member

knopp commented Nov 17, 2023

I don't want to sound too negative. Unless we can figure out what causes the delay (not holding my breath) this is a clear improvement.

@jonahwilliams
Copy link
Contributor Author

The scenario you are describing is already technically possible today. We can get up to a pipeline depth of two, which means that from the engine's perspective of a vsync interval, double submission is entirely possible.

This was my working theory for a while, but I can't find any evidence that this is actually happening on the system side. drawables have a presentation time attribute that we could intrument to determine if any are getting skipped, I can try that today.

I don't want to sound too negative.

I don't think you're being too negative. its important to be critical on changes like this, especially those that rely on APple's under documented functionality.

@jonahwilliams
Copy link
Contributor Author

Here is how I am testing this. Using an app that just has a linear progress indicator indeterminatly spinning.

The documentation for addPresentedHandler is unclear as to whether or not it fires if a drawable was skipped. It does note that drawable.presentedTime will be 0 if the drawable is skipped, which makes me thing it will. To ensure this is the case, I add a static atomic int, increment it for acuqired drawables and decrement it when the present handler fires. If we were skipping drawables and the handler fires, I'll see a print. If skipping drawables doesn't cause the handler to fire, then the number will increment over time.

I don't have any expectations for what the value is itself since the docs dont say when the presented handler is fired, but I'd suspect that 3 or less would be reasonable.

diff --git a/impeller/renderer/backend/metal/surface_mtl.mm b/impeller/renderer/backend/metal/surface_mtl.mm
index 4a49692c5f..5c9d502725 100644
--- a/impeller/renderer/backend/metal/surface_mtl.mm
+++ b/impeller/renderer/backend/metal/surface_mtl.mm
@@ -3,6 +3,7 @@
 // found in the LICENSE file.
 
 #include "impeller/renderer/backend/metal/surface_mtl.h"
+#include <atomic>
 #include <optional>
 
 #include "flutter/fml/trace_event.h"
@@ -18,6 +19,8 @@
 
 namespace impeller {
 
+static std::atomic_int64_t present_count_ = 0;
+
 #pragma GCC diagnostic push
 #pragma GCC diagnostic ignored "-Wunguarded-availability-new"
 
@@ -307,8 +310,17 @@ bool SurfaceMTL::Present() const {
       [drawable present];
     } else {
       auto deferred_drawable = deferred_drawable_;
+      present_count_++;
       [command_buffer addScheduledHandler:^(id<MTLCommandBuffer> _Nonnull) {
         auto drawable = deferred_drawable->get();
+        [drawable addPresentedHandler:^(id<MTLDrawable> draw){
+          present_count_--;
+          // Returns 0 if a frame has not been presented or has been skipped.
+          if (draw.presentedTime == 0) {
+            FML_LOG(ERROR) << "SKIPPED";
+          }
+          FML_LOG(ERROR) << present_count_;
+        }];
         [drawable present];
       }];
       [command_buffer commit];

If I run this then I observe the number very slowly creeping up from 2 to 3. If I let it run for multiple minutes that doesn't seem to change. If we were regularly dropping presentation I would expect to see this number climb quickly. If it happened only rarely then it would climb slowly. It seems consistent.

If I background the app, then I get a bunch of. "SKIPPED", I suspect because iOS pauses rendering. THe callback seems to fire for every cmd buffer.

If I foreground it again, it goes right back to 2-3.

@jonahwilliams
Copy link
Contributor Author

Trying a different app like https://github.com/gskinnerTeam/flutter-wonderous-app/ , I notice a similar behavior expect the count gets higher. Oddly enough I can consistently generate a skipped frame on the 2nd or 3rd frame after I start a slow gesture to swipe between the tabs.

I'm going to try blocking on the cmd buffer to see if that reproduces in all cases or is specific to using the worker thread.

@jonahwilliams
Copy link
Contributor Author

Even if I block on drawable aquisition on the raster thread instead of in the worker like I'm doing here, I still see iOS skipping drawables when I initiaiate a gesture on the wonderous screen.

[ERROR:flutter/impeller/renderer/backend/metal/surface_mtl.mm(321)] SKIPPED
[ERROR:flutter/impeller/renderer/backend/metal/surface_mtl.mm(323)] 0
[ERROR:flutter/impeller/renderer/backend/metal/surface_mtl.mm(323)] 2
[ERROR:flutter/impeller/renderer/backend/metal/surface_mtl.mm(323)] 3
[ERROR:flutter/impeller/renderer/backend/metal/surface_mtl.mm(323)] 3

This is with waitForNextDrawable on raster thread blocking, but at end of raster workload:

image

@jonahwilliams
Copy link
Contributor Author

If this was because we use present and not presentDrawable:atTime: or presentDrawable:afterMinimumDuration: I wouldn't expect to see this on the first few frames of a gesture.

@jonahwilliams
Copy link
Contributor Author

Its actually very easy to reproduce by putting your thumb down on the screen and lifting up without moving.

@knopp
Copy link
Member

knopp commented Nov 18, 2023

I think I finally have a clue about what is going on. I have a simple metal application with one displaylink on background thread (slightly modified example). Even with this fairly simple example, iPhone 13 Pro will sometimes not manage to put frame on glass in expected time. The interesting part is the behavior when this happens:

  1. With presentsWithTransaction=YES such frame will be dropped by compositor. presentedTime inside presentedHandler will be 0. The frame will never see the light of the day. The interval between displaylink target and presentedTime is consistently 16ms.

  2. With presentsWithTransaction=NO such frames are never dropped. When frame is not presented in time, the previous frame will be displayed for longer and then the new frame is displayed, increasing inverval between displaylink target and presentedTime to 25ms. After this, nextDrawable will take ~8ms to complete, because the frame display latency has increased. Eventually this will self correct by nextDrawable blocking the thread enough so that present can keep up and the interval between displaylink target and presentedTime goes back to 16ms.

I tried a simple experiment, every time presentedTime-targetTime exceeded 20ms, I would drop the next frame (simply not call present), to mimic the behavior of presentsWithTransaction=YES. And it worked. [nextDrawable] never blocks and the displayLink target - presentedTime keeps consistently at 16ms.

TLDR: presentsWithTransaction=YES drops frames. presentsWithTransaction=NO will never drop frame, but [nextDrawable] acts as a backpressure for renderer to eventually keep up with presentation.

If we move [nextDrawable] to separate thread we will lose the backpressure, possibly increasing the latency. Perhaps we could figure out when to drop the frame instead, so our behavior would be closer to presentsWithTransaction=YES.

@knopp
Copy link
Member

knopp commented Nov 18, 2023

To confirm my suspicion that without backpressure on raster thread and not dropping frames this PR will keep increasing presentation latency ([command_buffer commit] -> show on glass) indefinitely I tried logging it here. When running FlutterGallery Circular progress indicator, the latency keep steadily increasing, after few minutes it gets to hundreds of milliseconds:

...
Presentation latency 0.056334 (frame 5434)
...
Presentation latency 0.064422 (frame 7625)
...
Presentation latency 0.106325 (frame 12760)
...

Note that this is on A15. Maybe on faster devices more frames will be presented on time and the latency won't increase that quickly.

Here's what 200ms presentation latency after few minutes of animation looks like :)

latency.mov

@knopp
Copy link
Member

knopp commented Nov 18, 2023

Considering all of the above, here is a very quick prototype for frame dropping. It makes some assumptions about frame duration, but it seems to be working quite well.

Circular progress indicator example with frame dropping:

progress-indicator-frame-drop

Same thing on main:

progress-indicator-main

In this example, only one frame gets dropped at the beginning of transition from gallery menu to example. That is enough to remove the delay from [layer nextDrawable].

@jonahwilliams
Copy link
Contributor Author

Really interesting results - I think we can keep the backpressure if it provides some mechanism to avoid getting out of sync. But the idea of dropping frames ourselves is absolutely terrifying to me. i.e. we drop the last frame of a particular animation we could leave the UI in an inconsistent state, right? I suspect that the presentation engine has more herustics than we do.

So if its the case that presentsWIthTransaction solves this problem by allowing the presentation engine to drop frames, then it sounds like we actually need to merge the raster thread onto the platform thread so we can safely present with transaction.

@knopp
Copy link
Member

knopp commented Nov 18, 2023

Really interesting results - I think we can keep the backpressure if it provides some mechanism to avoid getting out of sync. But the idea of dropping frames ourselves is absolutely terrifying to me. i.e. we drop the last frame of a particular animation we could leave the UI in an inconsistent state, right? I suspect that the presentation engine has more herustics than we do.

I don't think this is about heuristics. As far as I can tell, presentsWithTransaction is basically what we do on macOS - replace layer contents with IOSurface that backs the texture. If you replace the contents twice in one frame, only the later will be picked by the compositor.

As for dropping the last frame, if you look at my prototype branch, that should be a fixable problem.

So if its the case that presentsWIthTransaction solves this problem by allowing the presentation engine to drop frames, then it sounds like we actually need to merge the raster thread onto the platform thread so we can safely present with transaction.

I think that would be a last resort?

@jonahwilliams
Copy link
Contributor Author

Alternatively we can investigate a frame dropping approach - it worries me greatly tbh, I don't know how we make that safe.

@jonahwilliams
Copy link
Contributor Author

I've added back the backpressure mechanism to this PR.

@knopp
Copy link
Member

knopp commented Nov 18, 2023

Alternatively we can investigate a frame dropping approach - it worries me greatly tbh, I don't know how we make that safe.

It should worry you :) But that doesn't mean it's not worth giving it a shot.

@jonahwilliams
Copy link
Contributor Author

lol fair. I am worried a lot in general these days so I'm probably being overly cautious. I think we should be able to tell if there is another frame in flight (since flutter will render with a pipeline depth of 2). That would be a signal that is safe to skip. I would need to think about the edge cases there.

@jonahwilliams
Copy link
Contributor Author

Closing for now, I think @knopp 's changes will be the preferred path (once we do some more testing).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

No open projects
Archived in project

Development

Successfully merging this pull request may close these issues.

2 participants