Skip to content
This repository was archived by the owner on Feb 25, 2025. It is now read-only.

Increase thread priority before engine init#20922

Merged
cbracken merged 3 commits intoflutter:masterfrom
linxuebin1990:patch-1
Sep 9, 2020
Merged

Increase thread priority before engine init#20922
cbracken merged 3 commits intoflutter:masterfrom
linxuebin1990:patch-1

Conversation

@linxuebin1990
Copy link
Member

Description

We get huge ANRs in not pure flutter app.
We found that 0.15% of the engine initialization took more than 10 seconds, every 10 million starts.

Fellow code show, engine init in ui thread. So Increase thread priority before engine init can reduce time cost

// shell/common/shell.cc
fml::TaskRunner::RunNowOrPostTask(
      shell->GetTaskRunners().GetUITaskRunner(),
      fml::MakeCopyable([&engine_promise,                                 //
                         shell = shell.get(),                             //
                         &dispatcher_maker,                               //
                         &platform_data,                                  //
                         isolate_snapshot = std::move(isolate_snapshot),  //
                         vsync_waiter = std::move(vsync_waiter),          //
                         &weak_io_manager_future,                         //
                         &snapshot_delegate_future,                       //
                         &unref_queue_future                              //
  ]() mutable {
        TRACE_EVENT0("flutter", "ShellSetupUISubsystem");
        const auto& task_runners = shell->GetTaskRunners();

        // The animator is owned by the UI thread but it gets its vsync pulses
        // from the platform.
        auto animator = std::make_unique<Animator>(*shell, task_runners,
                                                   std::move(vsync_waiter));

        engine_promise.set_value(std::make_unique<Engine>(
            *shell,                         //
            dispatcher_maker,               //
            *shell->GetDartVM(),            //
            std::move(isolate_snapshot),    //
            task_runners,                   //
            platform_data,                  //
            shell->GetSettings(),           //
            std::move(animator),            //
            weak_io_manager_future.get(),   //
            unref_queue_future.get(),       //
            snapshot_delegate_future.get()  //
            ));
      }));

Tests

Checklist

Before you create this PR confirm that it meets all requirements listed below by checking the relevant checkboxes ([x]). This will ensure a smooth and quick review process.

  • I read the contributor guide and followed the process outlined there for submitting PRs.
  • I signed the CLA.
  • I read and followed the C++, Objective-C, Java style guides for the engine.
  • I read the tree hygiene wiki page, which explains my responsibilities.
  • I updated/added relevant documentation.
  • All existing and new tests are passing.
  • I am willing to follow-up on review comments in a timely manner.

Breaking Change

Did any tests fail when you ran them? Please read handling breaking changes.

We get huge ANRs in not pure flutter app.
We found that 0.15% of the engine initialization took more than 10 seconds, every 10 million starts.
@flutter-dashboard
Copy link

It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat.

Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing.

@auto-assign auto-assign bot requested a review from gw280 September 1, 2020 13:25
@linxuebin1990
Copy link
Member Author

@liyuqian cc ?

change shell->shell_->GetTaskRunners() to task_runners
@liyuqian liyuqian added perf: speed Performance issues related to (mostly rendering) speed severe: performance Relates to speed or footprint issues. labels Sep 2, 2020
@liyuqian liyuqian requested review from chinmaygarde, jason-simmons, liyuqian and zanderso and removed request for gw280 September 2, 2020 18:06
@liyuqian
Copy link
Contributor

liyuqian commented Sep 2, 2020

The change looks reasonable. One challenging question is how to test and verify it. The expected improvement seems to be making 0.15% of engine initialization faster than 10 seconds, as previously observed by 10 million app startups in add-to-app scenarios on mobile devices. It is probably beyond our devicelab's capability to capture that signal reliably on mobile devices.

I wonder if we can simulate this in host machines with shell_benchmarks. Currently, it has a ShellInitializationAndShutdown benchmark which takes about 20ms for 1 iteration. Could we capture the improvement by running ShellInitializationAndShutdown for 1000 iterations and record the worst time? Could we spin num_cpu - 4 busy threads to artificially create some thread priority competition so we can observe this improvement more easily?

revert is_valid_ value
@linxuebin1990
Copy link
Member Author

The change looks reasonable. One challenging question is how to test and verify it. The expected improvement seems to be making 0.15% of engine initialization faster than 10 seconds, as previously observed by 10 million app startups in add-to-app scenarios on mobile devices. It is probably beyond our devicelab's capability to capture that signal reliably on mobile devices.

I wonder if we can simulate this in host machines with shell_benchmarks. Currently, it has a ShellInitializationAndShutdown benchmark which takes about 20ms for 1 iteration. Could we capture the improvement by running ShellInitializationAndShutdown for 1000 iterations and record the worst time? Could we spin num_cpu - 4 busy threads to artificially create some thread priority competition so we can observe this improvement more easily?

OK, I will do measure it by ShellInitializationAndShutdown .

@linxuebin1990
Copy link
Member Author

The change looks reasonable. One challenging question is how to test and verify it. The expected improvement seems to be making 0.15% of engine initialization faster than 10 seconds, as previously observed by 10 million app startups in add-to-app scenarios on mobile devices. It is probably beyond our devicelab's capability to capture that signal reliably on mobile devices.

I wonder if we can simulate this in host machines with shell_benchmarks. Currently, it has a ShellInitializationAndShutdown benchmark which takes about 20ms for 1 iteration. Could we capture the improvement by running ShellInitializationAndShutdown for 1000 iterations and record the worst time? Could we spin num_cpu - 4 busy threads to artificially create some thread priority competition so we can observe this improvement more easily?

We will verify this modification on Toutiao or Douyin in the near future.

Copy link
Contributor

@chinmaygarde chinmaygarde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is a fine low-risk change and beyond the capabilities of our current benchmark harnesses to measure.

Copy link
Contributor

@liyuqian liyuqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rerunning the Mac Host Engine test to see if it's just flaky. Agree that this is low-risk so landing without a test is reasonable. Of course, a test will always be appreciated as it can prevent accidental regressions, or capture similar performance issues caused by threads contention.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla: yes perf: speed Performance issues related to (mostly rendering) speed platform-android severe: performance Relates to speed or footprint issues.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants