[rfc][skip-ci] Screenshot Mode Service #59084
[rfc][skip-ci] Screenshot Mode Service #59084tsullivan wants to merge 17 commits intoelastic:masterfrom
Conversation
07fb556 to
06ab40e
Compare
06ab40e to
cfb9e91
Compare
💚 Build SucceededHistory
To update your PR or re-run it, just comment with: |
|
I think this likely makes sense but I think the RFC assumes some knowledge of how reporting currently works and is used. What is the lifecycle of creating a pdf report? Start kibana, load url, save pdf, and exit? It seems to me there are two benefits to this - simply hiding a UI from rendering and preventing unnecessary (and slow) work from occurring. Oh, there's a third, the popup case - where you need different behavior in the reporting context. Its my understanding that the UI has some slow bits (such as the news feed) that could be easily disabled. Past that, I'd be curious to see some performance benchmarks to see where most of the time is being spent creating a report. |
|
On Mar 3, 2020, at 8:06 AM, Matthew Kime ***@***.***> wrote:
I think this likely makes sense but I think the RFC assumes some knowledge of how reporting currently works and is used. What is the lifecycle of creating a pdf report? Start kibana, load url, save pdf, and exit?
I think these really are important points that would help all Kibana devs understand why we want to do this, what we’re solving and the value it provides.
Along with adding a quick summary of the screenshot capture flow, do you think it would help if we had a separate issue taking about the current state of performance with capturing a page? We can get data from Kibana and Puppeteer verbose logs and get some event timings, but that will take some time to put together.
|
|
@tsullivan That sounds good. I know those suggestions can mean more work so I leave it in your hands as to decide if and how you want to go about these things. |
|
@elasticmachine merge upstream |
|
|
||
| There are a few ways for the service to obtain the data for its internal state. | ||
| The most typical way would be to have a URL query string variable with the | ||
| raw data and have the service read it when an API client is constructed. The |
There was a problem hiding this comment.
By "API client", are you referring to an instance of ScreenshotModeServiceApi described below?
There was a problem hiding this comment.
Pretty much. I reworded this paragraph: c33de1b
Instead of "when the API client is constructed" I think I mean "in the setup method of the plugin" where it reads the URL and forms its interface data.
…to rfcs/screenshot_mode
brandon's grammer fix Co-Authored-By: Brandon Kobel <brandon.kobel@gmail.com>
Co-Authored-By: Brandon Kobel <brandon.kobel@gmail.com>
Co-Authored-By: Brandon Kobel <brandon.kobel@gmail.com>
Co-Authored-By: Brandon Kobel <brandon.kobel@gmail.com>
…to rfcs/screenshot_mode
| features. The solution to those downsides is to have Kibana pages themselves | ||
| become more capable at presenting themselves for screenshot capture reports. | ||
| With the Screenshot Mode Service available, Reporting could drop the | ||
| task that it currently has which hurt performance: wasted rendering that is |
There was a problem hiding this comment.
According to your investigation.
there are 2 steps being performance bottlenecks in the PDF rendering:
open_url. which is hurt by bloated artifacts size.get_number_of_items. I don't think we spend several seconds accessing DOM. Do we wait for data to be fetched on this step?
How would the current proposal improve them?
There was a problem hiding this comment.
- To improve the performance of
open_url, we can have the applications make fewer requests on page load by disabling the interactive features that aren't needed for capturing screenshots. - To improve the performance of
get_number_of_items(andwait_for_render), we'll look for ways to streamline the UX of the applications for screenshot capture mode to get them to be in a "ready" state sooner.
There was a problem hiding this comment.
I don't think we spend several seconds accessing DOM. Do we wait for data to be fetched on this step?
We're waiting for the application to inject the meta data into the DOM, so the time we're spending here is waiting for the DOM selectors to yield a result.
| screenshotRouter.get( | ||
| { path: '/my-endpoint' }, | ||
| async (context, req, res) => { | ||
| const isScreenshotMode = context.screenshots.isScreenshotMode(); |
There was a problem hiding this comment.
What is the expected result of this route handler? A fully rendered HTML page?
There was a problem hiding this comment.
Perhaps, but the example is generic to show a way that the service can be used in a route.
If the route serves rendered HTML and the HTML can be streamlined for screenshot capture, then we can integrate it in the route handler and customize the payload for screenshot capture.
If the route serves JSON, and the JSON needs to be customized for screenshot capture, we could do the same thing. I'm not sure if we should, but it may be a solution for bugs such as #17950. That issue says: when the data table is captured as an image, there needs to be the complete set of rows shown in the table because the user can't paginate or scroll the table in a rasterized image. If the data table's rows were populated by XHR, I would think about customizing the response to show all the rows of the table to make for the complete data capture.
| - Print media query CSS | ||
| Many things can be improved on capturing screenshots from the page using | ||
| `@print` CSS media query selecting. Also, CSS makes things hidden on the | ||
| page, but they are still loading in the DOM and take browser memory |
There was a problem hiding this comment.
Is it a really problem? AFAIK you can command a browser not to load styles using media attribute:
<link href="print.css" rel="stylesheet" media="print">There was a problem hiding this comment.
The reason we will use print media query CSS as a supplement is because we can't address page loading performance using just this solution. We need faster page loading performance than we have today to deliver the reports in a timely manner.
There was a problem hiding this comment.
How can I better understand the performance issues we face here with Reporting where media queries aren't sufficient? As I understand it, any performance issues we have in Reporting are also issues every-day users experience (bundle size, etc). I am concerned that the RFC suggestion doesn't address the underlying issue, but would like to understand further.
There was a problem hiding this comment.
My understanding is also that a lot of the problems happening in Reporting are issues every-day users face. I doubt that we can do anything about bundle sizes, even if this service existed. But I think we can find ways to decrease memory usage for non-interactive experiences. It would be wonderful if this helps reports run faster, but we're also still very concerned about stability and we think that streamlining where we can will help more reports complete without Chromium running out of memory.
Having this service in place would give us leverage to streamline more things. The background polling requests seem to be the primary negative factor on Reporting performance, in causing higher memory consumption. We're working with the idea that the plugins that do background polling can be adjusted for screenshot capture, where only the initial data should be needed.
We can't quantify how much of a hit that it's taking on performance because we don't have the tools to isolate that. The Screenshot Mode Service would be the tool that would let us tweak the screenshot capture experience. Being able to flip some switches would help us understand more about what is impacting performance, while we measure things happening with APM.
| plugin to support custom endpoints include the screenshot data as part of the | ||
| route handling context. Example: | ||
| ``` | ||
| const screenshotRouter = plugins.screenshots.wrapRouter(router); |
There was a problem hiding this comment.
What if we need to implement another mode? Dashboard mode, for example? Should plugin be aware of all of them?
There was a problem hiding this comment.
I can't think of a reason why not.
|
|
||
| ``` | ||
| http://localhost:5601/app/kibana | ||
| ?screenshots=(enabled:!t,layout:(height:500,width:400),timezone:PST) |
There was a problem hiding this comment.
who is going to parse this query in OSS mode?
There was a problem hiding this comment.
The Screenshot Mode Service will be a plugin in OSS Kibana.
| against the toast timeout window. | ||
| - Avoid collection and sending of telemetry from the browser when page is | ||
| loaded for screenshot capture. | ||
| - Turn off autocomplete features and auto-refresh features that weigh on |
There was a problem hiding this comment.
Can it be disabled by the Screenshot Mode Service explicitly via ui settings or plugin API?
There was a problem hiding this comment.
The plan for what the Screenshot Mode Service will have the applications and plugins turn off their own features as needed for more optimal screen capture. We're purposefully not having the service do anything automatically - I'm not sure that would even be possible as the service is a plugin that applications have to explicitly integrate with.
|
Looks like I forgot to put this RFC into a final comment period, but this week was the final comment period. I'm going to close this without merging and collaborate more with the Reporting team on a new RFC to replace this one. After some internal discussions, we found what we need the most is a way to distribute the application-specific rendering concerns away from the Reporting codebase and into the respective applications ownership. That notion is partially here, but it's obscured by performance topics. The performance problems are all over Kibana right now, and there are plans in New Platform to speed up page load time. |
Summary
RFC Process: https://github.com/elastic/kibana/tree/master/rfcs