♻️ Refactor core discovery queue by wwilsman · Pull Request #369 · percy/cli

wwilsman · 2021-06-16T00:26:22Z

What is this?

While the current queue system in @percy/core is better than the lazy promise system from @percy/agent, there are still a few performance issues. Namely, multiple browser pages can be opened at the same time for a single snapshot. So while our queue concurrency is set to a default of 5, the 2 widths per snapshot default causes 2 pages to be open per snapshot. Resulting in our true snapshot concurrency being severely limited by the number of widths.

Additionally, when using the capture() method to take DOM snapshots, another browser page is opened to take said snapshot before asset discovery opens pages for all of the widths. While page captures have their own concurrency, this can result in up to 10+ browser pages being opened and closed all at once across the process when capturing more than 5 snapshots.

Finally, while captures and snapshots were managed by their own individual queues, resulting API uploads were not queued. If the capture/snapshot queue drains faster than a few requests can resolve, this could then result in many many in-flight requests which could then clog the process and cause timeouts.

Snapshot, capture, and discovery changes

Rather than open multiple browser pages for multiple widths to discover responsive assets in parallel, we can trigger page resize events sequentially to limit the number of pages that need to be open. This actually results in faster responsive asset discovery for initial resource requests because common requests are no longer duplicated across all widths. While testing, I discovered that in some scenarios the browser will not request smaller assets when a larger one is already loaded. Because of this, widths are sorted with the initial page loading at the smallest width and then resizing to each larger width.

For capturing DOM snapshots with the capture() method, we can eliminate the extraneous browser page by decoupling the snapshot() method and discovering assets simultaneously while capturing the DOM. To achieve this, the discoverer class was broken down into functional methods that can be used by the browser page network when enabling request interception. With this change, the discovery asset cache was moved into the main Percy class and manipulating that cache is now done via request interception resource callbacks.

Now that the capture() method was isolated from the snapshot() method, the logic between them became very similar:

snapshot() {
  // open a page with request interception enabled
  // navigate to a URL and capture assets
  // serve DOM snapshot as root resource
  // schedule snapshot upload
}

capture() {
  // open a page with request interception enabled
  // navigate to a URL and capture assets
  // capture DOM snapshot as root resource
  // schedule snapshot upload
}

Since they were now so similar, these two methods could be merged into a single snapshot() method with one important piece of logic:

// -> if a DOM snapshot was provided, serve it
// -> if no DOM snapshot was provided, capture it

The capture portion of the code remained largely similar to before, but with other asset discovery code now present alongside the capture code, the method became a bit too complex. To alleviate the complexity, capturing of the DOM snapshot was moved into a dedicated page.snapshot() method. The various wait* options were also moved to this method away from the page.goto() method, since those options are only used when capturing a DOM snapshot.

With the percy.snapshot() method now accepting capture options, a snapshot config util was created to validate and merge snapshot options. Part of this validation is ensuring that capture options are not provided along with a static DOM snapshot. The snapshots capture option was also renamed to be additionalSnapshots to avoid confusion with the method name.

Queue changes

Now that captures and snapshots share a method and the discovery process has been broken down, we can combine all three separate queues into a single snapshot queue.

To manage uploads and prevent too many open requests happening at once, an upload queue was also added. The snapshot method will now asynchronously schedule uploads using this queue. This upload queue can also be optionally deferred which allows snapshots to be replaced prior to being sent to Percy. Since uploads can be deferred, it also makes sense to defer build creation along with snapshot uploads. In the future, we can expose this option to allow snapshot retries, or to avoid creating builds when the test suite fails (this would also mean Percy would start at the end of a test run rather than process snapshots as soon as possible).

To facilitate pausing, flushing, removing, and replacing queued tasks, the internal queue class was updated to manage tasks by id. Tasks can also be run according to a priority, which is utilized by the build creation task to ensure it always runs first in the queue. The build creation task also pauses the queue until the build has been created to ensure the build exists before attempting to upload snapshots. Upon build creation failure, the queue is cleared and closed and snapshots will no longer be accepted (the snapshot method will bubble a queue closed error).

Other changes

When the API responds with a 422 build failed error, the queue will also be cleared and closed after logging the error. The @percy/sdk-utils package was updated to catch closed errors and disable future snapshots.

While @percy/client kept build state and mapped one-to-one with a build, build failures are only really tied to closing core queues and preventing future snapshot processing. The client instance does not need to know about this particular state, and other build fetching methods also do not care about state and are not tied to a single build. Therefore it made sense to remove this one-to-one build requirement from client completely. However, this means internal breaking changes to client — now all client methods relating to a build will require a build ID and assertions were added so useful errors are thrown when forgetting to provide one.

With the major parts of core shifting around, other changes were also made to that package. Its src file tree was flattened, util methods were combined, and tests were reorganized to match the new code structure. With both core and client breaking changes, all internal usage amongst the CLI was also updated.

For deferred snapshots, the core stop method was updated to not immediately clear any queued snapshots. It can however, still be told to clear them when the new force argument is true. Caught errors in the CLI are passed along to the command's respective finally() method, which we can leverage to force close the percy instance upon fatal errors.

Various logs were also updated. Redundant start and stop logs, as well as build creation logs were removed; the latter due to potentially being deferred. Logs were also added to client, since we've now long removed Winston from our dependency tree.

🔊 Add debug logs

The queue has been refactored to accept ids for tasks and allow pausing & flushing. This allows us to add a deferUploads option which, when enabled, will wait until the process is stopped before creating a build and uploading all snapshots. While deferred, duplicate snapshots are replaced in the queue which allows SDKs to retry snapshots without duplicate name errors from the upstream API. The discoverer class has been converted to be functional helpers that the network class can use to enable request interception. This allows the capture method to intercept resources while taking a DOM snapshot. With this addition, the capture method no longer requires the snapshot method, however the two methods then became so similar, that it made sense move the capture functionality to the snapshot method entirely. With the snapshot method now accepting capture options, a per-snapshot config util was created to validate and merge all snapshot options. Part of this validation is ensuring that capture options are not provided along with a static dom snapshot. The snapshots capture option was also renamed to be additionalSnapshots to avoid confusion with the snapshot method. With the major parts of this library shifting around, other changes were also made in occordance. The src file tree was flattened, util methods were combined, and tests were reorganized to match the new code structure.

- Caught errors in the base command are passed along to finally, mirroring oclif behavior - Commands will force close the percy instance when there is an error - The snapshot command was updated to use the snapshot method - The upload command was updated to reflect client changes - Test logs were updated to match new logs

Seems something changed out of our control to cause this value not to be serialized. Ensuring it is serialized fixes the flakey/failing test for browsers.

Robdel12 · 2021-06-16T15:19:56Z

This upload queue can also be optionally deferred which allows snapshots to be replaced prior to being sent to Percy. Since uploads can be deferred, it also makes sense to defer build creation along with snapshot uploads. In the future, we can expose this option to allow snapshot retries, or to avoid creating builds when the test suite fails (this would also mean Percy would start at the end of a test run rather than process snapshots as soon as possible).

🔥 🔥

Robdel12 · 2021-06-16T16:08:21Z

packages/client/src/request.js

    connectMessage = connectMessage.join(CRLF);
    connectMessage += CRLF.repeat(2);

+    logger('client:proxy').debug(`Proxying request: ${options.href}`);


Robdel12

Amazing work here! 👏🏼 🔥 A few comments but nothing blocking or anything. Excited to ship this! 🏁

Robdel12 · 2021-06-16T21:38:53Z

packages/cli-command/src/command.js


    // ensure cleanup is always performed
-    let cleanup = () => this.finally();
+    let cleanup = () => this.finally(new Error('SIGTERM'));


Just curious -- what's this for?

When OCLIF catches an error, including calls to exit() (which just throws a specific error), it passes the error along to the finally() method. This way you can handle closing things up in a different way if there was an error versus if finally() was called normally, immediately after run().

To recap part of the PR: with new changes, stopping Percy will wait on snapshots to finish processing and uploading. Unless force is true, in which case it will clear and close all queues as soon as possible.

All of that to answer your question: when the process receives one of these termination events, we want to force stop Percy. So we send finally() an error to signal it to force close any processes. The SIGTERM message is pretty arbitrary as this could trigger from either of the other events here as well (but we don't really care about which event, just that we handle closing as if we got an error).

Robdel12 · 2021-06-16T21:45:03Z

packages/client/README.md

 #### Options

- `name` — Snapshot name (**required**)
+- `name` — Snapshot name


Robdel12 · 2021-06-16T21:48:36Z

packages/client/src/client.js

@@ -18,16 +43,9 @@ export default class PercyClient {
    // versioned percy api url
    apiUrl = 'https://percy.io/api/v1'


Going to get this one in another PR? (Sounds right 👍🏼 )

Robdel12 · 2021-06-16T21:54:16Z

packages/core/src/page.js

  }

+  async resize({
+    deviceScaleFactor = 1,


👏🏼 Not sure if this was done before, but all our browsers are scale 1 in the infra. When safari comes along I wonder if that changes 🤨

Robdel12 · 2021-06-16T21:57:45Z

packages/core/src/percy.js

+        let failed = error.response?.status === 422 && (
+          error.response.body.errors.find(e => (
+            e.source?.pointer === '/data/attributes/build'


I wish we had this when we were doing support 😆

Robdel12 · 2021-06-16T22:02:34Z

packages/core/test/unit/install.test.js

      .toHaveBeenCalledOnceWith(path.join('.downloads', 'v0', 'archive.zip'));
  });

+  it('logs the file size in a readable format', async () => {


This is a nice addition too 👀

Robdel12 · 2021-06-16T22:03:18Z

packages/core/types/index.d.ts

@@ -1,3 +1,7 @@
+// utility types
+type Without<T, U> = { [P in Exclude<keyof T, keyof U>]?: never };
+type XOR<T, U> = (T | U) extends object ? (Without<T, U> & U) | (Without<U, T> & T) : T | U;


This hurts to look at.

Bumps @percy/core from 1.0.0-beta.52 to 1.0.0-beta.56. --- updated-dependencies: - dependency-name: "@percy/core" dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

wwilsman added 8 commits May 28, 2021 15:35

✨ Discover responsive assets without multiple pages

e0105b4

🔧 Move output of pending tests from summary to spec

4469a62

♻ Refactor client to be build agnostic

af695ad

🔊 Add debug logs

✨ Close snapshot queues when the build fails early

27bef11

✨ Return false from isPercyEnabled when a build has failed

3e856d9

✅ Fix sdk-utils test helper

a578215

Seems something changed out of our control to cause this value not to be serialized. Ensuring it is serialized fixes the flakey/failing test for browsers.

wwilsman force-pushed the ww/discovery-queue-refactor branch from 086c671 to c0bff87 Compare June 16, 2021 00:26

wwilsman added 💥 breaking Breaking change ✨ enhancement New feature or request labels Jun 16, 2021

Robdel12 reviewed Jun 16, 2021

View reviewed changes

Robdel12 mentioned this pull request Jun 16, 2021

feat: retry launching puppeteer when it fails to start percy/percy-storybook#187

Closed

wwilsman added 3 commits June 16, 2021 12:46

🏷 Update core types

146f70c

🐛 Fix snapshot command dry-run output

96a7063

💡 Update code comments

af31704

wwilsman force-pushed the ww/discovery-queue-refactor branch from c0bff87 to af31704 Compare June 16, 2021 17:46

🗑 Deprecate capture method

b17372a

wwilsman marked this pull request as ready for review June 16, 2021 19:37

wwilsman requested a review from Robdel12 June 16, 2021 19:37

wwilsman enabled auto-merge (squash) June 16, 2021 19:38

wwilsman disabled auto-merge June 16, 2021 19:57

wwilsman force-pushed the ww/discovery-queue-refactor branch 2 times, most recently from d20dd1a to 09dda31 Compare June 16, 2021 20:09

📝 Update READMEs

dc1888e

wwilsman force-pushed the ww/discovery-queue-refactor branch from 09dda31 to dc1888e Compare June 16, 2021 20:11

wwilsman enabled auto-merge (squash) June 16, 2021 20:13

Robdel12 approved these changes Jun 16, 2021

View reviewed changes

wwilsman merged commit c63173d into master Jun 16, 2021

wwilsman deleted the ww/discovery-queue-refactor branch June 16, 2021 22:06

wwilsman mentioned this pull request Jun 28, 2021

Do not wait for network idle when in-flight requests will not be captured #399

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

♻️ Refactor core discovery queue#369

♻️ Refactor core discovery queue#369
wwilsman merged 13 commits intomasterfrom
ww/discovery-queue-refactor

wwilsman commented Jun 16, 2021 •

edited

Loading

Uh oh!

Robdel12 commented Jun 16, 2021

Uh oh!

Robdel12 Jun 16, 2021

Uh oh!

Robdel12 left a comment

Uh oh!

Robdel12 Jun 16, 2021

Uh oh!

wwilsman Jun 16, 2021 •

edited

Loading

Uh oh!

Robdel12 Jun 16, 2021

Uh oh!

Robdel12 Jun 16, 2021

Uh oh!

Robdel12 Jun 16, 2021

Uh oh!

Robdel12 Jun 16, 2021

Uh oh!

Robdel12 Jun 16, 2021

Uh oh!

Robdel12 Jun 16, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -18,16 +43,9 @@ export default class PercyClient {
		// versioned percy api url
		apiUrl = 'https://percy.io/api/v1'

Conversation

wwilsman commented Jun 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is this?

Snapshot, capture, and discovery changes

Queue changes

Other changes

Uh oh!

Robdel12 commented Jun 16, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Robdel12 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wwilsman Jun 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wwilsman commented Jun 16, 2021 •

edited

Loading

wwilsman Jun 16, 2021 •

edited

Loading