Support generic chunking and chunk Bind's DOM scan#6967
Support generic chunking and chunk Bind's DOM scan#6967dreamofabear merged 15 commits intoampproject:masterfrom
Conversation
| export default class PriorityQueue { | ||
| constructor() { | ||
| /** @private @const {Array<{item: T, priority: number}>} */ | ||
| this.queue_ = []; |
There was a problem hiding this comment.
We'd get theoretically better performance with a max heap for large N but I thought it was unnecessary for current usage. I'm not sure that even the binary search is worth the readability cost.
| /** @type {!Array<./bind-evaluator.EvaluateeDef>} */ | ||
| const evaluatees = []; | ||
|
|
||
| // TODO(choumx): Use TreeWalker if this is too slow. |
There was a problem hiding this comment.
I'm pretty sure from past experience that TreeWalker and getElementsByTagName will be generally comparable, but TreeWalker would be somewhat faster.
There was a problem hiding this comment.
I actually meant to use TreeWalker so the traversal can be amortized over several frames. In my profiling, this call only took a few ms so I'll work on expression parsing first (which is much slower).
There was a problem hiding this comment.
Cool, thanks for the link.
extensions/amp-bind/0.1/bind-impl.js
Outdated
|
|
||
| // Helper function for scanning a slice of elements. | ||
| const scanFromTo = (start, end) => { | ||
| for (let i = start; i < Math.min(end, elements.length); i++) { |
There was a problem hiding this comment.
Don't remember 100%, but I believe calling elements.length is here would force some significant amount of work. It might be better to iterate until we arrive at elements[++i] === undefined.
There was a problem hiding this comment.
Good tip. Looks like length is slow for live collections. Done.
| * @return {string} | ||
| */ | ||
| get name() { | ||
| return this.fn_.displayName || this.fn_.name; |
There was a problem hiding this comment.
I think we decided not to use getters like this since they are somewhat slow. /cc @jridgewell .
Also, displayName is good for local debug, but most are likely stripped by compiler for prod binaries.
There was a problem hiding this comment.
Probably slower than property access, but what about vs. normal function invocation?
This usage of displayName was ported from original code on line 175.
There was a problem hiding this comment.
Getters and setters are considerably slower than method invocation.
There was a problem hiding this comment.
Thanks for the reference. Changed to plain method.
|
Thanks, @choumx! I'd like @cramforce to review in more detail the chunking code changes. I left few comments, but some of a high-level picture questions I have:
|
extensions/amp-bind/0.1/bind-impl.js
Outdated
| }; | ||
|
|
||
| // Divide elements into buckets to be scanned, one bucket per chunk. | ||
| const bucketSize = 10; |
There was a problem hiding this comment.
I think this is not the right way to do this when you design for requestIdleCallback, but this is a perfect use case for RIC!
The current chunking code has the issue that all tasks that we scheduled could not be split up further (easily). Here you can basically do work until you shouldn't anymore. With that the timeRemaining function should be made available to the scanner and it should always try to do as much as it can given the time available.
There was a problem hiding this comment.
Good point! Added support for this by passing the RIC info object into the chunk function. This way, we can still prioritize usage of requestIdleCallback rather than "first-always-wins".
| }); | ||
|
|
||
| // Helper function for scanning a slice of elements. | ||
| const scanFromTo = (start, end) => { |
There was a problem hiding this comment.
How long does a full run take on a typical mobile device?
There was a problem hiding this comment.
For a page with 5000 elements with 1000 bindings:
- DOM scan: 100ms
- Expression parsing: 1s
Complex expressions still take ~1ms each to parse during init. Next step is to either chunk expression parsing or convert BindEvaluator into a web worker.
f97fc41 to
3d1a6ec
Compare
dreamofabear
left a comment
There was a problem hiding this comment.
Thanks for the review!
Do we really need priority queue? I understand all startup chunking should be done before amp-binder is instrumented, right?
True, though generic background task execution is probably useful to have. Using requestIdleCallback directly has the downside that it's not widely supported and callbacks are executed in order of invocation (which may not be desired).
Is chunking the right instrument here? Or should we look into what Resources manager is doing?
Would you mind elaborating on what you mean by Resources?
Is there some condition (e.g. user action) where we need to force-execute all pending crawl operations before we proceed with evaluations? Is this something that can be accomplished with chunking/idle callbacks?
Good question. It's feasible to implement timeouts with chunk(), but not blocking the UI is the correct UX trade off IMO. Nonetheless it's still important that for vast majority of pages, this completes soon after page render.
extensions/amp-bind/0.1/bind-impl.js
Outdated
|
|
||
| // Helper function for scanning a slice of elements. | ||
| const scanFromTo = (start, end) => { | ||
| for (let i = start; i < Math.min(end, elements.length); i++) { |
There was a problem hiding this comment.
Good tip. Looks like length is slow for live collections. Done.
| /** @type {!Array<./bind-evaluator.EvaluateeDef>} */ | ||
| const evaluatees = []; | ||
|
|
||
| // TODO(choumx): Use TreeWalker if this is too slow. |
There was a problem hiding this comment.
I actually meant to use TreeWalker so the traversal can be amortized over several frames. In my profiling, this call only took a few ms so I'll work on expression parsing first (which is much slower).
| * @return {string} | ||
| */ | ||
| get name() { | ||
| return this.fn_.displayName || this.fn_.name; |
There was a problem hiding this comment.
Probably slower than property access, but what about vs. normal function invocation?
This usage of displayName was ported from original code on line 175.
| }); | ||
|
|
||
| // Helper function for scanning a slice of elements. | ||
| const scanFromTo = (start, end) => { |
There was a problem hiding this comment.
For a page with 5000 elements with 1000 bindings:
- DOM scan: 100ms
- Expression parsing: 1s
Complex expressions still take ~1ms each to parse during init. Next step is to either chunk expression parsing or convert BindEvaluator into a web worker.
extensions/amp-bind/0.1/bind-impl.js
Outdated
| }; | ||
|
|
||
| // Divide elements into buckets to be scanned, one bucket per chunk. | ||
| const bucketSize = 10; |
There was a problem hiding this comment.
Good point! Added support for this by passing the RIC info object into the chunk function. This way, we can still prioritize usage of requestIdleCallback rather than "first-always-wins".
3d1a6ec to
e1227c3
Compare
I meant
I think we need to find a paradigm that will work here w/o risk conditions. I see your point and agree that we don't want to block user action. So, instead, I think we need to have a model where we update data structures in memory and mark them as "user action". When a dependent bind expression is ingested, we determine whether it's in initial state (no work) or update by user action (update as soon as practical). |
| * @param {function(?IdleDeadline)} fn | ||
| */ | ||
| export function chunk(nodeOrAmpDoc, fn) { | ||
| export function startupChunk(nodeOrAmpDoc, fn) { |
There was a problem hiding this comment.
I don't understand why startupChunk implies "deactivate-able" while normal chunk is always "active".
There was a problem hiding this comment.
Chunking used to only be used for startup tasks, so I guess the nochunking=1 test query was very useful for testing. There's less of a need for generic background task chunks.
Added it below though since it's better to be consistent here.
src/chunk.js
Outdated
| this.viewer_ = null; | ||
|
|
||
| if (viewerOrPromise instanceof Promise) { | ||
| viewerOrPromise.then(viewer => { |
There was a problem hiding this comment.
Is the sync case absolutely necessary?
There was a problem hiding this comment.
Yes. schedule_() is called synchronously after a task is queued/executed, which calls StartupTask#immediateTriggerCondition_(), whose return value depends on the existence of viewer_.
We can change schedule_() to be executed as a micro-task instead, but I'd rather avoid behavioral changes in this already large-ish PR.
There was a problem hiding this comment.
Yes, please. Everything here is very subtle.
There was a problem hiding this comment.
Done.
Used existing resolved Promise and added new executeASAP_() to be stubbed in tests instead.
| t(); | ||
| t.runTask(idleDeadline); | ||
| } catch (e) { | ||
| // We run early in init. All errors should show the doc. |
There was a problem hiding this comment.
Catch block can be removed now.
Ah I see, thanks. IMO,
Somewhat related to
So actually this is only used during Bind's initialization (DOM scan) during page load. This won't affect latency when updating Bind state from user actions. |
Let me clarify this. Imagine the following sequence:
If my reading of this is correct, we could change this in a couple of ways:
Does this make sense? |
This actually exists in this PR via flag |
extensions/amp-bind/0.1/bind-impl.js
Outdated
| // Trigger verify-only digest in development. | ||
| const development = getMode().development; | ||
| if (development || this.digestQueuedAfterScan_) { | ||
| this.digest_(/* opt_verifyOnly */ development); |
There was a problem hiding this comment.
Ok, should not this then be !this.digestQueuedAfterScan_?
extensions/amp-bind/0.1/bind-impl.js
Outdated
| if (!opt_skipDigest) { | ||
| this.digest_(); | ||
| // If scan hasn't completed yet, set `digestQueuedAfterScan_`. | ||
| if (this.boundElements_) { |
There was a problem hiding this comment.
Isn't boundElements_ always truthy?
There was a problem hiding this comment.
Oops, bad merge I think. Thanks!
I see. This is what I wanted to see. So, in our world, |
dreamofabear
left a comment
There was a problem hiding this comment.
So, in our world, AMP.setState() is always equivalent to the user action, right?
Yes, with the exception of initializing <amp-state> components.
extensions/amp-bind/0.1/bind-impl.js
Outdated
| // Trigger verify-only digest in development. | ||
| const development = getMode().development; | ||
| if (development || this.digestQueuedAfterScan_) { | ||
| this.digest_(/* opt_verifyOnly */ development); |
extensions/amp-bind/0.1/bind-impl.js
Outdated
| if (!opt_skipDigest) { | ||
| this.digest_(); | ||
| // If scan hasn't completed yet, set `digestQueuedAfterScan_`. | ||
| if (this.boundElements_) { |
There was a problem hiding this comment.
Oops, bad merge I think. Thanks!
extensions/amp-bind/0.1/bind-impl.js
Outdated
|
|
||
| // Helper function for scanning a slice of elements. | ||
| const scanFromTo = (start, end) => { | ||
| for (let i = start; i < end && elements[i]; i++) { |
There was a problem hiding this comment.
Optimization: Hoist assignment of elements[i] into loop header to avoid accessing the live-list twice per entry.
extensions/amp-bind/0.1/bind-impl.js
Outdated
| if (idleDeadline) { | ||
| if (idleDeadline.didTimeout) { | ||
| // On timeout, scan the remaining elements immediately. | ||
| scanFromTo(position, Number.POSITIVE_INFINITY); |
There was a problem hiding this comment.
This is OK for now, but not sure the right algorithm. You still want to yield every so often on timeout, I think.
| } else { | ||
| // If `requestIdleCallback` isn't available, scan elements in buckets. | ||
| // Bucket size is a magic number that fits within a single frame. | ||
| const bucketSize = 250; |
There was a problem hiding this comment.
Should this just be what happens on timeout of idle callback?
| * A default chunkable task. | ||
| */ | ||
| class Task { | ||
| /** |
There was a problem hiding this comment.
Can this class and all methods be private?
There was a problem hiding this comment.
I should state @private is a file-level concept.
src/chunk.js
Outdated
| this.viewer_ = null; | ||
|
|
||
| if (viewerOrPromise instanceof Promise) { | ||
| viewerOrPromise.then(viewer => { |
There was a problem hiding this comment.
Yes, please. Everything here is very subtle.
| * A task that's run as part of AMP's startup sequence. | ||
| */ | ||
| class StartupTask extends Task { | ||
| /** |
There was a problem hiding this comment.
Same question: Can all methods be private?
| * @param {ChunkPriority} priority | ||
| */ | ||
| export function chunk(nodeOrAmpDoc, fn, priority) { | ||
| if (deactivated) { |
There was a problem hiding this comment.
Don't need it, but it's better to be consistent if URL contains nochunking=1.
Another option is to change the regex to test for nostartupchunking=1 and remove this.
test/functional/test-chunk.js
Outdated
| env.sandbox.stub(resolved, 'then', () => { | ||
| throw new Error('No calls expected .then'); | ||
| const chunks = chunkInstanceForTesting(env.win.document); | ||
| env.sandbox.stub(chunks, 'executeASAP_', () => { |
There was a problem hiding this comment.
D'oh, I keep forgetting about this rule. Done.
7f866aa to
63a658e
Compare
0e3624b to
d7c08d3
Compare
d7c08d3 to
2e1d712
Compare
extensions/amp-bind/0.1/bind-impl.js
Outdated
|
|
||
| // Helper function for scanning a slice of elements. | ||
| const scanFromTo = (start, end) => { | ||
| for (let i = start, el = elements[i]; i < end && el; i++) { |
There was a problem hiding this comment.
Oops, good catch. Fixed with a break which is more readable IMO.
extensions/amp-bind/0.1/bind-impl.js
Outdated
| for (let i = start, el = elements[i]; i < end && el; i++) { | ||
| const attrs = el.attributes; | ||
| const bindings = []; | ||
| for (let j = 0; attrs[j]; j++) { |
There was a problem hiding this comment.
Wanted to avoid invoking length which is purportedly slow for live lists. Changed to use for..of instead.
| // If `requestIdleCallback` is available, scan elements until | ||
| // idle time runs out. | ||
| if (idleDeadline && !idleDeadline.didTimeout) { | ||
| while (idleDeadline.timeRemaining() > 1 && elements[position]) { |
There was a problem hiding this comment.
A lot of this elements[index] check weirdness could go away if you made a function that took the element to scan. Then call it from either this while loop, or the above for loop.
There was a problem hiding this comment.
Thought about this. The helper method would need a lot of input params (ugly signature).
I think with the weirdness is reduced with the break and for..of.
There was a problem hiding this comment.
Just element, evaluatees and boundElements?
There was a problem hiding this comment.
Oh, it's not as bad if extracting the inner loop only; done.
Mutating input params is generally bad for readability but I think it's ok in this case.
extensions/amp-bind/0.1/bind-impl.js
Outdated
| const bindings = []; | ||
| for (const attr of el.attributes) { | ||
| const attrs = el.attributes; | ||
| for (let j = 0;; j++) { |
There was a problem hiding this comment.
What's up with your weird for loops? 😛
You could just check j < attrs.length.
There was a problem hiding this comment.
Haha I'm not a fan either. They're in response to: #6967 (comment)
Hard to make everyone happy. :)
There was a problem hiding this comment.
Cache the length:
for (let j = 0, l = attributes.length; j < l; j++) {
}Doing an out-of-bounds access is likely slower than anything else in here.
| // If `requestIdleCallback` is available, scan elements until | ||
| // idle time runs out. | ||
| if (idleDeadline && !idleDeadline.didTimeout) { | ||
| while (idleDeadline.timeRemaining() > 1 && elements[position]) { |
There was a problem hiding this comment.
Just element, evaluatees and boundElements?
|
Gentle ping. |
jridgewell
left a comment
There was a problem hiding this comment.
3 things:
- We should follow up with a
TreeWalkerimplementation - This is the second Queue class in the codebase. Time to extract?
- This is the second binary search in the codebase. Time to extract?
cramforce
left a comment
There was a problem hiding this comment.
Please hold merge until after canary cut (if it has not happened yet).
|
Thanks all! @jridgewell Filed #7230 to track your suggestions. |
* Squash chunk-scan commits * fix lint error * rename chunk to startupChunk, pass viewer from Chunks to StartupTask * fix lint and type errors * support timeRemaining in chunk tasks & misc clean up * justin's comments * dima's comments * malte's comments * style and lint fix * fix unit tests * missed a couple @Private declarations * fix scan iteration * remove for..of * more iteration on iteration * move element scan to helper method
* Squash chunk-scan commits * fix lint error * rename chunk to startupChunk, pass viewer from Chunks to StartupTask * fix lint and type errors * support timeRemaining in chunk tasks & misc clean up * justin's comments * dima's comments * malte's comments * style and lint fix * fix unit tests * missed a couple @Private declarations * fix scan iteration * remove for..of * more iteration on iteration * move element scan to helper method
* Squash chunk-scan commits * fix lint error * rename chunk to startupChunk, pass viewer from Chunks to StartupTask * fix lint and type errors * support timeRemaining in chunk tasks & misc clean up * justin's comments * dima's comments * malte's comments * style and lint fix * fix unit tests * missed a couple @Private declarations * fix scan iteration * remove for..of * more iteration on iteration * move element scan to helper method
Partial for #6199 and #6910.
PriorityQueuedata structurechunk.jsto support non-startup tasksThis PR is on the larger side -- happy to split this into multiple PRs. PTAL!
/to @cramforce @dvoytenko