Implement client cache of GET urls with fragments#7562
Implement client cache of GET urls with fragments#7562cramforce merged 16 commits intoampproject:masterfrom
Conversation
src/service/xhr-impl.js
Outdated
| .then(response => response.json()); | ||
|
|
||
| const isCacheable = (init.method === 'GET' && propertyPath); | ||
| if (isCacheable) { |
There was a problem hiding this comment.
I bet we'll add an option to opt out of cache at some point, but this is fine :)
src/service/xhr-impl.js
Outdated
| return this.fetch(input, init) | ||
| .then(response => response.json()); | ||
|
|
||
| const isCacheable = (init.method === 'GET' && propertyPath); |
There was a problem hiding this comment.
I'm wondering: Is it a valid use case that you sometimes want the a sub path and sometimes the whole thing?
There was a problem hiding this comment.
I'd say it's valid. I'll remove the && propertyPath from the condition.
src/utils/object.js
Outdated
| * @template T | ||
| */ | ||
| export function sortProperties(obj) { | ||
| if (!obj || typeof obj !== 'object') { return obj; } |
There was a problem hiding this comment.
nit: expand conditional into 3 lines.
There was a problem hiding this comment.
Done. I removed this method.
src/utils/object.js
Outdated
| } | ||
|
|
||
| /** | ||
| * Creates a new object identical to the first with sorted properties. |
There was a problem hiding this comment.
So you are relying on JS' implicit insertion order is read order semantics?
I think I'd prefer a simpler cache key generated that directly writes a string from input. For our effective use case that should just work.
There was a problem hiding this comment.
You are suggesting to just JSON.stringify the opt_init object directly without sorting the properties, is that correct? If so I'll do that and then remove this method.
There was a problem hiding this comment.
I might misunderstand your code, but what I'm suggesting it to just generate direct string from the traversal instead of generating an object and then stringifying that.
There was a problem hiding this comment.
Done. I made a simpler serializer that should be sufficient for this use case.
test/functional/utils/test-cache.js
Outdated
| cache.put('d', {foo: 'bar'}); | ||
|
|
||
| expect(toArray(cache, 'abcd'.split(''))).to.deep.equal( | ||
| ['abc', 123, ['x', 'y'], {foo: 'bar'}]); |
test/functional/utils/test-cache.js
Outdated
| cache.put('c', ['x', 'y']); | ||
| cache.put('d', {foo: 'bar'}); | ||
|
|
||
| expect(toArray(cache, 'abcd'.split(''))).to.deep.equal( |
There was a problem hiding this comment.
jsonEqual gives much nicer error messages :)
src/service/xhr-impl.js
Outdated
| if (init.method == 'POST' && !isFormData(init.body)) { | ||
| const parsedInput = parseUrl(input); | ||
| const propertyPath = parsedInput.hash.slice(1); // remove # prefix | ||
| const getPropertyAtPath = getPath.bind(null, propertyPath); |
There was a problem hiding this comment.
Please don't bind until you know you need it.
src/utils/object.js
Outdated
| export function getPath(path, obj) { | ||
| const arrayIndexRe = /\[(\d+)\]/g; | ||
| const keys = path.replace(arrayIndexRe, '.$1').split('.'); | ||
| return keys.reduce((acc, key) => acc[key], obj); |
There was a problem hiding this comment.
CC @choumx who is doing this kind of stuff for amp-bind.
Since JSON returns full objects we need to guard this with hasOwnProperty to avoid exposing stuff like constructor and toString.
Please also think about how to make the error message good if this fails to find an element.
There was a problem hiding this comment.
Done. I split it out into a less terse loop with special error messaging.
src/service/xhr-impl.js
Outdated
| } from '../utils/object'; | ||
| import {isArray, isObject, isFormData} from '../types'; | ||
| import {utf8EncodeSync} from '../utils/bytes'; | ||
| import Cache from '../utils/cache'; |
There was a problem hiding this comment.
@dvoytenko How do you feel about this being directly integrated into XHR?
There was a problem hiding this comment.
Moved into CachedXhr class
|
Awesome! Thank you! |
src/service/xhr-impl.js
Outdated
| * @private | ||
| */ | ||
| getCacheKey_(url, opt_init) { | ||
| const urlWithoutHash = url.slice(0, url.indexOf('#')); |
There was a problem hiding this comment.
Danger! url.indexOf can return -1, which'll make this be something like:
const url = 'https://example.com';
const urlWithoutHash = url.slice(0, url.indexOf('#')); // => https://example.coWe have a removeFragment helper to do this.
src/service/xhr-impl.js
Outdated
| .then(response => response.json()); | ||
|
|
||
| const isCacheable = (init.method === 'GET' && propertyPath); | ||
| if (isCacheable) { |
There was a problem hiding this comment.
I don't think this is the right class for cacheable queries. How about a composed CachedXHR class?
There was a problem hiding this comment.
Makes sense, I'll try that out.
src/service/xhr-impl.js
Outdated
| const isCacheable = (init.method === 'GET' && propertyPath); | ||
| if (isCacheable) { | ||
| const cacheKey = this.getCacheKey_(input, opt_init); | ||
| const cachedPromise = this.fragmentCache_.get(cacheKey); |
There was a problem hiding this comment.
All the fetchX methods suffer from fragments causing repeated requests. With the composed class suggestion, we can solve all of them.
There was a problem hiding this comment.
Hmm I'm not sure I understand. The fragment is a path to a property inside a JSON object. How should fetchDocument or fetchText specify a subset of data like the issue specs?
There was a problem hiding this comment.
The fragment is a path to a property inside a JSON object.
We've defined that for fetchJSON only.
But fragments can be added to any request, and they're never sent to the server. So, fecthDoc("#frag1") and fecthDoc("#frag2") are the same request, but won't be cached properly. They should be cached, without doing anything special to the response (no deep JSON property returning, etc).
src/utils/cache.js
Outdated
| * @param value {T} | ||
| */ | ||
| put(key, value) { | ||
| if (typeof key !== 'string') { |
There was a problem hiding this comment.
This is guaranteed by Closure types.
There was a problem hiding this comment.
Ok cool. Is there a way to test this guarantee, or should I not test it?
There was a problem hiding this comment.
No need to test, gulp check-types does it for us.
src/utils/cache.js
Outdated
| return this.map_[key]; | ||
| } | ||
|
|
||
| get length() { |
There was a problem hiding this comment.
Magic getters are significantly slower than normal property accesses or methods. This is simple enough that we can update #length after every #put, or we could just make it a method.
There was a problem hiding this comment.
I'll just remove the length getter, I don't see a good use-case for it given it only increases to the size and then remains constant.
src/utils/object.js
Outdated
| export function getPath(path, obj) { | ||
| const arrayIndexRe = /\[(\d+)\]/g; | ||
| const keys = path.replace(arrayIndexRe, '.$1').split('.'); | ||
| return keys.reduce((acc, key) => acc[key], obj); |
There was a problem hiding this comment.
This will throw if acc is falsey!
src/utils/object.js
Outdated
| const keys = Object.keys(obj).sort(); | ||
| return keys.reduce((acc, key) => { | ||
| const value = obj[key]; | ||
| if (typeof value === 'object') { |
There was a problem hiding this comment.
Test that it's not an array either.
f45fda1 to
d61c81f
Compare
src/utils/object.js
Outdated
| * | ||
| * @param {T} obj a map-like value | ||
| * @param {string} path a dot-separated list of keys to reference a value | ||
| * @return {T} |
There was a problem hiding this comment.
This should be *, it's not guaranteed to be an Object (which T has to be).
There was a problem hiding this comment.
Right, even if it was, the shape wouldn't be the same
src/utils/cache.js
Outdated
| this.queue_ = []; | ||
|
|
||
| /** @private @const {Object<string, T>} */ | ||
| this.map_ = {}; |
src/service/xhr-impl.js
Outdated
| * @param {string=} opt_accept The HTTP Accept header value. | ||
| * @return {!FetchInitDef} | ||
| */ | ||
| setupInit_(opt_init, opt_accept) { |
There was a problem hiding this comment.
Is there a reason to make this an instance method?
There was a problem hiding this comment.
To use in the first line of CachedXhr#fetchJson. I thought it made sense to share it through subclassing rather than exporting, so that it was clear that it isn't a part of Xhr's public API
There was a problem hiding this comment.
With d61c81f#r103352978, it won't be necessary. You'll just call Xhr#fetchJson, and cache the result if cacheable.
| @@ -191,8 +193,8 @@ export class Xhr { | |||
| init.headers['Content-Type'] = 'application/json;charset=utf-8'; | |||
There was a problem hiding this comment.
Somehow missed this before. Would you mind deleting this line, and updating the 'application/json' above to be 'application/json;charset=utf-8'?
There was a problem hiding this comment.
There's a test checking that this property is present in the request headers, and the second argument to setupInit sets the Accepts header, not Content-Type.
I can change the test to succeed, but I just want to make sure this is what you really intended.
There was a problem hiding this comment.
Whoops, I thought the second param was for Content-Type. Thanks for looking into it.
src/service/xhr-impl.js
Outdated
| const init = setupInit(opt_init, 'application/json'); | ||
| if (init.method == 'POST' && !isFormData(init.body)) { | ||
| const init = this.setupInit_(opt_init, 'application/json'); | ||
| const getResponseJson = response => response.json(); |
There was a problem hiding this comment.
Nit: if we're going to pull this into a variable, hoist it out of the function.
src/service/cached-xhr-impl.js
Outdated
| * ampCors: (boolean|undefined) | ||
| * }} | ||
| */ | ||
| let FetchInitDef; |
There was a problem hiding this comment.
This should be exported from xhr-impl.js, then you can reference the type in this file.
src/service/cached-xhr-impl.js
Outdated
| * @return {!Promise<!JSONType>} | ||
| * @override | ||
| */ | ||
| fetchJson(input, opt_init) { |
There was a problem hiding this comment.
There's a lot of duplication with Xhr#fetchJson here. I think we can simplify down based on:
- Any call with
opt_initis un-cacheable- Removes crazy
simpleSerialize - Handles
POSTcase
- Removes crazy
Actually, that's it. 😉
src/service/cached-xhr-impl.js
Outdated
| // Since a fragment is present, cache the full response promise, then | ||
| // return a promise with just the value specified by the fragment. | ||
| this.fragmentCache_.put(cacheKey, fetchPromise); | ||
| return fetchPromise.then(getPropertyAtPath); |
There was a problem hiding this comment.
Is there any reason to put this functionality in CachedXhr? Seems like CachedXhr is generally useful, while deep JSON gets are only useful to <amp-list>?
There was a problem hiding this comment.
I belive the intended use-case for the fragment is e-commerce JSON APIs that return a big user-object with nested data that are displayed on many different parts of the page. For example, a user object with a name that is displayed at the top and bottom, a shopping cart list, etc. and each piece of the UI that references a deep property of the user-object would specify which property through the fragment. Does that answer your question?
There was a problem hiding this comment.
Nope. I understand the use case, but I don't think the functionality belongs here. You could just as easily call getPath like so:
// amp-list.js
cachedXhr.fetchJson(endpoint).then(json => getPath(fragment, json))That keeps CachedXhr generically useful, while giving <amp-list> the special deep JSON gets.
There was a problem hiding this comment.
It's true you could do that, and that would follow the single responsibility principle. But you'd be duplicating that fragment extraction and getPath logic in any component or extension where you'd want to get a sub-object of the overall response, which I think would be 3 or more places, more than just amp-list. IMO it keeps things DRYer if that logic is implicitly triggered in CachedXhr by the presence of a fragment. It also enables any AMP element to access a fragment without explicitly coding it into every fetchJson call site, you'd just swap in the CachedXhr class. If you think the separation will be more beneficial than the deduplication, I'll split it out.
There was a problem hiding this comment.
The duplication can be handled with a helper, and code size changes should be negligible. For my rationale, I need this to be useable by amp-call-tracking.js and amp-ad-custom.js, both which do not expect this behavior.
There was a problem hiding this comment.
That makes sense. Thank you for explaining your reasoning, I'll do it this way.
src/utils/object.js
Outdated
| * | ||
| * @param {T} obj a map-like value | ||
| * @param {string} path a dot-separated list of keys to reference a value | ||
| * @return {T} |
There was a problem hiding this comment.
Right, even if it was, the shape wouldn't be the same
src/service/cached-xhr-impl.js
Outdated
|
|
||
|
|
||
| /** | ||
| * A service that polyfills Fetch API for use within AMP. |
src/url.js
Outdated
| if (index == -1) { | ||
| return ''; | ||
| } | ||
| return url.substring(index + 1); |
There was a problem hiding this comment.
Nit: location.hash returns the #. Why don't we do so as well?
There was a problem hiding this comment.
I removed the # here for convenience, but keeping it does follow the POLS
src/utils/object.js
Outdated
| * @return {*} | ||
| * @template T | ||
| */ | ||
| export function getPath(path, obj) { |
There was a problem hiding this comment.
Nit: Can we flip the params?
There was a problem hiding this comment.
Sure. I had path first for to make currying with bind easy to use to map on an array of objects, but I'm not doing that anywhere.
src/service/cached-xhr-impl.js
Outdated
| * @return {!Promise<!JSONType>} | ||
| * @override | ||
| */ | ||
| fetchJson_(input, init) { |
There was a problem hiding this comment.
Why not just override the public methods?
fetchJSON(input, init) {
return this.cached(input) || this.cacheFetch_(input, super.fetchJSON(input, init));
}There was a problem hiding this comment.
The public methods accept an optional opt_init value, while the private methods have init guaranteed to be defined by setupInit in the corresponding public method. This lets the cache separately key the edge case where identical URLs serve different documents for different Accepts header values. I could override the public methods, but then I think I'd need to expose setupInit to the subclass. I'll change it if you prefer it implemented that way.
There was a problem hiding this comment.
for different Accepts header values
Don't we know that by virtue of the public method being called? Ie, we can do:
fetchJSON(input, init) {
return this.cached(input, 'json') || this.cacheFetch_(input, 'json', super.fetchJSON(input, init));
}
fetch(input, init) {
return this.cached(input, 'generic') || this.cacheFetch_(input, 'generic', super.fetchJSON(input, init));
}
fetchText(input, init) {
return this.cached(input, 'text') || this.cacheFetch_(input, 'text', super.fetchJSON(input, init));
}#getCacheKey can then just concat the URL and the method to generate.
There was a problem hiding this comment.
Oh true, thanks I didn't see that
src/utils/cache.js
Outdated
| export class Cache { | ||
| /** | ||
| * Construct a new Cache with the given size | ||
| * @param size {number=} |
There was a problem hiding this comment.
The type definitions in this file are backwards. Should be @param {number=} size
src/utils/object.js
Outdated
| * e.g. | ||
| * `getPath({a: {b: [{c: 2}]}}, 'a.b[0].c') === 2` | ||
| * | ||
| * @param {T} obj a map-like value |
There was a problem hiding this comment.
Can you help me understand why is this different than hasOwn, which returns a boolean but specifies a @param {T} obj?
There was a problem hiding this comment.
I actually don't have a good answer. @erwinmombay?
test/functional/test-cached-xhr.js
Outdated
| xhr.fetchJson('/get?k=v1'), | ||
| xhr.fetchJson('/get?k=v1'), | ||
| ]).then(results => { | ||
| expect(fetchStub.calledOnce).to.be.true; |
There was a problem hiding this comment.
expect(fetchStub).to.be.calledOnce;
test/functional/test-cached-xhr.js
Outdated
| import {installCachedXhrService} from '../../src/service/cached-xhr-impl'; | ||
|
|
||
|
|
||
| describes.realWin('CachedXhr', {}, env => { |
There was a problem hiding this comment.
Test that POSTs aren't cached.
test/functional/test-cached-xhr.js
Outdated
| import {installCachedXhrService} from '../../src/service/cached-xhr-impl'; | ||
|
|
||
|
|
||
| describes.realWin('CachedXhr', {}, env => { |
There was a problem hiding this comment.
Test that fragments do not affect caching.
b5d4f06 to
c2fca22
Compare
test/functional/test-xhr.js
Outdated
| return response.text() | ||
| .then(result => { | ||
| const callCloneText = () => clone.text().then(clonedText => { | ||
| expect(clonedText).to.equal(TEST_TEXT); |
There was a problem hiding this comment.
This expectation will not be validated, since the test will have exited before this is run.
test/functional/test-xhr.js
Outdated
| const callCloneText = () => clone.text().then(clonedText => { | ||
| expect(clonedText).to.equal(TEST_TEXT); | ||
| }); | ||
| expect(callCloneText, 'should not throw').to.not.throw(Error); |
There was a problem hiding this comment.
This wrapping isn't necessary. If the clone's #text were to throw syncly, it would cause the tests to fail (because the promise block it's inside is waited on by the test).
There was a problem hiding this comment.
Thanks, makes sense.
test/functional/test-xhr.js
Outdated
| expect(clonedText).to.equal(TEST_TEXT); | ||
| }); | ||
| expect(callCloneText, 'should not throw').to.not.throw(Error); | ||
| expect(result).to.equal(TEST_TEXT); |
There was a problem hiding this comment.
Return the clone's #text promise here, with its expectation in a then block.
src/runtime.js
Outdated
| installTimerService(global); | ||
| installVsyncService(global); | ||
| installXhrService(global); | ||
| installCachedXhrService(global); |
There was a problem hiding this comment.
Could you run gulp size and see the impact on the core payload?
There was a problem hiding this comment.
Certainly, here are the results.
| max | min | gzip | |
|---|---|---|---|
| master | 1.11 MB | 245.71 kB | amp.js |
| PR #7562 | 1.11 MB | 246.56 kB | amp.js |
There was a problem hiding this comment.
You're gzip is a little large. 😜
There was a problem hiding this comment.
Ah that makes sense, I had just run build. What do you think of adding dist as a gulp task dependency for the size task? Are there any cases where we want to run size on files produced by a task other than dist?
There was a problem hiding this comment.
Here are the correct numbers:
| max | min | gzip | file | |
|---|---|---|---|---|
| master | 1.11 MB | 206.7 kB | 61.94 kB | v0.js / amp.js |
| PR #7562 | 1.11 MB | 207.05 kB | 62.11 kB | v0.js / amp.js |
73d815c to
d0a25ea
Compare
|
Sorry, for opening pandoras box. I think we should reconsider the implementation strategy, but would like to get input from others: Delete the app layer cache from this CL. We have to document how to do the fragment stuff anyway, so we might as well document people should do something like set a 1 minute |
d312438 to
e47c3f1
Compare
|
Hi, ampproject bot here! Here are a list of the owners that can approve your files. You may leave an issue comment stating "@ampprojectbot retry!" to force me to re-evaluate this Pull Request's status /to cramforce dvoytenko jridgewell
/to dvoytenko jridgewell
/to alanorozco camelburrito chenshay choumx cvializ ericlindley-g erwinmombay gregable honeybadgerdontcare jridgewell kmh287 lannka mkhatib mrjoro muxin newmuis powdercloud zhouyx
For any issues please file a bug at https://github.com/google/github-owners-bot/issues |
|
PTAL, removed app layer cache as requested by @cramforce and implemented fetch-batching per @jridgewell's suggestion |
223e29a to
95ab799
Compare
cramforce
left a comment
There was a problem hiding this comment.
Looks great, but have one question?
| } | ||
| return xhrFor(this.win).fetchJson(src, opts); | ||
| const fetchPromise = batchedXhrFor(this.win).fetchJson(src, opts); | ||
| const fragment = getFragment(src).slice(1); |
There was a problem hiding this comment.
Should this be a feature of BatchedXhr?
There was a problem hiding this comment.
Justin and I discussed keeping the fragment-extraction and object property access behavior inside of the new Xhr class, but we decided to keep them separate. Discussion here.
cramforce
left a comment
There was a problem hiding this comment.
LGTM
Important follow up is to support the path based stuff in more places and to document it.
|
Wahoo! I created an issue to track the documentation effort. I don't have merge permission yet, so feel free to merge any time. |
| if (isBatchable) { | ||
| this.fetchPromises_[key] = fetchPromise.then(response => { | ||
| delete this.fetchPromises_[key]; | ||
| return response.clone(); |
There was a problem hiding this comment.
Bug: This isn't sufficient to prevent double draining. The problem is this is part of the promise chain.
So, first call to fetch returns the clone. I then add to the chain to read the body, and issue a second request to the same endpoint. The second request will see the batched call, and return a clone of that promise (a clone of a clone) Problem is, that clone will happen after the body read. (Because execution chains on a single promise is defined)
This should return the response without cloning. This total promise chain will be stored as the batch. For the return value, we should return the batching promise with a .then((resp) => resp.clone()) added to its chain.
There was a problem hiding this comment.
After thinking about it, I'm not sure your assessment is correct. The clone is created, but not returned to the caller, just to the batching mechanism. Your argument assumes that the first fetch returns the clone, which is not what is happening in this code. So nothing can drain the body of the response, so there cannot be a double drain.
| * @param {string} path a dot-separated list of keys to reference a value | ||
| * @return {*} | ||
| */ | ||
| export function getPath(obj, path) { |
There was a problem hiding this comment.
@cvializ Is this method significantly different from https://github.com/ampproject/amphtml/blob/master/src/json.js#L84 ?
If no, please file a bug to reconcile with P1.
There was a problem hiding this comment.
Ah! I did not see that. I'll file the bug

Implements #7463