✨Add new amp-skimlinks extension #17907
Conversation
…We always want to call beacon)
…rReplacementTuple
|
@zhouyx hi, I don't want to be pushy and understand that you are probably busy but my commercial team is pressuring me (those people, right?). Is there any way you could review @slocka's changes / comments? thank you. We have some large publishers waiting on this feature to monetize their content for the holiday season. @mrjoro we are pretty new to the amp project. could you please give us an estimate on how long you recon it would take to get this feature live / whats the process around it? thanks |
|
@lepunk, regarding how long it will take to go live, there are a couple of aspects. In general once code is merged in it will take 1-2 weeks to be fully in production. Full details are in our release-schedule.md. We also generally prefer that new features are launched behind an experiment to allow it to be more fully tested. I'll defer to @lannka and @choumx on whether that is needed in this case. |
| TODO: add optional config param to .build() so we don't need to mutate | ||
| a private property from outside. | ||
| */ | ||
| analytics.config_.transport = {beacon: true}; |
There was a problem hiding this comment.
given the other PR is merged, could you update here?
|
Just had one last comment. Otherwise LGTM |
Turn xhrpost off to avoid wildcard in Access-Control-Origin error when navigator.sendBeacon() is not available.
|
LGTMed. since @zhouyx is back today, I'm checking with her if she want to take a final look. |
|
One advantage of putting new stuff behind an experiment is that it makes site developers aware that a feature is new and may have rough edges, and developers can use the feature in a nearly-real environment to see how it behaves. The downside is that currently experiments are enabled on a per-browser basis, and there isn't a good way to enable an experimental feature in production for a limited set of sites until we re-enable origin experiments (dependent on #16183). I'll defer to @choumx and @lannka on what to do in this case. |
|
@lannka Thanks a lot! I have seen on Slack a message from @mrjoro mentioning that there will be no releases this week.
Doesn't this impact |
|
There will be no prod release this week. But we will still do canary release. |
jridgewell
left a comment
There was a problem hiding this comment.
This was far too much code to effectively review.
|
|
||
| export const AMP_SKIMLINKS_VERSION = '1.0.0'; | ||
| export const XCUST_ATTRIBUTE_NAME = 'data-skimlinks-custom-tracking-id'; | ||
| export const AFFILIATION_API = 'https://go.skimresources.com'; |
There was a problem hiding this comment.
Nit: This should needs a trailing /, so it's impossible to change the domain name.
| 'domains': domains, | ||
| }); | ||
|
|
||
| const beaconUrl = `${DOMAIN_RESOLVER_API_URL}?data=${JSON.stringify(data)}`; |
There was a problem hiding this comment.
Nit: JSON.stringify is not 100% sufficient, since it won't encode & and = inside strings. The pubcode above comes from publisher HTML element.getAttribute('publisher-code')
There was a problem hiding this comment.
We are not expecting any & or = since domains contain a list of hostnames and publisher-code is validated using this regex: ^[0-9]+X[0-9]+$
There was a problem hiding this comment.
Ah, I see. I didn't check the validator files.
| * @private | ||
| */ | ||
| getNewDomains_(anchorList) { | ||
| return anchorList.reduce(((acc, anchor) => { |
There was a problem hiding this comment.
Nit: unnecessary quotes around the arrow function.
| unknownAnchors.map(anchor => ({anchor, replacementUrl: null})) | ||
| ); | ||
| const twoStepsResponse = this.resolveUnknownLinks_(unknownAnchors); | ||
| user().assert(twoStepsResponse instanceof TwoStepsResponse, |
There was a problem hiding this comment.
This check doesn't make sense to me. You control the API, of course it will be a new TwoStepsResponse.
There was a problem hiding this comment.
I understand your confusion. It's true that for the moment only Skimlinks is using the link rewriter API but it was designed to be able to be used by other extensions, hence the check here. There will be a separate discussion about if we want to move this code outside of amp-skimlinks to make this API available to other extensions. If not I will clean up the code and will remove this test.
There was a problem hiding this comment.
👍. instanceof is notoriously brittle across document boundaries, and AMP works in several modes that work that way. A duck-type check would be better.
|
|
||
| /** @override */ | ||
| isLayoutSupported() { | ||
| return true; |
There was a problem hiding this comment.
This should be checking against Layout.NO_DISPLAY explicitly.
| return false; | ||
| } | ||
| // Save so we can restore it. | ||
| anchor.setAttribute(ORIGINAL_URL_ATTRIBUTE, anchor.href); |
There was a problem hiding this comment.
Bug: This is not re-entrant, and it's too hard to reason about the caller to rewriteAnchorUrl.
There was a problem hiding this comment.
I am not sure to understand what you mean and I don't see the bug 🤔. Could you give more explanations?
There was a problem hiding this comment.
This can be called multiple times on a link in the timespan, so we will have buildup of affiliate links pointing to affiliate links pointing to the original link. Kinda like, https://example.com/?redirect=https://example.com/?redirect=https://example.com/?redirect=original.com
* cl/220306609 Revision bump for #17907 * cl/220307253 Revision bump for #19128 * cl/220310523 Revision bump for #19129 * cl/220399983 Revision bump for #19167 * cl/221145203 n/a * cl/221159765 Revision bump for #19214 * cl/221164382 Invalidate `<amp-script>` tag as well * cl/221176616 Revision bump for #17939 * cl/221181356 Revision bump for #19171
* Basic skim-js implementation * Trying to use amp-analytics to send the request. * Force tracking request to be a GET request * add plant uml graph for design review * Updating design doc graph * design doc v2 * Start implementing AMP solution * Hook AffiliateLinks to navigation click handler * Better version of affiliate links manager. * Adding tracking module & refactor booting of the extension * Adding options and handle no links to resolve on the page scenario. (We always want to call beacon) * Fix issue with wrong 'this' context * Use events instead of optional callback to trigger na click tracking * New link rewriter service * Remove typo * session id is not a thing anymore * FIx JS Doc lint issues && move link rewrite service to 'src' * Add domain resolver tests around beacon API calls * Add domain resolver tests && replace AnchorRewriteData by createAnchorReplacementTuple * adding test around beacon callback * Add placeholder tests around waypoint query params * Add impression id to waypoint params * Implement .skip() tests * Moving domain resolver tests to separate file. * Add tracking tests * No need to mock internal functions. * Adding some NA click tracking tests * Adding xcust test for NA click tracking * Add test for na tracking => rnd query params * Adding generate page impression id test * Add tracking "setup" tests * Cleaning up * Adding tests amp-skimlinks bootstrap * Test beacon callback handler * Add test to check if options are parsed when loading the extension * Refactor link-rewrite-service priority logic. * Add more link-rewriter-service tests * Remove unused function * Add tests for LinkRewriter * Replace AnchorRewriteDataResponse by twoStepsResponse object. * Add new variable to whitelist * Fix use of .calledOnce instead of .called when testing not called. * Use fakeWin instead of realWin for quicker run of the tests * Add tests for rewriteAnchorUrl method * Adding tests for link rewriter * Refactor call beacon fallback when no links on page. * Fix issue, missing anchor param for NA click * Fix issue with impression tracking getting wrong ae status. We need to make sure the anchorLinkReplacment map is updated before calling this.trackingService.sendImpressionTracking. * amp-skimlinks should be loaded with layout='nodisplay'. * Amp analytics requests should be sent using sendBeacon * Consider 'canonical url hostname and sourceUrl hostname as 'internal' domain. (Domain will be excluded by default) * Add real links for manual testing * Renaming variables * Refactor again the beacon callback logic so we can access the guid before sending impression tracking. * Remove .only * Cleanup * Call onDomUpdate from registerLinkRewriter instead of LinkRewriter constructor. * Send new 'platform' flag instead of xcreo . * Move LinkRewriter to be global to the app. * Handle link-rewrite-service click handler from navigation instead of propagating a click event. * Move constant to constants.js * Renaming constants * Renaming file * Remove first param since the guid is passed using setTrackingInfo * Extract waypoint logic from AffiliateLinkResolver * We can not use replacement macro for waypoint URL. Macro in links only support CLIENT_ID and QUERY_PARAM. https://github.com/ampproject/amphtml/blob/master/spec/amp-var-substitutions.md#variable-substitution-in-links * Code review * Clean up * Add tests for link rewriter called from navigation.js * Add basic validator * Fix typos * Fix search & replace * Fixing check types * Adding documentation for affiliate-link-resolver class * Adding documentation for Tracking module * Use skimresources instead of go.redirectingat * Add documentation for link-rewriter * Adding 1st version of the amp-skimlinks readme * Fix broken tests * Documentation fixes * Rename link-rewrite-service to link-rewriter-manager for clarity * Code review * Do not send NA click tracking on right click. * Renaming test * Remove anchorClickActions since we don't need it after all * Fix issue when unregistered LinkRewriter are in priority list. * Fix new tests after rebase * .trim() is already done in code above. * Update pubcode for testing * Code review * Rename link-rewrite-service to link-rewriter-manager for clarity * Replace Map by LinkReplacementCache class since Map are not allowed in this project. * Fix broken tests * Simplify updateLinkList * FIx typo * Add comment * Fixing eslint issues part1 * Fix lint errors in tests * Fix issue with go.redirectingat links showing in link & page impressions + tracking as NA click (on top of waypoint) * Fix issue with excluded-domains option overwriting internal and globally excluded domains. * fix linter issue * Update doc types around link rewriter API * Improve comments * Fix check types * Remove plantUML file from repo. * Cleaning up amp-skimlinks.html * Add owners to amp-skimlinks * Fix outdated comment * Update comments * Fix issues related to page pre-rendering scenario * Improve comment * Add copyright on top of file * Don't initialise anything until buildCallback * Fix typo * Moving linkRewriter logic inside amp-skimlinks * Fix eslint issues while running unit tests on travis. * Access referrer through util. * Fix "Do NOT stub on a cross domain iframe!" error on travis * Remove dependency on navigation.js * Code review part1 - Use existing Observable class instead of EventMessenger. - Move affiliate link status to enum instead of constants. * Make closure compiler types more explicit. * @Private & @public should be the last annotation. * Code review part2 * Use JsonDict when reading/writing serialisable data * Use a class to represent TwoStepsResponse. * Add comment * Minor code review changes * Missing annotations * Shadow AMP support * Don't use parseUrlDeprecated to get the hostname if possible since the function is extremly slow. * Add Skimlinks privacy policy * Use chunk function to scan the document. Chunk will delay the execution of the document scaning to when the browser is idle. The purpose of this change is to avoid any potential performance issue that could affect the user experience when loading the page. At this point it hasn't been proven that amp-skimlinks could create such slowdown but it feels safer to wrap it inside the chunk function for now. Since this is not the ideal solution (we could potentially lose clicks and page impressions), it would be great make more measurement and optimization and remove it once we feel confident that amp-skimlinks can run without risking to interfer with the user experience. * Fix ampproject#18282. CustomEventReporter should auto-load amp-analytics. * Add choumx as amp-skimlinks owner. * amp-skimlinks should be lowest priority. * Remove getBoundFunction util * Improve amp-skimlinks validator * Code review * Get referrer from javascript instead of using macro * Remove access to private variable. Use default transport until ampproject#19014 is ready. * Rename tracking param. * Use custom transport config for tracking API. Turn xhrpost off to avoid wildcard in Access-Control-Origin error when navigator.sendBeacon() is not available.
* cl/220306609 Revision bump for ampproject#17907 * cl/220307253 Revision bump for ampproject#19128 * cl/220310523 Revision bump for ampproject#19129 * cl/220399983 Revision bump for ampproject#19167 * cl/221145203 n/a * cl/221159765 Revision bump for ampproject#19214 * cl/221164382 Invalidate `<amp-script>` tag as well * cl/221176616 Revision bump for ampproject#17939 * cl/221181356 Revision bump for ampproject#19171
Implementation for the new
amp-skimlinksextension as discussed here: I2I: amp-skimlinks · Issue #16435Initial design doc is available here.
What's in this PR:
amp-skimlinksextension allowing publishers to use Skimlinks affiliate services.Critical changes:
The service is added in the main
amp.jsbundle since it is global to all extensions.EDIT: Based on feedback, the service has been moved back to amp-skimlinks for now.
DOCUMENT_REFERRERtoSANDBOX_AVAILABLE_VARS:amp-skimlinksis using the amp-analytics internal API to send tracking requests which seems to useamp-analyticswith thesandboxflag turned totrue.amp-skimlinksneeds to send the document referrer and from looking at this PR, it looks like there may not be any real reason to blacklistDOCUMENT_REFERRER.amp-skimlinks extension
As a reminder amp-skimlinks does two main things:
In order to do so
amp-skimlinksis divided into four main modules:AmpSkimlinks: The custom component.Tracking: Sends all the tracking requests through theamp-analyticsAPI.AffiliateLinkResolver: Communicates with the new "link rewriter service" to resolve through the Skimlinks domain resolver API, which links on the page should be swapped.Waypoint: Builds the final Skimlinks affiliate URLs namely, the replacement URLs.New "Link Rewriter Service"
In order to make the code from
amp-skimlinksas generic as possible, we have created a new "link rewriter service" on top of which we have builtamp-skimlinks.The link rewriter service allows extensions to rewrite the href of anchors on the page at click time. The service also fixes the issue of conflicts between extensions trying to rewrite a URL at the same time by a system of priorities defined by the publisher.
EDIT: The service now lives inside amp-skimlinks but was build with the idea of being common to all extensions.
Initial solution:
The service is composed of two main parts interacting with each others:
LinkRewriterA
LinkRewriterinstance allows you to specify in an asynchronous or synchronous way, which anchors should be replaced and by which replacement URLs.LinkRewriterManagerAs an AMP page can have multiple
LinkRewriterinstances running at the same time, we introducedLinkRewriterManagerwhich is managing all the registeredLinkRewriter.Example:
A publisher wants to use Skimlinks to replace some specific links on the page but wants to use another affiliate service to replace all the other links.
A global
LinkRewriterManagerinstance is installed when the mainamp.jsscript startsTypical flow is:
1 - One or multiple LinkRewriter are registered through the
LinkRewriterManager, each with:LinkRewritershould handle.2 - Each
LinkRewriterscans the page to find all the anchors (based on thelinkSelectoroption if provided).3 -
LinkRewriterask through the provided 'resolveUnknownLinks' function what is the replacement for each anchor found on the page and store the results in a cache (LinkReplacementCache).(4) -
LinkRewriterManagerdetects aDOM_CHANGEDevent and ask all the registeredLinkRewriterto re-scan the page (go back to 2)5 -
LinkRewriterManagerdetects a click on an anchor, iterates through the list of registered LinkRewriters by order of priority, asking one by one if they want to replace the link. If one LinkRewriter wants to replace the link, the iteration stops and LinkRewriter with a lower priority are ignored.6 - A "click" event is propagated through an event system to all the registered LinkRewriter so all the LinkRewriters can at least track the click regardless of if they have mutated the anchor.
Potential future improvement:
Chainable linkRewriters.
In the current solution, there is no way to chain href mutation on the URL, only one LinkRewriter is allowed to modify the link at a time. If this solution fits most of the affiliate services we could make it more generic for other types of URL rewriting. We could imagine a
chainable: true/falseoption to allow LinkRewriter to modify the URL without blocking the other LinkRewriter.Config based service.
amp-skimlinksis doing more than just replacing links, it has a lot of logic related to tracking and how we resolve links. However, we could imagine for simpler use-cases a config based "Link Rewriter Service".