Conversation
|
@westonruter We already discussed having not only a PSR-2 package for the optimizer and a WPCS package for the plugin, but also a shared package with common code. The first optimizer I'm working on ( @pierlon Already wrote a PHP version of most of this, so writing it again for the Optimizer doesn't make sense. The Optimizer also shouldn't have the plugin package as a dependency. The clean solution is to extract it out into this third shared package (naming to be discussed). However, this would mean for now (while we have everything in one repo) that I'm adding a third namespace root already via this PR (which is already adding the second Should we go for "clean separation" for now and later deal with the many subfolders either through separate repos or lerna, or do you want me to just copy that shared code into the Optimizer for now (which might lead to inconsistencies that can result in bugs when moving over to a shared library). |
8835c9a to
3b31720
Compare
|
The Go optimizer (powering the Google Cache) and the NodeJS optimizer handle amp-stories differently. The NodeJS optimizer behaves the way the Go version states its future intent: https://github.com/ampproject/amppackager/blob/c9993b8ac4d17d1f05d3a1289956dadf3f9c370a/transformer/transformers/serversiderendering.go#L149 With NodeJS, the presence of an I'd suggest going with the NodeJS version in this case. @westonruter, @sebastianbenz thoughts? |
Having clean separation with different directories for the various packages makes sense to me. At a future point we can split them into separate repos (for those which aren't left in place) when we make the packages available. It will be easier now to keep everything in one repo while we iterate. |
|
@schlessera re amp-story: the Go version is correct. I've filed ampproject/amp-toolbox#567 for Optimizer. |
Alright. I'll go with |
westonruter
left a comment
There was a problem hiding this comment.
I haven't finished my review but I wanted to submit what I've seen so far.
src/Dom/Document.php
Outdated
| /** | ||
| * Restore the emoji AMP symbol (⚡) from its pure text placeholder. | ||
| * | ||
| * @param string $html HTML string to restore the AMP emoji symbol in. | ||
| * @return string Adapted HTML string. | ||
| */ | ||
| private function restore_amp_emoji_attribute( $html ) { | ||
| return preg_replace( '/(<html [^>]*?)' . preg_quote( self::EMOJI_AMP_ATTRIBUTE, '/' ) . '="([^"]*)"/i', '\1⚡\2', $html, 1 ); | ||
| } |
There was a problem hiding this comment.
Is restore_amp_emoji_attribute needed? We should let the post-processor be adding the amp attribute, as it is currently doing. Additionally, in the case for when there is excessive CSS on a site in the Standard template mode, we serve an AMP document without the amp attribute. So that should be retained.
There was a problem hiding this comment.
I'm trying to build this so that we can use the language-independent testsuite provided at https://github.com/ampproject/amp-toolbox/tree/master/packages/optimizer/spec/transformers/valid
We can of course either ignore these tests, or modify them on-the-fly to adapt them to our specific requirements. But I thought it might be worthwhile to stick as closely to the testable spec as possible within the PHP libraries...?
src/Dom/Document.php
Outdated
| /** | ||
| * Covert the emoji AMP symbol (⚡) into pure text. | ||
| * | ||
| * The emoji symbol gets stripped by DOMDocument::loadHTML(). | ||
| * | ||
| * @param string $source Source HTML string to convert the emoji AMP symbol in. | ||
| * @return string Adapted source HTML string. | ||
| */ | ||
| private function convert_amp_emoji_attribute( $source ) { | ||
| return preg_replace( '/(<html [^>]*?)⚡([^\s^>]*)/i', '\1' . self::EMOJI_AMP_ATTRIBUTE . '="\2"', $source, 1 ); | ||
| } |
There was a problem hiding this comment.
I think convert_amp_emoji_attribute should just remove the attribute altogether instead of replacing it with self::EMOJI_AMP_ATTRIBUTE. We will add the amp attribute during post-processing.
There was a problem hiding this comment.
As stated in #4019 (comment), the goal is to be able to use the language-independent test suite.
| const ATTRIBUTES_STRING = 'Cannot remove boilerplate as either heights, media or sizes attribute is set: '; | ||
| const RENDER_DELAYING_SCRIPT_STRING = 'Cannot remove boilerplate because the document contains a render-delaying extension: '; | ||
| const AMP_AUDIO_STRING = 'Cannot remove boilerplate because the document contains an extension that needs to know the dimensions of the browser: '; | ||
| const UNSUPPORTED_LAYOUT_STRING = 'Cannot remove boilerplate because of an unsupported layout: '; |
There was a problem hiding this comment.
These messages should perhaps be translatable, as we'd want to show them in the validated URL screen.
There was a problem hiding this comment.
We're not in WordPress here, so we need to discuss how to actually implement translatable strings. We can use regular gettext, but we might need a way to properly map everything.
| const INVALID_INPUT_WIDTH = 'Cannot perform serverside rendering, invalid input width: '; | ||
| const INVALID_INPUT_HEIGHT = 'Cannot perform serverside rendering, invalid input height: '; | ||
| const UNSUPPORTED_LAYOUT = 'Cannot perform serverside rendering, unsupported layout: '; |
There was a problem hiding this comment.
These should be translatable because we'd want to show them on the validated URL screen.
| /** | ||
| * Check whether the configuration has a given setting. | ||
| * | ||
| * @param string $key Configuration key to look for. | ||
| * @return bool Whether the requested configuration key was found or not. | ||
| */ | ||
| public function has($key) | ||
| { | ||
| return array_key_exists($key, $this->configuration); | ||
| } | ||
|
|
||
| /** | ||
| * Get the value for a given key from the configuration. | ||
| * | ||
| * @param string $key Configuration key to get the value for. | ||
| * @return mixed Configuration value for the requested key. | ||
| * @throws UnknownConfigurationKey If the key was not found. | ||
| */ | ||
| public function get($key) | ||
| { | ||
| if (! array_key_exists($key, $this->configuration)) { | ||
| throw UnknownConfigurationKey::fromKey($key); | ||
| } | ||
|
|
||
| return $this->configuration[$key]; | ||
| } |
There was a problem hiding this comment.
Should this class implement ArrayAccess or Traversable or the like?
There was a problem hiding this comment.
ArrayAccess would mean we make the configuration mutable, or at leat add an offsetSet() that throws an exception. I don't really see what the benefit would be.
For Traversable, I can see us implement that if we have an actual use case where we need to iterate over the configuration. Right now, this is not needed, so adding it just for the sake of it seems premature.
| private function hasAncestorWithTag(DOMElement $element, $tagName) | ||
| { | ||
| $parent = $element->parentNode; | ||
| while ($parent !== null) { | ||
| if ($parent instanceof DOMElement && $parent->tagName === $tagName) { | ||
| return true; | ||
| } | ||
| $parent = $parent->parentNode; | ||
| } | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Since this method is going to be called a lot, perhaps it should be optimized better? When collecting the amp_elements can we omit the ones that are inside of template elements? Perhaps something to add to the Document class? For example:
case 'templateless_amp_elements':
$this->templateless_amp_elements = $this->xpath->query( ".//*[ starts-with( name(), 'amp-' ) and not( ancestor::template/ancestor::* = current() ) ]", $this->body );There was a problem hiding this comment.
We can benchmark and optimize this in a subsequent PR.
| // If textnode is not JSON parsable, then not used. | ||
| $experiment = json_decode($json, /*$assoc*/ true); |
There was a problem hiding this comment.
What if the value is null? Should this not actually use json_last_error()?
There was a problem hiding this comment.
If the value is null, we'll get null as a result, which is still a valid reason to skip.
| } | ||
|
|
||
| // If JSON is empty, then not used. | ||
| if (count($experiment) === 0) { |
There was a problem hiding this comment.
Should this not also check if it is a countable?
| if (count($experiment) === 0) { | |
| if (!is_array($experiment) || count($experiment) === 0) { |
There was a problem hiding this comment.
Actually, just checking for empty($experiment) should cover all cases correctly.
There was a problem hiding this comment.
Hmm, is an experiment valid if it is just a scalar like 0? I don't think so...
This commit moves code around so that all code to be externalized lands in `lib/<package>`. These are meant to be separate roots, so they have their own Composer file and testing setup (a simple one for now). These "external" packages are then pulled in via Composer using `"path"` repositories with relative URIs. To work on an individual package: ``` cd lib/<package> composer install composer test ```
…/amp-wp into add/958-amp-optimizer
| /** | ||
| * Always ensure we have a <style amp-boilerplate> tag. | ||
| */ | ||
| protected function ensure_boilerplate_is_present() { | ||
| $style = $this->dom->xpath->query( './style[ @amp-boilerplate ]', $this->dom->head )->item( 0 ); | ||
|
|
||
| if ( ! $style ) { | ||
| $style = $this->dom->createElement( Tag::STYLE ); | ||
| $style->setAttribute( Attribute::AMP_BOILERPLATE, '' ); | ||
| $style->appendChild( $this->dom->createTextNode( amp_get_boilerplate_stylesheets()[0] ) ); | ||
| } else { | ||
| $style->parentNode->removeChild( $style ); // So we can move it. | ||
| } | ||
|
|
||
| $this->dom->head->appendChild( $style ); | ||
|
|
||
| $noscript = $this->dom->xpath->query( './noscript[ style[ @amp-boilerplate ] ]', $this->dom->head )->item( 0 ); | ||
|
|
||
| if ( ! $noscript ) { | ||
| $noscript = $this->dom->createElement( Tag::NOSCRIPT ); | ||
| $style = $this->dom->createElement( Tag::STYLE ); | ||
| $style->setAttribute( Attribute::AMP_BOILERPLATE, '' ); | ||
| $style->appendChild( $this->dom->createTextNode( amp_get_boilerplate_stylesheets()[1] ) ); | ||
| $noscript->appendChild( $style ); | ||
| } else { | ||
| $noscript->parentNode->removeChild( $noscript ); // So we can move it. | ||
| } | ||
|
|
||
| $this->dom->head->appendChild( $noscript ); | ||
| } |
There was a problem hiding this comment.
This doesn't really make sense to be part of the Meta sanitizer, if “Meta” is referring to meta tags, but we can move it into a separate sanitizer later.
There was a problem hiding this comment.
Or rather, maybe this sanitizer should be renamed to be something like AMP_Ensure_Required_Markup_Sanitizer.
|
Sorry for leaving you so much work left undone, I didn't get nearly as much time yesterday to work on this as expected. |
|
You've done enough 😄 |

Summary
Adds an AMP Optimizer library to generate Optimized (aka Transformed) AMP pages from regular AMP pages.
This PR does the following:
amp/commonpackage (pulled-in via a Composerpathrepository for now)amp/optimizerpackage (pulled-in via apathrepository for now)AmpBoilerplateAmpRuntimeCssReorderHeadServerSideRenderingTransformedIdentifieramp/optimizerlibrary to theprepare_response()to optimize our Amp outputAmpSchemaOrgMetadatato test and demonstrate extensibilityamp/commonpackage where appropriateFixes #958
Fixes #3282
Depends on:
Checklist