Add sanitizer arg to prevent noscript fallbacks; preemptively remove disallowed attributes from fallback#2131
Conversation
felixarntz
left a comment
There was a problem hiding this comment.
Fix looks solid.
One thing I'd like to change though is that we now have the same code four times. I think we should introduce a trait to make it more straightforward.
| } | ||
| foreach ( $disallowed_attributes as $disallowed_attribute ) { | ||
| $old_node->removeAttribute( $disallowed_attribute ); | ||
| } |
There was a problem hiding this comment.
Why not iterate only once and immediately remove the attributes instead of populating first and then doing another round of removing? If the problem is remove attributes modifying the $old_node->attributes property, then we could assign it to another variable to use.
There was a problem hiding this comment.
The problem is that the attributes are a live DOM list and if a node is removed while iterating then a node gets skipped. This pattern of gathering up the list of nodes and then removing them on a second iteration can be seen elsewhere in the codebase. It's not ideal.
|
We could also create an |
|
I think the For example, if other elements than media could benefit from that noscript fallback in the future, we could use the trait, but a media base class wouldn't fit in that case. The latest commit includes a trait that does it. We could still easily modify it to become a base class if this is the decision. |
| * @param array $default_args Original defaults. | ||
| * @return array Extended defaults. | ||
| */ | ||
| protected function merge_default_args( array $default_args ) { |
There was a problem hiding this comment.
Is this the right name? When we have other traits then they would potentially have their own default args to merge, so then this should be named uniquely for this trait.
There was a problem hiding this comment.
Maybe this method isn't even needed? Maybe better to just have each class that uses it define the default arg like they do for other args?
There was a problem hiding this comment.
Also, some sanitizers may want different defaults.
| * | ||
| * @param string $tag Tag name to get allowed attributes for. | ||
| */ | ||
| protected function initialize_allowed_attributes( $tag ) { |
There was a problem hiding this comment.
This is very specific to the noscript fallbacks, as it assumes there is it one tag spec for the given tag, which is true for the noscript fallback tag specs. In reality AMP has multiple tag specs for each tag name. So maybe this should also be renamed to be specific to the noscript fallback case.
| return; | ||
| } | ||
|
|
||
| $this->initialize_noscript_allowed_attributes( self::$tag ); |
There was a problem hiding this comment.
This should only happen if the add_noscript_fallback arg is true, no?
| return; | ||
| } | ||
|
|
||
| $this->initialize_noscript_allowed_attributes( self::$tag ); |
| return; | ||
| } | ||
|
|
||
| $this->initialize_noscript_allowed_attributes( self::$tag ); |
| return; | ||
| } | ||
|
|
||
| $this->initialize_noscript_allowed_attributes( self::$tag ); |
When a disallowed attribute appears on an input
img,video,audio, oriframecurrently the validation error will be duplicated once for the replaced AMP component and again for the original element added inside as anoscript. This PR prevents a duplication of the validation error by removing the disallowed attributes preemptively.Also, in case the
noscriptfallback elements are not desired, aadd_noscript_fallbackarg is introduced for the image, audio, video, and iframe sanitizers to prevent it from being added (this is needed for AMP stories, and these commits were picked from #2107).Lastly, as reported in ampproject/amphtml#21686 the
amp-imgallows the HTTP protocol whereas theamp-img > noscript > imgdoes not. So this PR prevents adding thenoscript > imgfallback if it lacks the HTTPS protocol.Fixes #2126.