Skip to content

Support mandatory_oneof and mandatory_anyof constraints#4285

Merged
westonruter merged 24 commits intodevelopfrom
add/support-mandatory-oneof
Feb 15, 2020
Merged

Support mandatory_oneof and mandatory_anyof constraints#4285
westonruter merged 24 commits intodevelopfrom
add/support-mandatory-oneof

Conversation

@kienstra
Copy link
Copy Markdown
Contributor

@kienstra kienstra commented Feb 12, 2020

Summary

Adds mandatory_anyof and mandatory_oneof to amphtml-update.py, and verifies against it in the sanitizer

Testing

  1. As the description of Add support for mandatory_oneof and mandatory_anyof attribute constraints #938 mentions, adding this to a Custom HTML block:
<iframe src="about:blank">

...now results in the <amp-iframe> being stripped, and the validation errors now include:

mandatory-oneof

  1. And adding this to a Custom HTML block:
<amp-list width="400" height="400"></amp-list>

...now results in this validation error:

yan-spec

Fixes #938

Checklist

  • My pull request is addressing an open issue (please create one otherwise).
  • My code is tested and passes existing tests.
  • My code follows the Engineering Guidelines (updates are often made to the guidelines, check it out periodically).

Add this to amphtml-update.py,
and verify against this in the validator.
@googlebot googlebot added the cla: yes Signed the Google CLA label Feb 12, 2020
I'm not sure if this case actually exists in the spec,
but this should still account for the case.
This didn't apply to this function,
now the return values are accurate.
@todo: account for this in the sanitizer,
like for mandatory_oneof
@westonruter
Copy link
Copy Markdown
Member

For updating the spec, please update at the same version that it was last updated from so that other spec changes are not included. That would be specifically 1910161528000 as done in #3619.

And update unit tests
to pass with this.
"""
if attribute_spec.HasField(constraint):
mandatory_of = getattr(attribute_spec, constraint).lstrip('[').rstrip(']').split(',')
spec_dict[constraint] = [oneof.strip(' ').strip("'") for oneof in mandatory_of]
Copy link
Copy Markdown
Member

@westonruter westonruter Feb 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to ensure that '[src]' becomes 'data-amp-bind-src', so replacing the first [ with data-amp-bind- and stripping the last ].

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied in 39926b6...bcec1fa

'data-amp-bind-src' => array(
'mandatory_anyof' => array(
'src',
'[src]',
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should get generated as 'data-amp-bind-src'. See https://github.com/ampproject/amp-wp/pull/4285/files#r378581454.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, it's applied in 39926b6...bcec1fa

As Weston mentioned,
this value in a mandatory_anyof should be converted.
Ensure that it starts with [ and ends with ],
and prefix it with data-amp-bind-
If it doesn't meet the requirement,
strip the node.
Also, align @param tags and remove
unnecessary variables.
Improve the DocBlock for
check_attr_spec_rule_mandatory_number_of()
This is applied in this PR,
and no longer needed.
Before, this didn't show which
spec wasn't satisfied.
Now, if there is a 'mandatory_oneof' rule
for [ 'src', 'srcdoc' ],
this will display those attributes.
@kienstra kienstra marked this pull request as ready for review February 13, 2020 05:34
@kienstra
Copy link
Copy Markdown
Contributor Author

Hi @westonruter,
Thanks for looking at this. It should be ready for another round of review when you have a chance.

Thanks, and looking forward to talking tomorrow.

'src',
'srcdoc',
),
),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this isn't really something you can answer, but something seems off about the data format in the validator spec:

  attrs: {
    name: "src"
    mandatory_oneof: "['src', 'srcdoc']"
    value_url: {
      protocol: "data"
      protocol: "https"
      allow_relative: true  # Will be set to false at a future date.
    }
    blacklisted_value_regex: "__amp_source_origin"
  }
  attrs: {
    name: "srcdoc"
    mandatory_oneof: "['src', 'srcdoc']"
  }

It's also somewhat strange that "['src', 'srcdoc']" is serialized as pseudo-JSON, as opposed to being represented as an actual array in the protoascii.

Why is mandatory_oneof duplicated in both srcdoc and src?

In other words, it would seem more logical for the mandatory_oneof property to be placed under the tag_spec as opposed to duplicated inside of each attribute. For example:

diff --git a/includes/sanitizers/class-amp-allowed-tags-generated.php b/includes/sanitizers/class-amp-allowed-tags-generated.php
index b60135580..1aae2b432 100644
--- a/includes/sanitizers/class-amp-allowed-tags-generated.php
+++ b/includes/sanitizers/class-amp-allowed-tags-generated.php
@@ -3290,6 +3290,10 @@ class AMP_Allowed_Tags_Generated {
 					),
 				),
 				'tag_spec' => array(
+					'mandatory_oneof' => array(
+						'src',
+						'srcdoc',
+					),
 					'amp_layout' => array(
 						'supported_layouts' => array(
 							6,

Would it make sense to gather up the mandatory_oneof list across all attributes and then store them once in the tag_spec?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out the value of the mandatory_oneof is just an opaque string that is used for the error message. The validator doesn't actually parse it. See the logic in the JS validator: https://github.com/ampproject/amphtml/blob/5fb4e7bb72012c2e088ffda951c3b815b6e1caba/validator/engine/validator.js#L4732-L4761

That is why the string is duplicated across the attributes, since it is just a key.

So in theory, instead of:

  attrs: {
    name: "src"
    mandatory_oneof: "['src', 'srcdoc']"
    value_url: {
      protocol: "data"
      protocol: "https"
      allow_relative: true  # Will be set to false at a future date.
    }
    blacklisted_value_regex: "__amp_source_origin"
  }
  attrs: {
    name: "srcdoc"
    mandatory_oneof: "['src', 'srcdoc']"
  }

The protoascii could have factored out the mandatory_oneof like so:

  attrs: {
    name: "src"
    value_url: {
      protocol: "data"
      protocol: "https"
      allow_relative: true  # Will be set to false at a future date.
    }
    blacklisted_value_regex: "__amp_source_origin"
  }
  attrs: {
    name: "srcdoc"
  }
  mandatory_oneof: ['src', 'srcdoc']

If we represented the data in this way, this would eliminate the current duplication of checking the same constraints across all attributes that have the same mandatory_oneof value in a given tag.

Copy link
Copy Markdown
Member

@westonruter westonruter Feb 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, the only way to prevent the duplication now would seem to be to replicate what the JS validator is doing and instead of checking check_attr_spec_rule_mandatory_number_of inside the loop that iterates over the attributes, to instead check it once after they have been iterated over.

Nevertheless, it seems better to improve the data model by factoring out the duplicated data from the attributes and into the tag_spec.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good idea to remove the duplication in the generated PHP file. I'll work on applying it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

969def4 updates the generated file, but of course I need to update the PHP logic.

Thanks to Weston for the suggestion,
as makes more sense and reduces
duplication.
Now that this is in the 'tag_spec',
look for it there instead of in the 'attr_spec'.
Add tests for allowed <amp-list>
and <amp-iframe>.
This doesn't check the attr_spec anymore,
as the rule is in the tag_spec.
Still, the name shouldn't reflect where
it's looking for the rule.
A lot of funtion calls don't have spaces,
so remove these.
@kienstra
Copy link
Copy Markdown
Contributor Author

kienstra commented Feb 14, 2020

Hi @westonruter,
Thanks for waiting. This should be ready for another look whenever you have a chance.

Comment on lines +793 to +798
for attr_spec in attr:
if attr_spec.HasField(constraint):
spec_dict = {}
mandatory_of = getattr(attr_spec, constraint).strip('[]').split(',')
spec_dict[constraint] = [GetMandatoryOfAttr(oneof) for oneof in mandatory_of]
return spec_dict
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a suggestion to improve the logic here which would eliminate the need for GetMandatoryOfAttr. Given that the value is not actually parsed, what can be done is merely to check if the key exists for a given attribute, and of so, add the attribute name to the list for the constraint.

So this could be replaced with:

attributes = []
for attr_spec in attr:
	if attr_spec.HasField(constraint):
		attributes.append( attr_spec.name )
if len( attributes ) == 0:
	return None
return attributes

And then GetMandatoryOfAttr could be removed.

Is that right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's a great idea.

3b1dbf8 applies it.

kienstra and others added 3 commits February 14, 2020 13:08
Removes the need for GetMandatoryOfAttr(),
and moves some of its logic into
GetMandatoryOf().
Still, this doesn't look like it's always
alphabetical.
@westonruter
Copy link
Copy Markdown
Member

@kienstra I found that the MANDATORY_ONEOF_ATTR_MISSING error was conflating two different error scenarios: (1) missing any attributes and (2) duplicate exclusive attributes.

In f18f6c2 I've addressed that by introducing a new error code: DUPLICATE_ONEOF_ATTRS.

Given a post with this content:

<!-- wp:html -->
<amp-iframe src="about:blank"></amp-iframe>
<!-- /wp:html -->

<!-- wp:html -->
<amp-list width="400" height="400"></amp-list>
<!-- /wp:html -->

<!-- wp:html -->
<amp-iframe src="https://example.com" srcdoc="https://foo.com" width="200" height="100"></amp-iframe><!-- Bad because both present! -->
<!-- /wp:html -->

The errors in the validated URL screen are now:

image

@westonruter
Copy link
Copy Markdown
Member

The message for the first validation error (INVALID_URL_PROTOCOL) needs to be improved in the context of #1420.

@kienstra Please review my enhancements and let me know if they check out.

Copy link
Copy Markdown
Contributor Author

@kienstra kienstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

Hi @westonruter,
Nice improvements, the new DUPLICATE_ONEOF_ATTRS error helps.

There's a minor comment below.

'mandatory_oneof' => array(
'src',
'srcdoc',
),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this looks to be the same as the mandatory_oneof spec below:

'mandatory_oneof' => array(
	'src',
	'srcdoc',
),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's bizzare. How is Python generating an array with duplicate keys?

				'tag_spec' => array(
					'mandatory_oneof' => array(
						'src',
						'srcdoc',
					),
					'amp_layout' => array(
						'supported_layouts' => array(
							6,
							2,
							3,
							7,
							9,
							1,
							4,
						),
					),
					'mandatory_oneof' => array(
						'src',
						'srcdoc',
					),
					'requires_extension' => array(
						'amp-iframe',
					),
				),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just re-ran ./bin/amphtml-update.sh and it removed the duplicate: 4fb49da.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a little strange.

const ATTR_REQUIRED_BUT_MISSING = 'ATTR_REQUIRED_BUT_MISSING';
const MANDATORY_ANYOF_ATTR_MISSING = 'MANDATORY_ANYOF_ATTR_MISSING';
const MANDATORY_ONEOF_ATTR_MISSING = 'MANDATORY_ONEOF_ATTR_MISSING';
const DUPLICATE_ONEOF_ATTRS = 'DUPLICATE_ONEOF_ATTRS';
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, having MISSING in the error code when there are too many is misleading.


if ( ! $this->is_matched_attribute_count_acceptable( $matched_attribute_count, $constraint_type ) ) {
return $tag_spec[ $constraint_type ];
private function get_element_attribute_intersection( DOMElement $element, $attribute_names ) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to have this return the intersection.

$title = __( 'Missing exclusive mandatory attribute', 'amp' );
$attributes_key = 'mandatory_oneof_attrs';
} else {
$title = __( 'Missing at least one mandatory attribute', 'amp' );
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, it's much more descriptive than before:

amp-list-here

@westonruter westonruter added this to the v1.5 milestone Feb 15, 2020
@westonruter westonruter merged commit 16c3cc9 into develop Feb 15, 2020
@westonruter westonruter deleted the add/support-mandatory-oneof branch February 15, 2020 20:45
@kienstra
Copy link
Copy Markdown
Contributor Author

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes Signed the Google CLA

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for mandatory_oneof and mandatory_anyof attribute constraints

3 participants