Refactor validator specification integration

Some of the validation we are doing in our sanitizers is based on the official validator specification found in https://github.com/ampproject/amphtml.

#### Current Process
1. Clone https://github.com/ampproject/amphtml repository.
2. Use Python to assemble all protoascii files (main + extensions).
3. Use Python to parse assembled protoascii file into a PHP class containing the spec in array form ( => https://github.com/ampproject/amp-wp/blob/1.4.4/includes/sanitizers/class-amp-allowed-tags-generated.php).
4. Load PHP class into plugin and traverse provided arrays in several ways to power the sanitizer logic.

#### Problems with this process that need solving
- The project cloned in step 1. is really big, so this uses a lot of time and bandwidth.
- Steps 2. and 3. require Python, so this adds an entire additional ecosystem to the dependencies of the project just for the sake of parsing the spec.
- Step 4. is an all-or-nothing approach of loading the spec. To validate a single value, the complete spec (more specifically, the subset we converted to PHP) needs to be loaded into memory.
- Step 4. is a huge list of arrays within arrays that contain repeated strings and other values that could be normalized.
- As step 4. is based on arrays, which are efficient on key-based retrieval but inefficient on value-based retrieval, we end up with a structure that is mostly optimized towards one pre-determined dimension, and is ill-suited to cover any other dimensions (that might even only come up in future requirements). (see https://github.com/ampproject/amp-wp/pull/3817 for an example)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor validator specification integration #4566

Current Process

Problems with this process that need solving

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Refactor validator specification integration #4566

Description

Current Process

Problems with this process that need solving

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions