Handles icon data URL for PR 31763#32276
Merged
getdave merged 1 commit intoWordPress:try/retrieve-more-data-from-url-details-apifrom May 27, 2021
Merged
Handles icon data URL for PR 31763#32276getdave merged 1 commit intoWordPress:try/retrieve-more-data-from-url-details-apifrom
getdave merged 1 commit intoWordPress:try/retrieve-more-data-from-url-details-apifrom
Conversation
5 tasks
1618f19
into
WordPress:try/retrieve-more-data-from-url-details-api
getdave
added a commit
that referenced
this pull request
May 29, 2021
…oint (#31763) * Add basic regex to grab site icon * Retrieve meta description * Ensure cleanup * Improve title regex to account for possible attributes on title * Retrieve OG Image * Fix linting * Fix tests to assert on array subset * Enhance fixture data with more edge cases * Add tests to ensure new properties are captured for icon, description and image. * Add more specific yet flexible test for title * Handle relative resource URLs for icon and image * Use random user agent string to avoid being blocked by certain websites. * Account for open graph image property variations * Add unit test for get_title * Add tests (including some failing) for get_icon * Fix method invocation to remove unused args * Wrap test HTML string in a basic HTML doc. * Parse the head section and use for comparison * Fix broken cache test * Refine wrap method * Add get_image tests * Handle relative URLs when target url has a path * Improves title and icon parsing for PR 31763 (#32021) * Title: removes malformed opening tag pattern and adds tests. * Icon: Allows for different ordering of attribute. Adds happy and unhappy test data. * Icon: allow for any order or combination of attributes. How? Get the icon link element first. Then grab its href. Benefits: - Not dependent upon the order of attributes - Allows for optional or custom attributes * Icon: allows for single, double, or no quotes around attributes. * Update for WPCS standard. * Seek head but fallback to body. * Improves metadata parsing for PR 31763 (#32067) * Description: uses regex instead of tmp file. * Adding test to check for like tag before and after target. * Description: changes regex strategy. Why? Lookahead was not constrained with each element and thus picked up <meta from one and then if not a match, grabbed the name and content from another upstream. The new strategy parses all meta elements with a content attribute. Then loops through them to find the description element. Why this order? The content attribute can contain HTML tags. The > or /> symbol is matched as the end of the meta element (it's closing symbol). If this happens, the content is truncated. Boo. Switching the parsing order solves this problem. Bonus: allows for pre-parsing of all meta elements. Performance boost. * Refactors getting meta with content elements for reuse. * Improves getting <head>..</head> element. - Isolates to the only the <head>..</head> element by stripping all content before the opening tag and ensuring it includes a closing </head> tag. - Performance improvements: - Bails out early if no opening tag is found. - Uses native string functions instead of regex. * Image: use same parsing strategy as description. * Refactor to reuse the process for getting the metadata from the list of meta elements. * Convert description HTML entities into HTML. * Improves PR 31763 for the URL Details Controller (#32162) * Code standards and consistency. * Removed unused data provider. * More formatting and standards. * Title: converts entities. * Fixes asserts: removes deprecated array subset, uses assertSame, and makes consistent. * Fixes method return signatures. * Remove HTML and convert non-HTML entities. * Removes type check from set_cache as data will be string type.. * Update lib/class-wp-rest-url-details-controller.php Co-authored-by: Tonya Mork <hello@hellofromtonya.com> * Update lib/class-wp-rest-url-details-controller.php Co-authored-by: Tonya Mork <hello@hellofromtonya.com> * Update lib/class-wp-rest-url-details-controller.php Co-authored-by: Tonya Mork <hello@hellofromtonya.com> * Icon: if data url, skip relative-to-absolute conversion (#32276) * Fix failing test due to extra character in expected string. * Updates schema for new data items. * Changes icon and image type to uri. * Schema: icon & image: reverts type back to string and adds format of uri. Co-authored-by: Tonya Mork <hello@hellofromtonya.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Improvement for PR #31763
The icon can be a data URL. If it is, skip the relative-to-absolute URL conversion and return it.
@getdave identified the use case here #31763 (comment).
Examples:
Works as expected. Tested the changes running older version of PHP up to PHP 8.0.x as shown here https://3v4l.org/aaDc9
Checklist:
*.native.jsfiles for terms that need renaming or removal).