SimplePie: Prepare PubSubHubbub for PSR-18 clients#838
SimplePie: Prepare PubSubHubbub for PSR-18 clients#838jtojnar merged 13 commits intosimplepie:masterfrom
Conversation
|
@mblaney Are there any objections? |
|
I will try to get back to SimplePie development on weekend. |
no this looks good to me 👍 |
48bab64 to
b9fa194
Compare
b9fa194 to
4818464
Compare
4818464 to
565b759
Compare
|
We should add some tests. |
|
Doing some more research, I am still not completely sure what the point is. It was introduced in 01eb2c5 and refactored in 426dc5b. The only idea for why we could have been promoting links to heasers is to allow But searching GitHub, I only found one relevant reference: a chat log of a discussion, whose conclusion was that clients MUST NOT support So perhaps this should be excised? |
905b3ed to
95e8eab
Compare
|
Also removing support for |
|
This now also appears to be blocked by #921 |
mf2 library uses implicit nullable types, which were deprecated in PHP 8.4: https://wiki.php.net/rfc/deprecate-implicitly-nullable-types This causes the response body of the HTTP mock server in tests to be polluted with the following warning: Deprecated: Mf2\Parser::parse(): Implicitly marking parameter $context as nullable is deprecated, the explicit nullable type must be used instead in vendor/mf2/mf2/Mf2/Parser.php on line 1373 This breaks tests in <#838> and <#919>. Until the upstream fix (microformats/php-mf2#264) is merged, let’s uninstall mf2 in PHP 8.4 CI. Works around: #921 Co-authored-by: Artur Weigandt <Art4@users.noreply.github.com>
This was artifact of copy paste in ae6665d
So that we can re-use it in integration tests.
ae6665d already introduced an unit test for `Parser` but let’s also check it is working properly in `SimplePie`.
`Locator::get_rel_link()` would return `null`, giving us a type error. Also drop redundant param arguments.
This needs to use a web server because links are only extracted from HTML files in `SimplePie::fetch_data()`, which is not called when `SimplePie::set_raw_data()` is used. And using `SimplePie::set_feed_url()` with local file will not work either because `Locator::is_feed()` returns `true` for local files.
…rays If header is present, it must have at least one header line. This will be needed to make PHPStan happy with `Response::with_header()` we are about to introduce.
This is similar to `Psr\Http\Message\MessageInterface::withHeader()` but does not preserve the casing of the header. It will be useful for storing metadata (e.g. WebSub link) in the response. We cannot use `static` as return type hint since it is only supported in PHP 8.0: https://php.watch/versions/8.0/static-return-type Nor can we use `self`, as that requires return covariance introduced in PHP 7.4: https://wiki.php.net/rfc/covariant-returns-and-contravariant-parameters
It would clear previous links by setting the header to comma instead of appending.
…in header As per <https://pubsubhubbub.github.io/PubSubHubbub/pubsubhubbub-core-0.4.html#discovery>: > In the absence of HTTP [RFC2616] Link headers, subscribers MAY fall back to other methods to discover the hub(s) and the canonical URI of the topic. […] Similarly, for HTML pages, it MAY use embedded link elements as described in Appendix A of Web Linking [RFC5988].
Previously, this only supported `File`, let’s prepare for other `Response` implementations.
It was introduced in 01eb2c5 and refactored in 426dc5b. However, according to chat log of a discussion between the writers of the PubSubHubbub specification, the consensus was that, due to a potential for injection attacks on poorly written websites, clients MUST NOT support `a[rel="hub"]`. See https://chat.indieweb.org/dev/2016-11-21#t1479767303222000 and https://www.github.com/w3c/websub/issues/67 We will thus remove support for it. Similar concerns apply to `link` elements in headers but per <https://www.w3.org/TR/websub/#x8-1-discovery>: > The decision about whether a subscriber should look for <link> elements inside a page's <body> (as well as the <head>) is not straightforward, and there is currently no clear consensus. So we will continue to support `link`s in `body`.
The specification is not completely clear on what should happen if there is `Link` header containing `rel=hub` link but `rel=self` is only present in `<link>` element: <https://w3c.github.io/websub/#discovery>: > The protocol currently supports the following discovery mechanisms. Publishers *MUST* implement at least one of them: > > - Link Headers [RFC5988]: the publisher *SHOULD* include at least one Link Header [RFC5988] with `rel=hub` (a hub link header) as well as exactly one Link Header [RFC5988] with `rel=self` (the self link header) > - If the topic is an XML based feed, publishers *SHOULD* use embedded link elements as described in Appendix B of Web Linking [RFC5988]. Similarly, for HTML pages, publishers *SHOULD* use embedded link elements as described in Appendix A of Web Linking [RFC5988]. Previously, if we found 'Link' header containing `rel=hub`, we would expect `rel=self` in a header as well. Let’s treat both `rel`s independently. As before, headers take precedence over HTML elements.
According to <https://www.w3.org/TR/websub/#x8-1-discovery>, there is no clear consensus about `link` elements in `body` but there are concerns about injection attacks on badly written websites: > The decision about whether a subscriber should look for `<link>` elements inside a page's `<body>` (as well as the `<head>`) is not straightforward, and there is currently no clear consensus. One reason to ignore the `<body>` during discovery is that some web sites might (perhaps accidentally) allow users to post content containing `<link>` elements, though the working group does not know of any specific examples of such sites. If WebSub discovery uses such `<link>` elements, a user contributing to such sites could potentially maliciously cause all subscribers to use an alternate hub which later delivers malicious content. Given this potential attack, it may be prudent to do discovery only in the `<head>` of HTML documents. Let’s be prudent and only look for `link`s in HTTP headers and the first `head` element.
Art4
left a comment
There was a problem hiding this comment.
Thanks for the great work. 👍
Previously, this only supported
File, let’s prepare for otherResponseimplementations.Also fixes a bug where it would clear previous links (by setting it to comma instead of appending).
And where it would add redundant
rel=selflink.