Skip to content

SimplePie: Prepare PubSubHubbub for PSR-18 clients#838

Merged
jtojnar merged 13 commits intosimplepie:masterfrom
jtojnar:websub-response
Jul 29, 2025
Merged

SimplePie: Prepare PubSubHubbub for PSR-18 clients#838
jtojnar merged 13 commits intosimplepie:masterfrom
jtojnar:websub-response

Conversation

@jtojnar
Copy link
Member

@jtojnar jtojnar commented May 31, 2023

Previously, this only supported File, let’s prepare for other Response implementations.

Also fixes a bug where it would clear previous links (by setting it to comma instead of appending).
And where it would add redundant rel=self link.

@Art4 Art4 mentioned this pull request Sep 13, 2023
7 tasks
@Art4
Copy link
Contributor

Art4 commented Jan 10, 2024

@mblaney Are there any objections?

@jtojnar
Copy link
Member Author

jtojnar commented Jan 10, 2024

I will try to get back to SimplePie development on weekend.

@mblaney
Copy link
Member

mblaney commented Jan 13, 2024

@mblaney Are there any objections?

no this looks good to me 👍

@jtojnar jtojnar force-pushed the websub-response branch 2 times, most recently from 48bab64 to b9fa194 Compare May 18, 2024 16:29
@pull-request-size pull-request-size bot added size/L and removed size/M labels May 18, 2024
@jtojnar jtojnar marked this pull request as draft May 18, 2024 16:29
@Art4 Art4 added this to the 1.9.0 milestone Sep 27, 2024
@jtojnar jtojnar marked this pull request as ready for review September 29, 2024 16:05
@jtojnar jtojnar marked this pull request as draft September 29, 2024 16:09
@Art4
Copy link
Contributor

Art4 commented Jun 26, 2025

We should add some tests.

@jtojnar
Copy link
Member Author

jtojnar commented Jun 27, 2025

Doing some more research, I am still not completely sure what the point is.

It was introduced in 01eb2c5 and refactored in 426dc5b.

The only idea for why we could have been promoting links to heasers is to allow a[rel="hub"] to be found by SimplePie::get_link() (since link elements should be picked up automatically).

But searching GitHub, I only found one relevant reference: a chat log of a discussion, whose conclusion was that clients MUST NOT support a[rel="hub"].

So perhaps this should be excised?

@pull-request-size pull-request-size bot added size/M and removed size/L labels Jun 27, 2025
@pull-request-size pull-request-size bot added size/L and removed size/M labels Jun 29, 2025
@jtojnar jtojnar marked this pull request as ready for review June 29, 2025 13:10
@jtojnar jtojnar force-pushed the websub-response branch 2 times, most recently from 905b3ed to 95e8eab Compare June 29, 2025 13:31
@jtojnar
Copy link
Member Author

jtojnar commented Jun 29, 2025

Also removing support for a[rel].

@jtojnar jtojnar requested a review from Art4 June 29, 2025 14:55
@jtojnar
Copy link
Member Author

jtojnar commented Jul 1, 2025

This now also appears to be blocked by #921

This was referenced Jul 27, 2025
jtojnar added a commit that referenced this pull request Jul 29, 2025
mf2 library uses implicit nullable types, which were deprecated in PHP 8.4:
https://wiki.php.net/rfc/deprecate-implicitly-nullable-types

This causes the response body of the HTTP mock server in tests to be polluted with the following warning:

    Deprecated: Mf2\Parser::parse(): Implicitly marking parameter $context as nullable is deprecated, the explicit nullable type must be used instead in vendor/mf2/mf2/Mf2/Parser.php on line 1373

This breaks tests in <#838> and <#919>.

Until the upstream fix (microformats/php-mf2#264) is merged, let’s uninstall mf2 in PHP 8.4 CI.

Works around: #921

Co-authored-by: Artur Weigandt <Art4@users.noreply.github.com>
jtojnar added 13 commits July 29, 2025 09:10
This was artifact of copy paste in ae6665d
So that we can re-use it in integration tests.
ae6665d already introduced an unit test for `Parser` but let’s also check it is working properly in `SimplePie`.
`Locator::get_rel_link()` would return `null`, giving us a type error.

Also drop redundant param arguments.
This needs to use a web server because links are only extracted from HTML files in `SimplePie::fetch_data()`, which is not called when `SimplePie::set_raw_data()` is used. And using `SimplePie::set_feed_url()` with local file will not work either because `Locator::is_feed()` returns `true` for local files.
…rays

If header is present, it must have at least one header line.

This will be needed to make PHPStan happy with `Response::with_header()` we are about to introduce.
This is similar to `Psr\Http\Message\MessageInterface::withHeader()`
but does not preserve the casing of the header.

It will be useful for storing metadata (e.g. WebSub link) in the response.

We cannot use `static` as return type hint since it is only supported in PHP 8.0:
https://php.watch/versions/8.0/static-return-type
Nor can we use `self`, as that requires return covariance introduced in PHP 7.4:
https://wiki.php.net/rfc/covariant-returns-and-contravariant-parameters
It would clear previous links by setting the header to comma instead of appending.
…in header

As per <https://pubsubhubbub.github.io/PubSubHubbub/pubsubhubbub-core-0.4.html#discovery>:

> In the absence of HTTP [RFC2616] Link headers, subscribers MAY fall back to other methods to discover the hub(s) and the canonical URI of the topic. […] Similarly, for HTML pages, it MAY use embedded link elements as described in Appendix A of Web Linking [RFC5988].
Previously, this only supported `File`, let’s prepare for other `Response` implementations.
It was introduced in 01eb2c5 and refactored in 426dc5b.

However, according to chat log of a discussion between the writers of the PubSubHubbub specification, the consensus was that, due to a potential for injection attacks on poorly written websites, clients MUST NOT support `a[rel="hub"]`.

See https://chat.indieweb.org/dev/2016-11-21#t1479767303222000 and https://www.github.com/w3c/websub/issues/67

We will thus remove support for it.

Similar concerns apply to `link` elements in headers but per <https://www.w3.org/TR/websub/#x8-1-discovery>:

> The decision about whether a subscriber should look for <link> elements inside a page's <body> (as well as the <head>) is not straightforward, and there is currently no clear consensus.

So we will continue to support `link`s in `body`.
The specification is not completely clear on what should happen if there is `Link` header containing `rel=hub` link but `rel=self` is only present in `<link>` element:

<https://w3c.github.io/websub/#discovery>:

> The protocol currently supports the following discovery mechanisms. Publishers *MUST* implement at least one of them:
>
> - Link Headers [RFC5988]: the publisher *SHOULD* include at least one Link Header [RFC5988] with `rel=hub` (a hub link header) as well as exactly one Link Header [RFC5988] with `rel=self` (the self link header)
> - If the topic is an XML based feed, publishers *SHOULD* use embedded link elements as described in Appendix B of Web Linking [RFC5988]. Similarly, for HTML pages, publishers *SHOULD* use embedded link elements as described in Appendix A of Web Linking [RFC5988].

Previously, if we found 'Link' header containing `rel=hub`, we would expect `rel=self` in a header as well.
Let’s treat both `rel`s independently.
As before, headers take precedence over HTML elements.
According to  <https://www.w3.org/TR/websub/#x8-1-discovery>, there is no clear consensus about `link` elements in `body` but there are concerns about injection attacks on badly written websites:

> The decision about whether a subscriber should look for `<link>` elements inside a page's `<body>` (as well as the `<head>`) is not straightforward, and there is currently no clear consensus. One reason to ignore the `<body>` during discovery is that some web sites might (perhaps accidentally) allow users to post content containing `<link>` elements, though the working group does not know of any specific examples of such sites. If WebSub discovery uses such `<link>` elements, a user contributing to such sites could potentially maliciously cause all subscribers to use an alternate hub which later delivers malicious content. Given this potential attack, it may be prudent to do discovery only in the `<head>` of HTML documents.

Let’s be prudent and only look for `link`s in HTTP headers and the first `head` element.
@jtojnar jtojnar requested a review from Art4 July 29, 2025 07:10
Copy link
Contributor

@Art4 Art4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great work. 👍

@jtojnar jtojnar merged commit bc03e27 into simplepie:master Jul 29, 2025
10 checks passed
@jtojnar jtojnar deleted the websub-response branch July 29, 2025 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants