Skip to content

Split SimplePie\File class into HTTP client and response#815

Merged
mblaney merged 16 commits intosimplepie:masterfrom
jtojnar:split-file-into-client-and-response
May 30, 2023
Merged

Split SimplePie\File class into HTTP client and response#815
mblaney merged 16 commits intosimplepie:masterfrom
jtojnar:split-file-into-client-and-response

Conversation

@jtojnar
Copy link
Member

@jtojnar jtojnar commented Mar 19, 2023

This is #774, with few clean-ups for easier reviewing commit-by-commit (commits squashed or split to ideally fit into my head, added some implementation notes into commit messages).

The main differences as outlined in #774 (review):

  • Renamed Response::get_requested_uri() to Response::get_final_requested_uri() so it is clearer what this does.
  • We do not expect Response::get_headers() to preserve header capitalization, removed mention of that from the comment.
  • Using SPDX to credit PHP Framework Interoperability Group, as the documentation comments taken from PSR-7 are non-trivial.
  • Adding support for HTTP\Parser to produce PSR-7 compatible headers, using $parsed_headers internally in File as $774 did initially, and recomputing $parsed_headers when $headers change (which should not really happen outside of tests but better safe than sorry).
  • Extracting the is_remote check to Misc::is_remote_uri().
  • Passing the reduced timeout to SimplePie::get_http_client() factory instead of side-channel signalling through $this->timeout.
  • Fixing early loop terminations in Locator.

Full diff (will be cleaner once dependencies are merged).

Depends on:

Should be mergable without squashing (e.g. rebase).

@jtojnar jtojnar force-pushed the split-file-into-client-and-response branch from fddcf74 to 5c4652c Compare March 19, 2023 03:33
@jtojnar jtojnar force-pushed the split-file-into-client-and-response branch from 5c4652c to feae95c Compare March 19, 2023 12:46
@jtojnar jtojnar marked this pull request as ready for review March 19, 2023 12:49
@jtojnar jtojnar marked this pull request as draft March 19, 2023 12:58
@jtojnar jtojnar force-pushed the split-file-into-client-and-response branch from feae95c to 8b22f57 Compare March 19, 2023 14:10
@jtojnar jtojnar marked this pull request as ready for review March 19, 2023 14:10
@jtojnar jtojnar marked this pull request as draft March 20, 2023 10:07
@jtojnar jtojnar force-pushed the split-file-into-client-and-response branch 3 times, most recently from a35dc7e to 90ed956 Compare March 20, 2023 11:48
@jtojnar jtojnar marked this pull request as ready for review March 20, 2023 11:51
@jtojnar jtojnar force-pushed the split-file-into-client-and-response branch 3 times, most recently from 4e67dec to 15cac44 Compare March 20, 2023 18:30
@jtojnar jtojnar force-pushed the split-file-into-client-and-response branch 4 times, most recently from db532eb to 1fc0248 Compare March 21, 2023 18:32
@jtojnar jtojnar force-pushed the split-file-into-client-and-response branch from 1fc0248 to f7d8ef1 Compare April 2, 2023 04:19
@jtojnar
Copy link
Member Author

jtojnar commented Apr 2, 2023

Rebased.

@jtojnar jtojnar force-pushed the split-file-into-client-and-response branch from f7d8ef1 to 0d1047c Compare April 2, 2023 05:02
@jtojnar jtojnar force-pushed the split-file-into-client-and-response branch from 0d1047c to deeac78 Compare May 21, 2023 12:33
jtojnar and others added 9 commits May 21, 2023 23:05
There are correctness issues as well as disorganized commit history.
The fixed changes are re-applied in the commits that follow.
Dividing a number may return a float, which will cause a `TypeError` once we add type hints.

The value was `1` originally, when timeouts were introduced along with `SimplePie_File` in
simplepie@5bf1814

We could consider hardcoding it again but we still want to keep it proportional to the master timeout
to allow people with slow connection to increase it, as introduced in
simplepie@fe7e333

In fact, reducing the timeout probably does not make sense in the first place.
Even if we assume majority of cached resources will not get outdated,
waiting for server to reply `HTTP 304 Not Modified` can still take time
(domain name resolution can be particularly slow).
And if it turns out that the cached version is outdated,
the web server will try to send an updated version.
Having too low timeout can mean the request will be terminated,
only to be started again with the original timeout.

The timeout reduction was presumably done to prevent
fetching a cached resource from an unresponsive server without `force_cache_fallback`
from essentially doubling the timeout due to re-running the failed request.

Let’s use the original timeout and prevent repeat requests instead.
- Drop unused variable
- Add white space for clarity
PSR-7 `MessageInterface` expects values of each header is always a list.
But concatenating multiple instances of the same header like we do now
is a lossy operation so we cannot confidently split them.

To allow us to use PSR-7-like semantics internally,
let’s add PSR-7 mode to the HTTP parser that will avoid the imploding.
`Client` is sort of like `ClientInterface` from PSR-18 but the `request()` method expects request parameters as discrete arguments rather than wrapped in PSR-7 `RequestInterface` like `ClientInterface::sendRequest()` does.

`Response` is a pared-down `ResponseInterface` from PSR-7. There are also other notable differences from PSR-7:
- Method names use snake_case.
- The class also provides the URI of the final HTTP request after redirects through `get_final_requested_uri()` and permanent URI of the requested resource through `get_permanent_uri()`
- Body is returned as a string rather than stream in `get_body_content()`.
- `get_headers()` does not have to preserve the exact case in which headers were originally specified.

https://www.php-fig.org/psr/psr-7/
https://www.php-fig.org/psr/psr-18/

Acked-by: Jan Tojnar <jtojnar@gmail.com>
Implementing the PSR-7 inspired `Response` interface gets us one step
closer to pluggable HTTP implementation.

Most of the complexity comes from the fact that PSR-7 `MessageInterface`
expects values of each header to always be a list, while `File` stored
each as a single strings. But we cannot change the type of the `$headers`
property without breaking backwards compatibility (especially for `File`
subclasses that modify the headers), and we cannot just confidently split
the header lines in the methods, since the concatenating multiple instances
of the same header like `HTTP\Parser` currently does is a lossy operation.

As a result, we have to check if `$headers` property changed every time
we access the internal header state.

Co-authored-by: Jan Tojnar <jtojnar@gmail.com>
Mostly extract repeat use of values from `File` object into local variables.
This will make the changes in the next commit cleaner.

Acked-by: Jan Tojnar <jtojnar@gmail.com>
@jtojnar jtojnar force-pushed the split-file-into-client-and-response branch from f110ea7 to 4a56b21 Compare May 21, 2023 22:33
Copy link
Member Author

@jtojnar jtojnar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, fixed.

@jtojnar jtojnar force-pushed the split-file-into-client-and-response branch 2 times, most recently from faf8181 to 573e67d Compare May 22, 2023 13:08
return $iri->get_uri();
}

public static function is_remote_uri(string $uri): bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a note at the CHANGELOG.md about this new method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I have decided to mark it as internal instead to avoid increasing the API surface and having to deprecate it later.

Art4 and others added 7 commits May 22, 2023 17:44
Acked-by: Jan Tojnar <jtojnar@gmail.com>
Introduces a trivial `Client` that just creates a `File` object.

Acked-by: Jan Tojnar <jtojnar@gmail.com>
Acked-by: Jan Tojnar <jtojnar@gmail.com>
Acked-by: Jan Tojnar <jtojnar@gmail.com>
Acked-by: Jan Tojnar <jtojnar@gmail.com>
Methods should be used instead.
Acked-by: Jan Tojnar <jtojnar@gmail.com>
@jtojnar jtojnar force-pushed the split-file-into-client-and-response branch from 573e67d to 7a48ae6 Compare May 22, 2023 15:45
@Art4
Copy link
Contributor

Art4 commented May 26, 2023

@mblaney This is ready for merge.

@mblaney mblaney merged commit 1d776d5 into simplepie:master May 30, 2023
@jtojnar jtojnar deleted the split-file-into-client-and-response branch May 30, 2023 12:15
@Art4 Art4 added this to the 1.9.0 milestone Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants