Skip to content

FileClient: Expose HTTP status as exception code#905

Closed
pe1uca wants to merge 2 commits intosimplepie:masterfrom
pe1uca:status_code_fix
Closed

FileClient: Expose HTTP status as exception code#905
pe1uca wants to merge 2 commits intosimplepie:masterfrom
pe1uca:status_code_fix

Conversation

@pe1uca
Copy link

@pe1uca pe1uca commented Dec 11, 2024

@jtojnar jtojnar changed the title Status code fix FileClient: Expose HTTP status as exception code Dec 12, 2024
Copy link
Member

@jtojnar jtojnar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pull request, including tests.

Would you mind changing the commit messages to something like the following so that the context is clearer in the commit log:

  • FileClient: Expose HTTP status as exception code
  • SimplePie: Propagate HTTP status code from exception

} catch (HttpException $th) {
$this->check_modified = false;
$this->status_code = 0;
$this->status_code = $th->getCode();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense for genuine HTTP errors but we sometimes promoted a random Exception to HTTPException, even when there is no meaningful error code.

So I wonder if we should add a 1000 to the error code of promoted exceptions like this:

and then only change the status code when it does not:

Suggested change
$this->status_code = $th->getCode();
$this->status_code = $th->getCode() < 1000 ? $th->getCode() : 0;

Though of course this will be easy to forget if we ad more HTTPException promotions.

@jtojnar
Copy link
Member

jtojnar commented Dec 12, 2024

Actually, thinking about it some more, it seems to me like HttpException is a misnomer and should be named something like ClientException. According to PSR-18:

A Client MUST NOT treat a well-formed HTTP request or HTTP response as an error condition. For example, response status codes in the 400 and 500 range MUST NOT cause an exception and MUST be returned to the Calling Library as normal.

And our Psr18Client does appear to preserve the semantics, only promoting ClientExceptionInterface implementations to HttpException.

Similarly, FileClient throws HttpException when success property is false but that only appears to be the case in errors other than HTTP error responses. In fact, File never seems to set status_code property when setting success = false, so get_status_code() there should always return the default 0 value.

@jtojnar
Copy link
Member

jtojnar commented Dec 13, 2024

Actually, we do enable CURLOPT_FAILONERROR in File so the success will be false on HTTP error in this case:

simplepie/src/File.php

Lines 139 to 142 in 72b4293

$this->status_code = curl_getinfo($fp, CURLINFO_HTTP_CODE);
if (curl_errno($fp)) {
$this->error = 'cURL error ' . curl_errno($fp) . ': ' . curl_error($fp);
$this->success = false;

So that means Psr18Client is inconsistent with FileClient as the former will not return HttpException on HTTP error.

@jtojnar jtojnar requested a review from Art4 December 13, 2024 01:04
@pe1uca
Copy link
Author

pe1uca commented Dec 16, 2024

I'm guessing all the changes needed to improve the flows and make them consistent should be addressed in another PR, right?

My main fix was in the class SimplePie to get 429 error codes.

The change to FileClient was just an addition in case something sets that value.
If nothing sets the value, as you mentioned, it remains at 0, so it won't change the behavior of HttpException.

@Art4
Copy link
Contributor

Art4 commented Dec 18, 2024

I have some fundamental issues with this PR. The HttpException is not made to represent a response, but to tell you that there was a problem making the request. If you have a response status code (even 500) the pure http request was successful, so no HttpException will be thrown. (Please note that the File/FileClient will be deprecated in favor of PSR-18.)

See PSR-18 error handling https://www.php-fig.org/psr/psr-18/#error-handling:

A Client MUST NOT treat a well-formed HTTP request or HTTP response as an error condition. For example, response status codes in the 400 and 500 range MUST NOT cause an exception and MUST be returned to the Calling Library as normal.

The request could fail for much more reasons as only for an HTTP request, e.g. reading a file from the local file system. One could not rely on the exception code to get a possible response status code. In opposite if you have a response status code the HTTP request was successful (but the response could be a 500 error).

Instead of moving the status code into the HttpException you can use the Psr18Client and get the status code from the response:

try {
    $response = $psr18Client->request(
        'GET',
        'http://example.com/feed.xml',
        ['Accept' => 'application/atom+xml']
    );
} catch (HttpException $e) {
    // Something went wrong making the HTTP request, reading the local file, etc
}

// $response is instance of \SimplePie\HTTP\Response
$statusCode = $response->get_status_code(); // Check for 500, 429, 200 or what else

@jtojnar
Copy link
Member

jtojnar commented Dec 18, 2024

Right. But currently, FileClient is not deprecated yet and it does throw HttpException for HTTP error response:

curl_setopt($fp, CURLOPT_FAILONERROR, 1);

simplepie/src/File.php

Lines 140 to 142 in 72b4293

if (curl_errno($fp)) {
$this->error = 'cURL error ' . curl_errno($fp) . ': ' . curl_error($fp);
$this->success = false;

if (!$file->success) {
throw new HttpException($file->error);
}

We probably cannot drop the CURLOPT_FAILONERROR due to BC so we will probably want FileClient to check if !$file->success && $file->get_status_code() === 0 and only throwing HttpException then.

@Art4
Copy link
Contributor

Art4 commented Dec 18, 2024

We can do this right now. HttpException and FileClient only exists in the master branch so far. We can deprecate it or mark it as internal. (FileClient is marked as internal already for exact this reason. 🙈 ) And we should rename the HttpException to ClientException as you suggested, maybe as part of #826?

@pe1uca
Copy link
Author

pe1uca commented Jan 14, 2025

FileClient is marked as internal already

Probably the issue is if the users of SimplePie don't call set_http_client() the default behavior is to use FileClient.

simplepie/src/SimplePie.php

Lines 3437 to 3454 in 72b4293

private function get_http_client(): Client
{
if ($this->http_client === null) {
$this->http_client = new FileClient(
$this->get_registry(),
[
'timeout' => $this->timeout,
'redirects' => 5,
'useragent' => $this->useragent,
'force_fsockopen' => $this->force_fsockopen,
'curl_options' => $this->curl_options,
]
);
$this->http_client_injected = true;
}
return $this->http_client;
}

The documentation for set_http_client() says

This will become required with SimplePie 2.0.0.

Is there any word on when this version is planned to be released?
Just wondering in case it's worth shipping this minor fix for FileClient


I've never totally understood PSR wording, so please bare with me 😅
The PSR-18 page says

Client - A Client is a library that implements this specification [...]

And

A Client is an object implementing ClientInterface.

But in the repo there's no class which implements this interface.
The name of the class Psr18Client makes me believe this already implements the PSR, but instead accepts a parameter ClientInterface $httpClient which should be the object actually implementing the specification.
Is this correct? Or am I missing something?

So, this will make every user of SimplePie migrating to 2.0.0 (when it comes out) require to implement their own way of making http requests instead of being already included out of the box.
Wouldn't this cause a lot of friction of adoption of the new version?
Or is this the intended flow?

@jtojnar
Copy link
Member

jtojnar commented Jan 14, 2025

Just wondering in case it's worth shipping this minor fix for FileClient

I think it is still worth it. It is hard to predict volunteer availability.

I've never totally understood PSR wording, so please bare with me 😅
The PSR-18 page says

Client - A Client is a library that implements this specification [...]

And

A Client is an object implementing ClientInterface.

I would call the first a PSR-18 Client library but I guess they decided to keep it short. Such shortening is fine in unambiguous cases, but ambiguity is often easy to miss, which leads to confusion, as you have experienced.

Also, since HTTP is a client-server protocol, even more confusingly, client can also refer to the abstract role in the HTTP protocol, or any implementation the role, such as curl.

The purpose of PSR-18 is to make the HTTP client implementations interchangeable. Without that, each library that wants to use HTTP would need to bring its own HTTP client implementation. As a result, an application that uses multiple such libraries (e.g. RSS reader can use SimplePie and Graby) would transitively depend on multiple, possibly slightly different HTTP libraries.

An example of a client library is https://github.com/guzzle/guzzle. You can see the full list at https://packagist.org/providers/psr/http-client-implementation.

But in the repo there's no class which implements this interface.

SimplePie is not a PSR-18 client. It supports using such library to allow consumers to select their preferred HTTP client implementation.

The name of the class Psr18Client makes me believe this already implements the PSR, but instead accepts a parameter ClientInterface $httpClient which should be the object actually implementing the specification.

Yes, the name is a bit confusing since language is always a bit ambiguous when we do not want overly long class names. The Psr18Client class is a HTTP client that is implemented as a proxy class for PSR-18 HTTP Client. The Psr18Client is parametric and must be provided by the consumer of SimplePie.

We introduced Psr18Client and FileClient as a temporary abstraction so that we can support PSR-18 in addition to the legacy File class. In the future, it is likely going away in favour of using ClientInterface directly.

So, this will make every user of SimplePie migrating to 2.0.0 (when it comes out) require to implement their own way of making http requests instead of being already included out of the box. Wouldn't this cause a lot of friction of adoption of the new version?

Yes, there is some upgrade cost but projects do not need to implement their own HTTP library, there are tons of ready made PSR-18 HTTP clients. And most modern projects will already be using one of them.

@Art4
Copy link
Contributor

Art4 commented Jan 14, 2025

I created #909 to rename the misleading HttpException to ClientException and marked it as @internal because it should always catched internally. I also made sure that the response from FileClient and Psr18Client always returns the correct status code. The ClientException is only thrown if there was an error connecting the server or reading a local file. In both cases there is no response status code, so it defaults to 0.

SimplePie stores the status code from the response, so one can use $simplePie->status_code() to get the status code of the latest HTTP request. If $simplePie->status_code() returns 0 there was an error and one can use $simplePie->error() to get the error message why the HTTP request fails.

I don't see any reason for changes in this PR.

@Art4
Copy link
Contributor

Art4 commented Jan 14, 2025

Just to be clear: I understand that the real problem of FreshRSS/FreshRSS#7038 is that in FreshRSS SimplePie logs 429 status code but $simplePie->status_code returns 0. I'm not sure why this happens and I cannot reproduce it in #909. So the problem must be somewhere else. Please investigate this problem to make sure you found the real reason so we can work on a solution.

@jtojnar
Copy link
Member

jtojnar commented Jan 14, 2025

As mentioned above, I believe the issue is that FileClient promotes HTTP errors to ClientError because File enables CURLOPT_FAILONERROR.

@Art4
Copy link
Contributor

Art4 commented Jan 14, 2025

@jtojnar Thank you. 👍 I should have read the whole conversion again after such a long time. I will try to reproduce it again.

Alkarex added a commit to Alkarex/FreshRSS that referenced this pull request Jun 15, 2025
Alkarex added a commit to FreshRSS/FreshRSS that referenced this pull request Jun 15, 2025
* SimplePie: Fix propagation of HTTP error codes
fix #7038
FreshRSS/simplepie#36
upstream simplepie/simplepie#905

Co-authored-by: Edgar Alvarado <15692727+pe1uca@users.noreply.github.com>
Copy link
Contributor

@Art4 Art4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #909 I've added integration tests to show that $simplepie->status_code() always returns the correct status code if the request was successful (if the server had send a response, even a 429 or 500 code). If the request failed, $simplepie->status_code() returns 0. In this case one can use $simplePie->error() to get the error message why the HTTP request fails.

The HttpException/ClientException is an internal class for now and should not be used and should not contain the status code as exception code to be compatible with PSR-18 for future changes.

jtojnar pushed a commit that referenced this pull request Jun 25, 2025
…or in FileClient (#909)

This PR renames the `SimplePie\Exception\HttpException` to `SimplePie\HTTP\ClientException` to avoid confusion (#905) that this exception is thrown on an 4xx or 5xx response from the server. Because the `ClientException` is always caught internally, I marked the class as `@internal`.

Additionally, change the `FileClient` to only throw the exception on non-HTTP errors to match PSR-18 and our `Psr18Client`. This is not a BC break since this API is not part of any release yet.

Also add unit tests to show that the responses (`File` or `Psr7Response` class) always contain the status code, even if it is 429. Plus integration tests using a local HTTP server.
@Art4
Copy link
Contributor

Art4 commented Jun 26, 2025

@pe1uca Thank you for pointing out this issue. Does #909 and using $simplepie->status_code() work for you? Can we close this PR?

@Alkarex
Copy link
Contributor

Alkarex commented Jul 26, 2025

Actually, we also need to expose the HTTP headers, for instance to access the Retry-After header after an HTTP 429 Too Many Requests.
Any suggestion for that?

@jtojnar
Copy link
Member

jtojnar commented Jul 29, 2025

Maybe #930

@Art4 Art4 closed this Jul 29, 2025
Alkarex added a commit to Alkarex/simplepie that referenced this pull request Aug 1, 2025
Alkarex added a commit to Alkarex/FreshRSS that referenced this pull request Aug 1, 2025
Alkarex added a commit to FreshRSS/FreshRSS that referenced this pull request Aug 1, 2025
* Bump SimplePie with PHPStan Level 8
* FreshRSS/simplepie#45
SimplePie increased to PHPStan Level 8:
* simplepie/simplepie#857

* Merge upstream
Including my two PRs:
* simplepie/simplepie#932
* simplepie/simplepie#933

* Resolve upstream sync of Expose HTTP status
* FreshRSS/simplepie#47
Finalise merge, following:
* simplepie/simplepie#905 (comment)
* simplepie/simplepie#909
* #7038
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Not all Client requests properly propagate the http code of the response

4 participants