Skip to content

Disable cURL encoding/compression handling when a bad Content-Encoding header is received#962

Merged
jtojnar merged 2 commits intosimplepie:masterfrom
mmeier86:disable-curl-encoding-handling-for-bad-content-encoding
Feb 7, 2026
Merged

Disable cURL encoding/compression handling when a bad Content-Encoding header is received#962
jtojnar merged 2 commits intosimplepie:masterfrom
mmeier86:disable-curl-encoding-handling-for-bad-content-encoding

Conversation

@mmeier86
Copy link
Contributor

@mmeier86 mmeier86 commented Jan 1, 2026

This PR relates to #961 and FreshRSS/FreshRSS#8374.

When cURL has already returned a bad encoding error, this is a good indication that the server ignores Accept-Encoding headers, as the CURLOPT_ENCODING option is already set to '' for the first request. This results in cURL sending an Accept-Encoding header with all supported encodings (compression algorithms, in practice) already in the first request. And the fact that this request results in a CURLE_BAD_CONTENT_ENCODING error indicates that the server ignored that header already. So trying again, but this time with CURLOPT_ENCODING = 'none' seems unlikely to result in success?

Instead, my proposal here is to set CURLOPT_ENCODING to null, which tells cURL to disable compression handling altogether and to ignore the bad Content-Encoding header send by the server.

I was able to confirm that this works in tests with FreshRSS against a test setup where the server sends a valid index.xml with an RSS feed, but with the Content-Encoding header set to aws-chunked. With CURLOPT_ENCODING = 'none', the fetch fails and the body is empty, leading to FreshRSS not being able to add the feed. But with CURLOPT_ENCODING = null set, the fetch succeeds and the feed can be added.

@Art4 Art4 added this to the 1.10.0 milestone Jan 1, 2026
Alkarex added a commit to FreshRSS/simplepie that referenced this pull request Jan 3, 2026
Alkarex added a commit to FreshRSS/FreshRSS that referenced this pull request Jan 3, 2026
…LOPT_ENCODING (#8376)

* Replace deprecated CURLOPT_ENCODING

The CURLOPT_ENCODING setting has been deprecated in favor of
CURLOPT_ACCEPT_ENCODING.

Signed-off-by: Michael Meier <mmeier1986@gmail.com>

* Sync with our SimplePie fork PR
FreshRSS/simplepie#67
simplepie/simplepie#960
simplepie/simplepie#962

* Our SimplePie PR merged

---------

Signed-off-by: Michael Meier <mmeier1986@gmail.com>
Co-authored-by: Alexandre Alapetite <alexandre@alapetite.fr>
@jtojnar
Copy link
Member

jtojnar commented Jan 15, 2026

Thanks.

Looks like this was introduced in f5d76cd to fix specific bug in another site but the feed mentioned in the issue no longer works so hard to tell if it is needed nowadays.

Accept-Encoding does not support none, only identity so I agree falling back to the former does not make sense, except as a quirk for some non-compliant server. But without a reproducible example, I would go with your fix.

It would also be helpful to include a test. Something like the following (with #966):

--- a/tests/Integration/SimplePieTest.php
+++ b/tests/Integration/SimplePieTest.php
@@ -553,4 +553,37 @@ HTML
         self::assertLessThanOrEqual(1, count($feed->get_links('self') ?? []), 'Link rel=self should not be promoted from HTML when it is already present in headers');
         self::assertSame($bogoUrl, $feed->get_link(0, 'bogo'), 'Link rel=bogo does not match');
     }
+
+    public function testSimplePieIgnoresBadContentEncodingHeader(): void
+    {
+        $server = new MockWebServer();
+        $server->start();
+
+        $filepath = dirname(__FILE__, 2) . '/data/feed_rss-2.0.xml';
+        $body = file_get_contents($filepath);
+        \assert($body !== false); // For PHPStan
+
+        $url = $server->setResponseOfPath(
+            '/bad-content-encoding',
+            new MockWebServerResponse($body, [
+                'content-type: application/rss+xml',
+                'content-encoding: aws-chunked',
+            ], 200)
+        );
+
+        $feed = new SimplePie();
+        $feed->enable_cache(false);
+
+        $feed->set_feed_url($url);
+
+        // For some reason, without this, Sniffer thinks we have text/plain on error.
+        $feed->force_feed(true);
+
+        $return = $feed->init();
+        $server->stop();
+
+        $error = implode("\n", (array) ($feed->error() ?? '')); // For PHPStan
+        self::assertTrue($return, 'Failed fetching feed: ' . $error);
+        self::assertSame(200, $feed->status_code());
+    }
 }

@mmeier86
Copy link
Contributor Author

I'm not exactly fluent in PHP, but I will give it a good try.

@mmeier86 mmeier86 force-pushed the disable-curl-encoding-handling-for-bad-content-encoding branch from 6340688 to 74c5496 Compare January 18, 2026 11:28
@mmeier86
Copy link
Contributor Author

It looks like the fix doesn't work for older PHP versions. I'm trying to figure out what the issue is there. I've also added an assert to make sure that cURL is actually available in the test, as I initially didn't have cURL support in my local PHP setup and couldn't get the test to fail at all, as it used the non-cURL code path without me realising.

@jtojnar
Copy link
Member

jtojnar commented Jan 18, 2026

Looks like setting NULL does not work on PHP < 8.1 because of php/php-src#11433, which leads to curl returning:

string(113) "cURL error 61: Unrecognized content encoding type. libcurl understands deflate, gzip, br, zstd content encodings."

@jtojnar
Copy link
Member

jtojnar commented Jan 18, 2026

I managed to work around it by creating a new curl handle from scratch but PHPStan does not like that on older versions of PHP where curl_init returns resource instead of CurlHandle:

https://github.com/simplepie/simplepie/compare/master...jtojnar:simplepie:disable-curl-encoding-handling-for-bad-content-encoding?expand=1

@mmeier86
Copy link
Contributor Author

I think I see what you're getting at with that commit, let me see whether I can stretch my non-existent PHP skills far enough to make it work in this PR.

@jtojnar
Copy link
Member

jtojnar commented Jan 18, 2026

I managed to get it working using conditional PHPStan config, you can just cherry pick both commits if you want.

jtojnar and others added 2 commits January 18, 2026 19:39
On PHP < 8.1.21, we will not be able to disable `CURLOPT_ACCEPT_ENCODING` by setting it to `NULL`:
php/php-src#11433
`curl_setopt` with such values will be a no-op so the secound `curl_exec` will still fail with `CURLE_BAD_CONTENT_ENCODING` and return an empty body.
Even worse, it will still report 200 in `CURLINFO_HTTP_CODE` so `FileClient` will not throw an exception because the `status_code` property is not 0.

Let’s extract the code so that we can create a fresh `CurlHandle` and later make the `CURLOPT_ACCEPT_ENCODING` conditional.

Unfortunately, since PHP 8.0 changed `curl_init` to return `CurlHandle` instead of `resource`, and PHPStan has no simple way to make the return type depend on PHP version, we have to resort to an ugly hack using `typeAliases`.
Setting the CURLOPT_ENCODING to 'none' is not officially supported by
cURL. Setting it in case of cURL receiving an invalid Content-Encoding
header from the server and re-trying would not fix the issue.

The current implementation already sets cURL up to send an Accept-Encoding
header with all supported encodings a couple of lines above this change.
If the fetch still returns a BAD_CONTENT_ENCODING error, the server already
ignored the Accept-Encoding headers once.

This change, instead of sending 'none' in a re-try, disables cURL's content
encoding handling (in practice, handling compression).

Co-authored-by: Jan Tojnar <jtojnar@gmail.com>
@mmeier86 mmeier86 force-pushed the disable-curl-encoding-handling-for-bad-content-encoding branch from 74c5496 to 98fb5c3 Compare January 18, 2026 18:41
@pull-request-size pull-request-size bot added size/L and removed size/M labels Jan 18, 2026
@mmeier86
Copy link
Contributor Author

That seems to have done the trick.

Copy link
Member

@jtojnar jtojnar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@jtojnar jtojnar requested a review from Art4 January 18, 2026 19:30
Copy link
Contributor

@Art4 Art4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmeier86 Thanks for your contribution!

@jtojnar nice solution for curl_init() and PHPStan, thanks!

@jtojnar jtojnar merged commit 4f59520 into simplepie:master Feb 7, 2026
10 checks passed
Alkarex referenced this pull request Mar 3, 2026
On PHP < 8.1.21, we will not be able to disable `CURLOPT_ACCEPT_ENCODING` by setting it to `NULL`:
php/php-src#11433
`curl_setopt` with such values will be a no-op so the secound `curl_exec` will still fail with `CURLE_BAD_CONTENT_ENCODING` and return an empty body.
Even worse, it will still report 200 in `CURLINFO_HTTP_CODE` so `FileClient` will not throw an exception because the `status_code` property is not 0.

Let’s extract the code so that we can create a fresh `CurlHandle` and later make the `CURLOPT_ACCEPT_ENCODING` conditional.

Unfortunately, since PHP 8.0 changed `curl_init` to return `CurlHandle` instead of `resource`, and PHPStan has no simple way to make the return type depend on PHP version, we have to resort to an ugly hack using `typeAliases`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants