Skip to content

Conversation

@bwoebi
Copy link
Member

@bwoebi bwoebi commented Jul 7, 2017

See also https://bugs.php.net/74876.

Note that apart from the max_size option to inflate_init(), there is also a parameter to inflate_add(), so that one can know exactly until where was read and resume from there as well as increase the current limit (without having to start over).

zend_string *out;
char *in_buf;
size_t in_len, buffer_used = 0, CHUNK_SIZE = 8192;
size_t in_len, buffer_used = 0, CHUNK_SIZE = 4096 /* power of 2 */;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look a bit down, I'm using CHUNK_SIZE * 2 as initial size.

@krakjoe krakjoe added the Bug label Jul 8, 2017
@krakjoe
Copy link
Member

krakjoe commented Jul 10, 2017

I'm unsure if we would consider this an ABI break, the sizeof of a public struct changing is strictly speaking an ABI break, but we may be able to accept it because the field was added to the end of the struct, and we don't expect anyone to be using sizeof on the struct directly for any reason ...

Thoughts ?

@bwoebi
Copy link
Member Author

bwoebi commented Jul 10, 2017

Well, then 0c4f11e is also an ABI break to these rules. The ABI restrictions should apply only to structures which are also used externally.
There is no API which uses that struct at all. Thus it should be safe.

@nikic
Copy link
Member

nikic commented Jul 10, 2017

@bwoebi Isn't that commit master only?

@bwoebi
Copy link
Member Author

bwoebi commented Jul 10, 2017

@nikic nope, it's in 7.0.

@weltling
Copy link
Contributor

@bwoebi well, that's bad. We usually discuss addition of new functions and struct changes with possible impac, especially for stable branches. I really consider this as a bad behavior to silently merging such cases like 0c4f11e and then pointing to them a as a proof of another possibly breaching change. As the patch is not released yet, we should evaluate whether it's acceptable for non master. ABI compatibility should be held not only because things are used in the core, but because the exported definition can be used in non core. It is clear, that sometimes patches can go through unreviewed, but it is not a reason to proceed this way, really :(

Thanks.

@weltling
Copy link
Contributor

weltling commented Jul 10, 2017

@bwoebi I was checking this PR and 0c4f11e with Joe, and we believe that both have to target master. It's going to be reverted in 7.0 and 7.1. Please see also 0c4f11e#diff-b2fb0c574cad6c75d02550ab1e1519f4R51 which is a bug if you look at the zilb.h definitions.

Thanks.

@kelunik
Copy link
Member

kelunik commented Jul 10, 2017

@weltling Is there anything so we could support this in 7.0 / 7.1? Otherwise anything like https://blog.haschek.at/2017/how-to-defend-your-website-with-zip-bombs.html could crash any gzip decoding HTTP client.

@bwoebi
Copy link
Member Author

bwoebi commented Jul 10, 2017

@weltling At least that's what I assumed, nobody said anything and nobody mentioned it on the PR either. And I saw no reason why to avoid it too.

Ah, it should probably be an int. Not sure why this was made size_t … It just happens to not matter if sizeof(zend_long) == sizeof(size_t).

Also, why do we need to revert that in 7.0/7.1? That struct isn't part of any public API...

@weltling
Copy link
Contributor

@bwoebi I see, no headers installed. The new functions might be also OK, whereby there's a potential for a behavior change in inflate_init() with 0c4f11e. Well, alone the change on the struct, which looks suspicious without knowing the header is not installed, and anyway the introduction of new functions - that's where i see other people getting proactive to avoid situations like this. People are busy with their lifes, work an whatsoever, so sometimes yes - need to gain attention.

That's the common practice new features are done after the feature freeze, i'd wonder it were not known. After the feature freeze, the stable branches are actively bug fixed, but new features are added on case-by-case base. Now, with this PR it's a further development, building upon the previous patch, with more potential issues to the existing functionality. Please, lets stay at targeting 7.2 which is the suitable place for the new things.

@kelunik we could protect by educating people, otherwise there's no chance. There are also many places above PHP, that are supposed to care. Web servers, load balancers, proxies, etc. There are however many cases where compressing and sending huge files over is a completely legit scenario. IMO it is great you guys popularize these topics. The most certain way is to get this into master, create a good documentation and ensure stability.

Thanks.

@kelunik
Copy link
Member

kelunik commented Jul 10, 2017

@weltling Educating people? How? What do you want them to know? That there's no possibility to protect against these in any released version of PHP?

@weltling
Copy link
Contributor

@kelunik how to educate - at least by documenting the new functionality and providing examples showing where it is useful. I'm sure also, that the topic comes out from a real project, where the information can be spread as well.

Which kind of protection do you have in mind? There are many ways to secure the network, delegating those tasks to PHP is sometimes doable. The zip momb issue, except if it's about a malicious site owner, is based on the assumption one would hack into server. If that's not the case, uncompressed size >= compressed size, which can be figured out before. In practice, not only this, but many other factors are protective in PHP. Fe time limit - with default 30 seconds transferring 10G of data is likely not happening. Another case, where one transfers from one host to another - well, that can be exactly what is needed. Or when one compresses a huge local file, intentionally.

Another topic - i think one should get loud on internals about discussing the exact way these issues were fixed. If it's going to concern default behaviors or impact BC in any way, it is always good to get some opinions for the best solution.

Thanks.

@kelunik
Copy link
Member

kelunik commented Jul 10, 2017

@kelunik how to educate - at least by documenting the new functionality and providing examples showing where it is useful. I'm sure also, that the topic comes out from a real project, where the information can be spread as well.

Agreed, but that's only possible for released versions of PHP, so it would have to wait until November until a fix can be implemented.

Which kind of protection do you have in mind?

A maximum size, which is the feature this PR implements. We have a maximum size in Artax for the HTTP body bytes, but if a few kB can totally explode the whole process, because it reaches its memory limit, that's definitely not good. While PHP has protections in place, there's no way to avoid a potential cash when downloading an HTTP resource from a potential malicious source without such an option. This is especially important in environments that do other things concurrently, but it's a problem for regular applications, too.

Another topic - i think one should get loud on internals about discussing the exact way these issues were fixed. If it's going to concern default behaviors or impact BC in any way, it is always good to get some opinions for the best solution.

Sure, totally agree there.

@weltling
Copy link
Contributor

weltling commented Jul 10, 2017

@kelunik i see now what it is about a bit better. The link talks about browsers, while this issue is something else. In general case

$s = file_get_contents("http://www.ccel.org/ccel/bible/kjv.txt");
$compressed   = gzdeflate($s);
$uncompressed = gzinflate($compressed, 1024); /* insufficient memory */ 

If gzinflate() can't do the job, so it fails.

For PHP as a client application, even without the patch being invented, it is a low severity issue. Where one would stream - it is another question, which is likely application specific and I don't see this PR touching gzip streams yet. Where one would code the way it exhausts memory_limit - it's a coding issue. I'm not sure browsers have such a limit at all, fe the page from the snippet above made mine nearly hang for some time. Still PHP is not a browser, and especially with a stable it is more convenient to target the major use case.

If using PHP as a client, the thing to do would seem in first place to use streams. Otherwise, in case of a huge file, deflated content still needs to be in memory. IMO an approach could be to read the incoming stream in chunks so inflated content size can be controlled. Just my two cents actually.

Thanks.

@cmb69
Copy link
Member

cmb69 commented Jun 16, 2018

Agreed, but that's only possible for released versions of PHP, so it would have to wait until November until a fix can be implemented.

Now, it waited for nearly a year …

@nikic
Copy link
Member

nikic commented Jul 7, 2018

This seems like a reasonable addition for 7.3. The part I don't get is the extra argument to inflate_add(). It seems to serve a weird double function (original value is used to increase maximum size, this value refers to the output, new value is used to tell how much of the input was decoded, refers to the input -- so both numbers have nothing to do with each other?) Can you please explain in more detail how that argument is supposed to be used and add a test for it?

@kelunik
Copy link
Member

kelunik commented Jul 7, 2018

The original reason for this PR was protecting Artax from ZIP bombs, but we've solved that in userland: amphp/artax@afb0d7f. Maybe this PR isn't really needed at all.

@bwoebi
Copy link
Member Author

bwoebi commented Jul 13, 2018

This userpace "solution" does not really guard against true inflate bombs, if the data is already in the buffer of the inflate context, it may expand (nearly) arbitrarily much. It is just comparing the sizes after an inflation round, at which point it may already be too late.

This patch however puts an effective internal buffer limit on the resulting string size - if it's reached, it immediately stops inflating, never actually even creating the large data string.

php_ctx->maxSize += maxSizeAdd;
if (maxSizeAdd < php_ctx->maxSize) { /* overflow check */
php_ctx->maxSize = ~0;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bwoebi This overflow check doesn't look right, since if no overflow occurs, the check is always true (n < n+m ∀ m >= 0).

@cmb69
Copy link
Member

cmb69 commented Jul 18, 2018

The $sizeDelta parameter looks strange indeed. Is there any practical purpose for its out-value?

Furthermore, I wonder what would be if we didn't add the max_size option to inflate_init(). This way we wouldn't have to modify struct _php_zlib_context at all, and a patch could address PHP-7.1. And perhaps it might be sensible to introduce another function, say inflate_add_max() (and leave inflate_add() alone), which would have a mandatory $maxSize parameter (and maybe an optional $bytesRead out-param).

@nikic
Copy link
Member

nikic commented Jul 29, 2018

Removing 7.3 nomination as per [https://github.com//pull/2614#issuecomment-314136450] and the following comments there actually is no ABI break here, so we can land this anytime (including on old versions).

@krakjoe
Copy link
Member

krakjoe commented Jan 30, 2019

@bwoebi can you take a fresh look at this please

/cc @cmb69

@krakjoe
Copy link
Member

krakjoe commented Oct 2, 2019

There's been no movement for more than a year here, the discussion also raised unanswered questions.

I'm closing this, if there is to be more work on this issue please reference this PR from the new pull request.

@krakjoe krakjoe closed this Oct 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants