Fix unchecked vector pre-allocation#221
Conversation
|
Not sure if https://github.com/3Hren/msgpack-rust/blob/master/rmp-serde/src/decode.rs#L807 needs to be adapted as well (I don't use the serde bindings). |
|
I'm concerned that this may be too drastic. Would it suffice to put some arbitrary limit on it? e.g. This would reduce the problem from outright DoS to mere amplification of memory usage in the worst case of infinitely recursive arrays overpromising their size. Amplification seems like a lesser problem, and also likely happens already anyway (all memory management has some overhead, even non-preallocated vec will grow by doubling its size, so there's always 2x amplification there). For a completely robust solution we'd need a completely different approach, e.g. allowing to specify maximum total memory usage for the decoding process, tracking it precisely, and baling out when the limit is reached. |
|
Yeah, a limit of 4K would probably cover most of the cases and make the amplification much less serious. However, I don't think the spec limits the nesting depth, so in case unlimited nesting is possible, with nested array16 items, an attacker with a 2 MiB input (1048576 nested arrays) could still cause an allocation of at least 4 GiB (amplification factor 2048). (I think it's even worse, since it's a What if we don't preallocate (with limit) for arrays, but only for bin/str types? The case is much less serious for binary data versus arrays, because binary data cannot be nested. There could also be a nesting limit for arrays, however that might need to be user-configurable. |
|
And in any case, we are speculating and should probably have some benchmarks. @kornelski the current rmpv benchmarks are from 2017 and broken. Do we still need them, or can they be disabled or removed for now? |
|
It'd be nice to fix the benchmarks, so we don't have to speculate :) Reserving capacity with a limit for non-nested types, and no reservation for nested types, sounds safe and fast enough. |
eed631a to
02856a8
Compare
|
Rebased on #222. I added a few benchmarks. Before the change: With no pre-allocation at all: I can't quite explain why, but the array 50/100/1024 KiB cases were actually faster without the preallocation. The array 20 MiB case was the same speed. The same thing counts for bin32, the cases up to 1 MiB were faster without preallocation. The 20 MiB bin32 test was about 5 times slower, while the 100 MiB bin32 test was again the same speed. Can someone explain this? In any case, it indicates that we're probably doing premature optimization with those preallocations. |
|
When preallocating up to 64 KiB for bin data: Comparison: Doesn't hurt, but doesn't help a lot either. |
8ad31a7 to
1c384a0
Compare
|
Rebased against master. |
|
Thanks for fixing this |
No more vector pre-allocation if the data size is not known in advance. See discussion in #151.
Fixes #151 / RUSTSEC-2017-0006.