Feature/config limit by VictorKoenders · Pull Request #439 · bincode-org/bincode

VictorKoenders · 2021-11-29T09:57:13Z

Adds a limit to configuration and keeps track of how many bytes are going to be read, returning a DecodeError::LimitExceeded if this exceeds the amount of bytes read.

The reason this is set up like this is because we use a lot of Container::with_capacity when decoding containers. The limit exists primarily to not allocate too much value, for example when data is garbage, we don't want to accidentally allocate 10GB of memory when the data is invalid.

I've added an unclaim system as well. As per the Decoder comments:

When decoding container types, a typical implementation would claim to read len * size_of::<T>() bytes.
This is to ensure that bincode won't allocate several GB of memory while constructing the container.

Because the implementation claims len * size_of::<T>(), but then has to decode each T, this would be marked
as double. This function allows us to un-claim each T that gets decoded.

We cannot check if len * size_of::<T>() is valid without claiming it, because this would mean that if you have
a nested container (e.g. Vec<Vec<T>>), it does not know how much memory is already claimed, and could easily
allocate much more than the user intends.

VictorKoenders · 2021-11-29T09:57:29Z

TODO(self):

Add tests to see if the limit actually gets hit
Maybe add a check SliceReader to see if the slice is larger than the current limit, and error immediately?

codecov-commenter · 2021-11-29T10:30:45Z

Codecov Report

Merging #439 (afc6719) into trunk (882e227) will increase coverage by 0.68%.
The diff coverage is 92.30%.

@@            Coverage Diff             @@
##            trunk     #439      +/-   ##
==========================================
+ Coverage   62.19%   62.87%   +0.68%     
==========================================
  Files          46       46              
  Lines        3383     3456      +73     
==========================================
+ Hits         2104     2173      +69     
- Misses       1279     1283       +4

Impacted Files	Coverage Δ
src/error.rs	`0.00% <ø> (ø)`
src/config.rs	`80.95% <50.00%> (-7.29%)`	⬇️
src/de/decoder.rs	`65.00% <76.92%> (+22.14%)`	⬆️
src/de/impls.rs	`79.07% <100.00%> (+1.80%)`	⬆️
src/de/mod.rs	`79.31% <100.00%> (+10.88%)`	⬆️
src/features/impl_alloc.rs	`98.24% <100.00%> (+0.16%)`	⬆️
src/features/impl_std.rs	`75.86% <100.00%> (+0.33%)`	⬆️
tests/alloc.rs	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 882e227...afc6719. Read the comment docs.

src/de/decoder.rs

src/de/impls.rs

ghost · 2021-12-09T13:58:26Z

src/de/impls.rs

-                super::impl_core::collect_into_array(&mut (0..N).map(|_| T::decode(&mut decoder)));
+            let result = super::impl_core::collect_into_array(&mut (0..N).map(|_| {
+                // See the documentation on `unclaim_bytes_read` as to why we're doing this here
+                decoder.unclaim_bytes_read(core::mem::size_of::<T>());


rather than unclaiming in a loop, we could just unclaim right after claiming. That should reduce loop maths.

The reason I unclaim per item is because of nested containers.

Consider the situation where there is data type Vec<Vec<u8>>, where each Vec is 100 entries. This means the entire container would allocate 10kb of memory.

If our limit is set to 2500, the current impl would:

Allocate the outer Vec<Vec<_>> (100 * 24), count = 2400

Deallocate 1 inner Vec<u8> (24 bytes), count = 2376

Allocate the inner Vec<u8> (100 bytes), count = 2476

Deallocate 1 inner Vec<u8> (24 bytes), count = 2452

Allocate the inner Vec<u8> (100 bytes), count = 2552

This would fail at the 3rd allocation

Whereas if we unclaim the entire vec immediately, we would get:

Allocate the outer Vec<Vec<_>> (100 * 24), count = 2400

Deallocate this, count = 0

25x:

Allocate the inner Vec<u8> (100 bytes), count = 100, 200, etc

This would fail at the 25th allocation

ghost · 2021-12-09T14:00:06Z

src/features/impl_alloc.rs

 {
    fn decode<D: Decoder>(mut decoder: D) -> Result<Self, DecodeError> {
        let len = crate::de::decode_slice_len(&mut decoder)?;
+        decoder.claim_container_read::<T>(len)?;


why is this used sometimes and other times (like the array impl) done manually? Also, same comment here about immediately unclaiming

This is specifically for containers (e.g. Vec, HashMap, etc). An array [T; N] is in my eyes not a container

…inlining to the decoder

…ize_of<[T; N]>())`

VictorKoenders force-pushed the feature/config-limit branch from ba72eaa to d8cd6e9 Compare November 29, 2021 10:30

VictorKoenders changed the title ~~[WIP] Feature/config limit~~ Feature/config limit Dec 9, 2021

VictorKoenders requested a review from a user December 9, 2021 08:28

ghost suggested changes Dec 9, 2021

View reviewed changes

VictorKoenders added 4 commits December 10, 2021 19:52

Added Limit<N> and NoLimit to the configuration

e5504aa

Added a limit check to Decoder and DecoderImpl

a601ecb

Fixed doc example

08c1784

Added test cases, added a helper function specialized for containers

bc30d0a

VictorKoenders force-pushed the feature/config-limit branch from d8cd6e9 to bc30d0a Compare December 10, 2021 18:52

VictorKoenders added 2 commits December 11, 2021 09:49

Added a test to see if inlining makes the limit config faster, added …

74fe437

…inlining to the decoder

Changed claim_bytes_read(N * size_of<T>()) into `claim_bytes_read(s…

afc6719

…ize_of<[T; N]>())`

ghost approved these changes Dec 11, 2021

View reviewed changes

VictorKoenders merged commit 240558d into trunk Dec 11, 2021

VictorKoenders deleted the feature/config-limit branch December 11, 2021 14:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/config limit#439

Feature/config limit#439
VictorKoenders merged 6 commits intotrunkfrom
feature/config-limit

VictorKoenders commented Nov 29, 2021

Uh oh!

VictorKoenders commented Nov 29, 2021 •

edited

Loading

Uh oh!

codecov-commenter commented Nov 29, 2021 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

ghost Dec 9, 2021

Uh oh!

VictorKoenders Dec 11, 2021

Uh oh!

ghost Dec 9, 2021

Uh oh!

VictorKoenders Dec 11, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

VictorKoenders commented Nov 29, 2021

Uh oh!

VictorKoenders commented Nov 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Nov 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

ghost Dec 9, 2021

Choose a reason for hiding this comment

Uh oh!

VictorKoenders Dec 11, 2021

Choose a reason for hiding this comment

Uh oh!

ghost Dec 9, 2021

Choose a reason for hiding this comment

Uh oh!

VictorKoenders Dec 11, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

VictorKoenders commented Nov 29, 2021 •

edited

Loading

codecov-commenter commented Nov 29, 2021 •

edited

Loading