Skip to content

Fuzz: added fuzz test for network filter "local_ratelimit"#11608

Merged
htuch merged 25 commits intoenvoyproxy:masterfrom
jianwen612:local_rate_limit_fuzz
Jun 23, 2020
Merged

Fuzz: added fuzz test for network filter "local_ratelimit"#11608
htuch merged 25 commits intoenvoyproxy:masterfrom
jianwen612:local_rate_limit_fuzz

Conversation

@jianwen612
Copy link
Copy Markdown
Contributor

Commit Message:
Additional Description:

Added fuzz test code for network filter "local_ratelimit" which is defined in extensions/filters/network/local_ratelimit/local_ratelimit.h

Added protobuf as the fuzz test input.

Added a basic test case to cover the code.

Risk Level: low
Testing:
Increased the coverage from 1.2% to 100% on the filter's .cc and .h files.

/cc @asraa
/cc @samkerner

Auni Ahsan and others added 10 commits June 8, 2020 16:15
)

Signed-off-by: Auni Ahsan <auni@google.com>
Signed-off-by: jianwen <jianwendong@google.com>
…s branch

merge up-to-date master branch
Signed-off-by: jianwen <jianwendong@google.com>

Signed-off-by: jianwen <jianwendong@google.com>
Signed-off-by: jianwen <jianwendong@google.com>
Signed-off-by: jianwen <jianwendong@google.com>
50 ms

Signed-off-by: jianwen <jianwendong@google.com>
Signed-off-by: jianwen <jianwendong@google.com>
activeFilter.

Signed-off-by: jianwen <jianwendong@google.com>
Signed-off-by: jianwen <jianwendong@google.com>
Signed-off-by: jianwen <jianwendong@google.com>
Signed-off-by: jianwen <jianwendong@google.com>
Signed-off-by: jianwen <jianwendong@google.com>
added protobuf file.

Signed-off-by: jianwen <jianwendong@google.com>
Signed-off-by: jianwen <jianwendong@google.com>
@jianwen612 jianwen612 marked this pull request as draft June 16, 2020 21:54
Signed-off-by: jianwen <jianwendong@google.com>
@jianwen612 jianwen612 marked this pull request as ready for review June 16, 2020 22:03
Signed-off-by: jianwen <jianwendong@google.com>
Signed-off-by: jianwen <jianwendong@google.com>
Signed-off-by: jianwen <jianwendong@google.com>
sqkerner
sqkerner previously approved these changes Jun 16, 2020
Copy link
Copy Markdown
Contributor

@sqkerner sqkerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks good to me.

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks great, a couple comment. Thanks.
/wait

}
}
} catch (const EnvoyException e) {
ENVOY_LOG_MISC(debug, "EnvoyException in fuzz test: {}", e.what());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is an arbitrary EnvoyException OK behavior here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is an arbitrary EnvoyException OK behavior here?

I catch EnvoyException because in
source/extensions/filters/network/local_ratelimit/local_ratelimit.cc
There is an EnvoyException thrown:
if (fill_interval_ < std::chrono::milliseconds(50)) { throw EnvoyException("local rate limit token bucket fill timer must be >= 50ms"); }

Do you have any suggestion on how to deal with this case?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jianwen has some history about this --
I think it's OK to add this constraint in the token bucket proto (which is only used in this filter), but the question came up of having 100% line coverage in the filter.

But the middleground might be just to catch the exception when creating the config

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jianwen has some history about this --
I think it's OK to add this constraint in the token bucket proto (which is only used in this filter), but the question came up of having 100% line coverage in the filter.

But the middleground might be just to catch the exception when creating the config

That makes sense. Modified.

@htuch htuch self-assigned this Jun 16, 2020
Copy link
Copy Markdown
Contributor

@asraa asraa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Would love to see some stats like exec/sec as well.

added TODO
removed unused deps and headers

Signed-off-by: jianwen <jianwendong@google.com>
@jianwen612
Copy link
Copy Markdown
Contributor Author

jianwen612 commented Jun 17, 2020

Thanks! Would love to see some stats like exec/sec as well.

At the beginning, exec/s: 300-500
#453 NEW cov: 5713 ft: 16233 corp: 127/28Kb lim: 4096 exec/s: 453 rss: 333Mb L: 415/935 MS: 8 CopyPart-ShuffleBytes-CopyPart-ShuffleBytes-ChangeBit-Custom-InsertByte-Custom-

After #[3000] : exec/s: 170
#3006 REDUCE cov: 80648 ft: 115428 corp: 297/94Kb lim: 4096 exec/s: 176 rss: 479Mb L: 461/1163 MS: 1 CustomCrossOver-

After #[10000]: exec/s: 111
#10116 REDUCE cov: 80653 ft: 118605 corp: 411/153Kb lim: 4096 exec/s: 111 rss: 768Mb L: 994/1582 MS: 3 CrossOver-Custom-CustomCrossOver-

Update: I ran it for additional several times, each time I deleted the corpus but the exec/s is different. Sometimes when #[4000] it slows down to 100. I remember the first time when I ran it, the exec/s slowed down to 30 exec/s. Is this normal?

@jianwen612 jianwen612 requested a review from asraa June 17, 2020 17:27
Signed-off-by: jianwen <jianwendong@google.com>
@sqkerner
Copy link
Copy Markdown
Contributor

Thanks! Would love to see some stats like exec/sec as well.

At the beginning, exec/s: 300-500
#453 NEW cov: 5713 ft: 16233 corp: 127/28Kb lim: 4096 exec/s: 453 rss: 333Mb L: 415/935 MS: 8 CopyPart-ShuffleBytes-CopyPart-ShuffleBytes-ChangeBit-Custom-InsertByte-Custom-

After #[3000] : exec/s: 170
#3006 REDUCE cov: 80648 ft: 115428 corp: 297/94Kb lim: 4096 exec/s: 176 rss: 479Mb L: 461/1163 MS: 1 CustomCrossOver-

After #[10000]: exec/s: 111
#10116 REDUCE cov: 80653 ft: 118605 corp: 411/153Kb lim: 4096 exec/s: 111 rss: 768Mb L: 994/1582 MS: 3 CrossOver-Custom-CustomCrossOver-

Update: I ran it for additional several times, each time I deleted the corpus but the exec/s is different. Sometimes when #[4000] it slows down to 100. I remember the first time when I ran it, the exec/s slowed down to 30 exec/s. Is this normal?

It is common for a fuzzer to have a high exec rate at first, then slow down. The reason is that only "shallow" code paths are explored at first. If only a small set of specific inputs cause coverage of a code path that is slow, the fuzzer will not execute it until it runs long enough to find such an input.

The simplest way to improve fuzzer speed is to write less text to logs.

Signed-off-by: jianwen <jianwendong@google.com>
@jianwen612 jianwen612 requested review from asraa and htuch June 18, 2020 21:04
}

} // NOLINT
// Silence clang-tidy here because it thinks there is a memory leak for "fill_timer_"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you seeing this and we don't see this in other tests? Can we just destruct at the end?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you seeing this and we don't see this in other tests? Can we just destruct at the end?

Actually we can't. It will crash and tell you that you are trying to free it twice.

This is the comment of this class about whether to delete or not(it also confuses me why it is designed like this): https://github.com/envoyproxy/envoy/blob/master/test/mocks/event/mocks.cc#L41

I think the reason why I'm seeing this is because this is the only place to use this(Event::MockTimer*) outside of unit test(gTest).

It's possible that gTest will clear all the memory inside a test case after the test case is run, and thus the clang-tidy won't give a warning like this.

But libfuzzer doesn't do this. It's running all the time and may not clear the memory after each test case(after n runs it will check memory leak), so clang-tidy gives this warning.

Currently all the usages of Event::MockTimer* are inside gtest(unit test). All the test cases don't destruct it at the end.

If you have any idea on how to deal with it, please let me know, thanks!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it looks like the local ratelimit Config object has a unique_ptr to the fill timer. So, it should be destructed, along with ActiveFilter, and free the timer. Do you see the Config object going through its destructor?

Copy link
Copy Markdown
Contributor Author

@jianwen612 jianwen612 Jun 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it looks like the local ratelimit Config object has a unique_ptr to the fill timer. So, it should be destructed, along with ActiveFilter, and free the timer. Do you see the Config object going through its destructor?

In these classes there are no explicit destructors. I tried using a {} block to wrap all the code and observe through a weak_ptr<Config>, the use_count shows 0 after the code block(inside the block the count is 2).
So Config_ should have been destructed after each run. Since fill_timer_ is a unique_ptr inside Config_, it should also have been deleted. I think this explains why when I delete it manually it says I'm trying to free twice.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think clang-tidy is just having a false positive; there is no leak. Is there a way to narrow the NO_LINT to just this variable?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think clang-tidy is just having a false positive; there is no leak. Is there a way to narrow the NO_LINT to just this variable?

I tried to use NO_LINT on the variable but it doesn't have any effect. The error given by clang-tidy is in the last line of the block which has potential memory leak issues. NO_LINT could only have effect on one line instead of one variable. After reading the documentation about NO_LINT, I think the only param we can specify is the error type. We could use NOLINT(clang-analyzer-cplusplus.NewDeleteLeaks) instead.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the more specific one works, thanks.

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just want to figure out what is going on with NOLINT, this shouldn't be needed.
/wait

Signed-off-by: jianwen <jianwendong@google.com>
@jianwen612 jianwen612 requested a review from htuch June 19, 2020 23:31
Signed-off-by: jianwen <jianwendong@google.com>
Signed-off-by: jianwen <jianwendong@google.com>
Signed-off-by: jianwen <jianwendong@google.com>
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@htuch htuch merged commit 658ef43 into envoyproxy:master Jun 23, 2020
songhu pushed a commit to songhu/envoy that referenced this pull request Jun 25, 2020
…y#11608)

Added fuzz test code for network filter "local_ratelimit" which is defined in extensions/filters/network/local_ratelimit/local_ratelimit.h

Added protobuf as the fuzz test input.

Added a basic test case to cover the code.

Risk Level: low
Testing: Increased the coverage from 1.2% to 100% on the filter's .cc and .h files.

Signed-off-by: jianwen <jianwendong@google.com>
yashwant121 pushed a commit to yashwant121/envoy that referenced this pull request Jul 24, 2020
…y#11608)

Added fuzz test code for network filter "local_ratelimit" which is defined in extensions/filters/network/local_ratelimit/local_ratelimit.h

Added protobuf as the fuzz test input.

Added a basic test case to cover the code.

Risk Level: low
Testing: Increased the coverage from 1.2% to 100% on the filter's .cc and .h files.

Signed-off-by: jianwen <jianwendong@google.com>
Signed-off-by: yashwant121 <yadavyashwant36@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants