HTTP filter fuzzer base class and POC buffer filter fuzzer by asraa · Pull Request #9400 · envoyproxy/envoy

asraa · 2019-12-18T20:48:35Z

Generic HTTP filter fuzzer framework that aims to provide coverage over any registered HTTP filter.

This fuzz target fuzzes the decode methods of an HTTP Filter. The fuzzed data is:

A configuration for an HttpFilter
Untrusted HTTP data (headers, data, trailers)

A post-processor mutation sanitizes the configuration to have a registered name pulled from the NamedHttpFilterConfigFactory with a corresponding type_url.

The fuzzer then ingests these two inputs, creates the appropriate filter (if the serialized configuration can be parsed), and executes decode(Headers/Data/Trailers) on the fuzzed data.

Runs at about 250 exec/sec

Signed-off-by: Asra Ali asraa@google.com

Signed-off-by: Asra Ali <asraa@google.com>

asraa · 2019-12-18T20:49:32Z

cc @nareddyt @htuch @mattklein123
Would love some initial comments. This is just a skeleton / POC, and I'm opening up a GH Issue to discuss more the objectives and goals for these fuzzers.

qiwzhang · 2019-12-18T22:08:42Z

test/extensions/filters/http/buffer/buffer_filter_fuzz_test.cc

+
+DEFINE_PROTO_FUZZER(
+    const test::extensions::filters::http::buffer::BufferFilterFuzzTestCase& input) {
+  BufferFilterFuzz filter(input.config());


Another way is:

class DecoderFuzzer {
public:
DecoderFuzz(DecodeFilter* filer) : filter_(filter) {
filter_->setDcoderFilterCallbacks(calback_);
}

void fuzz() {
filter_->...
}

DocoderFilter* filter_{};
callback_;
};

Each custom filter not need to derive from it.

DEFINE_PROTO_FUZZER() {
filter = create custom filter;
fuzzer = DoecoerFuzzer(&filter);
fuzzer.fuzz(input.data());
}

^ Ooh. I like this a lot more. It leaves the filter fuzz writer with less responsibility to look at the internal of the DecoderFuzzer class. I'll update the POC.

mattklein123 · 2019-12-19T16:47:07Z

@asraa looks super cool. I think this is very worthwhile but it will get complicated since in the general case the filter fuzzer is going to have to replicate a lot of the logic that lives inside the HCM (might be a good opportunity to refactor some of it out). I will take a look at the GH issue that you referred to and we can discuss there.

nareddyt · 2019-12-19T17:41:49Z

test/extensions/filters/http/common/fuzz/decoder_filter_fuzz.h

+    // TODO: add end_stream handling and decodeTrailers
+    filter_->decodeHeaders(headers, false);
+    Buffer::OwnedImpl buffer(data.data());
+    filter_->decodeData(buffer, true);


For decodeData, could we call it multiple times for the buffer, with the last call being end_stream=true? Some of our custom filters have logic to handle streaming grpc data, so a single call would not encapsulate this case.

Yeah, that's a good point I asked about in the GH issue. We could make the inputs look more like an action stream with repeated Headers and Data, and call decode multiple times with the last call ending the stream. WDYT? It's useful to know that's good, I'll change at least the data to be repeated strings.

htuch

This is great. I think this is definitely the right direction, my comments are mostly trying to see how far we could take the removal of boiler plate to make it easier/cheaper to fuzz additional filters.

htuch · 2019-12-19T23:40:19Z

test/extensions/filters/http/buffer/buffer_filter_fuzz.proto

+
+import "validate/validate.proto";
+
+message BufferFilterFuzzTestCase {


Is it possible to use proto reflection and something like DynamicMessage with libproto-mutator? This could allow us to synthesize the HttpData + per-filter config type message on-the-fly, avoiding the need to have these boiler plate protos.

htuch · 2019-12-19T23:42:26Z

test/extensions/filters/http/buffer/buffer_filter_fuzz_test.cc

+      : config_(std::make_shared<BufferFilterConfig>(proto_config)) {
+    // Initialize filter and setup callbacks.
+    filter_ = std::make_unique<BufferFilter>(config_);
+    filter_->setDecoderFilterCallbacks(callbacks_);


Can this be in the base somehow? Ideally what we are shooting for here is as few lines of boiler plate as possible for every new filter we want to fuzz. I would think it might even be possible to build an Uber Filter Fuzzer that uses the factories to find all the filters, consumes some fuzz data to pick the filter, synthesizes a reasonable config, and then runs through this process, without having any per-filter boiler plate. That way, every single filter gets fuzzed the moment it is linked into Envoy :)

I would think it might even be possible to build an Uber Filter Fuzzer that uses the factories to find all the filters,

Moving in to the base is in the POC now.

I was working and researching a way to Uber Filter Fuzz, but am running in to a problem dissociating ActiveStream callbacks from the Filter instantiation. (There is a createFilter method that each Filter implements, and the appropriate one is called with createFilterFactoryFromProto. But these methods return a Http::FilterFactoryCb rather than an Http::Filter, so I can't dissociate the creation of the HTTP filter and an active stream. This may be related to the general issue of breaking down the complexity of HCM discussed here #9401. I could imagine the createFilter creating an Http::Filter, and wrapping the creation of the filter chain for a new stream separately.

Just made a version of the UberFilterFuzzer that hacks the mock expectation to grab the filter when adding to the stream and executes the code on that. It runs at about 250 exec/sec, which is fast as I can seem to get it by reducing the number of mock expectations per call to 0 (and just one static one)

I hardcoded the filter for now, but TODO to make a directory of filter configs and use that.

Signed-off-by: Asra Ali <asraa@google.com>

stale · 2019-12-27T23:28:56Z

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

Signed-off-by: Asra Ali <asraa@google.com>

htuch · 2020-01-02T16:19:19Z

test/extensions/filters/http/common/fuzz/filter_fuzz_test.cc

+)EOF";
+  // Create proto_config from the string.
+  envoy::config::filter::network::http_connection_manager::v2::HttpFilter proto_config;
+  TestUtility::loadFromYaml(config_string, proto_config);


I like the new approach to making this generic. What I think we can do to make this more robust is:

Use some initial entropy to index into the list of available HTTP filters. There's probably no convenient integer index mechanism right now, but it should be easy to add to the registry.

Once we have the filter factory, we can then ask it it createEmptyConfigProto(). This will give us a Protobuf::Message.

With the message obtained in (2) (and/or its descriptor), we ask libprotobuf-mutator to populate with structurally valid content.

We fuzz with UberFilterFuzzer as in this PR, that bit looks great.

All set. Here are some things:
3. The mutator sometimes produces an invalid config, and so this code is in a try/catch block. Sometimes PGV constraints are violated, sometimes exceptions like LuaExceptions are thrown for bad code. I'm wondering if it's better to run this a few times rather than abort the fuzz test immediately, although I think it doesn't make a difference.
4. The class is starting to get complicated. I could move it out in to a library.

htuch · 2020-01-02T16:20:19Z

test/extensions/filters/http/common/fuzz/filter_fuzz_test.cc

+        end_stream = true;
+      }
+      Buffer::OwnedImpl buffer(data.data().Get(i));
+      filter->decodeData(buffer, end_stream);


I think we need to consider the decode* return value, which indicates the stream status. If we're told to stop, we shouldn't provide more data, or we violate the filter's preconditions on these methods.

Signed-off-by: Asra Ali <asraa@google.com>

htuch

Cool, definitely getting closer to the generic push button filter fuzzer!

bazel/external/libprotobuf_mutator.BUILD

htuch · 2020-01-07T04:14:26Z

test/extensions/filters/http/common/fuzz/filter_fuzz_test.cc

+    protobuf_mutator::Mutator mutator;
+    try {
+      // Mutate and validate the chosen filter protobuf directly.
+      mutator.Mutate(proto_config.get(), 200);


I think we're just running the mutator here directly on an empty config. I wonder if instead we could make use of something in the input proto, so that the mutation is now a function of the selected filter's config type and some input entropy.

In an ideal world, there would be some type magic so that the outer LPM, driving the fuzz test, had some oracle to know the proto type that was going to be selected fro the test run. This magic doesn't exist, so I think the next best thing is for the outer LPM to produce some bit vector that we combine with the known proto type here with an inner LPM.

Maybe there is a smarter way to do this that allows the outer LPM to be the only LPM and/or allows it to do smarter cross-over and mutations, but I'm not seeing it.

Instead of fuzzing a filter_index and mutating an empty config, I opted for fuzzing configs, and registering a post processor mutation that will always ensure that we at least consume valid named configs. There is no more "inner mutator".

I think it does get closer to producing both random and mostly valid configs (mostly by the hopes that the fuzzer can find the right ones).

It doesn't solve the problem of the typed_config containing an Any which may or may not be parsed into the selected filter config, or that the text proto representation of the Any is binary encoded and ugly. If we do use this approach, I think automagically parsing through unit tests to create the corpus entries is a good way to go.

This makes sense, thanks. I think there are two things to think about here:

Agree on the corpus and generation, we could get a good baseline by generating this from unit tests. I think we would want a framework that makes this basically automatic any time you write a unit test for an HTTP filter.

The mutation of the Any seems to me to be an area that is still a bit hard to understand. LPM is going to see a sequence of bytes and do byte mutation rather than structured proto mutation I think. Could you tag the LPM author on this thread? Would be good to get their take on this.

@vitalybuka Would love some feedback on this fuzzer. We have a proto fuzzer that takes as input a message that has an embedded google.protobuf.Any.

envoy/api/envoy/extensions/filters/network/http_connection_manager/v3alpha/http_connection_manager.proto

Line 640 in c3bddae

message HttpFilter {

The PostProcessorRegistration constrains the filter name to match a supported filter.

The type of the typed_config field depends on the filter being instantiated. Because it is an Any, the value is a serialized message. Assuming LPM will perform byte mutation on this field, is there a better way to perform structured fuzzing on the value?

Update: I added some valid typed_configs to represent the buffer_filter into a corpus directory, and some ASSERT(false) in the source code to check if we ever generate a config with certain values for fields, and running the fuzzer hundreds of thousands of runs doesn't catch the assert. I logged the config we produce when fuzzing, and too many aren't valid.

@htuch specifically, I crashed if we ever set max request bytes to 0, and restricted to only fuzzer buffer filter. Too many inputs turned into hidden_envoy_deprecated_config with deeper and deeper struct definitions.

Signed-off-by: Asra Ali <asraa@google.com>

vitalybuka · 2020-01-10T21:02:01Z

test/extensions/filters/http/common/fuzz/filter_fuzz_test.cc

+      [](test::extensions::filters::http::FilterFuzzTestCase* input, unsigned int seed) {
+        // This ensures that the mutated configs all have valid filter names.
+        static const std::vector<absl::string_view> filter_names = Registry::FactoryRegistry<
+            Server::Configuration::NamedHttpFilterConfigFactory>::registeredNames();


Looks like LPM is not very helpful with google.protobuf.Any
We should add support for that. google/libprotobuf-mutator#151
For now you can either keep this post processor assuming it will not conflict with future LPM support or you can just remove it assuming fuzzing performance will improve in future on it's own.

Actually we need somehow to find supported types for any without changing the proto definition. I guess this callback can be useful for that. I don't know for sure yet how this is going to work on LPM side.

Yeah, I would guess it would essentially lookup the proto in the descriptor pool with type URL, unpack to it, and then do whatever is normally done by LPM, then repack. But, that's without any knowledge of LPM details. I think this is what we need to make this work efficiently, I know @asraa has been doing some experiments to see what can be done with the existing mutator, so there might be something useful even without that.

So the message with embedded Any must have (1) valid name, (2) appropriate type_url based on name, (3) Any of the matching type. With the existing mutator what I've done is:

After setting valid name, will LPM eventually create a non-trivial Any and appropriate type_url? -> Did not seem to after millions of runs

After setting valid name, and adding valid type_urls and serialized protos to a dictionary, will LPM combine appropriately? Nope.

After setting valid name and matching type_url, and adding serialized protos to a dictionary, will LPM populate the value field with the dictionary string? -> Yes! I got much more lucrative results.

So for now, I'm relying on dictionaries having full config strings pulled from unit test examples. I think we can iterate further later.

Signed-off-by: Asra Ali <asraa@google.com>

htuch · 2020-01-23T00:02:50Z

@asraa can you also update the comment at the top which still reads "Very wip.."? Thanks.

Signed-off-by: Asra Ali <asraa@google.com>

htuch · 2020-02-05T17:34:35Z

@htuch as a reminder.

stale · 2020-02-12T17:42:56Z

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

Signed-off-by: Asra Ali <asraa@google.com>

asraa · 2020-02-13T20:14:33Z

@yanavlasov can you please take a look any time you have a few min? thank you!

Signed-off-by: Asra Ali <asraa@google.com>

asraa · 2020-02-19T17:32:35Z

Master changes merged, applied changes to header map type specifications. @yanavlasov

yanavlasov · 2020-02-19T20:06:40Z

test/extensions/filters/http/common/fuzz/BUILD

+        "//test/mocks/buffer:buffer_mocks",
+        "//test/mocks/http:http_mocks",
+        "//test/mocks/server:server_mocks",
+    ] + envoy_all_extensions(),


Comment that this links all of innternal extensions.

yanavlasov · 2020-02-19T20:08:35Z

test/extensions/filters/http/common/fuzz/filter_fuzz_test.cc

+      headers.setMethod("GET");
+    }
+    if (headers.Host() == nullptr) {
+      headers.setHost("authority");


Make it foo.com?

Signed-off-by: Asra Ali <asraa@google.com>

mattklein123

Nice. Let's ship and iterate!

WIP filter fuzzer

2471e1a

Signed-off-by: Asra Ali <asraa@google.com>

asraa mentioned this pull request Dec 18, 2019

HTTP Filter Fuzzing #9401

Closed

qiwzhang reviewed Dec 18, 2019

View reviewed changes

nareddyt reviewed Dec 19, 2019

View reviewed changes

htuch reviewed Dec 19, 2019

View reviewed changes

address some comments

4d64e92

Signed-off-by: Asra Ali <asraa@google.com>

stale bot added stale stalebot believes this issue/PR has not been touched recently and removed stale stalebot believes this issue/PR has not been touched recently labels Dec 27, 2019

asraa added 4 commits December 30, 2019 13:49

poc dissociating filter from fuzzer

bc9778c

Signed-off-by: Asra Ali <asraa@google.com>

Faster version of the UberFilterFuzzer

8d9392e

Signed-off-by: Asra Ali <asraa@google.com>

Merge remote-tracking branch 'upstream/master' into httpfilterbase

50d94d7

Signed-off-by: Asra Ali <asraa@google.com>

clean up the UberFilterFuzzer

1a76168

Signed-off-by: Asra Ali <asraa@google.com>

htuch reviewed Jan 2, 2020

View reviewed changes

asraa added 5 commits January 2, 2020 12:19

cleanup this PR and remove the buffer specific fuzzer

097def8

Signed-off-by: Asra Ali <asraa@google.com>

mutate an empty proto config

84bfb3f

Signed-off-by: Asra Ali <asraa@google.com>

add randomizing the filter config

529b57d

Signed-off-by: Asra Ali <asraa@google.com>

catch lua exceptions for bad code, and move the decode out of try/except

d0c76c0

Signed-off-by: Asra Ali <asraa@google.com>

run fuzzer, add some testcases

59ad89c

Signed-off-by: Asra Ali <asraa@google.com>

htuch reviewed Jan 7, 2020

View reviewed changes

asraa added 3 commits January 7, 2020 12:43

Merge remote-tracking branch 'upstream/master' into httpfilterbase

74d6330

Signed-off-by: Asra Ali <asraa@google.com>

update v3alpha

de4b20d

Signed-off-by: Asra Ali <asraa@google.com>

fuzz filter configs with valid names

bdaa814

Signed-off-by: Asra Ali <asraa@google.com>

mattklein123 assigned htuch Jan 9, 2020

vitalybuka mentioned this pull request Jan 10, 2020

Support google.protobuf.Any google/libprotobuf-mutator#151

Closed

vitalybuka reviewed Jan 10, 2020

View reviewed changes

asraa added 2 commits January 14, 2020 13:01

Merge remote-tracking branch 'upstream/master' into httpfilterbase

e71d04b

Signed-off-by: Asra Ali <asraa@google.com>

add dictionary with serialized proto config

53c2d14

Signed-off-by: Asra Ali <asraa@google.com>

htuch changed the title ~~WIP: HTTP filter fuzzer base class and POC buffer filter fuzzer~~ HTTP filter fuzzer base class and POC buffer filter fuzzer Jan 23, 2020

asraa mentioned this pull request Jan 23, 2020

Automatic creation of the HTTP Filter Fuzzer dictionary #9798

Closed

asraa added 7 commits January 23, 2020 10:54

address comments and TODO

f5a2879

Signed-off-by: Asra Ali <asraa@google.com>

add dictionary word

954259f

Signed-off-by: Asra Ali <asraa@google.com>

Merge remote-tracking branch 'upstream/master' into httpfilterbase

8ea70c3

Signed-off-by: Asra Ali <asraa@google.com>

fix mutator spelling

6f7997d

Signed-off-by: Asra Ali <asraa@google.com>

patch in envoyproxy#156 LPM Any support

113fe4c

Signed-off-by: Asra Ali <asraa@google.com>

format

5af8e06

Signed-off-by: Asra Ali <asraa@google.com>

fix ext authz bugs

178758d

Signed-off-by: Asra Ali <asraa@google.com>

asraa mentioned this pull request Jan 30, 2020

CacheFilter: Fuzz inputs from downstream, upstream, and cache #9871

Open

update lpm and remove serialized strings

52ca2c7

Signed-off-by: Asra Ali <asraa@google.com>

yanavlasov self-assigned this Feb 5, 2020

stale bot added the stale stalebot believes this issue/PR has not been touched recently label Feb 12, 2020

asraa added 2 commits February 13, 2020 14:33

fix more crashes

745df8a

Signed-off-by: Asra Ali <asraa@google.com>

Merge remote-tracking branch 'upstream/master' into httpfilterbase

fc9aede

Signed-off-by: Asra Ali <asraa@google.com>

stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Feb 13, 2020

fix format

ace39fb

Signed-off-by: Asra Ali <asraa@google.com>

asraa added 2 commits February 19, 2020 11:30

Merge remote-tracking branch 'upstream/master' into httpfilterbase

922b00a

Signed-off-by: Asra Ali <asraa@google.com>

fix test for merged change

20dca35

Signed-off-by: Asra Ali <asraa@google.com>

yanavlasov requested changes Feb 19, 2020

View reviewed changes

address comments

de26e08

Signed-off-by: Asra Ali <asraa@google.com>

yanavlasov approved these changes Feb 20, 2020

View reviewed changes

mattklein123 approved these changes Feb 20, 2020

View reviewed changes

mattklein123 merged commit fca0f0b into envoyproxy:master Feb 20, 2020


		import "validate/validate.proto";

		message BufferFilterFuzzTestCase {

Conversation

asraa commented Dec 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asraa commented Dec 18, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattklein123 commented Dec 19, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

htuch left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asraa Dec 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asraa Dec 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stale bot commented Dec 27, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asraa Jan 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

htuch left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asraa Jan 8, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

htuch commented Jan 23, 2020

Uh oh!

htuch commented Feb 5, 2020

Uh oh!

stale bot commented Feb 12, 2020

Uh oh!

asraa commented Feb 13, 2020

Uh oh!

asraa commented Feb 19, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattklein123 left a comment

asraa commented Dec 18, 2019 •

edited

Loading

asraa Dec 30, 2019 •

edited

Loading

asraa Dec 30, 2019 •

edited

Loading

asraa Jan 6, 2020 •

edited

Loading

asraa Jan 8, 2020 •

edited

Loading