libFuzzer support (WIP) by dvyukov · Pull Request #217 · dvyukov/go-fuzz

dvyukov · 2019-03-04T11:08:31Z

See #213
@guidovranken
@josharian please take a look as well, as least from the perspective that you both are touching the same files

dvyukov · 2019-03-04T11:09:34Z

+	return &ast.AssignStmt{
+		Lhs: []ast.Expr{counter},
+		Tok: token.ASSIGN,
+		Rhs: []ast.Expr{ &ast.BasicLit{Kind: token.INT, Value: "1"} },


Is this required? Or what's the motivation behind this change?

Because otherwise each unique amount of branches executed registers as a new coverage signal. This leads to almost every new input getting added to the corpus.

go-fuzz can operate in two modes (following AFL, as I understand it): either looking at which basic blocks are reached, or looking at a quantized count of how many times basic blocks are reached. See func roundUpCover and this quote from the AFL whitepaper:

In addition to detecting new tuples, the fuzzer also considers coarse tuple
hit counts. These are divided into several buckets:

1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+

But I'm still new here, so I might have gotten this wrong.

In any case, if this is what libFuzzer needs, this behavior should probably be switched by flagLibFuzzer.

I am still don't understand the full picture -- libfuzzer should support counters, but maybe it doesn't support them via __libfuzzer_extra_counters , or maybe the problem is that you dropped:

for i := range CoverTab { CoverTab[i] = 0 }

before each test.

Try to zero CoverTab before each test. If it does not help, then let's just switch between increments and setting to 1 depending on -libfuzzer flag.
But do we need to zero CoverTab anyway?... What does libfuzzer expect here?

I'm afraid that using actual counters (used in standard go-fuzz, and as opposed to a binary CoverTab as implemented in my PR), will lead to a "coverage explosion".

For instance:

for i := 0; i < input1; i++ { /* CoverTab updating inserted by go-fuzz here */ for j := 0; j < input2; j++ { /* CoverTab updating inserted by go-fuzz here */ for k := 0; k < input3; k++ { /* CoverTab updating inserted by go-fuzz here */ for l := 0; l < input4; l++ { /* CoverTab updating inserted by go-fuzz here */ } } } }

With the standard go-fuzz CoverTab updating mechanism (incrementing the counter), this would lead to 2^32 different coverage signals, leading to 2^32 different files written to the corpus directory. This is obviously counter-productive since every unique input leads to new "coverage".

When we use my code that sets CoverTab to 1 if a branch is hit, and leaves it 0 as long it is not, the max code coverage is 4 instead of 2^32, which seems a lot more useful, especially if you consider complex targets that have enough branches of their own to give off a useful coverage signal.

I'm not a proponent of zeroing CoverTab each iteration because this incurs an unnecessary speed penalty that can be simply avoided by doing CoverTab[N] = 1 in libFuzzer mode.

So I'll just make changes to the instrumentation code to insert my customized CoverTab updater.

In the future we can look at using a hybrid in the form of using 2 or 3 bits of coverage data per branch instead of 8 bits (standard go-fuzz) or 1 bit (my PR), depending on A/B testing.

dvyukov · 2019-03-04T11:14:00Z

-			syscall.Exit(1)
-		}
-		wr += n
+func Initialize(coverTabPtr unsafe.Pointer, coverTabSize uint64) {


Could we just add this function to the package and leave everything else intact?

I'd like to, but the init function is executed as soon as the binary is executed, and fails

failed to mmap fd = 3 errno = 9

Yes, we should skip that init in libfuzzer mode.
Perhaps we could put code for normal mode and libfuzzer into separate files and use libfuzzer tag to select one or another in go-fuzz-build.

dvyukov · 2019-03-04T11:27:14Z

+}
+
+int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
+    uint8_t* datacopy = malloc(size);


Why do we need to make a copy of the data rather then pass the data as is?

Because const uint8_t data may not be passed to a function that does not treat the pointer as const. But implementing LLVMFuzzerTestOneInput in Go and using SliceHeader solves this I think.

dvyukov · 2019-03-04T11:28:07Z

@@ -0,0 +1,38 @@
+#include <stdio.h>


This file need the standard copyright header. Checkout the latest version, it has just changed.

Addressed in latest commit.

dvyukov · 2019-03-04T11:29:34Z

+extern void gofuzz_Run(GoSlice p0);
+
+
+#ifdef __linux__


This will silently break on other OSes with obscure failure mode. It's better to make the build fail loudly instead.

Addressed in latest commit.

Push? I don't see it yet.

If we are looking at the same source (which is not always clear on github), then there is below:

#else #error Currently only Linux is supported

dvyukov · 2019-03-04T11:30:46Z

+
+typedef long long GoInt64;
+typedef GoInt64 GoInt;
+typedef struct { void *data; GoInt len; GoInt cap; } GoSlice;


Let's pass data pointer/size to Go code instead and reconstruct the slice in Go code using https://golang.org/pkg/reflect/#SliceHeader. The less C code we have, the better.

Addressed in latest commit.

dvyukov · 2019-03-04T11:31:38Z

+#include <stddef.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <string.h>


How is this file involved in the build? I fail to understand how it becomes part of the build.

josharian · 2019-03-04T21:21:17Z

+extern void gofuzz_Run(GoSlice p0);
+
+
+#ifdef __linux__


Push? I don't see it yet.

josharian · 2019-03-04T21:22:06Z

+	return &ast.AssignStmt{
+		Lhs: []ast.Expr{counter},
+		Tok: token.ASSIGN,
+		Rhs: []ast.Expr{ &ast.BasicLit{Kind: token.INT, Value: "1"} },


In any case, if this is what libFuzzer needs, this behavior should probably be switched by flagLibFuzzer.

josharian · 2019-03-04T21:24:06Z

+	if *flagLibFuzzer == true {
+		archive := c.buildInstrumentedBinary(&blocks, nil)
+
+		if *flagOut == "" {


How about we unify the decision about what to use for *flagOut. So above:

if *flagOut == "" { suffix := ".zip" if *flagLibFuzzer { suffix = ".a" } *flagOut = c.pkgs[0].Name+"-fuzz"+suffix }

Then if/when we add the Fuzz method name as context, we won't have to add it in three places. (We already need to add it in go-fuzz-build and go-fuzz.)

Since libFuzzer doesn't use the literals, let's move this whole thing up higher above the giant comment and make it standalone. Something like:

if *flagLibFuzzer { var blocks []CoverBlock archive := ... os.Rename ... return }

Does the literal finder extract hardcoded strings from the source files?
That might actually be very useful to create a dictionary file from, that libFuzzer can consume via the -dict= parameter. But I'll have to test this.

Yes, it does.

Should we also do the last step and run:

clang target.a -fsanitize=fuzzer `` ? This gives me a working fuzzer which is nice.

This is interesting but won't work with oss-fuzz which requires that you link against -lFuzzingEngine.

OK, let's leave it as is for now.
We can get back to this if/when somebody will have interest in using it. We could build both, or have an additional flag.

josharian · 2019-03-04T21:26:32Z

+			*flagOut = c.pkgs[0].Name + ".a"
+		}
+
+		os.Rename(archive, *flagOut)


Check the returned error here

josharian · 2019-03-04T21:27:01Z

 	// GOROOT/pkg/tool and GOROOT/pkg/include.
 	// Even better, see if we can avoid making some copies
 	// at all, using some combination of env vars and toolexec.
+	c.copyDir(filepath.Join(c.GOROOT, "src", "runtime", "cgo"), filepath.Join(c.workdir, "goroot", "src", "runtime", "cgo"))


Why? Maybe only do this for libfuzzer?

josharian · 2019-03-04T21:28:30Z

 	return f
 }

+func getExtraBuildFlags() string {


josharian · 2019-03-04T21:29:05Z

 }

+func getMainSrc() string {
+	if *flagLibFuzzer == false {


Remove == false, use ! instead.

josharian · 2019-03-04T21:29:40Z

 	}
 }

+func getMainSrc() string {


s/getMainSrc/funcMain/

Or some such. But no leading "get".

josharian · 2019-03-04T21:29:59Z

+func getMainSrc() string {
+	if *flagLibFuzzer == false {
+		return mainSrc
+	} else {


Drop the else, outdent.

josharian · 2019-03-04T21:33:34Z

-}

-func Main(f func([]byte) int) {
-	runtime.GOMAXPROCS(1) // makes coverage more deterministic, we parallelize on higher level


If this is going to get upstreamed, we obviously need to find some way not to delete all this code, but instead have it live harmoniously side-by-side. :)

It might even make sense just to make a new package instead, and have go-fuzz-build switch as needed; it doesn't seem like there's much code overlap.

Could build tags be of any help here? So that if go-fuzz-build runs the compilation process, the tag 'libFuzzer' would use a special version of ```go-fuzz-dep/main.go````, without a build tag it uses the default?
IDK, I'm not really a Go guy.

Yes, we could use build tags. It depends on how much code overlap there is. If there's substantive code overlap, build tags would be a fine idea. If the code is mostly disjoint, we may as well just have a go-fuzz-dep-libfuzzer instead.

Let's do 2 files and build tags. go-fuzz-dep is mentioned a bunch of times in go-fuzz-build, so just adding a tag looks simpler. Tag may turn out to be useful in other places as well (maybe even in user code?).
And in the end this is all strictly internal to go-fuzz, so we can change it in future.
Since we currently define gofuzz, let's go gofuzz-libfuzzer (or gofuzz_libfuzzer id dashes are not allowed) to keep all our tags prefixed with gofuzz.

Now supporting 2 files using build tags.

dvyukov · 2019-03-05T12:02:30Z

+	return &ast.AssignStmt{
+		Lhs: []ast.Expr{counter},
+		Tok: token.ASSIGN,
+		Rhs: []ast.Expr{ &ast.BasicLit{Kind: token.INT, Value: "1"} },


I am still don't understand the full picture -- libfuzzer should support counters, but maybe it doesn't support them via __libfuzzer_extra_counters , or maybe the problem is that you dropped:

for i := range CoverTab { CoverTab[i] = 0 }

before each test.

Try to zero CoverTab before each test. If it does not help, then let's just switch between increments and setting to 1 depending on -libfuzzer flag.
But do we need to zero CoverTab anyway?... What does libfuzzer expect here?

dvyukov · 2019-03-05T12:08:41Z

-			syscall.Exit(1)
-		}
-		wr += n
+func Initialize(coverTabPtr unsafe.Pointer, coverTabSize uint64) {


Yes, we should skip that init in libfuzzer mode.
Perhaps we could put code for normal mode and libfuzzer into separate files and use libfuzzer tag to select one or another in go-fuzz-build.

dvyukov · 2019-03-05T12:14:04Z

-}

-func Main(f func([]byte) int) {
-	runtime.GOMAXPROCS(1) // makes coverage more deterministic, we parallelize on higher level


Let's do 2 files and build tags. go-fuzz-dep is mentioned a bunch of times in go-fuzz-build, so just adding a tag looks simpler. Tag may turn out to be useful in other places as well (maybe even in user code?).
And in the end this is all strictly internal to go-fuzz, so we can change it in future.
Since we currently define gofuzz, let's go gofuzz-libfuzzer (or gofuzz_libfuzzer id dashes are not allowed) to keep all our tags prefixed with gofuzz.

dvyukov · 2019-03-05T12:17:36Z

+extern void gofuzz_Run(GoSlice p0);
+
+
+#ifdef __linux__


If we are looking at the same source (which is not always clear on github), then there is below:

#else #error Currently only Linux is supported

dvyukov · 2019-03-05T12:18:48Z

+}
+
+//export LLVMFuzzerTestOneInput
+func LLVMFuzzerTestOneInput(data uintptr, size uint64) int {


Nice!
This improves whole bunch of things -- no copy of data, no construction of Go slice in C code, less C code.

dvyukov · 2019-03-05T12:57:51Z

-	}
-	return deserialize64(buf[:])
+func init() {
+	CoverTab = (*[CoverSize]byte)(unsafe.Pointer(&CoverTabTmp[0]))


It would be super nice to put this directly into go-fuzz-dep package:

// #cgo CFLAGS: -Wall -Werror /* #ifdef __linux__ __attribute__((weak, section("__libfuzzer_extra_counters"))) #else #error Currently only Linux is supported #endif unsigned char __go_fuzz_counters[65536]; */ import "C" func init() { CoverTab = (*[CoverSize]byte)(unsafe.Pointer(&C.__go_fuzz_counters[0])) }

Then would not need bootstrap CoverTab and the Initialize function.
But unfortunately cgo can't be used here:

go-fuzz$ go install ./... && go-fuzz-build -libfuzzer ../go-fuzz-corpus/fmt && clang fmt.a -fsanitize=fuzzer gopath/src/github.com/dvyukov/go-fuzz/go-fuzz-dep/main.go:9:2: could not import C (no metadata for C) gopath/src/github.com/dvyukov/go-fuzz/go-fuzz-dep/main.go:24:8: C redeclared in this block gopath/src/github.com/dvyukov/go-fuzz/go-fuzz-dep/main.go:9:2: other declaration of C typechecking of ../go-fuzz-corpus/fmt failed

@josharian do you know if it's theoretically possible to support cgo here?
It's not a super big deal because the current scheme works. But if it's easy to do, then it would be nice.

dvyukov · 2019-03-06T09:23:24Z

I think we are almost there.

It has merge conflicts and does not seem to be based on current HEAD.
Please squash it all into a single commit, rebase on top of master HEAD and force push.

josharian · 2019-03-06T14:24:53Z

Please squash it all into a single commit, rebase on top of master HEAD and force push.

+1. And I’ll take a last look then as well. Thanks!

dvyukov · 2019-03-06T09:19:22Z

+	    if err != nil {
+	        c.failf("failed to rename file: %v", err)
+	    }
+		return


This looks unformatted, please run go fmt ./...

dvyukov · 2019-03-06T09:20:26Z

+)
+
+
+


Please remove 2 out of 3 new lines.

dvyukov · 2019-03-06T09:21:53Z

+	CoverTab = (*[CoverSize]byte)(coverTabPtr)
+}
+
+func Main(f func([]byte) int) {


Remove this only in this file, it does not do anything useful. And can confuse future readers as to what's its purpose.

If I remove it, I get this:

failed to execute go build: exit status 2 # internal/testlog /home/jhg/gofuzzpr/go/src/internal/testlog/log.go:69: undefined: gofuzzdep.Main # errors /home/jhg/gofuzzpr/go/src/errors/errors.go:20: undefined: gofuzzdep.Main # math/bits /home/jhg/gofuzzpr/go/src/math/bits/bits.go:535: undefined: gofuzzdep.Main /home/jhg/gofuzzpr/go/src/math/bits/bits_tables.go:83: undefined: gofuzzdep.Main # unicode/utf8 /home/jhg/gofuzzpr/go/src/unicode/utf8/utf8.go:521: undefined: gofuzzdep.Main # internal/syscall/unix /home/jhg/gofuzzpr/go/src/internal/syscall/unix/at.go:58: undefined: gofuzzdep.Main /home/jhg/gofuzzpr/go/src/internal/syscall/unix/at_sysnum_linux.go:13: undefined: gofuzzdep.Main /home/jhg/gofuzzpr/go/src/internal/syscall/unix/at_sysnum_newfstatat_linux.go:11: undefined: gofuzzdep.Main /home/jhg/gofuzzpr/go/src/internal/syscall/unix/getrandom_linux.go:46: undefined: gofuzzdep.Main /home/jhg/gofuzzpr/go/src/internal/syscall/unix/getrandom_linux_amd64.go:9: undefined: gofuzzdep.Main /home/jhg/gofuzzpr/go/src/internal/syscall/unix/nonblocking.go:17: undefined: gofuzzdep.Main # unicode /home/jhg/gofuzzpr/go/src/unicode/casetables.go:20: undefined: gofuzzdep.Main /home/jhg/gofuzzpr/go/src/unicode/digit.go:13: undefined: gofuzzdep.Main /home/jhg/gofuzzpr/go/src/unicode/graphic.go:144: undefined: gofuzzdep.Main /home/jhg/gofuzzpr/go/src/unicode/letter.go:370: undefined: gofuzzdep.Main /home/jhg/gofuzzpr/go/src/unicode/tables.go:7761: undefined: gofuzzdep.Main

See #217 (comment)

dvyukov · 2019-03-06T15:02:23Z

@@ -0,0 +1,41 @@
+// Copyright 2015 Dmitry Vyukov. All rights reserved.


Note: the copyright message has changed on HEAD to a more appropriate one.

guidovranken · 2019-03-08T15:02:42Z

I will finish this tonight.

dvyukov · 2019-03-11T08:24:45Z

 	// Question: Do it here or in copyDir?

-	// TODO: See if we can avoid making toolchain copies,
+  // TODO: See if we can avoid making toolchain copies,


This looks unformatted. Please run go fmt ./...

dvyukov · 2019-03-11T08:27:58Z

Thanks, Guido!
@josharian do you have any other comments?
github still says "This branch cannot be rebased due to conflicts". Please do git rebase origin/master.

josharian · 2019-03-11T13:21:00Z

@josharian do you have any other comments?

I've been waiting for the branch to be rebased and squashed to a single commit before I do another review. I think we're almost there, though.

libFuzzer C shim: fail if host OS is not Linux Address several issues mentioned in github.com//pull/217 - Use camel case where it is customary - Implement C shim code in Go - Use SliceHeader to convert fuzzer input to Go slice go-fuzz-build: Better function names go-fuzz-build: Remove else in funcMain() go-fuzz-build: Condense testing of flagLibFuzzer go-fuzz-build: Remove else in extraBuildFlags() go-fuzz-build: Deal with os.Rename() return value Move remaining C functionality to Go Use build tags to select customized go-fuzz-dep code for -libfuzzer Build tag libfuzzer -> gofuzz_libfuzzer go-fuzz-build/main.go: go fmt go-fuzz-dep/main_libFuzzer.go: Remove excess newlines Update copyright notice go-fuzz-build populateWorkdir(): Copy cgo directory only in libFuzzer mode

guidovranken · 2019-03-11T20:36:31Z

Done.

josharian

This LGTM. There are a few nits remaining, but I can just take care of those myself afterwards, since @guidovranken has been patient enough with review cycles as it is.

Before committing, though, I would like the commit message to be cleaned up a bit and force pushed.

Frustratingly, GitHub doesn't let me write line comments on the commit message (does they not care‽), but:

Please remove " (WIP)" from the title.
Instead of having a list of all the messy work we did along the way in the body, please write a short description instead. Something like: "This change adds a -libfuzzer flag to go-fuzz-build. When provided, go-fuzz-build generates an archive file that can be used with libFuzzer. Sample usage: . It also adds a new build tag, gofuzz_libfuzzer, that will be provided when building code for use with libFuzzer."

I will probably take that terminal transcript and send a follow-up change adding something to the README about this.

Thanks again for your patience with the review.

josharian · 2019-03-11T20:50:45Z

+	if *flagLibFuzzer {
+		archive := c.buildInstrumentedBinary(&blocks, nil)
+
+		if *flagOut == "" {


I still think we should do my first suggestion in this thread, about adjusting *flagOut only once. But we can merge without that and I can do it as a follow-up.

josharian · 2019-03-11T21:04:18Z

 }
 `
+
+var mainSrcLibFuzzer = `


I will probably rewrite these to use text/template at some point in the future, since they are pretty large, and it is hard to easily see the formatting directives. Fine for now, though.

josharian · 2019-03-11T21:07:16Z

+	. "github.com/dvyukov/go-fuzz/go-fuzz-defs"
+)
+
+// Bool is just a bool.


I'd like to factor the shared code into a third file that is only protected by the gofuzz build tag.

But again, I'm happy to do that myself as a follow-up after this is merged.

dvyukov · 2019-03-12T08:48:34Z

This LGTM. There are a few nits remaining, but I can just take care of those myself afterwards, since @guidovranken has been patient enough with review cycles as it is.

Agree.
I will merge this now with changed commit message.
@josharian please do the clenaups afterwards.

dvyukov · 2019-03-12T08:51:41Z

Btw github allows me to edit commit title/description when squashing:
897eea5

Doh! This is not attributed to @guidovranken. Is it because I created the PR? I kinda created it just to see the diff myself. But since I created it off the @guidovranken tree, all pushes updated the PR so it become the review PR...

josharian · 2019-03-12T17:59:26Z

Turns out this broke non-libfuzzer builds. I'm working on it now.

josharian · 2019-03-12T21:25:46Z

Follow-ups: #220

dvyukov mentioned this pull request Mar 4, 2019

libFuzzer support #213

Closed

dvyukov commented Mar 4, 2019

View reviewed changes

josharian reviewed Mar 4, 2019

View reviewed changes

dvyukov commented Mar 5, 2019

View reviewed changes

Comment thread C/main_libFuzzer_extra_counters.c Outdated

dvyukov commented Mar 5, 2019

View reviewed changes

dvyukov commented Mar 6, 2019

View reviewed changes

dvyukov commented Mar 11, 2019

View reviewed changes

josharian approved these changes Mar 11, 2019

View reviewed changes

dvyukov merged commit 897eea5 into dvyukov:master Mar 12, 2019

josharian mentioned this pull request Mar 12, 2019

cannot find "-o" package on vanilla installation/usage #221

Closed

		@@ -0,0 +1,41 @@
		// Copyright 2015 Dmitry Vyukov. All rights reserved.

Conversation

dvyukov commented Mar 4, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guidovranken Mar 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guidovranken Mar 4, 2019 •

edited

Loading