AFL: segfault and lock resetting (fixes #497) by jmid · Pull Request #731 · ocaml-multicore/ocaml-multicore

jmid · 2021-11-03T23:43:36Z

This is a fix to get AFL-instrumentation running on multicore again.

First off, the segfault reported in #497 is caused by a NULL-pointer dereference during error reporting.
The error is triggered because afl-fuzz runs the tested program under a memory budget with setrlimit:
https://github.com/google/AFL/blob/61037103ae3722c8060ff7082994836a794f978e/afl-fuzz.c#L2034-L2040
The allocation of the minor heap space in caml_init_domains thus fails and calls caml_raise_out_of_memory.
It eventually ends up in caml_raise where it dereferences Caml_state which hasn't been initialized yet:

ocaml-multicore/runtime/fail_nat.c

Line 74 in 13a6be2

exception_pointer = (char*)Caml_state->c_stack;

The PR changes caml_init_domains to use caml_fatal_error consistently.

Secondly, the AFL-instrumentation uses fork as an optimization to push more tests through.
For a simple setup, afl-fuzz can fork and execute a program repeatedly for each mutated input.
This is the behaviour you get with AFL_NO_FORKSRV=1 afl-fuzz -i input -o output a.out (also mentioned in #497).
To save time on redundant linking and initialization AFL also offers a "fork server" mode. This mode instead pauses the instrumented program at the point just before "proper execution" begins, and then cheaply forks test programs whenever afl-fuzz tells it to:

ocaml-multicore/runtime/afl.c

Lines 116 to 120 in 13a6be2

    
           while (1) { 
        
             int child_pid = fork(); 
        
             if (child_pid < 0) caml_fatal_error("afl-fuzz: could not fork"); 
        
             else if (child_pid == 0) { 
        
               /* Run the program */

This setup is described in more detail in https://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html
Note: Because we are still during initialization this fork occurrence will only execute with 1 domain running.
The child however has problems with unlocking akin to #471. Since afl-fuzz silences the tested program
https://github.com/google/AFL/blob/61037103ae3722c8060ff7082994836a794f978e/afl-fuzz.c#L2067-L2068
an strace -f helped reveal the problem:

...
[pid 97880] close(198)                  = 0
[pid 97880] close(199)                  = 0
[pid 97880] write(2, "Fatal error during unlock", 25) = 25
[pid 97880] write(2, ": Operation not permitted\n", 26) = 26
[pid 97880] exit_group(2)               = ?
[pid 97880] +++ exited with 2 +++
[pid 97879] <... wait4 resumed>[{WIFEXITED(s) && WEXITSTATUS(s) == 2}], WSTOPPED, NULL) = 97880
[pid 97879] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=97880, si_uid=1000, si_status=2, si_utime=0, si_stime=0} ---
[pid 97879] write(199, "\0\2\0\0", 4)   = 4
...

Because the child quickly fails silently during afl-fuzzs perform_dry_run it reports it as
No instrumentation detected which is a bit confusing https://github.com/google/AFL/blob/61037103ae3722c8060ff7082994836a794f978e/afl-fuzz.c#L2629-L2632

To fix the unlock issue the PR uses the caml_atfork_hook from 1a2c612
With this fix the test case from #497 runs again using -m none (no memory budget).
I'll submit a separate PR regarding heap memory requirements.

…l_nat.c

ctk21

So using caml_fatal_error in caml_init_domains is a good thing.

I'm a bit worried about allowing fork in caml_setup_afl without checking that there are not multiple domains running, but I'm not an AFL expert. From what I could tell, caml_setup_afl is called from 'instrumented modules during intitialization'. Is it not possible for the following to happen:

module A: uninstrumented and starts a domain
module B: instrumented but initialized after module A at which point we have multiple domains

If this could happen, it could be fixed by inserting a caml_domain_alone() guard and erroring out if multiple domains are running?

kayceesrk · 2021-11-08T07:51:14Z

If this could happen, it could be fixed by inserting a caml_domain_alone() guard and erroring out if multiple domains are running?

This seems like a reasonable step to move forward.

jmid · 2021-11-08T16:28:35Z

Following @ctk21's suggestion I've added a caml_domain_alone guard to ensure that the afl-instrumentation's "fork-server" does not invoke fork after other domains have already spawned.

jmid · 2021-11-08T23:06:14Z

I noticed that ./configure -afl-instrument and make fails with:

../boot/ocamlrun ../ocamlopt -strict-sequence -absname -w +a-4-9-41-42-44-45-48-70 -g -warn-error +A -bin-annot -nostdlib -principal -safe-string -strict-formats  -function-sections  -afl-inst-ratio 0 -c camlinternalLazy.ml
../boot/ocamlrun ../ocamlopt -strict-sequence -absname -w +a-4-9-41-42-44-45-48-70 -g -warn-error +A -bin-annot -nostdlib -principal -safe-string -strict-formats  -function-sections   \
           -o stdlib__Lazy.cmx -c lazy.ml
>> Fatal error: Primitive CamlinternalLazy.force not found.
Fatal error: exception Misc.Fatal_error
make[4]: *** [Makefile:237: stdlib__Lazy.cmx] Error 2

This is caused by c0ef11f AFAICS which unified force and force_val in one generic force_gen operation (CC @gadmm). For now I've gone with just calling that, but one could also roll back the unified interface to minimize interface changes to trunk. I've furthermore updated the comment in camlinternalLazy.ml to match the updated comment in trunk OCaml, so that it is clearer that the operation is actually used.

Thematically this is part of getting "AFL instrumentation working again" - but this can also be submitted as a separate PR if you prefer.

gadmm · 2021-11-08T23:22:54Z

This is caused by c0ef11f AFAICS which unified force and force_val in one generic force_gen operation (CC @gadmm). For now I've gone with just calling that, but one could also roll back the unified interface to minimize interface changes to trunk. I've furthermore updated the comment in camlinternalLazy.ml to match the updated comment in trunk OCaml, so that it is clearer that the operation is actually used.

Thanks, this seems correct to me, but I am not an expert of the affected code so it is best proof-read by someone else as well.

kayceesrk

The changes look good to me.

…fix-fatal-unlock AFL: segfault and lock resetting (fixes ocaml#497). Also fixes broken ./configure -afl-instrument && make

jmid added 2 commits November 3, 2021 21:45

reset child's lock under afl-instrumentation following 1a2c612

6726edd

print error when mmap fails, rather than NULL-deref Caml_state in fai…

8a93d45

…l_nat.c

jmid mentioned this pull request Nov 4, 2021

Reduce minor heap reservation #732

Closed

Fix 80-column overflow

0fa0157

ctk21 reviewed Nov 5, 2021

View reviewed changes

add caml_domain_alone guard of fork

a2e4e7a

update CamlinternalLazy.force to force_gen

a62d9f5

kayceesrk approved these changes Nov 9, 2021

View reviewed changes

jmid merged commit b47bdeb into ocaml-multicore:5.00 Nov 9, 2021

jmid mentioned this pull request Nov 11, 2021

Crash running multicore binary under AFL #611

Closed

sadiqj pushed a commit to sadiqj/ocaml that referenced this pull request Jan 10, 2022

Merge pull request ocaml-multicore/ocaml-multicore#731 from jmid/afl-…

8b18477

…fix-fatal-unlock AFL: segfault and lock resetting (fixes ocaml#497). Also fixes broken ./configure -afl-instrument && make

xavierleroy mentioned this pull request Jan 10, 2022

AFL in OCaml 5.00 ocaml/ocaml#10864

Closed

ctk21 pushed a commit to ctk21/ocaml that referenced this pull request Jan 11, 2022

Merge pull request ocaml-multicore/ocaml-multicore#731 from jmid/afl-…

d9bdc3c

…fix-fatal-unlock AFL: segfault and lock resetting (fixes ocaml#497). Also fixes broken ./configure -afl-instrument && make

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AFL: segfault and lock resetting (fixes #497)#731

AFL: segfault and lock resetting (fixes #497)#731
jmid merged 5 commits intoocaml-multicore:5.00from
jmid:afl-fix-fatal-unlock

jmid commented Nov 3, 2021

Uh oh!

ctk21 left a comment

Uh oh!

kayceesrk commented Nov 8, 2021

Uh oh!

jmid commented Nov 8, 2021

Uh oh!

jmid commented Nov 8, 2021

Uh oh!

gadmm commented Nov 8, 2021

Uh oh!

kayceesrk left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	while (1) {
	int child_pid = fork();
	if (child_pid < 0) caml_fatal_error("afl-fuzz: could not fork");
	else if (child_pid == 0) {
	/* Run the program */

Conversation

jmid commented Nov 3, 2021

Uh oh!

ctk21 left a comment

Choose a reason for hiding this comment

Uh oh!

kayceesrk commented Nov 8, 2021

Uh oh!

jmid commented Nov 8, 2021

Uh oh!

jmid commented Nov 8, 2021

Uh oh!

gadmm commented Nov 8, 2021

Uh oh!

kayceesrk left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants