Skip to content

Fix AFL show-map test#1789

Closed
ncik-roberts wants to merge 1 commit intomainfrom
fix-afl-test
Closed

Fix AFL show-map test#1789
ncik-roberts wants to merge 1 commit intomainfrom
fix-afl-test

Conversation

@ncik-roberts
Copy link
Copy Markdown
Contributor

@ncik-roberts ncik-roberts commented Aug 31, 2023

Fix an AFL instrumentation test, which is failing locally and skipped in CI. I believe it's failing for an uninteresting reason, so this PR is just a band-aid to get it passing again.

Review: @mshinwell says that @stedolan is familiar with this test — could you take a look?

What is this test anyway?

The AFL testsuite consists of:

  • some top-level startup code that's shared among all tests
  • a series of tests (really just functions of type unit -> unit)

A run of a test consists of:

  • Running the startup code
  • (I) Gathering afl-show-map output for a single invocation of the test function
  • (II) Gathering afl-show-map output for two serial invocations of the test function
  • Checking that the afl-show-map output exactly doubles from (I) to (II).

I'm not very familiar with afl-show-map but it looks like it prints some stats about which basic blocks are explored by the run of the instrumented program. The intuition is: if you run the same (deterministic) code twice, if a basic block is explored once in (I), it should be explored twice in (II).

What test is failing and why?

The laziness test is failing:

(* Top-level startup code *)
let already_forced = lazy (ref 42)
let _ = Lazy.force already_forced

(* Test function *)
let laziness () = opaque @@
  let _ = Lazy.force already_forced in
  Gc.major ()

The reason the test fails is that the count of basic blocks explored for the laziness does not exactly double from 1 invocation to 2 invocations.

For 1 invocation:

026649:1
051424:1

For 2 invocations:

026649:2
040923:1
051424:1
053443:1

I suspect that the first call to Gc.major () is doing something "different enough" to later calls (maybe just more work?), and that's why 051424 is hit in the first call and 053443 and 040923 are hit in the second call. A magic trace suggests that the first call is doing a lot more work in caml_empty_minor_heap, probably collecting the other garbage generated by top-level startup code. Indeed, if I call Gc.minor () once at top-level before laziness runs, then the output doubles as expected:

$ afl-showmap -q -o /dev/stdout -- ./test 1 7
025003:1
038699:1
042577:1
$ afl-showmap -q -o /dev/stdout -- ./test 2 7
025003:2
038699:2
042577:2

@stedolan
Copy link
Copy Markdown
Contributor

At first glance, it looks like this is actually detecting a real issue, so I'd prefer to either fix it or leave it failing rather than change the test in this way. It's important that afl instrumentation output does not depend on when GC runs, and it looks like GC is actually affecting it here.

@ncik-roberts
Copy link
Copy Markdown
Contributor Author

OK, I'll bisect at some point.

This was referenced Sep 12, 2023
@ncik-roberts
Copy link
Copy Markdown
Contributor Author

Closing in favor of #1824

@ncik-roberts ncik-roberts deleted the fix-afl-test branch September 12, 2023 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants