Skip to content

Soliciting feedback on a (currently failing) attempt to improve "stability" by tweaking the compiler #2

@gasche

Description

@gasche

I noticed that afl-persistent and crowbar have run into issues with
"stability" that are to be handled at the level of the compiler code
generation strategy:

ocaml/ocaml#1345
(disabling class caching in afl-instrument mode)

and

stedolan/crowbar#14
(yet-unfixed stability issue with `Lazy.force)

are instances of this.

Energized by my use of Crowbar during the excellent Mirage retreat
this year, I have wondered if it was something I could help with, and
tried to implement pieces of a solution at the level of the OCaml
compiler. Specifically I would like to make it easy to say, in the
compiler, "don't instrument this branch" or something like that, and
then this opt-out mechanism could be used to solve those stability
issues. This is the dream; what I have right now is a branch that does
what I wanted it to do, but does not solve stability issues in any way
that I was able to measure (so maybe I wanted the wrong thing). So,
very Work In Progress. I'm posting here to solicit feedback and
comments (cc @stedolan and @yomimono); in particular, I'm curious if
someone think that this approach has a chance to ever work.

The branch is afl on my personal fork:

https://github.com/gasche/ocaml/tree/afl

there are commits that describe what it does

https://github.com/gasche/ocaml/commits/afl
https://github.com/gasche/ocaml/compare/trunk...gasche:afl?expand=1

Understanding of the problem

The stability issues, if I understand correctly (but then I don't
actually understand what "stability" means in afl-fuzz parlance), come
from code that will run into different control path if they are
executed several times from the same input, when the change-of-control
mechanism are outside the view of the tool. (In the classes case, it
is caching; in the Lazy.force case, if @stedolan's analysis of it is
correct, it is the non-deterministic effect of a GC removing a Forward
tag before or after the user tries to force the already-evaluated
thunk for a second time).

So if we don't want to change how the compiler produces code (I would
rather not), we have to find how to selectively disable the
instrumentation point that are involved in these un-stable
conditionals. For class caching, I had the impression that if we could
make sure that neither the "has the method table already been
computed" test nor any of the method-table-computation code was
instrumented, we would recover stability. For the lazy forcing, I had
the impression that if the only instrumentation points to run were the
ones involved in forcing the trunk, and one when the final value is
returned, then that would be correct (I'm not convinced anymore).

Design of a solution attempt

It seems impossible or at least extremely fragile to try to annotate
individual control-flow expressions at the level of OCaml or Lambda
code, and hope to robustly control instrumentation with that, because
instrumentation happens at the Cmm level and further control-flow
constructs may be inserted by the compiler in the middle.

So I think one would need a way to disable all instrumentation within
one portion (defined as static code zone or dynamic fragment of
execution) of the program. (It would also be useful to be able to
share instrumentation points across two program locations, but that
sounded more difficult to implement so I haven't worked on it.)

My solution attempt is to introduce two primitives that would look
essentially as follows:

val suspend_afl : unit -> unit
val restore_afl : unit -> unit

Between a call to suspend_afl and the nest call to restore_afl, no
instrumentation happens.

I implement these two primitives in the commits

  • "add {suspend,restore}afl primitives to selectively silence path collection"
    (this propagates the primitives from the frontend to cmm,
    where I expect that they will be given a semantics in afl_instrument.ml)
  • "selectgen: interpret {suspend,restore}afl as no-ops"
    (this ignores then in instruction selection, that is after cmm)

(I'm using commit names instead of hashes as I expect the hashes to
change as I keep rebasing the PR.)

Attempt 1: disabling instrumentation dynamically

The dumbest possible implementation for this is to have a piece of
global state, a boolean, to indicate whether instrumentation is
currently active or suspended:

  • the {suspend,restore} operations are compiled into writes that
    change that global state

  • the instrumentation code is changed to first check this global flag
    before performing any action

I implemented this approach in

  • "semantics of {suspend,restore}afl"

(I realize as I'm writing this report that this implementation does
not bracket well, calling "suspend" twice and then "restore" only once
restores execution, and that incrementing and decrementing an integer
would work better. Easy to implement.)

Attempt 2: disabling instrumentation statically

When I started thinking about {suspend,restore}_afl, I had in mind
something closer to static annotations that would direct the
non-emission of afl instrumentation code. Basically the idea is that
after you encounter suspend_afl, you stop emitting insturmentation
code, and you start again on the next restore_afl.

The naive implementation of this assumes that function calls always
start with instrumentation enabled, so that instrumentation always
restarts during inner function calls (so it is not equivalent to the
static semantics above). This is not good for the class caching
scenario, when we want the no-insturmentation zone to cross function
calls. But it would be easy to add a blacklist of specific functions
from CamlinternalOO to start in suspended mode, or to just add
well-placed (suspendafl) instructions at the beginning of those
functions (as a ppx extension or what not).

The way this "static" policy is implemented may qualify as a "cool
hack", as in: I think it's cool, but it is also a horrible
hack. Anyway, this was just for experiment, and I checked the output
with -dcmm and it seems to be doing what I want:

  • "afl: make the afl_status a purely static property"

Negative results

I haven't tried to measure stability of object method table
caching. I tried to measure stability of thunk forcing, and the
results are terrible: none of what I have done improves stability, in
fact it decreases it.

I made two attempts at artfully placing {suspend,restore}_afl
instructions in the Lazy.force generated code:

  • "try to use {suspend,restore}afl primitives to stabilize Lazy forcing"
  • "{suspend,restore}afl in lazy_force: try to better bracket the instructions"

The code emitted by the compiler for Lazy.force looks as follows:

match Obj.tag obj with
| Lazy_tag -> force_the_thunk_of obj
| Forward_tag -> obj.(0)
| _ -> obj

and I tried to change it as follows (in the first commit):

suspend_afl ();
match Obj.tag obj with
| Lazy_tag -> restore_afl (); force_the_thunk_of obj
| Forward_tag -> obj.(0)
| _ -> obj

or as follows (in the second commit)

suspend_afl ();
let result =
  match Obj.tag obj with
  | Lazy_tag -> restore_afl (); force_the_thunk_of obj
  | Forward_tag -> obj.(0)
  | _ -> obj
in
restore_afl ();
result

Finally, I decided that what I really wanted was to have both the
Forward_tag and the _ branches assigned the same instrumentation
point, so I wrote a different version of the lazy-forcing code
generation strategy that looks like this:

if Obj.tag obj = Lazy_tag
then begin
  force_the_thunk of obj
end else begin
  suspend_afl ();
  let result =
    if Obj.tag obj = Forward_tag then obj.(0) else obj
  in
  restore_afl ();
  result
end

None of that seems to work: without my changes, my reproduction code
(below) sports a stability of 57.14%, and some of what I've done bring
stability down to 0%, or 33%, or at best 52.94%.

It was not easy for me to find code to reproduce the stability issue
for laziness, below is my repro-case. Is it sensible?

let test = lazy (print_endline "foo")

let f () =
  ignore (List.init 10 (fun _ -> Lazy.force test))

let _ = AflPersistent.run f

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions