Skip to content

Add "phantom let" support and provenance tracking to Flambda#855

Closed
mshinwell wants to merge 1 commit intoocaml:trunkfrom
mshinwell:flambda_phantom
Closed

Add "phantom let" support and provenance tracking to Flambda#855
mshinwell wants to merge 1 commit intoocaml:trunkfrom
mshinwell:flambda_phantom

Conversation

@mshinwell
Copy link
Copy Markdown
Contributor

This pull request deals with the addition of a new construction called "phantom let" to Flambda. Phantom lets behave like normal lets except that they do not generate any code at runtime. They are used as a way of encoding optimised-out parts of the program entirely within the language rather than using some kind of out-of-band mechanism.

This patch also adds support to Flambda for provenance tracking: identifying which variables in the Flambda code came from which variables (equipped with full module paths and locations of definitions) in the source program. The tracking back to original Ident.t values is necessary so that their original stamps can be retrieved; these are used for lookup of types in .cmt files in libmonda. The propagation of location information is required so that variables that do not have unique names can be properly disambiguated in the debugger. This location information keeps track of inlined-out frames.

In order for it to compile this pull request has a couple of stubs on the Lambda side and a version of Flambda_to_clambda that does not deal with phantom lets. This will be added via a subsequent pull request that adds support for them in Clambda and beyond.

Code in this pull request scrapes Lambda expressions for location information. It would probably be better in future to equip Llet with the necessary location information. There will be another pull request coming in due course in this area, so let's save any discussion on this point for later.

Phantom lets only permit certain kinds of defining expressions for simplicity. @lpw25 suggested this might be a mistake (we may still gain more information about an optimised-out piece of code if it were subsequently inlined), but I think this should stay as-is for the moment. It's probably actually good enough in any case.

When the simplifier determines that a let binding is no longer required it will become eligible for phantomisation. This uses both the structure of the redundant defining expression together with its approximation to try to construct a phantom let which will later be describable in DWARF. The backend support permits the description in DWARF (using "implicit pointers") of structured values that have been optimised out even in the case where such values may reference values that were not optimised out. For example, the structure of a pair may be described in the following manner: the pair was optimised out, but it had size 2 and tag 0, and its contents were the parameters x and y of the function, which may be found in some particular registers (also described). At runtime, the debugger will reconstitute the structured value in the address space of the debugger (not the address space of the target program), and the libmonda library's value printers will be able to move seamlessly through the value, unaware that in fact part of it doesn't exist in the target program.

Once case that is not currently supported in the backend is the description of projections from variables known to represent blocks where those variables may in turn be phantom. This problem, together with the fact that we end up generating aliases in the DWARF information between phantom lets (e.g phantom_let x = y where y is phantom), is planned to be solved in the future by making Inline_and_simplify simplify the defining expressions of phantom lets using the techniques it uses for normal lets. This will eliminate aliases and enable us to avoid the case that is unsupported in the backend. (The reason it is unsupported is due to restrictions in the DWARF language.)

Along the way this patch introduces a new module Free_names for tracking of the various free name sets that we now have. There are some other minor cleanups (e.g. a record type for Let_rec).

@chambart is on the hook to review this one :)

@alainfrisch
Copy link
Copy Markdown
Contributor

Do you believe that the extra bookkeeping could affect flambda compilation-time in a non-negligible way?

@mshinwell
Copy link
Copy Markdown
Contributor Author

It's hard to say -- there is possibly going to be some degradation, but I'm not sure I'm that worried. What concerns me about compilation time for Flambda is really performance in -Oclassic mode (which is about to be subject to significant scrutiny, by the way)---and in that mode, I don't think this should make too much difference. We can also turn the phantom lets off entirely if not in debug mode.

@chambart
Copy link
Copy Markdown
Contributor

chambart commented Nov 4, 2016

I benchmarked a bit and this seems to increase the flambda-related part of the compilation time by ~10% on my "normal OCaml" files in both -Oclassic and -O3. When compiling with -g this cost some more ~10%.

It can be made arbitrarily slower by a code like:

let g () =
  let rec loop n acc =
    if n = 0 then
      acc
    else
      let n' = n - 1 in
      let tmp = acc + 1 in
      loop n' tmp
  in
  (loop [@unrolled 3000]) 3000 0

With -g this is obviously vastly slower as expected, but this is also really slower without -g.

@chambart
Copy link
Copy Markdown
Contributor

chambart commented Nov 4, 2016

The CI error is due to a change of the order of the print of some [@inlined] related warnings. This is ok, but should be fixed by a make promote

Copy link
Copy Markdown
Contributor

@chambart chambart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a first round of comments. I didn't see anything problematic, but I'd like to continue to dig in this a bit more before merging.

let args = List.map (fun arg -> Flambda.Expr (Var arg)) args in
Flambda_utils.bind ~body ~bindings:(List.combine freshened_params args)
Flambda_utils.bind ~provenance ~body
~bindings:(List.combine freshened_params args) ()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really needed to provide the provenance here ?
I would rather have none than one with a buggy path.

}

type let_provenance = {
module_path : Path.t;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really going to be a real module path or only the name of the surrounding modules ?
If this is the case I would rather use another type. In particular, one that does not contain the apply constructor.

What is the expected use case ? If it is only for display / breakpoint (i.e. not type related) then

type module_name = string
type module_path = module_name list

Should be enough.

is_a_functor : bool;
(** Whether the function is known definitively to be a functor. *)
module_path : Path.t;
(** The module path under which the function is defined. *)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remark as for let_provenance

mutable free_phantom_variables : Variable.Set.t;
mutable free_symbols : Symbol.Set.t;
mutable free_phantom_symbols : Symbol.Set.t;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is no code sharing between mutable and immutable version, there is no problem with duplicating this type into the mutable and immutable version.

By the way, this free names with symbols might be useful for other uses and is a good change.

@damiendoligez damiendoligez modified the milestone: 4.05-or-later Feb 15, 2017
@damiendoligez damiendoligez added this to the 4.07-or-later milestone Sep 29, 2017
@damiendoligez damiendoligez removed this from the consider-for-4.07 milestone Jun 4, 2018
@chambart chambart self-assigned this Aug 29, 2018
@mshinwell
Copy link
Copy Markdown
Contributor Author

This isn't going to see the light of day for Flambda 1. Flambda 2 has a more advanced concept of "name mode" built in right from the start, to distinguish between names that are bound because they occur in terms, types (of the Flambda type system) or phantom bindings. (We will explain such things in detail in forthcoming RFCs.)

@mshinwell mshinwell closed this Apr 20, 2020
stedolan pushed a commit to stedolan/ocaml that referenced this pull request Oct 25, 2022
* add value<->native_pointer builtins
* add Cintofvalue/Cvalueofint and onwards
* update cmm parser
* allow allocating two registers
* reload intofvalue/valueofint allow stack operands
* make the new ops not pure

Co-authored-by: Xavier Clerc <xclerc@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants