Conversation
|
I find your presentation of the change informative and interesting, but it is difficult to review the code itself because the whole diff starts with lines and lines of (I don't know where to look to easily read the changes to I guess the person whose feedback would matter most in this case is @xavierleroy , that I suppose decided the original organisation of this part of the codebase. It's a minor detail, but I think the signature |
|
Hi @gasche, thanks for the comments. I will clean up the pull request to make it easier to read. |
We add two type arguments:
type ('addr, 'op) operation
where 'addr corresponds to Arch.addressing_mode and 'op to
Arch.specific_operation (for any architecture-dependent Arch module).
Printing functions for Mach.{operation, instruction, ...} (idem for
Linearize) now take extra printer parameters to print the 'addr, 'op
type arguments.
Additionally they take an argument of type
int -> string
which is used to print register names (this corresponds to
Proc.register_name). This is done to avoid a dependency on
architecture-dependent Proc modules.
These interfaces define the interface between the architecture-independent part of the native-code generator and the architecture-dependent parts.
The following modules are made functors of Arch and/or Proc: CSEgen Schedgen Selectgen (Reloadgen does not need to be functorized) Interf Spill Comballoc Coloring Deadcode Liveness The first group of modules are used by each native-code backend to implement their own functionality. The rest of the modules are used "internally" by the Asmgen module.
6ece5ad to
62f0538
Compare
This is the signature to be implemented by native-code backends. It corresponds to the files normally found in asmcomp/amd64, asmcomp/arm, etc. This signature will be used as the argument type for the Asmgen functor.
Asmgen becomes a functor of Native_backend_intf. Asmlink and Asmpackager become functors of Asmgen. Instead of functorizing these modules we could use first class modules passed as function arguments.
We rename every module in asmcomp/amd64 with a prefix since they are to coexist with other backends. Also update Makefile not to symlink these modules into asmcomp and to generate the *_emit.ml files at asmcomp/<arch>/<arch>_emit.ml.
These modules all depend on Arch or Proc. Since the dependence is only on values (not on types), it may be better to pass the necessary modules as a function argument as needed. For simplicity we use functors for now.
This is better than making Closure a functor of Arch.
62f0538 to
ca1327a
Compare
|
|
Also found the following slides by @xavierleroy which are relevant for this PR: http://gallium.inria.fr/~xleroy/talks/icfp99.ps.gz. |
|
I find the use of first-class module in ca1327a a regression rather than an improvement.
I think it's very good that you made this experiment but, to my personal taste, the results are less convincing than the pure-functor approach. |
| val contains_calls: bool ref | ||
| val assemble_file: string -> string -> int | ||
| val init : unit -> unit | ||
| end |
There was a problem hiding this comment.
This file should contain the documentation that was previously in Proc.
Note that it is often a good idea to have both an .ml and .mli file in the compiler sources to avoid parallel build problems.
There was a problem hiding this comment.
Indeed! I will put it back and add the .mli as well. Thanks!
|
This feature is a clear improvement without serious problems. It will also simplify the maintenance of the various less used backends. I am strongly in favor. For the code itself I didn't seriously review this patch yet, but from a quick scan, this looks like a safe change: this is moving things around, the scary parts of the backend is unchanged, and there is no risk of mix between backends as the specific instruction type prevents it. There may be some concerns with integers size, and the value of some backend specific values (like compile time constants). But this can be addressed in a later patch. |
aa8f137 to
463bb50
Compare
|
Functorizing the native-code generator over |
|
Thanks for the comment. That was the impression I got as well. Closing! |
Motivation
Current approach
The native-code generator is organised in two parts: a more-or-less architecture-independent part in
asmcomp/and architecture-dependent files in each one of the subdirectoriesasmcomp/<arch>(wherearchisamd64,arm, etc.).When
./configureis run it symlinks the files of the host architectureasmcomp/<arch>toasmcomp/itself. These modules are:Arch,Proc,Selection,Reload,CSE,Emit,Scheduling. The compiler never needs to know that there is more than one native-code backend available. This is simple and convenient because the architecture-independent and dependant parts are tightly wound together.Instead of doing this "linking trick" we want to be able to consider all backends at the same time and to choose them at run-time.
Proposed approach
We propose to use functors. However it is not completely trivial to do this because there are circular dependencies between the architecture-independent parts and architecture dependent ones as a whole.
Furthermore, one of the main types used throughout the compiler back-end,
Mach.{instruction, operation, ...}uses in its definition the typesArch.addressing_modeandArch.specific_operationwhich are architecture-dependent. If we makeMachinto a functor then every single module out there that usesMachwill need to become a functor ofMachor re-instantiateMachin order to do its work. This quickly leads to a lot of complexity.Mach.operationpolymorphic:PrintmachandPrintlinearto deal with this extra polymorphism (extra arguments for'addressing_modeand'specific_operationprinters, etc.).Arch_intfandProc_intfcontaining signatures forArchandProcmodules.Selectgen,Schedgen,Reloadgen,CSEgenwhich are used to build the architecture-dependentSelectionSchedulingReload,CSEnow are functors ofArchand/orProc.Native_backend_intf:Asmgenis a functor ofNative_backend_intf.S. Its body instantiates:AmspackagerandAsmlinkbecome functors parametrised byAsmgen.Current state
amd64, seedriver/optmain.ml).ArchorProcis very small (just a few constants). We could turn some of these functors into functions taking first class modules, which seems lighter.Thanks!
/cc @alainfrisch