Skip to content

Add counters to profiler (for CFG spill and reload instructions)#2786

Merged
milan-tom merged 17 commits intooxcaml:mainfrom
milan-tom:add-cfg-spill-reload-counters
Jul 18, 2024
Merged

Add counters to profiler (for CFG spill and reload instructions)#2786
milan-tom merged 17 commits intooxcaml:mainfrom
milan-tom:add-cfg-spill-reload-counters

Conversation

@milan-tom
Copy link
Copy Markdown
Contributor

@milan-tom milan-tom commented Jul 16, 2024

This PR introduces the ability to measure certain values as counters during passes of the compiler.

Options for API:

  1. Existing Profile.record function with optional ?counter_f argument
    • This function is used in many files so changing the signature could potentially mean implementations
      no longer match interfaces.
  2. New Profile.record_with_counters function
    • Shows intent better and gives hint about purpose of couter_f argument.
    • Allows providing specific documentation for this function.
    • New function so won't break any other files.
    • Intertwines logic for timing/memory profiling with counter profiling (these operate differently as
      timing and memory are difference measures between the start and end of the stage and have a fixed
      number of measures whereas counters are purely based on the result of a stage and there may be
      many counters).
  3. Completely separate function
    • No intertwining of logic for timing/memory profiling with counter profiling
    • May lead to code duplication (many aspects in common between these different types of profiling)
    • Harder to print all profile columns for -dprofile option (need to manage integration of the two profiling
      mechanisms)

The current implementation uses option 2.

Examples of new outputs

-dcounters

 test.ml
   generate
     compile_phrases
       regalloc
         cfg
          [reload = 30; spill = 30] cfg_irc

-dcounters -dgranularity func

 test.ml
   generate
     compile_phrases
       camlTest__add_0_14_code
         regalloc
           cfg
            [reload = 0; spill = 0] cfg_irc
       camlTest__subtract_1_15_code
         regalloc
           cfg
            [reload = 0; spill = 0] cfg_irc
       camlTest__multiply_2_16_code
         regalloc
           cfg
            [reload = 0; spill = 0] cfg_irc
       camlTest__divide_3_17_code
         regalloc
           cfg
            [reload = 0; spill = 0] cfg_irc
       camlTest__map_4_18_code
         regalloc
           cfg
            [reload = 3; spill = 3] cfg_irc
       camlTest__filter_5_19_code
         regalloc
           cfg
            [reload = 4; spill = 4] cfg_irc
       camlTest__fold_left_6_20_code
         regalloc
           cfg
            [reload = 2; spill = 2] cfg_irc
       camlTest__MakeSet_7_21_code
         regalloc
           cfg
            [reload = 0; spill = 0] cfg_irc
       camlTest__add_8_26_code
         regalloc
           cfg
            [reload = 3; spill = 3] cfg_irc
       camlTest__add_8_22_code
         regalloc
           cfg
            [reload = 4; spill = 4] cfg_irc
       camlTest__mem_9_27_code
         regalloc
           cfg
            [reload = 2; spill = 2] cfg_irc
       camlTest__mem_9_23_code
         regalloc
           cfg
            [reload = 3; spill = 3] cfg_irc
       camlTest__compare_10_10_code
         regalloc
           cfg
            [reload = 0; spill = 0] cfg_irc
       camlTest__fn[test.ml:79,23--39]_11_28_code
         regalloc
           cfg
            [reload = 0; spill = 0] cfg_irc
       camlTest__fn[test.ml:80,28--50]_12_29_code
         regalloc
           cfg
            [reload = 0; spill = 0] cfg_irc
       camlTest__fn[test.ml:81,22--27]_13_13_code
         regalloc
           cfg
            [reload = 0; spill = 0] cfg_irc
       camlTest__entry
         regalloc
           cfg
            [reload = 9; spill = 9] cfg_irc

-dtimings -dgranularity func

0.067s test.ml
  0.001s parsing
    0.001s parser
  0.009s typing
    0.008s infer
  0.004s transl
  0.053s generate
    0.035s flambda2
      0.002s lambda_to_flambda
      0.031s simplify
        0.009s data_flow
        0.021s other
      0.001s flambda_to_cmm
      0.001s other
    0.008s compile_phrases
      0.001s camlTest__filter_5_19_code
        0.001s regalloc
          0.001s cfg
      0.001s camlTest__add_8_26_code
        0.001s regalloc
          0.001s cfg
      0.001s camlTest__add_8_22_code
      0.002s camlTest__entry
        0.002s regalloc
          0.002s cfg
            0.001s cfg_irc
              0.001s split
            0.001s cfg_validate_description
    0.007s assemble
    0.003s other
0.009s other

@milan-tom milan-tom requested a review from xclerc July 16, 2024 10:05
@milan-tom milan-tom self-assigned this Jul 16, 2024
@xclerc xclerc added the backend label Jul 16, 2024
Copy link
Copy Markdown
Contributor

@xclerc xclerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be worthwhile to also
amend /ocaml/driver/compenv.ml
(in particular the read_one_param
function), so that the new behavior
can also be requested through the
OCAMLRUNPARAM environment
variable.

@xclerc
Copy link
Copy Markdown
Contributor

xclerc commented Jul 17, 2024

(@gretay-js will also review)

@milan-tom milan-tom force-pushed the add-cfg-spill-reload-counters branch from 3cf38c2 to 01a0abf Compare July 17, 2024 11:30
@milan-tom
Copy link
Copy Markdown
Contributor Author

I think it would be worthwhile to also amend /ocaml/driver/compenv.ml (in particular the read_one_param function), so that the new behavior can also be requested through the OCAMLRUNPARAM environment variable.

Done.

@xclerc
Copy link
Copy Markdown
Contributor

xclerc commented Jul 17, 2024

Thanks - there was a typo in my comment,
the environment variable is actually named
OCAMLPARAM (no RUN).

@milan-tom milan-tom force-pushed the add-cfg-spill-reload-counters branch from 0717ccf to ea7df88 Compare July 17, 2024 15:20
@milan-tom
Copy link
Copy Markdown
Contributor Author

Thanks - there was a typo in my comment, the environment variable is actually named OCAMLPARAM (no RUN).

Corrected in commit name.

@milan-tom milan-tom marked this pull request as ready for review July 17, 2024 16:31
@milan-tom milan-tom force-pushed the add-cfg-spill-reload-counters branch from 9585a3e to 3dc187e Compare July 18, 2024 09:02
@milan-tom milan-tom requested a review from gretay-js July 18, 2024 09:07
@milan-tom milan-tom enabled auto-merge (squash) July 18, 2024 09:58
@milan-tom milan-tom requested a review from xclerc July 18, 2024 09:59
Copy link
Copy Markdown
Contributor

@xclerc xclerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can confirm that the output
of -dtimings is unaffected when
-dcounters is not passed, I think
it is good to go.

(@gretay-js also suggested off-line
it would be useful to give an
example of the new output in the
description of the pull request.)

@milan-tom milan-tom merged commit 75a0761 into oxcaml:main Jul 18, 2024
@milan-tom milan-tom deleted the add-cfg-spill-reload-counters branch July 18, 2024 13:13
@xclerc xclerc mentioned this pull request Aug 16, 2024
lukemaurer pushed a commit to lukemaurer/flambda-backend that referenced this pull request Oct 23, 2024
…nstructions) (oxcaml#2786)

* Add counter column to profiler

* Add option for function level profiling

* Implement CFG spill and reload counter functionality

* Clean up formatting

* Always print ancestors of stages determined worth displaying

* Remove unnecessary string to int conversion

Co-authored-by: Xavier Clerc <xclerc@users.noreply.github.com>

* Prevent Counter methods from raising exceptions

* Only compute counters if requested by user

* Avoid underscore in function name

* Fix accumulation of counters

* Correct reference to Function_level in codegen_main.ml

* Change -dfunc-level to -dgranularity and make profile granularity settable by OCAMLPARAM

* Remove catch-all pattern for profile granularity

* Fix dynamic linking dependencies

* Accumulate spill and reload counts locally before passing to Counter

* Improve readability of profile wrapper for function declaration compilation

* Move counter profiling for regalloc outside pipeline

---------

Co-authored-by: Xavier Clerc <xclerc@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants