Skip to content

Add peephole optimizations for CFG blocks.#1666

Merged
xclerc merged 12 commits intooxcaml:mainfrom
GabiTulba:backend-peephole-optimizations-without-tracking
Aug 3, 2023
Merged

Add peephole optimizations for CFG blocks.#1666
xclerc merged 12 commits intooxcaml:mainfrom
GabiTulba:backend-peephole-optimizations-without-tracking

Conversation

@GabiTulba
Copy link
Copy Markdown
Contributor

The optimizations added are as follows:

  1. remove_useless_mov which treats the case:

     mov r1, r2
     mov r2, r1
    
     In this case the second move can be deleted
    
  2. fold_intop_imm which treats the case:

     op1 r, imm1
     op2 r, imm2
    
     We can rewrite this as:
    
     op1 r, (imm1 op2 imm2)
    
     Iff op1 and op2 are associative and the result (imm1 op2 imm2) does not overflow.
    

Added the following extra functionality for the Flambda_backend.Doubly_linked_list data structure:

  • next cell -> get the next cell
  • hd_cell t -> get a cell to the first element in the list
  • last_cell t -> get a cell to the last element of the list
  • delete_curr/delete_after/delete_before cell -> delete the current element, the one after or before a cell. asserts if there is no cell to delete (i.e. we try to delete before the first element or after the last)
  • extra functionlity for inserting and getting back the newly inserted cell in a Doubly_linked_list

Added:

  • Flags for enabling peephole optimizations (-cfg-peephole-optimize)
  • A generic CSV data structure

This was benchmarked on the following libraries with the following results:

  • Base:

    • remove_useless_mov hits: 4145
    • fold_intop_arith hits: 4
    • fold_intop_bitwise hits: 8
    • total build time without optimizations (avg of 10 runs): 171.54 sec
    • total build time with optimizations (avg of 10 runs): 171.84 sec
    • total .o file size reduction from optimizations: 2288 bytes
  • Core:

    • remove_useless_mov hits: 4473
    • fold_intop_arith hits: 9
    • fold_intop_bitwise hits: 51
    • total build time without optimizations (avg of 10 runs): 177.83 sec
    • total build time with optimizations (avg of 10 runs): 177.43 sec
    • total .o file size reduction from optimizations: 5968 bytes

…ded are:

      * remove_useless_mov which treats the case:
        mov r1, r2
        mov r2, r1

        In this case the second move can be deleted

      * fold_intop_imm which treats the case:
        op1 r, imm1
        op2 r, imm2

        We can rewrite this as:

        op1 r, (imm1 op2 imm2)

        Iff op1 and op2 are associative and the result (imm1 op2 imm2) does not overflow.

Added the following extra functionality for the Flambda_backend.Doubly_linked_list data structure:
      * next cell -> get the next cell
      * hd_cell t -> get a cell to the first element in the list
      * last_cell t -> get a cell to the last element of the list
      * delete_curr/delete_after/delete_before cell -> delete the current element, the one after or before a cell. asserts if there is no cell to delete (i.e. we try to delete before the first element or after the last)
      * extra functionlity for inserting and getting back the newly inserted cell in a Doubly_linked_list

Added:
  * Flags for enabling peephole optimizations (-cfg-peephole-optimize)
  * A generic CSV data structure
@GabiTulba GabiTulba force-pushed the backend-peephole-optimizations-without-tracking branch from 7594cff to a9a9939 Compare July 31, 2023 11:01
lukemaurer added a commit that referenced this pull request Jul 31, 2023
A couple of small simplifications that are obvious by themselves but would have
made #1666 a bit harder to review:

  * `Env.sign_of_cmi` is only called via `read_sign_of_cmi` with
    `~freshen:true`, so just rename `sign_of_cmi` to `read_sign_of_cmi` and
    freshen unconditionally

  * `Persistent_env.save_pers_struct` requires a `pers_struct` but doesn't use
    all of it, so just inline it into its only call site
…hub.com:GabiTulba/flambda-backend into backend-peephole-optimizations-without-tracking
Copy link
Copy Markdown
Contributor

@xclerc xclerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good; most of the suggested
changes are about unused elements and
typos.

Did not put a comment on the diff, but I
think it would make sense to use the
same set of warnings in all files. With
one possible exception: the "fragile
pattern" warning, but maybe we can
disable it locally rather than at the
top of some files.

Copy link
Copy Markdown
Contributor

@gretay-js gretay-js left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some tests, including overflow tests?

Additionally, there should be a check to guarantee termination of rewriting (e.g., if one pattern removes instruction and another adds it back). Currently, patterns only remove instructions so this is implicit, but may worth a comment somewhere.

Separate architectural constraints into a helper function, and leave refactoring it into target specific folder to another PR.

@xclerc xclerc added the cfg label Aug 3, 2023
GabiTulba and others added 4 commits August 3, 2023 13:43
Co-authored-by: Greta Yorsh <45005955+gretay-js@users.noreply.github.com>
Co-authored-by: Greta Yorsh <45005955+gretay-js@users.noreply.github.com>
…hub.com:GabiTulba/flambda-backend into backend-peephole-optimizations-without-tracking
@xclerc xclerc merged commit ec08424 into oxcaml:main Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants