Skip to content

Optimise Switch code generation on booleans#8672

Closed
stedolan wants to merge 1 commit intoocaml:trunkfrom
stedolan:switch-bool
Closed

Optimise Switch code generation on booleans#8672
stedolan wants to merge 1 commit intoocaml:trunkfrom
stedolan:switch-bool

Conversation

@stedolan
Copy link
Copy Markdown
Contributor

This patch optimises Switch to avoid introducing an extra comparison when matching on booleans (or other types with two constant constructors). If the value being matched is known to be 0 or 1, we need not do a further comparison with 0 before dispatching.

There was already a special case to detect branches that distinguish constructor 0 from other constructors (these are compiled as a comparison against 0, rather than the usual less-than comparison). Here, this special case is extended to detect the case when there is only one other constructor, with value 1.

An example of matching on a boolean is this function:

let f x =
  match x = 27 with
  | true -> 7
  | false -> 77

Currently, it is compiled to this Lambda:

(function x/81[int] : int
  (let (*match*/153 = (== x/81 27)) (if (!= *match*/153 0) 7 77)))

With this patch, the != 0 is removed, and it becomes:

(function x/82[int] : int
  (let (*match*/154 = (== x/82 27)) (if *match*/154 7 77)))

This doesn't make much difference for the non-Flambda compilation pipeline. With Flambda enabled, the resulting Cmm is much simpler. Instead of:

(function camlMatchtest__f_5 (x/158: val)
   (if (!= (+ (<< (== x/158 55) 1) 1) 1) 15 155))

we get:

(function camlMatchtest__f_5 (x/158: val)
 (if (== x/158 55) 15 155))

This translates into smaller code:

camlMatchtest__f_5:
        cmpq    $55, %rax
        sete    %al
        movzbq  %al, %rax
        leaq    1(%rax,%rax), %rax
        cmpq    $1, %rax
        je      .L100
        movq    $15, %rax
        ret
        .align  4
.L100:
        movq    $155, %rax
        ret

vs.

camlMatchtest__f_5:
        cmpq    $55, %rax
        jne     .L100
        movq    $15, %rax
        ret
        .align  4
.L100:
        movq    $155, %rax
        ret

I haven't found a significant speed difference, but this seems worthwhile for the code size reduction alone.

@gasche
Copy link
Copy Markdown
Member

gasche commented May 15, 2019

Naively I would have the following expectations:

  • Exhausitivity-based reasoning suffices to known when only one possible case remain, and not generate any test for it (I'm not sure how that knowledge transfers to the Switch level, though).
  • "Just 2 constructors" behaves in the same way as "Just N constructors", at least as long as the same switch-compilation strategy is preserved (tests instead of a jump table, etc.).

Both assumptions are violated by the situation you point out, and not restored by the fix you propose. Is there a reason why we can't have these nice things?

@stedolan
Copy link
Copy Markdown
Contributor Author

I don't really understand either of these points, could you explain a bit more?

Exhausitivity-based reasoning suffices to known when only one possible case remain, and not generate any test for it (I'm not sure how that knowledge transfers to the Switch level, though).

Switch already optimises this: If there is only one possible action, no tests are generated. (In Switch, "case" means possible input value, and "action" means possible branch to take. There may be multiple cases yet only a single action, e.g. function true | false -> 42). This patch concerns the situtation where there are two possible actions.

"Just 2 constructors" behaves in the same way as "Just N constructors", at least as long as the same switch-compilation strategy is preserved (tests instead of a jump table, etc.).

What do you mean by "behaves in the same way" here? The type of branch is selected according to the values that are being tested. Isn't that the right thing to do here?

@chambart
Copy link
Copy Markdown
Contributor

Looks good. To be honest I would prefer @maranget to give the thumb up on that one given that he was the one who wrote the make_if_ne ctx.arg 0 and that I think he knew the special case semantics Lifthenelse.
I verified the runtime and selectgen to be certain, so I'm quite confident, but anyway.

@gasche
Copy link
Copy Markdown
Member

gasche commented May 22, 2019

(If @maranget doesn't give an opinion here I'll ask him in person next week.)

@stedolan
Copy link
Copy Markdown
Contributor Author

Any updates on this?

@maranget
Copy link
Copy Markdown
Contributor

maranget commented Jun 11, 2019

I tend to approve the change, especially since I have always thought that != 0 and true were the same thing. So good catch by @stedolan. I'll merge happily.

If the value being matched is known to be 0 or 1, we need not do
a further comparison with 0 before dispatching.
@maranget
Copy link
Copy Markdown
Contributor

I have just merged the PR.

@stedolan
Copy link
Copy Markdown
Contributor Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants