Require less stack compiling 150 field case class (with a balanced AND)#9635
Require less stack compiling 150 field case class (with a balanced AND)#9635dwijnand merged 1 commit intoscala:2.13.xfrom
Conversation
|
Not sure if this should live in EDIT: I moved the code over, that gets rid of one useless |
| gen.mkAnd(binaryTreeAnd(before), binaryTreeAnd(after)) | ||
| } | ||
|
|
||
| assert(guards.nonEmpty, "Need at least one tree to AND() together, not zero") |
There was a problem hiding this comment.
This case never happens AFAICT, so better to fail loudly if it does rather than returning a rubbish AST node
There was a problem hiding this comment.
"AFAICT"? The types allow it and there's nothing rubbish about EmptyTree. Could you add that back? I'd rather that than an assert and a ???.
There was a problem hiding this comment.
AFAICT = As Far As I Can Tell. I grepped for AND() in the codebase and there is only one callsite (this one), and it doesn't allow an empty Seq to be passed in.
The EmptyTree branch is basically dead code that is never exercised, so who knows if it does what someone wants if it gets inadvertently hit? Wouldn't it be better to fail loudly rather than proceeding with an untested unexercised branch that nobody has ever thought about? If we really want to put something there, why not put a literal true, since that's the monoid zero of the && operation?
That's my reasoning anyway. If you still think we should put the EmptyTree back in, I'm happy to do so.
There was a problem hiding this comment.
The
EmptyTreebranch is basically dead code that is never exercised, so who knows if it does what someone wants if it gets inadvertently hit? Wouldn't it be better to fail loudly rather than proceeding with an untested unexercised branch that nobody has ever thought about?
It's a toss-up between returning something one could call reasonable vs loudly and clearly failing early on unexpected input. Given this is out of scope for the objective here, I'd opt for the status quo (EmptyTree).
Even better I'm not against making AND take (guard: Tree, guards: Tree*) and avoid this dilemma entirely.
There was a problem hiding this comment.
Actually don't mind this returning TRUE (above in the file) on the empty case, instead.
|
Test failure seems to benign, looks like we just need to update some golden files |
This comment has been minimized.
This comment has been minimized.
|
@dwijnand I ended up just copy-pasting it manually but it seems to have worked, |
|
Wouldn't this mess up the order of comparisons? If you check
Edit: to preserve that optimisation one would need two binary trees - one for primitives and one for other fields and then I also wonder why |
| def AND(guards: Tree*) = { | ||
| def binaryTreeAnd(tests: Seq[Tree]): Tree = tests match{ | ||
| case List() => ??? | ||
| case List(single) => single |
There was a problem hiding this comment.
I don't think you can match on List here.
scala> def f(i: Int*) = i match {
| case List() => 0
| case List(i) => 1
| case is => 2
| }
def f(i: Int*): Int
scala> f()
val res11: Int = 0
scala> f(1)
val res12: Int = 2
scala> f(2,3)
val res13: Int = 2There was a problem hiding this comment.
Whereas:
scala> def f(i: Int*) = i match {
| case Seq() => 0
| case Seq(i) => 1
| case is => 2
| }
def f(i: Int*): Int
scala> f()
val res18: Int = 0
scala> f(1)
val res19: Int = 1
scala> f(2,3)
val res20: Int = 2There was a problem hiding this comment.
(Note: that the type of ArraySeq.splitAt is (ArraySeq, ArraySeq))
scala> def f(ints: Int*) = ints
def f(ints: Int*): Seq[Int]
scala> f(1,2,3).splitAt(2)
val res7: (Seq[Int], Seq[Int]) = (ArraySeq(1, 2),ArraySeq(3))
scala> f(1,2,3).splitAt(0)
val res8: (Seq[Int], Seq[Int]) = (ArraySeq(),ArraySeq(1, 2, 3))
scala> f(1,2,3).splitAt(3)
val res9: (Seq[Int], Seq[Int]) = (ArraySeq(1, 2, 3),ArraySeq())| def NOT(tree: Tree) = Select(tree, Boolean_not) | ||
| def AND(guards: Tree*) = if (guards.isEmpty) EmptyTree else guards reduceLeft gen.mkAnd | ||
| def AND(guards: Tree*) = { | ||
| def binaryTreeAnd(tests: Seq[Tree]): Tree = tests match{ |
There was a problem hiding this comment.
I think it is better to require List here and recurse on Nil, a :: Nil, twoOrMore convert to List before the outer call.
Also you could have this method be:
def binaryTreeAnd(head: Tree, tail: List[Tree]): TreeAnd then this method clearly never has nested EmptyTree.
The EmptyTree case can be handled once at the outer level of the input is empty.
There was a problem hiding this comment.
Actually the use of splitAt recursively would make List version quadratic. I don't know if ArraySeq has O(1) splitAt. If it doesn't, the current code is also quadratic.
A better approach might be to copy to an Array and recurse on indices representing the upper and lower bounds of the array which will certainly only N log N.
There was a problem hiding this comment.
@johnynek varargs are instantiated as ArraySeq. splitAt for that class seems to be implemented by s.c.AbstractIterable, which I presume uses take + drop as implementation. I'd probably go with getting the array from the ArraySeq and work with offsets, as you suggested, as then you can avoid allocating tuples and new collections.
|
I removed the assert, added back the @joroKr21 I don't think this would change the order of evaluation; we're re-balancing the tree of comparisons, but the comparisons should still be evaluated from left to right, and the left-to-right order of the tree is unchanged. @johnynek Given the O(n) copying I'm doing in |
|
Yeah, I made a mistake. Since the depth is log, doing O(N) and work at each stage is N log N, which I think should be fine. I think you can optimize it down to O(N) at the cost of less clear code. Probably not worth it since N is small in virtually all cases. |
@lihaoyi indeed, thanks for explaining 👍 I had some non-profit fun with binary trees (untested):
def AND(guards: Tree*) =
if (guards.isEmpty) EmptyTree else {
val trees = guards.toArray
var n = trees.length
while (n > 1) {
var i, j = n % 2
while (j < n) {
trees(i) = gen.mkAnd(trees(j), trees(j + 1))
i += 1; j += 2
}
n = i
}
trees.head
}
def AND(guards: Tree*) = {
@tailrec def binaryTreeAnd(queue: List[Tree], frontier: List[Tree], flip: Boolean): Tree =
queue match {
case x :: y :: queue =>
val xy = if (flip) gen.mkAnd(y, x) else gen.mkAnd(x, y)
binaryTreeAnd(queue, xy :: frontier, flip)
case x :: Nil =>
binaryTreeAnd(Nil, x :: frontier, flip)
case Nil =>
frontier match {
case Nil => EmptyTree
case x :: Nil => x
case _ => binaryTreeAnd(frontier, Nil, !flip)
}
}
binaryTreeAnd(guards.toList, Nil, flip = false)
} |
|
If there's no response feedback, this just needs a squash now. |
@joroKr21 you said your order-of-operations concerns were addressed. I just want to double check, are we sure that this optimization wasn't lost here? |
Yes, left-to-right is left-to-right regardless of the branching structure. |
scala> ((((((prim1 && prim2) && prim3) && prim4) && ref1) && ref2) && ref3) && ref4
prim1 standing by
prim2 standing by
prim3 standing by
prim4 standing by
ref1 standing by
ref2 standing by
ref3 standing by
ref4 standing by
val res0: Boolean = true
scala> ((prim1 && prim2) && (prim3 && prim4)) && ((ref1 && ref2) && (ref3 && ref4))
prim1 standing by
prim2 standing by
prim3 standing by
prim4 standing by
ref1 standing by
ref2 standing by
ref3 standing by
ref4 standing by
val res1: Boolean = true |
Sorry, I didn't phrase my question clearly. I understand that the branching structure doesn't affect order of operations. What I meant to ask is: this PR retains the partitioning into primitive and not-primitive, right? |
Ah yeah it's done in |
cdc13fc to
29ae6fa
Compare
…hetic `def equals` method:
```scala
a && b && c && d && e && f && g && h
```
Which currently parses into an unbalanced depth O(n) tree as follows:
```scala
(((((((a && b) && c) && d) && e) && f) && g) && h)
```
into a binary tree of depth O(log n):
```scala
(((a && b) && (c && d)) && ((e && f) && (g && h)))
```
Tested manually by pasting the following snippet into the `sbt scala` interpreter:
```scala
case class Big150(_0: Int, _1: Int, _2: Int, _3: Int, _4: Int, _5: Int, _6: Int, _7: Int,
_8: Int, _9: Int, _10: Int, _11: Int, _12: Int, _13: Int, _14: Int,
_15: Int, _16: Int, _17: Int, _18: Int, _19: Int, _20: Int, _21: Int,
_22: Int, _23: Int, _24: Int, _25: Int, _26: Int, _27: Int, _28: Int,
_29: Int, _30: Int, _31: Int, _32: Int, _33: Int, _34: Int, _35: Int,
_36: Int, _37: Int, _38: Int, _39: Int, _40: Int, _41: Int, _42: Int,
_43: Int, _44: Int, _45: Int, _46: Int, _47: Int, _48: Int, _49: Int,
_50: Int, _51: Int, _52: Int, _53: Int, _54: Int, _55: Int, _56: Int,
_57: Int, _58: Int, _59: Int, _60: Int, _61: Int, _62: Int, _63: Int,
_64: Int, _65: Int, _66: Int, _67: Int, _68: Int, _69: Int, _70: Int,
_71: Int, _72: Int, _73: Int, _74: Int, _75: Int, _76: Int, _77: Int,
_78: Int, _79: Int, _80: Int, _81: Int, _82: Int, _83: Int, _84: Int,
_85: Int, _86: Int, _87: Int, _88: Int, _89: Int, _90: Int, _91: Int,
_92: Int, _93: Int, _94: Int, _95: Int, _96: Int, _97: Int, _98: Int,
_99: Int, _100: Int, _101: Int, _102: Int, _103: Int, _104: Int,
_105: Int, _106: Int, _107: Int, _108: Int, _109: Int, _110: Int,
_111: Int, _112: Int, _113: Int, _114: Int, _115: Int, _116: Int,
_117: Int, _118: Int, _119: Int, _120: Int, _121: Int, _122: Int,
_123: Int, _124: Int, _125: Int, _126: Int, _127: Int, _128: Int,
_129: Int, _130: Int, _131: Int, _132: Int, _133: Int, _134: Int,
_135: Int, _136: Int, _137: Int, _138: Int, _139: Int, _140: Int,
_141: Int, _142: Int, _143: Int, _144: Int, _145: Int, _146: Int,
_147: Int, _148: Int, _149: Int)
```
This semi-reliably crashes the interpreter with a StackOverflow on 2.13.x, and works without issue on this PR.
I'm not sure where the tests should go, but let me know and I'll happily paste that snippet into your test suite (or you guys could do it on my behalf when merging!)
It's not clear to me if the other generated methods suffer the same unbalanced-AST issue, but glancing over the code it seems they don't: e.g. `.hashCode` has a long chain of `val` assignments of AST depth O(1), `.productElement` is one big pattern match of depth O(1), etc. The fact that this seems to fix the StackOverflow without it turning up somewhere else also supports the idea that `.equals` is the only generated method with this issue
Seems the problematic behavior was introduced 14 years ago in scala@8397c7b#diff-205537ac4c08ea690ada72e398df0018dcaf2a7c4987c0d8d8df322314469578R162
|
@dwijnand squashed it |
Fixes scala/bug#12397 by turning the long change of
&&s in the syntheticdef equalsmethod:Which currently parses into an unbalanced depth O(n) tree as follows:
into a binary tree of depth O(log n):
Tested manually by pasting the following snippet into the
sbt scalainterpreter:This semi-reliably crashes the interpreter with a StackOverflow on 2.13.x, and works without issue on this PR.
I'm not sure where the tests should go, but let me know and I'll happily paste that snippet into your test suite (or you guys could do it on my behalf when merging!)
It's not clear to me if the other generated methods suffer the same unbalanced-AST issue, but glancing over the code it seems they don't: e.g.
.hashCodehas a long chain ofvalassignments of AST depth O(1),.productElementis one big pattern match of depth O(1), etc. The fact that this seems to fix the StackOverflow without it turning up somewhere else also supports the idea that.equalsis the only generated method with this issueSeems the problematic behavior was introduced 14 years ago in 8397c7b#diff-205537ac4c08ea690ada72e398df0018dcaf2a7c4987c0d8d8df322314469578R162