Skip to content

[nomerge] optimise the addition of immutable HashMap#8466

Merged
retronym merged 4 commits intoscala:2.12.xfrom
rorygraves:mike/2.12.x_quickHashMap
Feb 25, 2020
Merged

[nomerge] optimise the addition of immutable HashMap#8466
retronym merged 4 commits intoscala:2.12.xfrom
rorygraves:mike/2.12.x_quickHashMap

Conversation

@mkeskells
Copy link
Contributor

@mkeskells mkeskells commented Oct 14, 2019

Note - for 2.12 only
We can duplicate the functionality to 2.13, but I think it is with a seperate PR as 2.13 collections are differently structured

Prefer to use merge when we can
optimised the builder to allow ++ directly

small collection of other optimisations

need some additional tests for the edge cases introduced

@scala-jenkins scala-jenkins added this to the 2.12.11 milestone Oct 14, 2019
@mkeskells mkeskells force-pushed the mike/2.12.x_quickHashMap branch 3 times, most recently from a7fc744 to 7a6ee73 Compare November 1, 2019 00:09
@retronym
Copy link
Member

retronym commented Nov 1, 2019

Minimization of the bootstrap failure:

scala> val breaker = collection.breakOut[scala.collection.immutable.Set[Int], Int, scala.collection.immutable.BitSet](scala.collection.immutable.BitSet.canBuildFrom)
breaker: scala.collection.generic.CanBuildFrom[scala.collection.immutable.Set[Int],Int,scala.collection.immutable.BitSet] = scala.collection.generic.BitSetFactory$$anon$1@3dcb28de

scala> val f = java.lang.Math.abs _
f: Int => Int = $$Lambda$1331/1191964420@1ab08e83

scala> Set[Int]().map[Int, scala.collection.immutable.BitSet](f)(breaker)
java.lang.ClassCastException: scala.collection.immutable.Set$EmptySet$ cannot be cast to scala.collection.BitSet
  at scala.collection.generic.BitSetFactory$$anon$1.apply(BitSetFactory.scala:36)
  at scala.collection.TraversableLike.builder$1(TraversableLike.scala:233)
  at scala.collection.TraversableLike.map(TraversableLike.scala:237)
  at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
  at scala.collection.AbstractSet.scala$collection$SetLike$$super$map(Set.scala:51)
  at scala.collection.SetLike.map(SetLike.scala:104)
  at scala.collection.SetLike.map$(SetLike.scala:104)
  at scala.collection.AbstractSet.map(Set.scala:51)

@mkeskells mkeskells force-pushed the mike/2.12.x_quickHashMap branch from 7a6ee73 to cc549b7 Compare November 1, 2019 07:59
@mkeskells mkeskells changed the title use HashMap merge where possible optimise the addition of immutable HashMap Nov 1, 2019
@diesalbla diesalbla added the library:collections PRs involving changes to the standard collection library label Nov 2, 2019
@mkeskells mkeskells force-pushed the mike/2.12.x_quickHashMap branch 4 times, most recently from 854edb4 to 133bc9b Compare December 2, 2019 00:01
@mkeskells mkeskells changed the title optimise the addition of immutable HashMap [nomerge] optimise the addition of immutable HashMap Dec 4, 2019
@mkeskells mkeskells force-pushed the mike/2.12.x_quickHashMap branch from 133bc9b to 4babb8a Compare December 6, 2019 01:05
@mkeskells
Copy link
Contributor Author

mkeskells commented Dec 6, 2019

rebased to resolve expected conflict with #8467

@SethTisue SethTisue added the performance the need for speed. usually compiler performance, sometimes runtime performance. label Dec 6, 2019
@mkeskells mkeskells force-pushed the mike/2.12.x_quickHashMap branch 3 times, most recently from 5e1f061 to e0386ec Compare December 10, 2019 07:58
@lrytz lrytz assigned retronym and szeiger and unassigned retronym Jan 31, 2020
@lrytz lrytz requested a review from szeiger January 31, 2020 12:36
Comment on lines +143 to +144
val unwrapped = w.unwrap
(unwrapped eq HashMap.canBuildFrom) || (unwrapped eq Map.canBuildFrom)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if you can ever get multiple levels of wrapping but isCompatibleCBF(w.unwrap) would also be more concise and avoid repeating the test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


override def ++[C >: (A, B), That](that: GenTraversableOnce[C])(implicit bf: CanBuildFrom[HashMap[A, B], C, That]): That =
addImpl(that, bf)
private def addImpl[C >: (A, B), That](that: GenTraversableOnce[C], bf: CanBuildFrom[HashMap[A, B], C, That]): That = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the separate method necessary? Couldn't you put the implementation directly into the second ++ method?

Copy link
Contributor Author

@mkeskells mkeskells Feb 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I chnage to

  override def ++[B1 >: B](xs: GenTraversableOnce[(A, B1)]): Map[A, B1] = ++(xs)(HashMap.canBuildFrom[A, B1])

I get a compile error

Error:(166, 102) type mismatch;
 found   : scala.collection.generic.CanBuildFrom[scala.collection.immutable.HashMap.Coll,(A, B1),scala.collection.immutable.HashMap[A,B1]]
    (which expands to)  scala.collection.generic.CanBuildFrom[scala.collection.immutable.HashMap[_, _],(A, B1),scala.collection.immutable.HashMap[A,B1]]
 required: A
  override def ++[B1 >: B](xs: GenTraversableOnce[(A, B1)]): Map[A, B1] = ++(xs)(HashMap.canBuildFrom[A, B1])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works if you specify the type parameters:

  override def ++[B1 >: B](xs: GenTraversableOnce[(A, B1)]): Map[A, B1] = ++[(A, B1), Map[A, B1]](xs)(HashMap.canBuildFrom[A, B1])

  override def ++[C >: (A, B), That](that: GenTraversableOnce[C])(implicit bf: CanBuildFrom[HashMap[A, B], C, That]): That = {

The difference must be due to ++ being overloaded. It's strange that it fails with this type error rather than an ambiguous overload error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - thanks for the type magic :-)

Comment on lines +161 to +184
//default Merge prefers to keep than replace
//so we merge from thatHash
(thatHash.merged(this) (null) ).asInstanceOf[That]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the handling of key and value identities here and in ++: below compatible with the current default (and new fallback) implementation consisting of successive + calls?

Copy link
Contributor Author

@mkeskells mkeskells Feb 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so - that was my intention

var result: HashMap[A, B] = this.asInstanceOf[HashMap[A, B]]
that foreach { case kv: (A, B) =>
val key = kv._1
result = result.updated0(key, computeHash(key), 0, kv._2, kv, HashMap.liftMerger[A,B](null))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the default merger (aka liftMerger(null)) different from a null merger (as used when calling +)?

Copy link
Contributor Author

@mkeskells mkeskells Feb 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes - liftMerger(null) uses a merger with retainIdentical = true

  private def liftMerger[A1, B1](mergef: MergeFunction[A1, B1]): Merger[A1, B1] =
    if (mergef == null) defaultMerger.asInstanceOf[Merger[A1, B1]] else liftMerger0(mergef)

  private val defaultMerger : Merger[Any, Any] = new Merger[Any, Any] {
    override def apply(a: (Any, Any), b: (Any, Any)): (Any, Any) = a
    override def retainIdentical: Boolean = true
    override val invert: Merger[Any, Any] = new Merger[Any, Any] {
      override def apply(a: (Any, Any), b: (Any, Any)): (Any, Any) = b
      override def retainIdentical: Boolean = true
      override def invert = defaultMerger
    }
  }

this means that the merge code can ignore structurally shared trees

other than that the behaviour is , or should be, identical

Comment on lines +194 to +226
if (isCompatibleCBF(bf)) {
//here we know that That =:= HashMap[_, _], or compatible with it
if (this eq that) that.asInstanceOf[That]
else if (that.isEmpty) this.asInstanceOf[That]
else that match {
case thatHash: HashMap[A, B] =>
//default Merge prefers to keep than replace
//so we merge from this
(this.merged(thatHash)(null)).asInstanceOf[That]
case that =>
var result: HashMap[A, B] = this.asInstanceOf[HashMap[A, B]]
that foreach { case kv: (A, B) =>
val key = kv._1
result = result.updated0(key, computeHash(key), 0, kv._2, kv, HashMap.liftMerger[A,B](null))
}
result.asInstanceOf[That]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole branch appears to be identical to the one from the other ++: method above. It could be factored out and shared.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - and removed the ObjectRef
TODO - should cope with the Maps that now have forEachEntry

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did the forEachEntry change, and copied most of that to another PR to reduce the noise in the one


override def newBuilder[A, B]: mutable.Builder[(A, B), HashMap[A, B]] = new HashMapBuilder[A,B]
private class HashMapBuilder[A, B] extends mutable.MapBuilder[A, B, HashMap[A, B]](HashMap.empty) {
//not sure if this should be part of MapBuilder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, but the override wouldn't be binary compatible and there's a risk of breaking existing implementations that make ++ in Map call MapBuilder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could do something in 2.15, when that branch starts

// condition below is due to 2 things:
// 1) no unsigned int compare on JVM
// 2) 0 (no lsb) should always be greater in comparison
if (unsignedCompare(thislsb - 1, thatlsb - 1)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off-topic: I wonder if unsignedCompare (for Int) was ever benchmarked against a more straight-forward version that performs unsigned conversions to Long. It's also confusingly named (compare? for what?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think (based in the HashSet changes, this can probably go from this implementation later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed reference to unsignedCompare from HashMap. There are still some from HashSet until #8589 merges

def + [V1](kv: (Any, V1)): Map[Any, V1] = updated(kv._1, kv._2)
override def ++[V1 >: Nothing](xs: GenTraversableOnce[(Any, V1)]): Map[Any, V1] = addImpl(xs)
override def ++[B >: (Any, Nothing), That](that: GenTraversableOnce[B])(implicit bf: CanBuildFrom[Map[Any, Nothing], B, That]): That = super.++(that)
private def addImpl[B >: (Any, Nothing), That](that: GenTraversableOnce[B])(implicit bf: CanBuildFrom[Map[Any, Nothing], B, That]): That = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is only called from the first ++ above where the cbf instance is statically known.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doh - changed to be similar to HashMap ++ and addImpl

def apply(from: From) = wrapped.apply()

def apply() = wrapped.apply()
final def unwrap: CanBuildFrom[_, T, To] = wrapped
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this method that drops the From type instead of making wrapped a val and keeping Nothing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no idea. Changed it to be a val

@mkeskells mkeskells force-pushed the mike/2.12.x_quickHashMap branch from e0386ec to 1408bce Compare February 4, 2020 08:26
@mkeskells
Copy link
Contributor Author

addressed some of @szeiger comments
rebased - (removed the mima changes - easier to start again on those from the build)

@mkeskells mkeskells force-pushed the mike/2.12.x_quickHashMap branch from a0d9859 to f3bd263 Compare February 5, 2020 23:22
@mkeskells
Copy link
Contributor Author

rebased and squashed

@mkeskells mkeskells force-pushed the mike/2.12.x_quickHashMap branch 2 times, most recently from d765a5a to 30db477 Compare February 9, 2020 22:09
@retronym retronym force-pushed the mike/2.12.x_quickHashMap branch from 8c45087 to c042057 Compare February 17, 2020 02:57
@retronym
Copy link
Member

Pushed a commit to rebase atop the latest 2.12.x (which has HasForEach) and to squash the last two commits.

@retronym
Copy link
Member

/synch

@retronym
Copy link
Member

/nothingtoseehere

(Clearing spurious "cla Expected — Waiting for status to be reported ")

@szeiger szeiger requested a review from retronym February 19, 2020 16:53
@szeiger
Copy link
Contributor

szeiger commented Feb 19, 2020

LGTM but it's hard to judge the correctness of all the new code so I'd prefer to get a 2nd approval from @retronym before merging.

@lrytz lrytz assigned retronym and unassigned szeiger Feb 25, 2020
@retronym retronym merged commit 466f92f into scala:2.12.x Feb 25, 2020
retronym added a commit to retronym/scala that referenced this pull request Feb 27, 2020
This avoids intermediate Tuple2 allocation.

Selective forward port of scala#8466 with a few additional
opportunities seized.
retronym added a commit to retronym/scala that referenced this pull request Feb 27, 2020
This avoids intermediate Tuple2 allocation.

Selective forward port of scala#8466 with a few additional
opportunities seized.
@SethTisue
Copy link
Member

SethTisue commented Feb 27, 2020

after this PR, scala-ide/scalariform has test failures in the community build: scala/scala-dev#672 . let's discuss further there

@retronym
Copy link
Member

Fixed in #8764

retronym added a commit to retronym/scala that referenced this pull request Feb 28, 2020
SethTisue pushed a commit to retronym/scala that referenced this pull request Mar 9, 2020
This avoids intermediate Tuple2 allocation.

Selective forward port of scala#8466 with a few additional
opportunities seized.
hamzaremmal pushed a commit to hamzaremmal/scala3 that referenced this pull request May 2, 2025
…HashMap

[nomerge] optimise the addition of immutable HashMap
hamzaremmal pushed a commit to hamzaremmal/scala3 that referenced this pull request May 2, 2025
hamzaremmal pushed a commit to hamzaremmal/scala3 that referenced this pull request May 2, 2025
This avoids intermediate Tuple2 allocation.

Selective forward port of scala/scala#8466 with a few additional
opportunities seized.
hamzaremmal pushed a commit to scala/scala3 that referenced this pull request May 7, 2025
…HashMap

[nomerge] optimise the addition of immutable HashMap
hamzaremmal pushed a commit to scala/scala3 that referenced this pull request May 7, 2025
hamzaremmal pushed a commit to scala/scala3 that referenced this pull request May 7, 2025
This avoids intermediate Tuple2 allocation.

Selective forward port of scala/scala#8466 with a few additional
opportunities seized.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

library:collections PRs involving changes to the standard collection library performance the need for speed. usually compiler performance, sometimes runtime performance.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants