Skip to content

A simple TreeSetBuilder#8736

Merged
retronym merged 1 commit intoscala:2.12.xfrom
mkeskells:2.12.x_sortedSet-2
Feb 26, 2020
Merged

A simple TreeSetBuilder#8736
retronym merged 1 commit intoscala:2.12.xfrom
mkeskells:2.12.x_sortedSet-2

Conversation

@mkeskells
Copy link
Contributor

Avoid the creation of TreeSet objects tat are never visible, so just create one when result is called

Simple optimisation to ++= to do less work

@scala-jenkins scala-jenkins added this to the 2.12.12 milestone Feb 21, 2020
@mkeskells mkeskells force-pushed the 2.12.x_sortedSet-2 branch 2 times, most recently from 3bd32f2 to 486b6d7 Compare February 22, 2020 09:17
@diesalbla diesalbla added the library:collections PRs involving changes to the standard collection library label Feb 22, 2020
@mkeskells
Copy link
Contributor Author

benchmark
baseline - (Map and set) - https://gist.github.com/mkeskells/b2519ea2577d118fc93be65770fab705
results for this PR (Set only) - https://gist.github.com/mkeskells/bc7968fc375f8314387322cc0f786991

summary for each benchmark
builderPlus - calling += on the builder

Measure before After
CPU (ns) 432 414
Allocation(Bytes) 482 458

builderPlusPlusInitial - calling the first ++= on the builder - for the case where the parameters is a compatible TreeSet
before this change its O(n), after its constant and trivial
Its a bit of a special case - not typical coding

Measure before After
CPU (ns) 461546 14
Allocation(Bytes) 500992 24

builderPlusPlusLargeLarge - calling two ++= on the builder, with large non overlapping keys
Most of the effect is the initial set saving from the previous benchmark. the rest is from the saving in the first benchmark

Measure before After
CPU (ns) 951810 360917
Allocation(Bytes) 888817 363688

builderPlusPlusSmallLarge and builderPlusPlusLargeSmall builderPlusPlusSame- calling two ++= on the builder, with a large and small datase, non overlapping keys
Some of the effect if from the initial values, but additionally working out what which to add to which

builderPlusPlusLargeSmall

Measure before After
CPU (ns) 494615 2768
Allocation(Bytes) 503808 2952

builderPlusPlusSmallLarge

Measure before After
CPU (ns) 497649 2528
Allocation(Bytes) 502176 2952

builderPlusPlusSame

Measure before After
CPU (ns) 832849 338073
Allocation(Bytes) 793041 265992

plusPlus - not affected by this PR - but for completeness
calling ++ on the TreeSet with another TreeSet with non overlapping data

Measure before After
CPU (ns) 425446 429665
Allocation(Bytes) 385832 385832

final class TreeSet[A] private (tree: RB.Tree[A, Unit])(implicit val ordering: Ordering[A])
extends SortedSet[A] with SortedSetLike[A, TreeSet[A]] with Serializable {

//for serialisation computability
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you mean?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If instead we make tree: RB.Tree[A, Unit] above into private val tree: RB.Tree[A, Unit], we end up renaming the its backing field to an expanded name (expanded because it is accessed from another class and madeNotPrivate). This changes the serialization fingerprint of the class.

2.13 collections use the serialization proxy pattern pervasively to avoid relying on default serialization.

this
}

private object adder extends Function1[A, Unit] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, you could make TreeSetBuilder extend A => Unit and save the object overhead, but that's too clever by at least half

@retronym retronym requested a review from szeiger February 25, 2020 06:59
Copy link
Member

@retronym retronym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'd be comfortable with this in 2.12.11 ahead of the full backport of the 2.13 RB trees Mike is working towards for 2.12.12.

@retronym retronym merged commit 348c541 into scala:2.12.x Feb 26, 2020
@SethTisue SethTisue modified the milestones: 2.12.12, 2.12.11 Feb 27, 2020
@lrytz lrytz added the performance the need for speed. usually compiler performance, sometimes runtime performance. label Mar 16, 2020
hamzaremmal pushed a commit to hamzaremmal/scala3 that referenced this pull request May 2, 2025
hamzaremmal pushed a commit to scala/scala3 that referenced this pull request May 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

library:collections PRs involving changes to the standard collection library performance the need for speed. usually compiler performance, sometimes runtime performance.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants