Skip to content
This repository was archived by the owner on Aug 23, 2023. It is now read-only.

cassandra-idx: load partitions in parallel#1270

Merged
Dieterbe merged 6 commits intografana:masterfrom
bloomberg:loadIdxInParallel
Apr 10, 2019
Merged

cassandra-idx: load partitions in parallel#1270
Dieterbe merged 6 commits intografana:masterfrom
bloomberg:loadIdxInParallel

Conversation

@shanson7
Copy link
Copy Markdown
Collaborator

@shanson7 shanson7 commented Apr 4, 2019

Loading the cassandra index takes up to 10 minutes for our nodes. With the changes for partitioned index it used less memory, but wasn't any faster. This change brings it down to 3 minutes which helps reduce our recovery window when we lose a node (also speeds up our rollouts)

@woodsaj
Copy link
Copy Markdown
Contributor

woodsaj commented Apr 4, 2019

I think it would be beneficial to make the concurrency configurable. We have some instances that consume a lot of memory if they try and load all partitions at once. So having the concurrency configurable allows us to balance speed vs memory utilisation.

@shanson7
Copy link
Copy Markdown
Collaborator Author

shanson7 commented Apr 4, 2019

That makes sense. What would a reasonable default value be? I feel like 1 is the safest, but also slowest.

@shanson7
Copy link
Copy Markdown
Collaborator Author

shanson7 commented Apr 4, 2019

Ok, if we can agree on the anem of the param and the default, I'll do another commit and update the docs/sample ini files.

@shanson7 shanson7 force-pushed the loadIdxInParallel branch 2 times, most recently from 4e311f5 to 43c49ef Compare April 4, 2019 22:04
@replay
Copy link
Copy Markdown
Contributor

replay commented Apr 5, 2019

That's a great idea.
scripts/dev/tools-to-doc.sh still needs to be updated to make the tests pass.

@shanson7
Copy link
Copy Markdown
Collaborator Author

shanson7 commented Apr 5, 2019

Yep, wanted to make sure the naming / defaults were agreed upon

wg.Done()
<-gate
}()
var defs []schema.MetricDefinition
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try and reuse these slices where possible. If concurrency is set to 1, then we can re-use the same slice for every partition and reduce allocations.

Perhaps a sync.Pool could be used.

@woodsaj
Copy link
Copy Markdown
Contributor

woodsaj commented Apr 5, 2019

init-load-concurrency sounds like the right config name to me.

@shanson7
Copy link
Copy Markdown
Collaborator Author

shanson7 commented Apr 5, 2019

I should also mention that this change really shines with the PartitionedIndex because lock contention is lower. We aren't using that yet though.

@Dieterbe
Copy link
Copy Markdown
Contributor

Dieterbe commented Apr 8, 2019

I have disabled codelingo. too unreliable for now. may need to re-push to get it to get rid of the codelingo step.

@shanson7 shanson7 closed this Apr 9, 2019
@shanson7 shanson7 reopened this Apr 9, 2019
@shanson7 shanson7 force-pushed the loadIdxInParallel branch from e104b43 to 6720297 Compare April 9, 2019 15:14
@shanson7
Copy link
Copy Markdown
Collaborator Author

shanson7 commented Apr 9, 2019

Got a test failure in unrelated test. Possibly due to the latest rebase?

@Dieterbe
Copy link
Copy Markdown
Contributor

Dieterbe commented Apr 9, 2019

CI is all green.

@shanson7
Copy link
Copy Markdown
Collaborator Author

shanson7 commented Apr 9, 2019

Yes, I clicked "Re-run". Seems like a flaky test.

<-gate
}()
defs = c.LoadPartitions([]int32{p}, defs, pre)
num += c.MemoryIndex.LoadPartition(p, defs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concurrent write to num? maybe use an atomic here?

@Dieterbe
Copy link
Copy Markdown
Contributor

@shanson7 i pushed minor tweaks. if my commits looks good to you, then I think this is good to go!

@shanson7
Copy link
Copy Markdown
Collaborator Author

LGTM

Copy link
Copy Markdown
Contributor

@Dieterbe Dieterbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. thanks @shanson7 !

@Dieterbe Dieterbe changed the title Load partitions in parallel cassandra-idx: load partitions in parallel Apr 10, 2019
@Dieterbe Dieterbe merged commit ebcbb7b into grafana:master Apr 10, 2019
@shanson7 shanson7 deleted the loadIdxInParallel branch February 24, 2020 11:32
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants