Cache labels.Labels to Identify Series in Remote Write#5849
Cache labels.Labels to Identify Series in Remote Write#5849csmarchbanks merged 2 commits intoprometheus:masterfrom
Conversation
storage/remote/queue_manager.go
Outdated
There was a problem hiding this comment.
Does this really help? Changing the number of shards isn't that often.
There was a problem hiding this comment.
Benchmarks improved some when I first added it. The pool is used for every http request, not just the number of shards, though it could probably be done by keeping a single []prompb.TimeSeries per shard instead of using a pool, similar to the compression buffer.
storage/remote/codec.go
Outdated
There was a problem hiding this comment.
You can be smarter here I think, I presume this is where the allocations are.
There was a problem hiding this comment.
👍 Many were here, and some still in buildWriteRequest from making []prompb.Sample. Latest update helps both of these.
storage/remote/queue_manager.go
Outdated
There was a problem hiding this comment.
defer causes an alloc. I'd avoid sync.Pool entirely, it has interesting interactions with GC and in this case we know exactly how much memory we need.
There was a problem hiding this comment.
👍 Removed the pool.
27823cd to
cd83e11
Compare
|
That's looking okay. What do the benchmarks say now? |
cd83e11 to
908afe5
Compare
|
I removed the merge conflict, and updated the benchmarks, slightly fewer allocations than last time, CPU jumps around between runs but is about the same before/after. I don't expect the graphs to change too much, but can repost them in a few hours if interested. |
|
@brian-brazil would you mind taking another look when you have a moment? |
|
👍 |
908afe5 to
94a9f60
Compare
storage/remote/codec.go
Outdated
There was a problem hiding this comment.
@bwplotka I just updated this PR to make it build after the streaming remote read work was merged. Is it is ok that labelsToLabelsProto no longer interns all the labels?
I don't see any calls to interner.release, so it is probably better not to intern during the remote read path anyway.
There was a problem hiding this comment.
looks like remote read only uses labelsToLabelsProto in tests so it should be okay
cstyan
left a comment
There was a problem hiding this comment.
👍
I similar results to the benchmark; about the same CPU usage, anywhere from 15-30% less memory usage
storage/remote/codec.go
Outdated
There was a problem hiding this comment.
looks like remote read only uses labelsToLabelsProto in tests so it should be okay
storage/remote/queue_manager.go
Outdated
There was a problem hiding this comment.
Hmmm, would be nice if we could use a different name. To me sample implies just ts and value, but I can't think of anything better right now.
There was a problem hiding this comment.
I could call it something like queueSample to differentiate it a bit, but I don't know if that gains anything.
This will use half the steady state memory as required by prompb.Label. Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
94a9f60 to
791a240
Compare
This is an experiment to have the remote write component use
labels.Labelsinstead of[]prompb.Labelwhen passing samples around/identifying series. Alabel.Labeluses less space so steady state memory usage should be less, with the tradeoff of transforminglabel.Labelsto[]prompb.Labelfor each remote request.Results after removing some extra sources of allocations look very good. This is for a server with three remote-write targets, and ~8k samples per second ingestion.
Benchmarks
Heap profiles
Heap profiles show a 50% reduction in memory used by StoreSeries as expected.
v2.11.1:
rw-use-labels branch
Other metrics
CPU usage is very similar between the servers. However, there is a reduction in allocations:
