Optimise VectorAnd for step invariant expression#9577
Optimise VectorAnd for step invariant expression#9577darshanime wants to merge 1 commit intoprometheus:mainfrom
Conversation
b4bffb6 to
1b567e2
Compare
|
there are similar opportunities to optimize |
codesome
left a comment
There was a problem hiding this comment.
Thanks! Could you please benchmark this change to see it in action? You might need additional test cases in existing benchmark if it does not exist yet.
1b567e2 to
fecb642
Compare
|
benchmarks look great for the somewhat contrived queries: but good overall too... see benchmark |
ec72f4d to
d2818bb
Compare
|
@codesome (and maybe @bboreham ), we have run into this during the bug scrub, and it seems to us this is kind-of good to go. Would you agree? Of course, by now, it needs a rebase. @darshanime would you be able to do that? |
|
If I understand it correctly, this kind of optimization would be very helpful for Home Assistant users, since the current method for filtering out unavailable values is very expensive and needs a recorded rule to be efficient. https://www.home-assistant.io/integrations/prometheus/#metrics-in-unavailable-or-unknown-states |
|
Discussed at the bug scrub. I will take a closer look. |
d2818bb to
f55d906
Compare
|
rebased. this patch adds some optimizations for the relevant benchmarks see benchmark |
b87c9f0 to
119b62d
Compare
When sensors go offline, this component will continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric is used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). This new option will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component will continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric is used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). This new option will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component will continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric is used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). This new option will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
bboreham
left a comment
There was a problem hiding this comment.
Thanks for this. I think the idea is valid, but I'm afraid I could not understand the code.
Suggest adding some comments explaining key features.
|
|
||
| // optimisations for step invariant series | ||
| stepInvariant []bool | ||
| stepInvariantSigSet []map[string]struct{} |
There was a problem hiding this comment.
Please state what is the index of this slice, and the key of the map.
There was a problem hiding this comment.
clarified in code comment
There was a problem hiding this comment.
I don't think it is clear what a "signature set" is.
There was a problem hiding this comment.
"index of exprs passed to rangeEval" could perhaps be simplified to "0 for LHS and 1 for RHS".
Do we need a map where both are step-invariant?
promql/engine.go
Outdated
| stepInvariant: make([]bool, len(exprs)), | ||
| stepInvariantSigSet: make([]map[string]struct{}, len(exprs)), |
There was a problem hiding this comment.
Do we need to allocate in every case?
There was a problem hiding this comment.
yes, we need to allocate a stepInvariantSigSet slice for each expr (just like we allocate a Vector for each expr). refactored to append a map to this slice only when an expr is step invariant.
promql/engine.go
Outdated
| for _, sh := range lhsh { | ||
| enh.stepInvariantSigSet[0][sh.signature] = struct{}{} | ||
| } |
There was a problem hiding this comment.
I got lost here. Why are we looping?
There was a problem hiding this comment.
originally, we created a set of signatures for each series in lhs, rhs, for each timestamp.
if lhs/rhs is step invariant, all the timestamps are guaranteed to have the same series, so we create the signature set only once, and reuse it for subsequent timestamps...
There was a problem hiding this comment.
Can this be done before entering the range-values loop?
There was a problem hiding this comment.
i don't see how it could be done outside of the range-values loop
When sensors go offline, this component will continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric is used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). This new option will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
58cc80d to
0337df1
Compare
When sensors go offline, this component will continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric is used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). This new option will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component will continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric is used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). This new option will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component will continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric is used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). This new option will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component will continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric is used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). This new option will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component will continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric is used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). This new option will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component will continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric is used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). This new option will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component would continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric can be used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). Now, the component will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component would continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric can be used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). Now, the component will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component would continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric can be used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). Now, the component will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
|
Hello from the bug-scrub! Looks like no update since this PR was reviewed in May; do you think you will come back to it @darshanime ? |
When sensors go offline, this component would continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric can be used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). Now, the component will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component would continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric can be used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). Now, the component will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component would continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric can be used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). Now, the component will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component would continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric can be used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). Now, the component will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component would continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric can be used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). Now, the component will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component would continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric can be used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). Now, the component will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component would continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric can be used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). Now, the component will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component would continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric can be used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). Now, the component will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component would continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric can be used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). Now, the component will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
When sensors go offline, this component would continue to report its last value, until Home Assistant itself restarts, or the sensor returns. The `entity_available` metric can be used to filter out unavailable metrics, but this is slow with current versions of prometheus (see prometheus/prometheus#9577). Now, the component will automatically withdraw metrics when the entity becomes unavailable, which matches the behavior on restart and makes it easier to see missing metrics without using an `unless`.
Signed-off-by: darshanime <deathbullet@gmail.com>
c750e8a to
73d66f9
Compare
|
@bboreham i have rebased and addressed your comments.
i'd like to do that in a separate pr if that's okay with you |
|
Hello again from the bug-scrub! @bboreham are you able to continue the review on this? |
bboreham
left a comment
There was a problem hiding this comment.
I apologise it took me a really long time to get back to this.
It also took me a long time to pin down a problem. I still feel the code is hard to understand, but that could be acceptable.
However it doesn't always pass its own tests.
go test -count=20 -run=TestEvaluations/testdata/operators.test//@ ./promql
--- FAIL: TestEvaluations (0.08s)
--- FAIL: TestEvaluations/testdata/operators.test (0.08s)
--- FAIL: TestEvaluations/testdata/operators.test/line_17/vector_matching_a_@_start()_and_vector_matching_b (0.00s)
test.go:1122:
Error Trace: /home/bryan/src/github.com/prometheus/prometheus/promql/promqltest/test.go:1122
Error: Received unexpected error:
error in eval vector_matching_a @ start() and vector_matching_b (line 17): expected metric {__name__="vector_matching_a", l="x"} not found
Test: TestEvaluations/testdata/operators.test/line_17/vector_matching_a_@_start()_and_vector_matching_b
--- FAIL: TestEvaluations/testdata/operators.test/line_20/vector_matching_a_@_end()_and_vector_matching_b (0.00s)
test.go:1122:
Error Trace: /home/bryan/src/github.com/prometheus/prometheus/promql/promqltest/test.go:1122
Error: Received unexpected error:
error in eval vector_matching_a @ end() and vector_matching_b (line 20): expected metric {__name__="vector_matching_a", l="x"} not found
Test: TestEvaluations/testdata/operators.test/line_20/vector_matching_a_@_end()_and_vector_matching_b
I also tried reversing the new tests, and got a different failure:
eval range from 0 to 30m step 5m vector_matching_b and vector_matching_a @ start()
vector_matching_b{l="x"} 0 4 8 12 16 20 24
eval range from 0 to 30m step 5m vector_matching_b and vector_matching_a @ end()
vector_matching_b{l="x"} 0 4 8 12 16 20 24
eval range from 0 to 30m step 5m vector_matching_b @ end() and vector_matching_a
vector_matching_b{l="x"} 24 24 24 24 24 24 24
eval range from 0 to 30m step 5m vector_matching_b @ start() and vector_matching_a @ end()
vector_matching_b{l="x"} 0 0 0 0 0 0 0
--- FAIL: TestEvaluations/testdata/operators.test/line_35/vector_matching_b_@_end()_and_vector_matching_a (0.00s)
test.go:1122:
Error Trace: /home/bryan/src/github.com/prometheus/prometheus/promql/promqltest/test.go:1122
Error: Received unexpected error:
error evaluating query "vector_matching_b @ end() and vector_matching_a" (line 35): unexpected error: runtime error: index out of range [1] with length 1
Test: TestEvaluations/testdata/operators.test/line_35/vector_matching_b_@_end()_and_vector_matching_a
|
Hello again from the bug-scrub! @darshanime this is with you to fix the inconsistent test results. |
|
Hello again from the bug-scrub! @darshanime do you think you will come back to this? |
closes #9368
Signed-off-by: darshanime deathbullet@gmail.com