Add asPercent function by stivenbb · Pull Request #966 · grafana/metrictank

stivenbb · 2018-07-26T20:17:56Z

Native implementation of asPercent() Graphite function. (http://graphite.readthedocs.io/en/latest/functions.html#graphite.render.functions.asPercent)

Added a new argument type ArgIn that allows multiple other argument types. This was necessary for the total argument. Some of the code borrowed from an abandoned PR: #672

In terms of speed improvement:

---------- Native Implementation ----------
Requests      [total, rate]            900, 5.01
Duration      [total, attack, wait]    3m0.13756s, 2m59.799999s, 337.561ms
Latencies     [mean, 50, 95, 99, max]  72.006704ms, 38.065ms, 342.887ms, 472.467ms, 765.657ms
Bytes In      [total, mean]            130300948, 144778.83
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:900
Error Set:
---------- Graphite (Python) Implementation ----------
Requests      [total, rate]            900, 5.01
Duration      [total, attack, wait]    3m6.337648s, 2m59.799999s, 6.537649s
Latencies     [mean, 50, 95, 99, max]  797.282367ms, 167.489ms, 4.789024s, 6.756318s, 8.006429s
Bytes In      [total, mean]            144407224, 160452.47
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:900
Error Set:

On average, the native implementation was 11x faster, median was 4x faster, p95 was 14x faster, p99 was 14x faster and max was over 10x faster

shanson7

Would it be possible to add to the tests that this function doesn't modify the inputs?

shanson7 · 2018-07-27T14:36:12Z

expr/func_aspercent.go

+		if len(totals) == 1 {
+			totalsSerie = totals[0]
+		} else if len(totals) == len(series) {
+			sort.Slice(series, func(i, j int) bool {


Maybe a comment about why there are sorted.

shanson7 · 2018-07-27T14:38:40Z

expr/func_aspercent.go

+		return nil, err
+	}
+
+	var outSeries []models.Series


Could you do a

defer cache[Req{}] = append(cache[Req{}], outSeries...)

here instead of all the individual cache steps?

very nice, I like it

defer args are evaluated at call time, so this won't work as expected.

edit: looks like i'm wrong: https://play.golang.org/p/t8RZ1v8fLXT

You forgot to call foo() in your snippet.
see https://play.golang.org/p/Qh4rp1p2q73

shanson7 · 2018-07-27T15:01:22Z

expr/func_aspercent.go

+					serie1.Target = fmt.Sprintf("asPercent(%s,%s)", serie1.Target, serie2.Target)
+					serie1.Tags = map[string]string{"name": serie1.Target}
+					for i := range serie1.Datapoints {
+						serie1.Datapoints[i].Val = computeAsPercent(serie1.Datapoints[i].Val, serie2.Datapoints[i].Val)


I think this violates https://github.com/grafana/metrictank/blob/master/expr/NOTES#L35

shanson7 · 2018-07-27T15:02:18Z

expr/func_aspercent.go

+				serie1.Target = fmt.Sprintf("asPercent(%s,%s)", serie1.Target, serie2.Target)
+				serie1.Tags = map[string]string{"name": serie1.Target}
+				for i := range serie1.Datapoints {
+					serie1.Datapoints[i].Val = computeAsPercent(serie1.Datapoints[i].Val, serie2.Datapoints[i].Val)


I think this violates https://github.com/grafana/metrictank/blob/master/expr/NOTES#L35

shanson7 · 2018-07-27T15:02:39Z

expr/func_aspercent.go

+			} else {
+				totalVal = s.totalFloat
+			}
+			serie.Datapoints[i].Val = computeAsPercent(serie.Datapoints[i].Val, totalVal)


I think this violates https://github.com/grafana/metrictank/blob/master/expr/NOTES#L35

fixed all and added a test

shanson7 · 2018-07-27T16:11:10Z

expr/func_aspercent_test.go

+	// Test if original series was modified
+	for i, orig := range originalSeries {
+		inSerie := in[i]
+		if orig.Target != inSerie.Target {


Could you use something like

metrictank/expr/plan_test.go

Line 186 in 0428430

if !reflect.DeepEqual(err, c.expErr) {

here?

No, because the original series does get modified by metrictank (a "name" tag gets added). Makes sense to compare relevant values only.

Scratch that. the real problem was that reflect.DeepEqual(math.NaN(), math.NaN()) == false, which is not the case when comparing series

shanson7 · 2018-07-27T16:12:19Z

expr/func_aspercent_test.go

+		originalSeries[i].Interval = serie.Interval
+		originalSeries[i].QueryPatt = serie.QueryPatt
+		originalSeries[i].Target = serie.Target
+		originalSeries[i].Datapoints = getCopy(serie.Datapoints)


maybe just

originalSeries[i] = serie originalSeries[i].Datapoints = getCopy(serie.Datapoints)

Gets all the things like tags etc.

shanson7 · 2018-07-27T16:14:52Z

expr/func_aspercent.go

 			totalsSerie = totals[0]
 		} else if len(totals) == len(series) {
+			// Sorted to match the input series with the total series based on Target.
+			// Mimicks Graphite's implementation


Nit, but 'Mimic' the common modern spelling

shanson7 · 2018-07-27T16:16:42Z

expr/func_aspercent.go

+
+func copyDatapoints(serie *models.Series) {
+	out := pointSlicePool.Get().([]schema.Point)
+	for _, p := range serie.Datapoints {


out = append(out, serie.Datapoints...)

stivenbb · 2018-07-30T16:57:37Z

Note: this requires Go version 1.10+ because it uses math.Round()

@shanson7 any outstanding comments?

stivenbb · 2018-08-07T15:38:31Z

Any chance this can get merged to master?

…an optional argument

Dieterbe · 2018-08-07T18:44:11Z

do you plan to make more changes or dyou consider this complete?
also, it's kindof hard to see which code came from my old PR, is it possible to clarify the situation by reusing some of those commits. ideally perhaps, this PR could resume where the other branch left off.
but if all of this is too much work, then never mind.

stivenbb · 2018-08-07T18:51:27Z

This is good to go. I just rebased it to make the merge cleaner.

And I didn't use much, just added an additional argument type (that you originally made). It would be a lot of work to reuse the old commits with little to no payoff imo, since I rewrote most of it.

Dieterbe · 2018-08-07T19:13:04Z

expr/plan.go

+	for _, argExp = range argsExp {
+		if pos >= len(e.args) {
+			break // no more args specified. we're done.
+		}


the removal of cutoff here is this safe?

Yes. Before it assumed that series arguments can only be non-optional arguments and in the beginning. i.e function(serie, int, string, opt=string).

With my change the following is possible:
function(serie, int, serie, opt=serie)

Like before, this part only consumes series, and just skips over everything else.

looks good :) the argument consumption/iterating code is probably my least favorite code of MT. FWIW.
if you can handle this, you can handle anything else 👯‍♂️

Dieterbe · 2018-08-07T19:42:55Z

expr/func_aspercent.go

+			totalSeriesLists := groupSeriesByKey(totals, s.nodes, &keys)
+			totalSeries = getTotalSeries(totalSeriesLists)
+		} else {
+			return nil, errors.New("total must be None or a seriesList")


seems like it's fairly easy to trigger this by specifying a serieslist pattern that doesn't match any series.
maybe in that case we can provide a better error message? (basically above at if len(totals) == 0 { maybe directly return an error ? what does graphite do in this case?

I'm not sure what you mean here. This error message gets triggered if they pass in a number. In the case where they pass in a nodes argument, only a series or nothing should be passed in for the total argument. First branch is triggered if neither is passed in. Second branch is triggered if a series is passed in. Which leaves the case where a number is passed in (i.e math.IsNaN(s.totalFloat) = false)

Graphite throws that exact error message is that case.

if you specify a serieslist pattern that doesn't match any series, then I believe this will happen:

if s.totalSeries != nil { totals, err = s.totalSeries.Exec(cache) if err != nil { return nil, err } if len(totals) == 0 { totals = nil <---- this right here } }

FWIW, Graphite crashes if you do that.

Let me see if I can handle it in some clean way

Ok, so this was just unnecessary:

if len(totals) == 0 { totals = nil }

So I removed it. Now if a series returns empty, it does not assume that there were no arguments

Dieterbe · 2018-08-07T19:44:34Z

expr/func_aspercent.go

+	return context
+}
+
+func (s *FuncAsPercent) Exec(cache map[Req][]models.Series) ([]models.Series, error) {


this function is rather complex. can we split it up in smaller functions?

Dieterbe · 2018-08-07T20:29:53Z

expr/func_aspercent.go

+	return outSeries, err
+}
+
+func (s *FuncAsPercent) execWithNodes(series []models.Series, totals []models.Series) ([]models.Series, error) {


series, totals []models.Series. also the other function

Dieterbe · 2018-08-07T20:33:54Z

expr/func_aspercent.go

+	}
+}
+
+func deepCopySerieElements(serie *models.Series) {


this API is a bit strange. would it not make more sense to return a new series that is a deep copy?
even though that is slightly more work, seems worth it.

in fact, maybe add a Copy() method to the serie type in the models package

also looking at sumSeries, the copying/newly-allocating seems a bit too eager.
we should only need to allocate point slices:

for the totals slice: when we need to sum up values and we need a place to store the totals (not when totals results in a single series, then we can just read from it)

for each output series.(if it is different from the input slice)

The problem with # 1 is that later I modify the total series in some cases such as this (line 99):

// No series for total series if _, ok := metaSeries[key]; !ok { serie2 := totalSeries[key] serie2.QueryPatt = fmt.Sprintf("asPercent(MISSING,%s)", serie2.QueryPatt) serie2.Target = fmt.Sprintf("asPercent(MISSING,%s)", serie2.Target) serie2.Tags = map[string]string{"name": serie2.Target} for i := range serie2.Datapoints { serie2.Datapoints[i].Val = math.NaN() } outSeries = append(outSeries, serie2) continue }

I guess I could make a copy right there, not sure which one would be better.

I'll be mostly afk until Thursday.
Please read expr/NOTES carefully if you have not already done so. All this complexity is to make sure we never overwrite data in the memory AggMetric, chunk cache etc.
Some of these fields are by value so harmless but the tags map is interesting as well as it might open an avenue to modify the tags in the MemoryIdx. Not sure if we currently take that into account everywhere or maybe we already have a provision for that. On phone so can't check right now

Ok, well I added a Series.Copy(emptyDatapoints []schema.Point) function (the argument is there so that pointSlicePool can be used if needed). And I only copy the series if I modify values that shouldn't be modified.

Hopefully that's sufficient and what you were looking for 😃

Dieterbe · 2018-08-07T20:45:47Z

some unit tests for ArgIn would be nice.
with some luck you can just cherry pick 47ff80a

Dieterbe · 2018-08-10T11:32:08Z

expr/func_aspercent_test.go

+		{Val: float64(199) * 100, Ts: 30},
+		{Val: float64(29) / 2 * 100, Ts: 40},
+		{Val: float64(80) / 3.0 * 100, Ts: 50},
+		{Val: float64(250) / 4 * 100, Ts: 60},


shouldn't out2 be the same as in the previous function? because both cases have asPercent(d,a)

No, because when totals is a seriesList with len(totals) == len(series), the series first get sorted by tag before getting matched. So, in this test case it would be asPercent(d,c) and asPercent(b,a). This behavior is same in Graphite

Dieterbe · 2018-08-10T13:45:46Z

expr/func_aspercent.go

+				totalVal = totalsSerie.Datapoints[i].Val
+			} else {
+				totalVal = s.totalFloat
+			}


isn't totalFloat pretty much guaranteed to be NaN here?

No, because totalFloat is still a valid option here. (the else part of the if statement right before this)

Dieterbe · 2018-08-10T13:47:17Z

expr/func_aspercent.go

+		for i := range serie.Datapoints {
+			var totalVal float64
+			if len(totalsSerie.Datapoints) > i {
+				totalVal = totalsSerie.Datapoints[i].Val


we may want to always just assume this branch is taken and just write this code line directly without the if/else.
that way if we ever have a bug where len(totalSerie.Datapoints) != len(serie.Datapoints) we can troubleshoot it instead of trying to hide such error case and returning incorrect data.

this is more of a check of whether there is a totalSeries at all. I'll change it to > 0, that way we get an index out of range exception if there's a bug.

Dieterbe · 2018-08-10T13:49:13Z

expr/func_aspercent.go

+			sort.Slice(totals, func(i, j int) bool {
+				return totals[i].Target < totals[j].Target
+			})
+			for i, serie1 := range series {


lots of duplication here wrt the similar code block further down.
can't we do all the if/else stuff here to just set up the right totals and series variables,
and then do the same processing at the end irrespective of which scenario it was?

Dieterbe · 2018-08-10T16:16:00Z

expr/func_aspercent.go

+	} else if totals != nil {
+		totalSeriesLists := groupSeriesByKey(totals, s.nodes, &keys)
+		totalSeries = getTotalSeries(totalSeriesLists)
+	}


we compute totals even for keys we won't need (e.g. keys not in the input series)

for each missing case we repeatedly get a slice and fill it with NaN's. we could reuse the same slice in this case. my suggestion would be declare a var nones []schema.Point. then whenever you need it, if it's nil, instantiate it. if it's not, just reuse it.

I don't think reusing a nones slice would play nicely with the pointSlicePool, would it?

oh right, because at the end the same slice would get added to the pool multiple times, which is bad. so nevermind that then.

On phone now
best would be to just add the slice to the cache thing once I think (that way we can reuse them)

Dieterbe · 2018-08-10T16:17:09Z

hi @stivenbb my first pass of review comments is now done :p let me know when you've addressed everything or if you have any questions.

… arg

stivenbb · 2018-08-13T15:34:42Z

@Dieterbe ok, I think I've addressed all the requested changes. Let me know if I missed something

Dieterbe · 2018-08-14T10:45:50Z

what is it that requires the new go version?

shanson7 · 2018-08-14T10:53:00Z

#966 (comment)

math.Round

Dieterbe · 2018-08-14T14:06:43Z

I think the last things to do here are :

I added 3 commits to https://github.com/grafana/metrictank/tree/asPercent2 , please review them and merge / cherry-pick them into your branch if they look good to you.
in execWithNodes we still compute totals for all aggKeys, even for keys not found in the metaSeries. this is needless extra work

* easier to follow code * consistency with execWitNodes * bugfix: pool-obtained sumseries should be recycled later

stivenbb · 2018-08-14T14:11:39Z

Ok, just cherry picked. Will work on not computing totals when unnecessary.

Dieterbe · 2018-08-14T14:12:55Z

oh i realized my code nones := pointSlicePool.Get().([]schema.Point) is wrong, should be = to prevent scoping issues I think.

stivenbb · 2018-08-14T14:18:08Z

cassandra test seems to be failing after I cherry-picked... Do you know what that could be about?

Dieterbe · 2018-08-14T14:22:44Z

looks like a flakey test. rerunning tests should fix it. but circleCI is not letting me. can you retrigger on https://circleci.com/workflow-run/ba5e3453-48a0-4731-9c34-22381695eb63 ? if not, your next push will.

stivenbb · 2018-08-14T15:09:48Z

Ok, should be good to go now. LMK if my change looks good.

Dieterbe · 2018-08-14T15:29:07Z

great work @stivenbb, thank you very much for your work on this.

stivenbb changed the title ~~Add isPercent function~~ Add asPercent function Jul 26, 2018

stivenbb mentioned this pull request Jul 26, 2018

WIP: asPercent #672

Closed

shanson7 suggested changes Jul 27, 2018

View reviewed changes

shanson7 reviewed Jul 27, 2018

View reviewed changes

shanson7 approved these changes Aug 1, 2018

View reviewed changes

stivenbb added 12 commits August 7, 2018 14:38

wip: Initial asPercent function. works with no arguments

58f3c4b

added arg that accepts different types of args

e32031f

Bug fix: make ArgIn work properly with series AND allow series to be …

c434121

…an optional argument

fixed nodes parameter and added tests

cf3b879

more tests

c01bb11

modified readme

79aeebe

made requested changes

6d8ae12

changed comment

1a17f36

removed unnecessary _

2b147d2

sort output series in tests

8ff0653

ensure that series don't get modified and added tests

653c0ba

copy tags as well

79d81bd

stivenbb force-pushed the asPercent branch from 2d03fbc to 79d81bd Compare August 7, 2018 18:39

Dieterbe reviewed Aug 7, 2018

View reviewed changes

split up Exec

f5ed45b

Dieterbe reviewed Aug 7, 2018

View reviewed changes

unit tests for ArgIn (based on asPercent)

d690235

Dieterbe reviewed Aug 10, 2018

View reviewed changes

Dieterbe and others added 3 commits August 13, 2018 10:25

it shouldn't be up to caller of consumeKwarg to provide half-consumed…

9171b99

… arg

alphabetical ordering of cases

bd84f57

PR changes

b476724

stivenbb added 2 commits August 13, 2018 11:35

format

d350e37

refactored tests and merged into one file

c480969

Dieterbe added 3 commits August 14, 2018 10:10

move pool recycling close to pool getting + bugfix

172dccb

* easier to follow code * consistency with execWitNodes * bugfix: pool-obtained sumseries should be recycled later

can use pool to source none slice

b6abd38

better comments

54afde0

fix scoping issue

22870b9

don't sum unneeded totals

2b34f46

Merge branch 'master' into asPercent

7555e2b

Dieterbe approved these changes Aug 14, 2018

View reviewed changes

Dieterbe merged commit 6933255 into grafana:master Aug 14, 2018

shanson7 deleted the asPercent branch August 21, 2018 20:33

Conversation

stivenbb commented Jul 26, 2018

Uh oh!

shanson7 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Dieterbe Aug 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stivenbb commented Jul 30, 2018

Uh oh!

stivenbb commented Aug 7, 2018

Uh oh!

Dieterbe commented Aug 7, 2018

Uh oh!

stivenbb commented Aug 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stivenbb Aug 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Dieterbe Aug 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Dieterbe Aug 7, 2018 •

edited

Loading

stivenbb commented Aug 7, 2018 •

edited

Loading

stivenbb Aug 7, 2018 •

edited

Loading

Dieterbe Aug 7, 2018 •

edited

Loading

shanson7 Aug 10, 2018 •

edited

Loading

Dieterbe Aug 10, 2018 •

edited

Loading