Skip to content

Add Tdigest Aggregation Plugin#6887

Closed
PhoenixRion wants to merge 18 commits intoinfluxdata:masterfrom
PhoenixRion:master
Closed

Add Tdigest Aggregation Plugin#6887
PhoenixRion wants to merge 18 commits intoinfluxdata:masterfrom
PhoenixRion:master

Conversation

@PhoenixRion
Copy link
Copy Markdown
Contributor

@PhoenixRion PhoenixRion commented Jan 9, 2020

closes #6440

Required for all PRs:

  • Signed CLA.
  • Associated README.md updated.
  • Has appropriate unit tests.

Copy link
Copy Markdown
Contributor

@danielnelson danielnelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here an initial review. Ultimately my goal is to shrink down the code as much as we can, as this goes a long way on understanding and maintaining the plugin.

/telegraf.exe
/telegraf.gz
/vendor
.idea/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this file, place in your global core.excludesfile https://git-scm.com/docs/gitignore/2.22.1

name = "github.com/satori/go.uuid"
revision = "b2ce2384e17bbe0c6d34077efa39dbab3e09123b"

[[constraint]]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We updated master to use Go modules after this PR was opened, kind of surprised this didn't trigger a conflict.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what I would need to change because of this

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to either merge or rebase against Telegraf master branch, then these files will be deleted. To add the dependency you just need to run:

go get github.com/influxdata/tdigest@v0.0.1

This should add it to the new go.mod/go.sum files.


```
# Keep the histogram of each metric passing through.
[[aggregators.tdigestagg]]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to tdigest, move files to directory with same name.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This what not initially possible because it conflicted with the tdigest library but that is not longer an issue

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also adjust an imports name if needed:

import (
    foo github.com/influxdata/tdigest
)
var tdigest = foo.TDigest{}

@@ -0,0 +1,172 @@
//Copyright (c) 2019, Groupon, Inc.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove per file copyright notices

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am checking with Legal as the text per file is a current guideline.

ExcludeTags []string `toml:"exclude_tags"`
SourceTagKey string `toml:"source_tag_key"`
AtomReplacementTagKey string `toml:"atom_replacement_tag_key"`
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the contents of this file into tdigest.go

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would I move this out of the bucketing package when it is only referenced within it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this referenced from the main plugin struct?

emitting the aggregations and the histogram every `period` seconds.

### Aggregation Concepts
* Tag key ```source``` has special meaning as the primary key of developers destination TSDB
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The measurement aka metric name should be this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Source is very different from measurement

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the some of the logic with source and atom is specific to your deployment. For inclusion into the main Telegraf tree I suggest we move these out of this plugin and into separate processor(s) than can check for tags and enforce rules as needed. If the current processors can't handle your case, then we can add new ones to complete the needed functionality.

I'm not exactly sure how you are using the source tag, but keep in mind that it is a somewhat special tag that Telegraf often uses to report the hostname of the system the metrics describe.


### Aggregation Concepts
* Tag key ```source``` has special meaning as the primary key of developers destination TSDB
* Tag key ```atom``` has special meaning to denote the atomic dimension of an aggregation
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this used for?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to denote the atomic dimension of an aggregation

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the units (meters, bytes, etc), the aggregation type (min, max, etc) or something entirely different? Can you add some examples?

@@ -0,0 +1,37 @@
//Copyright (c) 2019, Groupon, Inc.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On all files, rename to use snake_case.go

_time time.Time
}

type ClamAggregation struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to keep this format in, I'd like to do two things:

  1. Build everything as a single main type, and then do a conversion to you CLAM format at the last moment.
  2. Remove documentation of this format.

If you don't think this will be possible, then we should remove the format. There isn't any way we will be able to maintain the format if it unless it is very simple and self contained.

Comment on lines +57 to +61
_histogram *tdigest.TDigest
_tags map[string]string
_basicName string
_sum float64
_time time.Time
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't very idiomatic for Go, name these without the leading underscore and export the function names in the interface.

@alexrocco
Copy link
Copy Markdown

Any news about this PR? IMO this will be very useful for the project, especially for on-line percentiles.

@sjwang90
Copy link
Copy Markdown
Contributor

@PhoenixRion Thanks for the PR but closing this in preference for #8594 with less code. If there's anything that this PR contains that is missing from #8594 feel free to comment.

@sjwang90 sjwang90 closed this Jan 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Need a more flexable histogram / tdigest

4 participants