Skip to content

Dogstatsd mixer metrics adapter#3368

Merged
istio-merge-robot merged 5 commits intoistio:masterfrom
awwithro:dogstatsd
Feb 17, 2018
Merged

Dogstatsd mixer metrics adapter#3368
istio-merge-robot merged 5 commits intoistio:masterfrom
awwithro:dogstatsd

Conversation

@awwithro
Copy link
Copy Markdown
Contributor

@awwithro awwithro commented Feb 9, 2018

This PR adds a datadog adapter that sends metrics using the dogstatsd flavor of statsd. I modeled this largely using the statsd and prometheus adapters since dogstatsd "tags" are similar to prometheus "labels" applied to statsd metric names.

I have this adapted currently running in a staging environment of mine and can confirm it is working as intended. I've only implemented statsd type metrics with tags from dimensions. There is also the potential to emit checks/events to datadog using istio telemetry but that's out of scope for what I wanted to build for an initial adapter.

Please let me know how this looks or if I've missed anything important.

@awwithro awwithro requested a review from a team February 9, 2018 19:08
@googlebot
Copy link
Copy Markdown
Collaborator

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address on your commit. Check your existing CLA data and verify that your email is set on your git commits.
  • If your company signed a CLA, they designated a Point of Contact who decides which employees are authorized to participate. You may need to contact the Point of Contact for your company and ask to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the project maintainer to go/cla#troubleshoot. The email used to register you as an authorized contributor must be the email used for the Git commit.
  • In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again. If the bot doesn't comment, it means it doesn't think anything has changed.

@googlebot googlebot added the cla: no Set by the Google CLA bot to indicate the author of a PR has not signed the Google CLA. label Feb 9, 2018
@istio-testing
Copy link
Copy Markdown
Collaborator

Hi @awwithro. Thanks for your PR.

I'm waiting for a istio member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@awwithro
Copy link
Copy Markdown
Contributor Author

awwithro commented Feb 9, 2018

I signed it!

@googlebot
Copy link
Copy Markdown
Collaborator

CLAs look good, thanks!

@googlebot googlebot added cla: yes and removed cla: no Set by the Google CLA bot to indicate the author of a PR has not signed the Google CLA. labels Feb 9, 2018
@awwithro awwithro requested a review from a team February 9, 2018 19:26
@istio-merge-robot
Copy link
Copy Markdown

@awwithro PR needs rebase

@istio-merge-robot istio-merge-robot added needs-rebase Indicates a PR needs to be rebased before being merged and removed needs-rebase Indicates a PR needs to be rebased before being merged labels Feb 9, 2018
@douglas-reid
Copy link
Copy Markdown
Contributor

/ok-to-test

Copy link
Copy Markdown
Contributor

@douglas-reid douglas-reid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. Thanks for contributing!

I had a few initial comments.

Gopkg.lock Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like a merge issue.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we reuse the "default" metrics already defined instead of defining new ones (for example: requestcount.metric) ? These look nearly identical.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some documentation here? This will form the basis of the website docs, so it would be good to have thorough docs here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah didn't realize that. I'll expand on the comments.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: suggest naming this UNKNOWN_TYPE, per https://cloud.google.com/apis/design/naming_convention#enum_names

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why duplicate this locally?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd seen it in another adapter and figured it was a test fixture of some sort, I'll clean it up

Copy link
Copy Markdown
Contributor

@guptasu guptasu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @awwithro for this adapter. Overall looks great. I have a few comments on validation of the config and general comment about more documentation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding an example for how to write equivalent yaml config.
example https://github.com/istio/istio/blob/master/mixer/adapter/solarwinds/config/config.proto#L34

This comment will automatically show up on istio documentation page https://istio.io/docs/reference/config/adapters/solarwinds.html

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a default value for this ? Consider adding what is the default for any field in this config, if there is one. Also, consider mentioning, which is required and which is optional

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious: the values for these tags are constant literals right ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason this cannot be same as instance_name ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find its useful to override the istio metric name to remove things like the "istio-system" namespace or to match a different naming standard for metrics in datadog

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider commenting on what each of these mean

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly this validation of datatype of value can be done during Validate call. The SetMetricType gives exactly what is the datatype of the value field.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider adding more context, what types are allowed for what kind of metrics.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is default, please add to the comments of adapter config.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a real world example for global_tags user would want to add .

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider adding an example for per metric tag in adapter config.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a copyright banner.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add Istio documentation markers. Look in other adapters for the $location, $title, and $overview markers. See the Istio developer's guide for info on writing docs for protos.

Thanks.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please capitalize Mixer here and elsewhere.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefix -> Prefix

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain in the comment why one would alter this value? Does Datadog need corresponding config updates when twiddling this value?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took another look at the client and it turns out this buffer is totally different than what I thought it was. I thought it was for adjusting the UDP packet size to fill before sending metrics (similar to the non-datadog statsd client) but instead its a buffer of individual metrics regardless of size. The buffer is flushed every 100ms regardless of size and can result in packet drops if the buffer is large.

Given this I think a default of non-buffering makes more sense but I'm torn on leaving this as a knob for tuning (with a warning) given that it could lead to undesired behavior. Thoughts?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be preferable to generate when this happens. Otherwise, the operator could easily have data being dumped in the toilet and not know why. So just generate errors when things don't line up.

You can discover exactly what you will be handed at runtime during validation. So you can report errors to the user there, before receiving traffic.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a copyright banner.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this test, the Validate call took care of it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty expensive. Could this be done by calling .String on the object instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd tried a type assertion initially but since not all dimensions are strings it panicked. I figured that the fmt package would be better than switch on each dimension type

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sprintf is very expensive relatively speaking. The SolarWinds adapter's metric_handler.go has a fast path for common data which would be useful here.

@guptasu Sunny, Seems like we need to add a utility to pkg/adapter that returns a stringified version of all supported value types we support. We could scour all adapters and use as needed. The memquota adapter also needs to compare value types in a generic way, so maybe we need that too. WDYT?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are talking about matchDimension in memquota ?

@geeknoid, I agree, we should do everything possible to make lives easy for adapters + performant. This is a good example. I will send a PR soon, hopefully it can be used for this adapter.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the redis quota adapter has similar logic. Both also contain a makeKey function which converts dimensions into strings which would benefit from a generic function for each value type.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, the prometheus adapter has the same logic here as well:

labels[i] = fmt.Sprintf("%v", label)

Copy link
Copy Markdown
Contributor

@guptasu guptasu Feb 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update: #3391

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll wait for that commit to be merged to incorporate

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure this can't happen by validating the input config up front.

@istio-merge-robot
Copy link
Copy Markdown

@awwithro PR needs rebase

@istio-merge-robot istio-merge-robot added the needs-rebase Indicates a PR needs to be rebased before being merged label Feb 9, 2018
@awwithro
Copy link
Copy Markdown
Contributor Author

PTAL, I've incorporated changes around docs/validation config. I think that covers the primary asks other than the string formatting which I can add when the stringifier is merged

Copy link
Copy Markdown
Contributor

@geeknoid geeknoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Just a few comments.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can just combine with next line as return nil, env.Logger.Errorf(...)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

byt -> by

This check is already done in Validate, right?

The Mixer contract is that it won't call Build unless Validate succeeds.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validate checks that the adapterConfig metrics exist as a valid metric type while this is looking at the opposite. I was going to add this to validate but 1. didn't think it necessitates failing validation in this case and 2. validate doesn't have access to the env.Logger to generate a warning (unless I've missed something).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, the Validate function doesn't have access to an env logger. All validation errors should be returned in the error.

The idea with Validate is that anything wrong in the config entered by the operator should be reported there. This will prevent bad config from entering the system at all. By the time Build runs, the only errors we should be producing are the transient variety around network connectivity or credentials. Things that are strictly related to operator errors should be in Validate.

In general, we want to encourage config to be well-formed and strongly correct in order to eliminate ambiguities and issues with future upgradability. So we want to be aggressive about flagging errors.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stepping back a bit. When SetMetricsType gets called on the builder, are the metrics that are passed in all of the metric templates that the mixer knows about or just the metrics that have rules that map them to the adapter? My first thought was that it was all metrics and it made sense to me to log and continue in that case.

If SetMetricsType is only passing specific metrics to the handler and the handler doesn't have the corresponding config then, yes a validation error makes sense there.

@geeknoid
Copy link
Copy Markdown
Contributor

geeknoid commented Feb 13, 2018 via email

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

byt -> by

@geeknoid
Copy link
Copy Markdown
Contributor

/lgtm

Copy link
Copy Markdown
Contributor

@guptasu guptasu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments. Almost LGTM

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by the

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@awwithro you should now be able to replace this code with the new code from pkg/adapter.Stringify

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do. For purposes of rebasing/pushing is the PR "good enough" to force push and lose comment history?

@awwithro awwithro requested a review from a team February 16, 2018 21:12
@googlebot
Copy link
Copy Markdown
Collaborator

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for the commit author(s). If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again. If the bot doesn't comment, it means it doesn't think anything has changed.

@googlebot googlebot added cla: no Set by the Google CLA bot to indicate the author of a PR has not signed the Google CLA. and removed cla: yes labels Feb 16, 2018
@istio-merge-robot
Copy link
Copy Markdown

/lgtm cancel //PR changed after LGTM, removing LGTM. @awwithro @geeknoid @guptasu

Alex Withrow added 5 commits February 16, 2018 14:14
PR Feedback
  * Disable buffering by default
  * Validate statsd address
  * Warn on unhandled metrics
  * Update sample/docs

Copy map to avoid a race condition
@googlebot
Copy link
Copy Markdown
Collaborator

CLAs look good, thanks!

@googlebot googlebot added cla: yes and removed cla: no Set by the Google CLA bot to indicate the author of a PR has not signed the Google CLA. labels Feb 16, 2018
@awwithro
Copy link
Copy Markdown
Contributor Author

@ldemailly PTAL I think this is all set

@ldemailly
Copy link
Copy Markdown
Member

/lgtm

@istio-merge-robot
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: geeknoid, guptasu, ldemailly

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@ldemailly
Copy link
Copy Markdown
Member

thx for going through the new vendor/ process successfully !
(feedback or updates to the FAQ welcome!)

(my lgtm is only about the vendor/toml - I assume the rest is still same as before/approved by others)

@istio-merge-robot
Copy link
Copy Markdown

/test all [submit-queue is verifying that this PR is safe to merge]

@istio-merge-robot
Copy link
Copy Markdown

Automatic merge from submit-queue.

@istio-merge-robot istio-merge-robot merged commit 448436a into istio:master Feb 17, 2018
@istio-testing
Copy link
Copy Markdown
Collaborator

@awwithro: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
prow/istio-pilot-e2e.sh 6f0f88d link /test istio-pilot-e2e
Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@awwithro
Copy link
Copy Markdown
Contributor Author

Thanks again for the help and getting this added to the code base. Biggest feedback would be to consolidate the vendor faq with the contribution repo. I liked the makefile approach for setting up the submodules and extending that to validate/warn the local dev env would be a good thing.

The other thing is that I used the statsd/prometheus adapters as a reference but it looks like they predate much of code improvement/docs of later adapters. It'd be good to have some open issues to increase their quality. The label:"community/good first issue" would be a good fit for them.

@ldemailly
Copy link
Copy Markdown
Member

there are lint failures on master

mixer/adapter/dogstatsd/dogstatsd.go:150:2:error: should write info := h.metrics[mname] instead of info, _ := h.metrics[mname] (S1005) (megacheck)
mixer/adapter/dogstatsd/dogstatsd.go:51:2:error: this value of client is never used (SA4006) (megacheck)
mixer/adapter/dogstatsd/dogstatsd.go:101:14:error: error return value not checked (ce.Appendf("metricName", "%s is a valid metric but is not configured to be handled by the datadog adapter", mname)) (errcheck)
mixer/adapter/dogstatsd/dogstatsd_test.go:225:12:error: this value of err is never used (SA4006) (megacheck)

@awwithro
Copy link
Copy Markdown
Contributor Author

Hmm, not sure how those slipped in.
#3591

@awwithro awwithro mentioned this pull request Feb 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants