Skip to content

feat(grpc status codes): report grpc status codes via mixer#20036

Merged
istio-testing merged 8 commits intoistio:masterfrom
douglas-reid:grpc-for-mixer
Feb 7, 2020
Merged

feat(grpc status codes): report grpc status codes via mixer#20036
istio-testing merged 8 commits intoistio:masterfrom
douglas-reid:grpc-for-mixer

Conversation

@douglas-reid
Copy link
Copy Markdown
Contributor

@douglas-reid douglas-reid commented Jan 9, 2020

This PR implements the changes approved in https://tinyurl.com/yy2cwmv5 to report gRPC status codes in metrics (and provide better alignment with logs and metrics when reporting protocol). It also cleans up unused metrics dimensions.

Other PRs will be required to support the same feature set in v2 telemetry and to update the operator/installer config templates.

Partial fix for #12846

[ ] Configuration Infrastructure
[ ] Docs
[ ] Installation
[ ] Networking
[ ] Performance and Scalability
[ X ] Policies and Telemetry
[ ] Security
[ ] Test and Release
[ ] User Experience
[ ] Developer Infrastructure

@douglas-reid douglas-reid requested a review from a team as a code owner January 9, 2020 17:33
@googlebot googlebot added the cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. label Jan 9, 2020
@istio-testing istio-testing added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jan 9, 2020
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are these removed? cardinality issue?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For months I've been asking if these are used, and have yet to find anyone that is actually using these dimensions. I thought we were not carrying them forward into v2 (though the code now says otherwise).

@mandarjog objections to removing?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liminw @yangminzhu can we remove these?

Copy link
Copy Markdown

@jshaughn jshaughn Jan 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, kiali does not use these.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please remove, @yangminzhu last chance to object.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed this PR. It should be fine to remove because the new authorization policy doesn't support the permissive mode but I'm not sure if it's used by anyone in the old RBAC policy for now. cc @liminw

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the status an int or a string? I think it's a string (by mistake)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apparently it is a string (based on the test failures). Is "0" the right default, or "unknown" ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's set if grpc status header is present, and not set at all otherwise (see src/envoy/http/mixer/report_data.h`)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, but do we want the unset to be represented in metrics as "0" (like we default to "200" for http) or "unknown" ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think empty is better. It shouldn't really be unknown for grpc ever.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So just to confirm, the decision to have separate response_code and grpc_response_status fields is because even if request_protocol=grpc, response.code is still set?

Also, I think 0 is good for unset, just like the 200. I like a default being the same type as an actual value.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@douglas-reid I know the ship has sailed but perhaps the new field should have been "request_protocol_status" or something like that. And contained the response status for the specified request_protocol. Basically replacing request_code, which has now been relegated to effectively being "http_code". So existing consumer code could still assume http codes in request_code and write new code using the new field, which could support the two existing protocols and if necessary, additional protocols somewhere down the line. And "response_code" could have been deprecated.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jshaughn perhaps -- that's closer to my original desire. I think we can rethink that for 1.6/1.7 when maybe we do an overhaul of the metrics (and they are controlled all via v2). however, at this point, i'd like to get the approved design implemented and into istio/istio first.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@douglas-reid yup, totally agree. And I've been coding kiali the past days in anticipation of the new grpc_request_status field, and to be back compatible.

@istio-testing istio-testing added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 13, 2020
@douglas-reid
Copy link
Copy Markdown
Contributor Author

Ping. This now includes updates for the merged manifests/.

@howardjohn
Copy link
Copy Markdown
Member

i am not qualified to review, lmk once a mixer owner approves and I can give it the approval

@jshaughn
Copy link
Copy Markdown

@douglas-reid Will this make it into 1.5?

@douglas-reid
Copy link
Copy Markdown
Contributor Author

@jshaughn i hope so. i'm certainly trying to make it happen.

@douglas-reid
Copy link
Copy Markdown
Contributor Author

@jshaughn the biggest issue is making sure we have equivalent functionality with v2.

@douglas-reid douglas-reid requested a review from a team as a code owner February 6, 2020 21:10
@istio-testing istio-testing added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 6, 2020
Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>
Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>
Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>
Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>
@douglas-reid
Copy link
Copy Markdown
Contributor Author

Gentle ping on a review here. This PR brings the v1 telemetry pipeline in line with changes already made to v2.

Copy link
Copy Markdown
Contributor

@mandarjog mandarjog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the last features added Mixer.

@istio-testing istio-testing merged commit be55403 into istio:master Feb 7, 2020
gargnupur pushed a commit to gargnupur/istio that referenced this pull request Feb 19, 2020
)

* feat(grpc status codes): report grpc status codes via mixer

* response.grpc_status is a string

* force string instead of auto-detect duration

* update manifest files as well

* update to match istio/proxy PR

Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>

* update golden files

Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>

* add generated file

Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>

* running make gen a second time produces new results

Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>
istio-testing pushed a commit that referenced this pull request Feb 19, 2020
* DO NOT MERGE Test PR

* Update proxy SHA

* Debug test by printing  all metrics

* Check if failure repros on server side too

* Add debug logging  in integ tests

* Add envoy debug logging

* feat(grpc status codes): report grpc status codes via mixer (#20036)

* feat(grpc status codes): report grpc status codes via mixer

* response.grpc_status is a string

* force string instead of auto-detect duration

* update manifest files as well

* update to match istio/proxy PR

Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>

* update golden files

Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>

* add generated file

Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>

* running make gen a second time produces new results

Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>

* More debug

* Update Istio Proxy SHA (#20745)

* Add TCP Stats and Metadata Filters for Istio 1.6

* Update Istio Proxy SHA

* Run make gen

* Use Istio generated Metadata Exchange and Stats filters in TCP Stats Test (#20818)

* Add TCP Stats and Metadata Filters for Istio 1.6

* Use Istio generated Metadata Exchange and Stats filters in TCP Stats test

* Run make gen

* Remove debugging steps

* Update to latest SHA

Co-authored-by: Douglas Reid <douglas-reid@users.noreply.github.com>
sdake pushed a commit to sdake/istio that referenced this pull request Feb 21, 2020
)

* feat(grpc status codes): report grpc status codes via mixer

* response.grpc_status is a string

* force string instead of auto-detect duration

* update manifest files as well

* update to match istio/proxy PR

Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>

* update golden files

Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>

* add generated file

Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>

* running make gen a second time produces new results

Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>
@israel-hdez
Copy link
Copy Markdown
Contributor

Perhaps, could you update the Default Metrics page? https://istio.io/docs/reference/config/policy-and-telemetry/metrics/

I also see on that page the sentence They can be found here under the section with “kind: metric” while, that kind: metric looks non existent in the current config.yaml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants