Skip to content

Initial pilot api az functionality#1815

Merged
rshriram merged 3 commits intoistio:masterfrom
liamawhite:master
Nov 30, 2017
Merged

Initial pilot api az functionality#1815
rshriram merged 3 commits intoistio:masterfrom
liamawhite:master

Conversation

@liamawhite
Copy link
Copy Markdown
Member

@liamawhite liamawhite commented Nov 21, 2017

What this PR does / why we need it: A new API path for pilot, az. It is needed to allow proxy agent to set the availability zone of a proxy using the --availabilityZone flag

Which issue this PR fixes:
Continues on from: istio/old_pilot_repo#1230
Working towards: #1473

Special notes for your reviewer: This is a quick and dirty PR to get feedback on the API path and general approach. The PR that allows Proxy Agent to consume this endpoint will come later.

Release note:

Add /v1/az/{ServiceCluster}/{ServiceNode} endpoint to pilot API for retrieving AZ for a given ServiceNode

@codecov
Copy link
Copy Markdown

codecov bot commented Nov 21, 2017

Codecov Report

Merging #1815 into master will increase coverage by 0.5%.
The diff coverage is 83.33%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #1815     +/-   ##
=========================================
+ Coverage   81.14%   81.65%   +0.5%     
=========================================
  Files         187      197     +10     
  Lines       18883    19918   +1035     
=========================================
+ Hits        15323    16264    +941     
- Misses       3127     3211     +84     
- Partials      433      443     +10
Flag Coverage Δ
#broker 44.44% <ø> (-1.07%) ⬇️
#mixer 83.17% <ø> (+0.78%) ⬆️
#pilot 80.31% <83.33%> (-0.09%) ⬇️
#security 90.39% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pilot/proxy/envoy/discovery.go 74.59% <83.33%> (+0.6%) ⬆️
pilot/adapter/config/memory/monitor.go 79.48% <0%> (-10.26%) ⬇️
pilot/platform/kube/controller.go 50.84% <0%> (-1.02%) ⬇️
broker/pkg/version/version.go
mixer/pkg/runtime/resolver.go 93.38% <0%> (ø)
mixer/pkg/runtime/handler.go 89.92% <0%> (ø)
mixer/pkg/runtime/context.go 100% <0%> (ø)
mixer/pkg/runtime/logger.go 100% <0%> (ø)
mixer/pkg/runtime/monitor.go 100% <0%> (ø)
mixer/pkg/runtime/dispatcher.go 96.69% <0%> (ø)
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 24ec6a3...d44f70e. Read the comment docs.

@liamawhite
Copy link
Copy Markdown
Member Author

/test istio-presubmit

@rshriram
Copy link
Copy Markdown
Member

@ZackButcher / @nmittler do you guys want to shepherd this through first? This would shed more light into how we name/and control the sidecars.

@liamawhite
Copy link
Copy Markdown
Member Author

/test istio-presubmit

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing a return here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I am

Copy link
Copy Markdown
Contributor

@ZackButcher ZackButcher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just need the missing return and another test case.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test case for a service with no AZ too?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for us to hit cases where services on the same host report different AZs? In other words, is it possible for the notion of an AZ to vary per service (or, what if two different service registries have different notions of AZs and both have entries for services on the same host)?

@ZackButcher
Copy link
Copy Markdown
Contributor

Also, more general note: I couldn't find Envoy's AZ API definition, I could only find references to AZs in the tags returned by SDS queries. Can you point me to the API def?

@rshriram
Copy link
Copy Markdown
Member

I think the context is missing here. Envoy has no API for AZs. AZs have to be provided during startup. However, during startup, the only way to get pod AZ info is by querying kube api. That would mean adding the kube client-go into the pilot-agent. Doing so increases the footprint of the pilot agent to 40Mb+ (currently its few Mb only). In order to avoid this, we are introducing a AZ api in pilot. Pilot agent can ping Pilot, to get the AZ for the pod. With this information, it can hot restart Envoy with the right AZ info. (since this happens in early stages of a pod's life, impact should be small).

Once Envoy has the AZ info, it will prefer routing to other Envoys in the same AZ over Envoys in other AZs. our EDS already provides Envoy with the AZs of all other pods. What was missing is the AZ of the local pod, which this PR attempts to tackle.

@ZackButcher
Copy link
Copy Markdown
Contributor

So what component will use this API? The pilot-agent? Where does the local AZ actually manifest in Envoy config? I.e. could we just supply it with listener config or somewhere else w/o having to expose the API on pilot proper?

@liamawhite
Copy link
Copy Markdown
Member Author

@rshriram we couldn't pull it in from kube either because that breaks the abstraction for other platforms in the proxy agent right?

@ZackButcher My understanding is that it has to be passed into envoy as a flag at start up. We currently expose it in the envoy config already, but thats not enough. @rshriram feel free to correct me!

@istio-merge-robot
Copy link
Copy Markdown

@liamawhite PR needs rebase

@istio-merge-robot istio-merge-robot added the needs-rebase Indicates a PR needs to be rebased before being merged label Nov 29, 2017
@istio-merge-robot istio-merge-robot removed the needs-rebase Indicates a PR needs to be rebased before being merged label Nov 29, 2017
@liamawhite
Copy link
Copy Markdown
Member Author

/test istio-presubmit

@liamawhite
Copy link
Copy Markdown
Member Author

/test istio-pilot-e2e

@liamawhite
Copy link
Copy Markdown
Member Author

/test e2e-smoke

Copy link
Copy Markdown
Contributor

@ZackButcher ZackButcher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for improving the test coverage!

@ZackButcher
Copy link
Copy Markdown
Contributor

/lgtm
/approve

@istio-merge-robot
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ZackButcher

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@kyessenov
Copy link
Copy Markdown
Contributor

@rshriram do you also want service cluster still, too? Look to me like the exact same issue.

@rshriram
Copy link
Copy Markdown
Member

Didn't we resolve the service cluster by using the deployment name from kube inject?

@kyessenov
Copy link
Copy Markdown
Contributor

Yes, we did, but it's not related to upstream cluster name. Does Envoy care that it's cluster is named same as upstream cluster?

@rshriram
Copy link
Copy Markdown
Member

service-cluster is for the envoy CLI. where does upstream cluster come in this picture?

@rshriram rshriram merged commit 9386e6c into istio:master Nov 30, 2017
istio-merge-robot pushed a commit that referenced this pull request Jan 10, 2018
Automatic merge from submit-queue.

Add in consul datacenter (AvailabilityZone) support

What this PR does / why we need it: Currently we only support AZ routing for kube based service registries. This enables it for consul.

Which issue this PR fixes:
Continues on from: istio/old_pilot_repo#1230, #2054 and #1815
Working towards: #1473

Release note:

Enable consul intra-datacenter routing with AvailabilityZone
kyessenov pushed a commit to kyessenov/istio that referenced this pull request Aug 13, 2018
1. Stop trashing build's cache.
2. Split ASan and TSan tests into separate targets.
3. Fix output directory, so that caching works on macOS.

Before:
- build     : 49 mins
- macos     : 30 mins

After:
- build     : 27 mins
- linux_asan: 20 mins
- linux_tsan: 16 mins
- macos     : 30 mins

After (with warm cache):
- build     :  3 mins
- linux_asan:  4 mins
- linux_tsan:  2 mins
- macos     :  5 mins

Fixes istio#1815.

Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants