Incremental MCP for SyntheticServiceEntries by Nino-K · Pull Request #12276 · istio/istio

Nino-K · 2019-03-06T01:03:38Z

This PR introduces the following:

New EDS or full config update behaviour depending on changes in service or endpoints
Enables incremental MCP in coredatamodel (only for syntheticServiceEntries that received via Galley).
Supports annotations (serviceVersion & endpointsVersion) received via MCP (only Galley) . These annotations are used to determine partial update or full config update to envoy.
Handles NotReadyEndpoints annotations.
Ability to expose NotReadyEndpoints to aggregated controller

pilot/pkg/config/coredatamodel/controller.go

pilot/pkg/bootstrap/server.go

pilot/pkg/config/coredatamodel/controller.go

pilot/cmd/pilot-discovery/main.go

pilot/pkg/bootstrap/server.go

pilot/pkg/config/coredatamodel/controller.go

galley/pkg/metadata/types.go

pilot/cmd/pilot-discovery/main.go

pilot/pkg/bootstrap/server.go

pilot/pkg/config/coredatamodel/controller.go

stale · 2019-04-12T20:07:07Z

This pull request has been automatically marked as stale because it has not had activity in the last 2 weeks. It will be closed in 30 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

hzxuzhonghu · 2019-10-16T02:27:05Z

There are other two questions left:

Why notreadyEndpoints Discovery interface, what it is for?
The Istio service is not well populated, at least lack of several important fields, so i think it should cause some regression.

Nino-K · 2019-10-16T02:41:02Z

@Nino-K Please donot squash the commits, it is hard to review.

@hzxuzhonghu I don't usually squash my PRs, but this one has been in flight for a very long time and there is always a conflict when I make a commit since it touches many files, having to squash them makes rebasing easier.

hzxuzhonghu · 2019-10-16T02:43:01Z

pilot/pkg/config/coredatamodel/discovery.go

+		for _, h := range se.Hosts {
+			for _, svcPort := range se.Ports {
+				// add not ready endpoint
+				out = append(out, d.notReadyServiceInstance(se, svcPort, conf, h, notReadyIP, notReadyPort))


Thisis not right, for example, if 192.168.1.1:8080 and 192.168.1.1:8081 are not ready. What's the output

Can you elaborate a bit more? I have a test case for this case specifically. The expected out put would be

// NetworkEndpoint(endpoint{Address:192.168.1.1, Port:8080, servicePort: &{http-port 80 http}} service:{Hostname: svc.example2.com}) // NetworkEndpoint(endpoint{Address:192.168.1.1, Port:8081, servicePort: &{http-port 80 http}} service:{Hostname: svc.example2.com})

is that not correct?

Nino-K · 2019-10-16T02:47:12Z

There are other two questions left:

Why notreadyEndpoints Discovery interface, what it is for?

The Istio service is not well populated, at least lack of several important fields, so i think it should cause some regression.

The NotReadyEndpoint discovery handles notReady annotations that are received via galley and it exposes them to EDS so the corresponding updates can be sent out to envoy. This is a case that we were not handling previously in Pilot over MCP.
As for the critical properties in service that are not populated, this feature is turned off by default so there should not be any regression in the normal code path. I will be making follow up PRs to address some of the concerns.

hzxuzhonghu · 2019-10-16T02:59:32Z

The NotReadyEndpoint discovery handles notReady annotations that are received via galley and it exposes them to EDS so the corresponding updates can be sent out to envoy. This is a case that we were not handling previously in Pilot over MCP.

I could not find its use.

nmittler · 2019-10-16T13:38:33Z

It merged ... Wooh-hoo! :)

Thanks @Nino-K!

seflerZ · 2019-10-23T06:23:25Z

Two questions:

Why implemented only on SytheticServiceEntry via Galley? How about other schemas of MCP?
We've got a Nacos registry connected to Pilot via MCP Server but suffering from low performance of updating ServiceEntry. To benifit from incremental updates, is that means we should update data through Galley or we shall implemt incremental ServiceEntry updates on Pilot?

nmittler · 2019-10-23T16:49:41Z

@seflerZ

Why implemented only on SytheticServiceEntry via Galley? How about other schemas of MCP?

what schemas are you referring to?

We've got a Nacos registry connected to Pilot via MCP Server but suffering from low performance of updating ServiceEntry. To benifit from incremental updates, is that means we should update data through Galley or we shall implemt incremental ServiceEntry updates on Pilot?

You'd either have to implement incremental MCP on your server or use Galley.

seflerZ · 2019-10-25T12:12:09Z

@nmittler

Some collections such as ServiceEntry, DestinationRule and etc. As far as I know, the MCP is designed to support incremental transfer for all collections listed in Istio documentation
Yes, we can implement incremental MCP for our server. However the Pilot only supports recieve incremental MCP data from Galley as the PR describes and I've checked the code, the SytheticServiceEntry controller supports collection named 'SyntheticServiceEntry' only.

nmittler · 2019-10-25T15:56:54Z

@seflerZ right, Pilot is only currently using incremental for the synthetic service entries. That could change, but may be obviated by the move to istiod.

nkorange · 2019-10-29T07:16:12Z

Is this feature released? I don't see its introduction in recent releases.

nkorange · 2019-10-29T11:25:39Z

I have read the code of this PR. I found in this incremental push of synthetic service entries mode Pilot replaces all endpoints of a service, not just adds/removes endpoints in the service.

Say Pilot has data: {{svc1:2.2.2.2}, {svc2:3.3.3.3}}. Then if data {collection='synthetic service entries', {svc1: 1.1.1.1}, incremental=true} is sent to Pilot, the data in pilot change to {{svc1:1.1.1.1}, {svc2:3.3.3.3}}.

Am I understanding it right?

nmittler · 2019-10-29T15:29:13Z

@nkorange this will be in the 1.4 release.

Say Pilot has data: {{svc1:2.2.2.2}, {svc2:3.3.3.3}}. Then if data {collection='synthetic service entries', {svc1: 1.1.1.1}, incremental=true} is sent to Pilot, the data in pilot change to {{svc1:1.1.1.1}, {svc2:3.3.3.3}}.

That wouldn't be the desired result, although I'm not sure our test currently verifies how pilot handles endpoints from different service registries for the same service.

@Nino-K have you tested that by any chance? We're going to want to make sure that works properly.

nkorange · 2019-10-30T03:13:29Z

@nmittler Does 1.4.0-beta contain this feature?

Nino-K · 2019-10-30T15:15:56Z

@nkorange yes, I believe this feature will be available in 1.4, however it is turned off by default.

Nino-K · 2019-10-30T15:20:51Z

@nmittler the implementation of incremental MCP in syntheticServicController currently only focuses on the removal of the configs that received while incremental flag is set and sending updates incrementally to envoy (EDSUpdate vs ConfigUpdate). The reason I have not invested so much time in incremental updates from the source server is currently galley never forwards incremental payload to Pilot, although the snapshots are configured to support incremental which prevented me from testing this in the integration test. I will go ahead and create an issue to investigate this further.

nkorange · 2019-10-31T02:18:52Z

I have read the code of this PR. I found in this incremental push of synthetic service entries mode Pilot replaces all endpoints of a service, not just adds/removes endpoints in the service.

Say Pilot has data: {{svc1:2.2.2.2}, {svc2:3.3.3.3}}. Then if data {collection='synthetic service entries', {svc1: 1.1.1.1}, incremental=true} is sent to Pilot, the data in pilot change to {{svc1:1.1.1.1}, {svc2:3.3.3.3}}.

Am I understanding it right?

@Nino-K So this is expected? And Can I download the 1.4.0-beta.1 to test this incremental feature? The master branch seems not stable.

Nino-K · 2019-10-31T18:20:15Z

I have read the code of this PR. I found in this incremental push of synthetic service entries mode Pilot replaces all endpoints of a service, not just adds/removes endpoints in the service.
Say Pilot has data: {{svc1:2.2.2.2}, {svc2:3.3.3.3}}. Then if data {collection='synthetic service entries', {svc1: 1.1.1.1}, incremental=true} is sent to Pilot, the data in pilot change to {{svc1:1.1.1.1}, {svc2:3.3.3.3}}.
Am I understanding it right?

@Nino-K So this is expected? And Can I download the 1.4.0-beta.1 to test this incremental feature? The master branch seems not stable.

@nkorange the behaviour that you are expecting is not part of incremental MCP specification. Currently incremental MCP operates at the resource level and not the sub-resources.
e.g in non incremental all MCP resource in a given collection are delivered in a single update. However in the incremental case only MCP resources that were added/changed (recently) are included in the update. Either way in both cases the resources are delivered in full state. Hope that helps.

nkorange · 2019-11-01T01:03:18Z

I have read the code of this PR. I found in this incremental push of synthetic service entries mode Pilot replaces all endpoints of a service, not just adds/removes endpoints in the service.
Say Pilot has data: {{svc1:2.2.2.2}, {svc2:3.3.3.3}}. Then if data {collection='synthetic service entries', {svc1: 1.1.1.1}, incremental=true} is sent to Pilot, the data in pilot change to {{svc1:1.1.1.1}, {svc2:3.3.3.3}}.
Am I understanding it right?

@Nino-K So this is expected? And Can I download the 1.4.0-beta.1 to test this incremental feature? The master branch seems not stable.

@nkorange the behaviour that you are expecting is not part of incremental MCP specification. Currently incremental MCP operates at the resource level and not the sub-resources.
e.g in non incremental all MCP resource in a given collection are delivered in a single update. However in the incremental case only MCP resources that were added/changed (recently) are included in the update. Either way in both cases the resources are delivered in full state. Hope that helps.

OK. I see, thanks.

seflerZ · 2019-11-01T08:05:05Z

@Nino-K I don't think it's much practical if it doesn't support incremental on sub resources and may lead to misusing because the there is not comment or flag to tell that.

How do you think about implementing the incremental updating on sub-resources? We have very large scale clusters and endpoints. We appreciate incremental updating on every resource and sub resources. We can join in and contribute.

Nino-K · 2019-11-01T16:12:27Z

@Nino-K I don't think it's much practical if it doesn't support incremental on sub resources and may lead to misusing because the there is not comment or flag to tell that.

How do you think about implementing the incremental updating on sub-resources? We have very large scale clusters and endpoints. We appreciate incremental updating on every resource and sub resources. We can join in and contribute.

@seflerZ for large clusters with lots of endpoints you are better off to send the full state of the resource as oppose to just updated sub-resources (delta). Here is why incremental updates with full state might be more efficient: 1) pilot would ultimately send the full resource and not only updated part to envoy via EDSUpdate, so what benefit is there to just send an incremental update with only changes in the sub-resource? 2) having pilot to apply the delta on the sub-resources and computing the diff on the resources might be more expensive (in terms of operation) on a large scale cluster with lots of endpoints, I think it would be better to defer that to the config source server to compile the new resource with updated sub-resource (something like what galley does today).

Having said all that, it might be worth to bring this up in the community meeting to see what others think, also I think @ayj might have more insights on this too.

ayj · 2019-11-01T18:20:44Z

ServiceEntry is roughly analogous with the k8s endpoint API. Both contain a list of endpoints for a specific host/service. k8s introduced EndpointSlice as more scalable and extensible alternative. We might consider something similar if/when performance becomes a problem. Taken to the extreme one could imagine an API which represents individual endpoint/hosts per resource.

howardjohn · 2019-11-01T18:25:24Z

We will be adding support for EndpointSlice as well

Anyways, can't you shard your ServiceEntries already? If you have multiple STATIC service entries they will be merged

seflerZ · 2019-11-04T03:57:43Z

@Nino-K
I don't agree with you on incremental updates with full state might be more efficient.

For point 1: Pilot would ultimately send the full resource through EDSUpdate()
The efficiency of EDSUpdate is a problem of implementation not architecture. We generally have a application comsuing 150+ services and every service might contain 1000+ endpoints (The number is large because we don't use a load balancer on service endpoints and have direct connections), so even tricky network turbulence would cause tremendous pushing data at present. So this is the point where Pilot should change, I think, to 'EdsIncremental' or something which introduce incremental update to Envoy. Of cause, Envoy will also change to support that.

For point 2: Having pilot to apply/compute the delta might be expensive
So this is a trade off. Either pushing tremendous data thrasing the network or computing to reduce the traffic. I think computing is better becasue it only consumes resources on Pilot itself, not every Envoy. As we know, with full state update, every Envoy receives and rebuilds all data in memory which is also expensive. It is consuming the resources of application which is very limited. Moreover, full state updates on Pilot causes high memory usage.

BTW: Should I open a new issue to discuss the incremental updates on sub-resources?

@ayj
I've checked the code of Pilot on release-1.4 and noticed that 'edsIncremental' has been introduced. Is that means we're ready to incrementally update EDS to Envoy of version 1.12?

@howardjohn
Is the idea 'EndpointSlice' also supported by MCP? I don't understand 'multiple STATIC service entries' you've said. Our architecture is like this:

We've a triditional microservice registry named Nacos which holds 10 GiB of services and endpoints. Usally a service may have 1000+ endpoints and 100+ service dependencies. Now we want to move to and join in Istio ecosystem but most of our services are not K8s ready yet. So we sync our services and endpoints to Pilot through MCP or Pilot registry plugin. Currently we're suffering from low performance on both Nacos-Pilot synchronization and Pilot-Envoy data pushing.

ayj · 2019-11-05T23:22:39Z

I'm not familiar with Pilot's incremental EDS support. @hzxuzhonghu may know more.

EndpointSlice should be compatible with MCP though its not plumbed through yet. The config server (e.g. galley, Nacos) is responsible for managing the grouping of endpoints over time.

Nino-K · 2019-11-06T01:13:10Z

BTW: Should I open a new issue to discuss the incremental updates on sub-resources?

@seflerZ I think it would make more sense to create an issue to discuss this further, thanks!

hzxuzhonghu · 2019-11-06T01:31:01Z

Pilot's incremental EDS now supports when one endpoints update, we only build and push eds associated with the updated service. Can not build and send the new added/deleted subset of the endpoints, because envoy hasn't support EDS patch yet.

seflerZ · 2019-11-07T03:52:31Z

Thanks, everyone. I've opened a proposal to discuss this.

istio-testing added the do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. label Mar 6, 2019

googlebot added the cla: yes label Mar 6, 2019

istio-testing requested review from ayj and costinm March 6, 2019 01:03

Nino-K requested review from andraxylia, nmittler and ozevren March 6, 2019 01:04

This was referenced Mar 6, 2019

Clarification between Service Entry and Synthetic Service Entry #12311

Closed

[WIP] Add XDSUpdater to coredatamodel controller #10994

Closed

nmittler mentioned this pull request Mar 13, 2019

[Galley] Adding ServiceEntry synthesis #12409

Merged

Nino-K force-pushed the incremental-eds branch from 06fdc73 to 651ee30 Compare March 13, 2019 21:51

Nino-K changed the title ~~[WIP] Incremental EDS updates~~ Incremental EDS updates Mar 13, 2019

istio-testing removed the do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. label Mar 13, 2019

nmittler reviewed Mar 14, 2019

View reviewed changes

Nino-K force-pushed the incremental-eds branch from ca36860 to 4b146a4 Compare March 19, 2019 22:01

Nino-K commented Mar 20, 2019

View reviewed changes

pilot/pkg/bootstrap/server.go Outdated Show resolved Hide resolved

golangcibot reviewed Mar 20, 2019

View reviewed changes

pilot/pkg/config/coredatamodel/controller.go Outdated Show resolved Hide resolved

Nino-K force-pushed the incremental-eds branch from cdc2be6 to 0cd1cec Compare March 20, 2019 21:54

Nino-K commented Mar 20, 2019

View reviewed changes

pilot/cmd/pilot-discovery/main.go Outdated Show resolved Hide resolved

nmittler reviewed Mar 20, 2019

View reviewed changes

pilot/pkg/bootstrap/server.go Outdated Show resolved Hide resolved

nmittler reviewed Mar 20, 2019

View reviewed changes

pilot/pkg/config/coredatamodel/controller.go Outdated Show resolved Hide resolved

nmittler reviewed Mar 20, 2019

View reviewed changes

pilot/pkg/config/coredatamodel/controller.go Outdated Show resolved Hide resolved

nmittler reviewed Mar 20, 2019

View reviewed changes

pilot/pkg/config/coredatamodel/controller.go Outdated Show resolved Hide resolved

costinm reviewed Mar 21, 2019

View reviewed changes

istio-testing added the needs-rebase Indicates a PR needs to be rebased before being merged label Mar 29, 2019

stale bot added the stale label Apr 12, 2019

Nino-K force-pushed the incremental-eds branch from 0cd1cec to 3dd5240 Compare April 17, 2019 22:00

stale bot removed the stale label Apr 17, 2019

istio-testing removed the needs-rebase Indicates a PR needs to be rebased before being merged label Apr 17, 2019

hzxuzhonghu reviewed Oct 16, 2019

View reviewed changes

sushicw mentioned this pull request Oct 16, 2019

Basic integration tests for istioctl x analyze #17904

Merged

seflerZ mentioned this pull request Nov 7, 2019

Enable sub-resources patch on incremental SyntheticServiceEntry MCP update #18731

Closed

Conversation

Nino-K commented Mar 6, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stale bot commented Apr 12, 2019

Uh oh!

hzxuzhonghu commented Oct 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Nino-K commented Oct 16, 2019

Uh oh!

hzxuzhonghu Oct 16, 2019

Choose a reason for hiding this comment

Uh oh!

Nino-K Oct 16, 2019

Choose a reason for hiding this comment

Uh oh!

Nino-K commented Oct 16, 2019

Uh oh!

hzxuzhonghu commented Oct 16, 2019

Uh oh!

nmittler commented Oct 16, 2019

Uh oh!

seflerZ commented Oct 23, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nmittler commented Oct 23, 2019

Uh oh!

seflerZ commented Oct 25, 2019

Uh oh!

nmittler commented Oct 25, 2019

Uh oh!

nkorange commented Oct 29, 2019

Uh oh!

nkorange commented Oct 29, 2019

Uh oh!

nmittler commented Oct 29, 2019

Uh oh!

nkorange commented Oct 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Nino-K commented Oct 30, 2019

Uh oh!

Nino-K commented Oct 30, 2019

Uh oh!

nkorange commented Oct 31, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Nino-K commented Oct 31, 2019

Uh oh!

nkorange commented Nov 1, 2019

Uh oh!

seflerZ commented Nov 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Nino-K commented Nov 1, 2019

Uh oh!

ayj commented Nov 1, 2019

Nino-K commented Mar 6, 2019 •

edited

Loading

hzxuzhonghu commented Oct 16, 2019 •

edited

Loading

seflerZ commented Oct 23, 2019 •

edited

Loading

nkorange commented Oct 30, 2019 •

edited

Loading

nkorange commented Oct 31, 2019 •

edited

Loading

seflerZ commented Nov 1, 2019 •

edited

Loading

seflerZ commented Nov 4, 2019 •

edited

Loading