eds: do not trigger continuous full pushes when a pod is in crash loop#18574
eds: do not trigger continuous full pushes when a pod is in crash loop#18574istio-testing merged 13 commits intoistio:masterfrom
Conversation
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
|
/test integ-pilot-k8s-tests_istio |
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
|
|
howardjohn
left a comment
There was a problem hiding this comment.
overall lgtm, thanks for working on this and adding good tests.
I only skimmed it right now so I'll give a chance for Nino to look at the SSE code or others then give a proper review during the week
| t.Fatal("Expecting only EDS update as part of a partial push. But received CDS also +v", upd) | ||
| } | ||
|
|
||
| if len(upd) > 0 && !contains(upd, "eds") { |
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
pilot/pkg/proxy/envoy/v2/eds.go
Outdated
| // deleteEndpointShards deletes matching endpoints from EndpointShardsByService map. This is called when | ||
| // endpoints are deleted or the service is deleted. If deleteKeys is true, this method will also delete the | ||
| // associated entries from map. | ||
| func (s *DiscoveryServer) deleteEndpointShards(cluster, serviceName, namespace string, deleteKeys bool) { |
There was a problem hiding this comment.
Donot like deleteKeys param, this kind of param is confusing.
Can you only do delete in SvcUpdate? Thus we don't need to care about endpoints=0 or not?
There was a problem hiding this comment.
Ok. I changed it to two functions and called delete only from SvcUpdate and deleting shards from edsupdate.
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
| // functioning correctly. Currently it is working because on edsUpdate if we set endpoints to 0, we remove | ||
| // the service from EndpointShardsByService and subsequent eds updates trigger a full push. That is being | ||
| // fixed in https://github.com/istio/istio/pull/18574. Need to fix this issue and re-enable conditional | ||
| // full push. For now, any configupdate triggers a full push much like service entries. |
There was a problem hiding this comment.
@ramaraochavali can you please put this in an issue, being able to do a EDSUpdate vs ConfigUpdate was one of the main goals of this controller.
| oldEpVersion := c.endpointVersion(conf.Namespace, conf.Name) | ||
| newEpVersion := version(conf.Annotations, endpointKey) | ||
| if oldEpVersion != newEpVersion { | ||
| if err := c.edsUpdate(conf); err != nil { |
There was a problem hiding this comment.
because it is ineffective now - any update needs a full push
There was a problem hiding this comment.
Didn't we only do eds push when svc not changed?
There was a problem hiding this comment.
Yes. That is what we have to do. But it is not working for SSE. See the TODO below and #18625
pilot/pkg/proxy/envoy/v2/eds.go
Outdated
| delete(s.EndpointShardsByService[serviceName][namespace].Shards, cluster) | ||
| s.EndpointShardsByService[serviceName][namespace].mutex.Unlock() | ||
| delete(s.EndpointShardsByService[serviceName], namespace) | ||
| delete(s.EndpointShardsByService, serviceName) |
There was a problem hiding this comment.
This is not right, we can delete EndpointShardsByService[hostname] only when len(EndpointShardsByService[hostname]) = 0
pilot/pkg/proxy/envoy/v2/eds.go
Outdated
| s.EndpointShardsByService[serviceName][namespace].mutex.Lock() | ||
| delete(s.EndpointShardsByService[serviceName][namespace].Shards, cluster) | ||
| s.EndpointShardsByService[serviceName][namespace].mutex.Unlock() | ||
| delete(s.EndpointShardsByService[serviceName], namespace) |
There was a problem hiding this comment.
Same only can delete when s.EndpointShardsByService[serviceName][namesapce].Shards = 0
pilot/pkg/model/push_context.go
Outdated
| // updated to force a EDS and CDS recomputation and incremental push, as it doesn't affect | ||
| // LDS/RDS. | ||
| SvcUpdate(shard, hostname string, ports map[string]uint32, rports map[uint32]string) | ||
| SvcUpdate(shard, hostname string, namespace string, event Event, ports map[string]uint32, rports map[uint32]string) |
There was a problem hiding this comment.
We donot need the ports and rports param
There was a problem hiding this comment.
I think may be better to leave them for future? Because they were there from the beginning.
There was a problem hiding this comment.
or if you are ok to clean it up - I can do that. I do not have strong opinion on that.
There was a problem hiding this comment.
Ok. Cleaned it up. PTAL
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
|
/test unit-tests_istio |
|
@rshriram @howardjohn @hzxuzhonghu - Can you PTAL? Addressed all the comments |
istio#18574) * wip Signed-off-by: Rama Chavali <rama.rao@salesforce.com> * add test cases Signed-off-by: Rama Chavali <rama.rao@salesforce.com> * revert controller change Signed-off-by: Rama Chavali <rama.rao@salesforce.com> * add debug error Signed-off-by: Rama Chavali <rama.rao@salesforce.com> * try deleting completely Signed-off-by: Rama Chavali <rama.rao@salesforce.com> * trigger full push for ss3 config update Signed-off-by: Rama Chavali <rama.rao@salesforce.com> * change sse to trigger full push Signed-off-by: Rama Chavali <rama.rao@salesforce.com> * lint Signed-off-by: Rama Chavali <rama.rao@salesforce.com> * review comments Signed-off-by: Rama Chavali <rama.rao@salesforce.com> * split delete in to two functions Signed-off-by: Rama Chavali <rama.rao@salesforce.com> * address comments Signed-off-by: Rama Chavali <rama.rao@salesforce.com> * clean up ports Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
Prevents full push being triggered when a pod is in crash loop and thus its endpoints flipflop between 0 and 1. Fixes #13822