Make GC occasionally refresh its discovery information#46000
Make GC occasionally refresh its discovery information#46000lavalamp wants to merge 3 commits intokubernetes:masterfrom
Conversation
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lavalamp DetailsNeeds approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
There was a problem hiding this comment.
We've had issues caching empty returns. If the list is empty, don't cache it. Same fo rresources.
There was a problem hiding this comment.
What causes empty lists?
This is going to be updated every 30s so it should be ok if it forgets stuff for 30s
There was a problem hiding this comment.
What causes empty lists?
This is going to be updated every 30s so it should be ok if it forgets stuff for 30s
Server start seems to start listening for discovery before registering groups. It would be pretty bad if GC didn't work for a 30 second window for core types on startup. It ought not be a difficult check to add.
There was a problem hiding this comment.
seems like the wrong fix, server could be in the middle of registering groups, too, and then we'd get some right and some wrong for 30s.
If an empty list is an error, then apiserver should probably not return 200s on empty lists?
There was a problem hiding this comment.
this looks off. Why are you always fresh? Deserves a comment if its right.
There was a problem hiding this comment.
Alright, I think I see. Setting this means that you're never being invalidated by higher logic. Definitely needs a comment explaining it and how its to be used.
There was a problem hiding this comment.
Honestly the interface didn't document what this is supposed to do...
There was a problem hiding this comment.
Reading the code that calls Fresh I still don't understand what it is supposed to do.
There was a problem hiding this comment.
why? Just for the cached return on error? I'd rather see that handled by outter logic that handles the error.
|
I think that this needs at least one e2e test that creates a TPR and proves that GC collects it. |
|
thanks for feedback. we've got to stop making unwieldy e2e tests. This should be testable as a unit or integration test in the GC, which I should add. |
|
@lavalamp I don't think this is sufficient - the graph builder currently only sets up its monitors to list/watch resources when the GC controller is first instantiated, but it doesn't refresh right now. |
|
@ncdc good point, I will fix that in another PR. |
b67ad9b to
31800f9
Compare
There was a problem hiding this comment.
There was a problem hiding this comment.
The current comment makes no sense to me whatsoever, so ok.
|
@mikedanese any idea why bazel can't find the new import? |
There was a problem hiding this comment.
@lavalamp what does DiscoveryInterface provide ? Is this related to service discovery or api discovery ?
07e95d7 to
89134ac
Compare
|
Looks like you figured it out. |
|
@mikedanese yes, I did, sorry for the noise. (git commit -a doesn't actually add new files, duh...) |
| defer d.lock.RUnlock() | ||
| cachedVal, ok := d.groupToServerResources[groupVersion] | ||
| if !ok { | ||
| return nil, ErrCacheEmpty |
There was a problem hiding this comment.
looks more like a NotFound than ErrCacheEmpty.
|
@lavalamp PR needs rebase |
| deletableResources := discovery.FilteredBy(discovery.SupportsAllVerbs{Verbs: []string{"delete"}}, preferredResources) | ||
| deletableGroupVersionResources, err := discovery.GroupVersionResources(deletableResources) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("Failed to parse resources from server: %v", err) |
|
@lavalamp are you going to come back to this as a bugfix after freeze? |
|
@deads2k I am not sure. My calendar is pretty booked over the next two weeks. :/ Let me see if Dawn will buy this is a bug fix. |
|
Please consider moving the client to client-go repo so that it is more reusable. See kubernetes/client-go/issues/200 |
|
@lavalamp do you have any update on your plans for this PR? Is there anything I can do to help move it forward? |
|
@lavalamp: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
| } | ||
| } | ||
|
|
||
| type monitorSingle struct { |
There was a problem hiding this comment.
// monitorSingle wraps a cache.Controller and adds another stop channel (in addition to that of Run(stopCh)). It will stop
// if either of them is closed.
| } | ||
|
|
||
| func (gb *GraphBuilder) HasSynced() bool { | ||
| for _, monitor := range gb.monitors { |
There was a problem hiding this comment.
this misses a lock. Or at the very least it should be documented that it assumes it being taken.
| func (gb *GraphBuilder) Run(stopCh <-chan struct{}) { | ||
| gb.monitorLock.Lock() | ||
| defer gb.monitorLock.Unlock() | ||
| gb.globalStop = stopCh |
There was a problem hiding this comment.
Something is fishy: NewGarbageCollector calls syncMonitors which depends on globalStop.
| absentOwnerCache *UIDCache | ||
| } | ||
|
|
||
| type stopMux struct { |
There was a problem hiding this comment.
this stopMux thing does not make the code easier to understand IMO. Just inline the mux logic into monitorSingle.Run.
It might even be much simpler to have one Go routine waiting on the global lock and shutting down all monitors in a loop. Any special reason that you want to have this logic per monitor?
| } | ||
|
|
||
| func NewMetadataCodecFactory() serializer.CodecFactory { | ||
| type MetadataCodecFactory struct { |
| @@ -0,0 +1,70 @@ | |||
| /* | |||
| Copyright 2016 The Kubernetes Authors. | |||
Automatic merge from submit-queue (batch tested with PRs 47700, 48464, 48502) Add a refreshing discovery client Introduce a discovery client (implementing `CachedDiscoveryInterface`) which caches discovery information in memory and which can be actively refreshed by the user. This implementation fetches from discovery upon refresh and could later be improved to maintain updates from a watch. Extracted from #47665 and #46000 to help reduce the scope of #48065. ```release-note NONE ```
Automatic merge from submit-queue (batch tested with PRs 49538, 49708, 47665, 49750, 49528) Enable garbage collection of custom resources Enhance the garbage collector to periodically refresh the resources it monitors (via discovery) to enable custom resource definition GC (addressing #44507 and reverting #47432). This is a replacement for #46000. /cc @lavalamp @deads2k @sttts @caesarxuchao /ref #48065 ```release-note The garbage collector now supports custom APIs added via CustomeResourceDefinition or aggregated apiservers. Note that the garbage collector controller refreshes periodically, so there is a latency between when the API is added and when the garbage collector starts to manage it. ```
|
This PR was superseded by #47665 and can be closed. |
ref #44507 - this should unblock AA and TPR even though it's not a full fix.