Skip to content

Enable garbage collection of custom resources#47665

Merged
k8s-github-robot merged 1 commit intokubernetes:masterfrom
ironcladlou:gc-poll-types
Jul 29, 2017
Merged

Enable garbage collection of custom resources#47665
k8s-github-robot merged 1 commit intokubernetes:masterfrom
ironcladlou:gc-poll-types

Conversation

@ironcladlou
Copy link
Copy Markdown
Contributor

@ironcladlou ironcladlou commented Jun 16, 2017

Enhance the garbage collector to periodically refresh the resources it monitors (via discovery) to enable custom resource definition GC (addressing #44507 and reverting #47432).

This is a replacement for #46000.

/cc @lavalamp @deads2k @sttts @caesarxuchao

/ref #48065

The garbage collector now supports custom APIs added via CustomResourceDefinition or aggregated apiservers. Note that the garbage collector controller refreshes periodically, so there is a latency between when the API is added and when the garbage collector starts to manage it.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 16, 2017
@k8s-github-robot k8s-github-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. release-note-label-needed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 16, 2017
@ironcladlou ironcladlou changed the title WIP: Make GC occasionally refresh its discovery information Enable garbage collection of custom resources Jun 16, 2017
@ironcladlou
Copy link
Copy Markdown
Contributor Author

@k8s-bot pull-kubernetes-unit test this

@ironcladlou
Copy link
Copy Markdown
Contributor Author

@k8s-bot pull-kubernetes-verify test this

@jszczepkowski jszczepkowski removed their assignment Jun 22, 2017
@ironcladlou ironcladlou force-pushed the gc-poll-types branch 2 times, most recently from 0a87fee to 2f28fce Compare June 22, 2017 14:49
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I preferred the construct that got back a caching discovery client and simply started a gofunc that called reset.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So building a restmapper gives me back a function that somehow handles the garbage collector? that feels weirdly intertwined.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the code was authored by @lavalamp so I can't speak to the decisions leading to the current design, but it works and so I kept it. If you feel strongly we can continue the discussion about a different factoring.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you feel strongly we can continue the discussion about a different factoring.

I feel strongly. I'm not certain that the monitor sync has to line up exactly with the restmapper sync, but if it does, pass both through to the garbage collector and have the function live there.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you feel strongly we can continue the discussion about a different factoring.

still outstanding.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we make them stop channels here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a nil stop channel to indicate the monitor's start/stopped state.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a nil stop channel to indicate the monitor's start/stopped state.

doced and I didn't read to it before commenting her?

@deads2k
Copy link
Copy Markdown
Contributor

deads2k commented Jun 22, 2017

Macro-level observation. This makes creating links to resources hosted outside of the main API server exciting. If the API server goes down, then every resource with an owner-ref to it will be reaped.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is HasSynced is different than a normal HasSynced method which only moves to true and never back to false. I think this method is IsSynced.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect this in client-go.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this ever fail for legitimate reasons, like a CRD being deleted for instance.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to think about this a little more, but in the meantime maybe @lavalamp can comment more quickly since this whole file is his original unmodified work from #46000.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to think about this a little more, but in the meantime maybe @lavalamp can comment more quickly since this whole file is his original unmodified work from #46000.

He probably won't comment here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this will prevent proper handling higher up the chain, it seems worth a utilruntime.HandleError.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accidental date change?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You bet!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about getting this merged separately and see about making use of it in the namespace cleanup controller? It's a smaller controller and it would reduce the size of this pull.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objection from me. Who else might have an opinion on this?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objection from me. Who else might have an opinion on this?

Me, @sttts (who will agree with me), and @caesarxuchao who will also likely agree.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of this construct. The garbagecollector already has a reference to this RESTMapper, that seems like a more proper place to tie these things together. You could isolate this to a member function there instead of returning a function to be called later with the thing that requested this.

k8s-github-robot pushed a commit that referenced this pull request Jul 28, 2017
Automatic merge from submit-queue (batch tested with PRs 49712, 49694, 49714, 49670, 49717)

Reduce GC e2e test flakiness

Increase GC wait timeout in a flaky e2e test. The test expects a GC
operation to be performed within 30s, while in practice the operation
often takes longer due to a delay between the enqueueing of the owner's
delete operation and the GC's actual processing of that event. Doubling
the time seems to stabilize the test. The test's assumptions can be
revisited, and the processing delay under load can be investigated in
the future.

Extracted from #47665 per #47665 (comment).

/cc @sttts @caesarxuchao @deads2k @kubernetes/sig-api-machinery-bugs


```release-note
NONE
```
@caesarxuchao
Copy link
Copy Markdown
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 28, 2017
@caesarxuchao
Copy link
Copy Markdown
Contributor

@ironcladlou could you open an issue saying that we need to add tests to validate that GC works with aggregated apis?

@caesarxuchao
Copy link
Copy Markdown
Contributor

/assign @deads2k
for approval

@deads2k
Copy link
Copy Markdown
Contributor

deads2k commented Jul 28, 2017

/approve no-issue

@k8s-github-robot
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: caesarxuchao, deads2k, ironcladlou

Associated issue: 44507

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 28, 2017
@ironcladlou
Copy link
Copy Markdown
Contributor Author

Opened #49801 for followup.

@caesarxuchao
Copy link
Copy Markdown
Contributor

Thank you @ironcladlou for the hard work.

@ironcladlou
Copy link
Copy Markdown
Contributor Author

@caesarxuchao

Thank you @ironcladlou for the hard work.

Team effort! ✋

@k8s-github-robot
Copy link
Copy Markdown

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@ironcladlou: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-gce-etcd3 d08dfb9 link /test pull-kubernetes-e2e-gce-etcd3

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot
Copy link
Copy Markdown

Automatic merge from submit-queue (batch tested with PRs 49538, 49708, 47665, 49750, 49528)

@k8s-github-robot k8s-github-robot merged commit 7be28a1 into kubernetes:master Jul 29, 2017
// TODO: Consider how this could be represented with table-style tests (e.g. a
// before/after expected object graph given a delete operation targetting a
// specific node in the before graph with certain delete options).
func TestMixedRelationships(t *testing.T) {
Copy link
Copy Markdown
Member

@liggitt liggitt Jul 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks- will look into it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants