Skip to content

Reworking kube-proxy to only compute endpointChanges on apply#83206

Merged
k8s-ci-robot merged 1 commit intokubernetes:masterfrom
robscott:endpointslice-proxy-endpointchange-perf
Oct 16, 2019
Merged

Reworking kube-proxy to only compute endpointChanges on apply#83206
k8s-ci-robot merged 1 commit intokubernetes:masterfrom
robscott:endpointslice-proxy-endpointchange-perf

Conversation

@robscott
Copy link
Copy Markdown
Member

What type of PR is this?
/kind cleanup

What this PR does / why we need it:
Computing EndpointChanges is a relatively expensive operation for kube-proxy when Endpoint Slices are used. This had been computed on every EndpointSlice update which became quite inefficient at high levels of scale when multiple EndpointSlice update events would be triggered before a syncProxyRules call.

Profiling results showed that computing this on each update could consume ~80% of total kube-proxy CPU utilization at high levels of scale. This change reduced that to as little as 3% of total kube-proxy utilization at high levels of scale.

It's worth noting that the difference is minimal when there is a 1:1 relationship between EndpointSlice updates and proxier syncs. This is primarily beneficial when there are many EndpointSlice updates between proxier sync loops.

Does this PR introduce a user-facing change?:

Significant kube-proxy performance improvements when using Endpoint Slices at scale.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

/sig network
/cc @freehan
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. sig/network Categorizes an issue or PR as relevant to SIG Network. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 26, 2019
@robscott robscott force-pushed the endpointslice-proxy-endpointchange-perf branch from 7442a93 to 24dc65f Compare September 26, 2019 20:12
@robscott
Copy link
Copy Markdown
Member Author

/retest

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a unit test for this?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may need a more significant name than getChangesAndMarkApplied. This function essentailly emits the changes in cache and refresh the cache. Plus, maybe other stuff

Naming Idea:
CheckoutChanges

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, CheckoutChanges is a much better name. Added some unit tests for this as well.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may helped the performance more by not comparing the difference and just trigger sync as always.

But it may not make any difference given the test result.

Copy link
Copy Markdown
Member Author

@robscott robscott Oct 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately both the EndpointChangesPending metric and the last change trigger times rely on this being available on EndpointSlice addition (computing at proxier sync would not be helpful). The profiling I've done suggests that the full updatePending function, including this diffing, takes ~3% of total kube-proxy compute time (0.12s of 3.61s scaling to 10k endpoints). The actual esInfoChanged call within that is small enough to not be reported. Given the relative insignificance of that time I think it's worth maintaining the metrics as they exist.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider moving all these into getChangesAndMarkApplied

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find a great way to integrate it there, but I did move this into a similarly named function.

@freehan
Copy link
Copy Markdown
Contributor

freehan commented Oct 3, 2019

Consider putting endpointslicecache.go into a different package (e.g. endpointslice/util or apimachinery), and define the interface better for reuse. Other consumer will need similar logic for this.

@robscott robscott force-pushed the endpointslice-proxy-endpointchange-perf branch 4 times, most recently from 98040fc to e59fdf7 Compare October 7, 2019 17:40
@k8s-ci-robot k8s-ci-robot added the sig/apps Categorizes an issue or PR as relevant to SIG Apps. label Oct 7, 2019
@robscott
Copy link
Copy Markdown
Member Author

robscott commented Oct 7, 2019

/retest

@robscott robscott force-pushed the endpointslice-proxy-endpointchange-perf branch from e59fdf7 to 8178d13 Compare October 7, 2019 20:23
@robscott
Copy link
Copy Markdown
Member Author

robscott commented Oct 8, 2019

/retest

Copy link
Copy Markdown
Contributor

@freehan freehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/assign @freehan

The interface of EndpointSliceCache, EndpointTracker is not too isolated. Hence making the synchronization and locking more unclear. But it seems not changing the existing flow.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consider enclose the critical section into a built in func

func(){
  ect.lock.Lock()
  defer ect.lock.Unlock()

  ...
}()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have lock in endpointSliceCache already. Do you still need lock here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing this a bit more I think it makes sense to hold off on any additional changes here until endpointslicecache can be moved into its own package.

Computing EndpointChanges is a relatively expensive operation for
kube-proxy when Endpoint Slices are used. This had been computed on
every EndpointSlice update which became quite inefficient at high levels
of scale when multiple EndpointSlice update events would be triggered
before a syncProxyRules call.

Profiling results showed that computing this on each update could
consume ~80% of total kube-proxy CPU utilization at high levels of
scale. This change reduced that to as little as 3% of total kube-proxy
utilization at high levels of scale.

It's worth noting that the difference is minimal when there is a 1:1
relationship between EndpointSlice updates and proxier syncs. This is
primarily beneficial when there are many EndpointSlice updates between
proxier sync loops.
@robscott robscott force-pushed the endpointslice-proxy-endpointchange-perf branch from 8178d13 to 8e7de45 Compare October 15, 2019 23:31
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 15, 2019
Copy link
Copy Markdown
Contributor

@freehan freehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 16, 2019
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: freehan, robscott

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 16, 2019
@k8s-ci-robot k8s-ci-robot merged commit 30ba9f6 into kubernetes:master Oct 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/network Categorizes an issue or PR as relevant to SIG Network. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants