Skip to content

Optimize route reconciliation logic#2090

Merged
k8s-ci-robot merged 1 commit intokubernetes:masterfrom
kayrus:optimize-ip-resolve
Mar 2, 2023
Merged

Optimize route reconciliation logic#2090
k8s-ci-robot merged 1 commit intokubernetes:masterfrom
kayrus:optimize-ip-resolve

Conversation

@kayrus
Copy link
Copy Markdown
Contributor

@kayrus kayrus commented Feb 8, 2023

What this PR does / why we need it:

the openstack.Routes' ListRoutes method must not loop over all project servers just to resolve a couple of route's next hops to node names. This PR resolves the node name only when it's found in the route's next hop using cached k8s API calls for nodes listing.

Which issue this PR fixes(if applicable):
fixes #2089

Special notes for reviewers:

Release note:

Improve performance of the `ListRoutes` method, especially when an OpenStack project contains hundreds of non-k8s nodes

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 8, 2023
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @kayrus. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 8, 2023
@jichenjc
Copy link
Copy Markdown
Contributor

jichenjc commented Feb 9, 2023

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 9, 2023
@jichenjc
Copy link
Copy Markdown
Contributor

jichenjc commented Feb 9, 2023

still fixing gate, we can continue the code review before that..

@kayrus kayrus force-pushed the optimize-ip-resolve branch 2 times, most recently from e8feac3 to 9fef59e Compare February 10, 2023 18:14
@kayrus kayrus changed the title Resolve only existing route hops in ListRoutes method Optimize route reconciliation logic Feb 10, 2023
@kayrus kayrus force-pushed the optimize-ip-resolve branch from 9fef59e to f93f0da Compare February 16, 2023 13:15
@kayrus
Copy link
Copy Markdown
Contributor Author

kayrus commented Feb 16, 2023

/retest-required

@kayrus kayrus mentioned this pull request Feb 16, 2023
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 17, 2023
@kayrus kayrus force-pushed the optimize-ip-resolve branch from f93f0da to a413699 Compare February 17, 2023 09:43
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 17, 2023
@zetaab
Copy link
Copy Markdown
Member

zetaab commented Feb 18, 2023

/test openstack-cloud-csi-manila-e2e-test

@zetaab
Copy link
Copy Markdown
Member

zetaab commented Feb 20, 2023

/override openstack-cloud-csi-manila-e2e-test

manila is broken currently

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@zetaab: Overrode contexts on behalf of zetaab: openstack-cloud-csi-manila-e2e-test

Details

In response to this:

/override openstack-cloud-csi-manila-e2e-test

manila is broken currently

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kayrus
Copy link
Copy Markdown
Contributor Author

kayrus commented Feb 20, 2023

/test openstack-cloud-csi-manila-e2e-test

@kayrus
Copy link
Copy Markdown
Contributor Author

kayrus commented Feb 20, 2023

Manila e2e tests are broken. What is the process for the PR review?

@jichenjc
Copy link
Copy Markdown
Contributor

/override openstack-cloud-csi-manila-e2e-test

that's something we need fix later, for now, let's ignore this

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@jichenjc: Overrode contexts on behalf of jichenjc: openstack-cloud-csi-manila-e2e-test

Details

In response to this:

/override openstack-cloud-csi-manila-e2e-test

that's something we need fix later, for now, let's ignore this

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kayrus
Copy link
Copy Markdown
Contributor Author

kayrus commented Feb 21, 2023

@jichenjc could you please review this PR? Or do I need to wait for assigned reviewers approval?

@jichenjc
Copy link
Copy Markdown
Contributor

no, I will read this along with other reviewers, will do soon, thanks for the reminder

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@jichenjc: Overrode contexts on behalf of jichenjc: openstack-cloud-csi-manila-e2e-test

Details

In response to this:

/override openstack-cloud-csi-manila-e2e-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jichenjc
Copy link
Copy Markdown
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 28, 2023
@kayrus
Copy link
Copy Markdown
Contributor Author

kayrus commented Mar 1, 2023

Copy link
Copy Markdown
Member

@zetaab zetaab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments, but like mentioned - I do not use this feature so I have not 100% understanding what is even needed

return nil, err
}

var networkIDs []string
Copy link
Copy Markdown
Member

@zetaab zetaab Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not contain unique values for networkid? is it possible to have duplicate values?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate values are less probable, since router cannot handle the same network more than once.


// detect router's private network ID for further VM ports filtering
r.Lock()
r.networkIDs, err = getRouterNetworkIDs(r.network, r.os.routeOpts.RouterID)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this really changing so often that we need to fetch this all the time? In general networks are not changing quite often? (spoiler: I am not using this feature at all)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a low probability that this may happen. Nevertheless, I decided to be on the safe side, since it's just one API call.

Copy link
Copy Markdown
Member

@zetaab zetaab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: zetaab

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 2, 2023
@kayrus
Copy link
Copy Markdown
Contributor Author

kayrus commented Mar 2, 2023

I do not use this feature so I have not 100% understanding what is even needed

hm. IMHO this is used in every openstack based k8s cluster, and the router-id is a required config option.

@zetaab
Copy link
Copy Markdown
Member

zetaab commented Mar 2, 2023

its not? example from kops https://github.com/kubernetes/kops/blob/master/upup/pkg/fi/cloudup/openstack/cloud.go#L835-L892 like I said we are not using any "router" things from occm. Instead we do have CNIs

@kayrus
Copy link
Copy Markdown
Contributor Author

kayrus commented Mar 2, 2023

I was confused by

RouterID string `gcfg:"router-id"` // required
and
return nil, errors.ErrNoRouterID
comment.

And apologies, I've never used kops.

@zetaab
Copy link
Copy Markdown
Member

zetaab commented Mar 2, 2023

Interesting, why that is required? Why anyone should use it? What is the purpose of it? I have used cloud provider openstack for 7 years without that.

I0302 21:25:15.943948       1 openstack.go:439] openstack.Routes() called
W0302 21:25:16.355725       1 openstack.go:466] Error initialising Routes support: router-id not set in cloud provider config

but everything works like should?

actually after that in next row https://github.com/kubernetes/cloud-provider-openstack/blob/master/pkg/openstack/openstack.go#L467 it just disables the cloudprovider.Routes support.

@kayrus
Copy link
Copy Markdown
Contributor Author

kayrus commented Mar 2, 2023

but everything works like should?

in my test environment the amount of API calls has been decreased by an order of magnitude, if not by two.

Could you please skip openstack-cloud-csi-manila-e2e-test?

@zetaab
Copy link
Copy Markdown
Member

zetaab commented Mar 2, 2023

/override openstack-cloud-csi-manila-e2e-test

yeah but I mean, the interesting question is that why that is marked as required if its not really needed.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@zetaab: Overrode contexts on behalf of zetaab: openstack-cloud-csi-manila-e2e-test

Details

In response to this:

/override openstack-cloud-csi-manila-e2e-test

yeah but I mean, the interesting question is that why that is marked as required if its not really needed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kayrus
Copy link
Copy Markdown
Contributor Author

kayrus commented Mar 2, 2023

yeah but I mean, the interesting question is that why that is marked as required if its not really needed.

I'll cover this in my next PR: #2134. Thank you for paying attention on this.

@k8s-ci-robot k8s-ci-robot merged commit c8b8494 into kubernetes:master Mar 2, 2023
@kayrus kayrus deleted the optimize-ip-resolve branch March 3, 2023 06:48
@kayrus
Copy link
Copy Markdown
Contributor Author

kayrus commented Mar 3, 2023

@zetaab @jichenjc what is the process for backporting this fix to earlier releases?

@jichenjc
Copy link
Copy Markdown
Contributor

jichenjc commented Mar 3, 2023

just cherry pick ,but we ususally backport a bug instead of a feature ..

@kayrus
Copy link
Copy Markdown
Contributor Author

kayrus commented Mar 3, 2023

@jichenjc well, our API endpoints are overloaded by useless API requests, which cause nasty performance issues and sometimes outages. Let me try to cherry-pick and let's see how will it go.

P.S. #2113 doesn't look like a real bug as well, it's an improvement.

kayrus added a commit to kayrus/cloud-provider-openstack that referenced this pull request Mar 3, 2023
kayrus added a commit to kayrus/cloud-provider-openstack that referenced this pull request Mar 3, 2023
kayrus added a commit to kayrus/cloud-provider-openstack that referenced this pull request Mar 3, 2023
Comment on lines +149 to +152
if needIPv6 && ip.To4() == nil {
return v.Address
}
return v.Address
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if-branch is ineffective because the return value is the same regardless of whether the condition is true.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good catch!
Fixed in atomic routes PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[oocm] generates an excessive number of os-interface api requests

5 participants