Skip to content

Prow script for pilot-multicluster-e2e test#5516

Closed
tiswanso wants to merge 13 commits intoistio:masterfrom
tiswanso:noclusreg_e2e-suite
Closed

Prow script for pilot-multicluster-e2e test#5516
tiswanso wants to merge 13 commits intoistio:masterfrom
tiswanso:noclusreg_e2e-suite

Conversation

@tiswanso
Copy link
Copy Markdown
Contributor

@tiswanso tiswanso commented May 9, 2018

Script called by CI to runs multicluster e2e test. Uses mason to get GKE resources in
the same GCP project. The boskos resource_type already creates multiple GKE clusters. Use kubectl config get-contexts to determine all k8s clusters to use for the multicluster env.

The script can be run with pre-allocated clusters by having a pre-setup
kubeconfig for multiple clusters. It will choose the pilot cluster to
be the first cluster in the kubectl config get-contexts or the context set in the PILOT_CLUSTER env var.

Example for use with 2 precreated GKE clusters and designating "gke_istio-ci-testing_us-east1-c_cluster-1" context as the cluster to host the istio control-plane (pilot, etc).

PILOT_CLUSTER=gke_istio-ci-testing_us-east1-c_cluster-1 \
  KUBE_USER=tiswanso@cisco.com ROOT=$(pwd) USE_MASON_RESOURCE=False \
  prow/istio-pilot-multicluster-e2e.sh

for non-GKE clusters:

PILOT_CLUSTER=cluster-1 USE_GKE=False \
  KUBE_USER=tiswanso@cisco.com ROOT=$(pwd) USE_MASON_RESOURCE=False \
  prow/istio-pilot-multicluster-e2e.sh

@tiswanso
Copy link
Copy Markdown
Contributor Author

tiswanso commented May 9, 2018

/test istio-pilot-multicluster-e2e

@tiswanso
Copy link
Copy Markdown
Contributor Author

tiswanso commented May 9, 2018

e2e multicluster failed because the gke/gcp user associated with the prow job gke doesn't have permission to create firewall rules 👎 ... need to figure out whether the test jobs' user can get those permissions bumped or if we can make mason do it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we running this with auth disabled. If this is temporary to get the easiest environment to prove out the process then I think it is fine and we can increment in a later PR, but with auth should be the default case.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure... was just trying to get it going with less complexity. I'll change it.

But for the CI the more pressing problem is the prow env's lack of permissions to set firewall rules in the projects/istio-testing GCP project.

@codecov
Copy link
Copy Markdown

codecov bot commented May 10, 2018

Codecov Report

Merging #5516 into master will increase coverage by 1%.
The diff coverage is n/a.

Impacted file tree graph

@@          Coverage Diff           @@
##           master   #5516   +/-   ##
======================================
+ Coverage      71%     71%   +1%     
======================================
  Files         369     369           
  Lines       32122   32040   -82     
======================================
- Hits        22607   22577   -30     
+ Misses       8588    8532   -56     
- Partials      927     931    +4
Impacted Files Coverage Δ
galley/pkg/mcp/client/client.go 71% <0%> (-29%) ⬇️
mixer/adapter/signalfx/cumulative.go 67% <0%> (-4%) ⬇️
pilot/pkg/serviceregistry/kube/controller.go 66% <0%> (-2%) ⬇️
mixer/adapter/list/stringList.go 100% <0%> (ø) ⬆️
mixer/adapter/denier/denier.go 100% <0%> (ø) ⬆️
mixer/adapter/servicecontrol/client.go 0% <0%> (ø) ⬆️
mixer/adapter/stdio/stdio.go 100% <0%> (ø) ⬆️
mixer/adapter/solarwinds/solarwinds.go 0% <0%> (ø) ⬆️
mixer/adapter/memquota/memquota.go 100% <0%> (ø) ⬆️
mixer/adapter/cloudwatch/client.go 0% <0%> (ø) ⬆️
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7bba996...2ffe236. Read the comment docs.

@tiswanso
Copy link
Copy Markdown
Contributor Author

trying a run with removing use of gcloud compute instance-template list API and instead replacing with gcloud compute instances list since according to GCE IAM docs the compute instance-templates APIs are only available to compute-admin role. Possibly instance list and (maybe?) firewall-rules are available to the current role the istio-testing project user/service-account is set up for.

@tiswanso
Copy link
Copy Markdown
Contributor Author

/test istio-pilot-multicluster-e2e

@tiswanso
Copy link
Copy Markdown
Contributor Author

Run failed with same permissions error when trying with removing use of gcloud compute instance-template list API and instead replacing with gcloud compute instances list. Need to figure out how we can setup firewall rules... here or in boskos / mason.

@tiswanso tiswanso force-pushed the noclusreg_e2e-suite branch from 9336a14 to 45a16c4 Compare May 17, 2018 17:36
@istio-testing istio-testing added the needs-rebase Indicates a PR needs to be rebased before being merged label Jun 6, 2018
@tiswanso tiswanso force-pushed the noclusreg_e2e-suite branch from 45a16c4 to 3f77a92 Compare June 15, 2018 17:05
@istio-testing istio-testing removed the needs-rebase Indicates a PR needs to be rebased before being merged label Jun 15, 2018
@tiswanso tiswanso force-pushed the noclusreg_e2e-suite branch 3 times, most recently from c33c542 to e1b1004 Compare June 21, 2018 16:50
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done by the framework.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebastienvas -- by "framework" are you meaning Mason?

The main thing that needs to happen in unsetup_clusters is that the firewall rule needs to be removed from the project since Mason won't do that just by destroying clusters. The other stuff I added in so I could rerun the tests over and over without always deleting and recreating clusters in my own GKE proj.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to go e2e framework.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The e2e framework won't be able to delete the gcloud firewall rule so something like unsetup_clusters is needed for GKE projs usage.

Also, the service-account stuff was done by cluster_lib.sh and unsetup_clusters removes that from the remote cluster. There's a PR to put the service account create inside the helm chart used to setup the remote cluster. But I'd like to get this merged and update the script later along with a fix to the e2e multicluster.go after the helm merges.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are actually running gcloud on istio-testing here, not on the project where the cluster are created.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

@sebastienvas
Copy link
Copy Markdown
Contributor

You can take my patch from #6549

@costinm costinm added this to the Nebulous Future milestone Jun 25, 2018
@tiswanso tiswanso force-pushed the noclusreg_e2e-suite branch from 8558f99 to 0373644 Compare June 26, 2018 15:07
@tiswanso
Copy link
Copy Markdown
Contributor Author

@sebastienvas -- I merged in your version of the fix for the proj from #6549

@tiswanso tiswanso force-pushed the noclusreg_e2e-suite branch from 0373644 to c05f944 Compare June 26, 2018 16:03
Copy link
Copy Markdown
Contributor

@sebastienvas sebastienvas Jun 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use {} around variable names for consistency.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here about {}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what this is ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover comment ... will remove

I should be able to remove that GIT_SHA stuff as well since it's setup in e2e-suite.sh

@tiswanso
Copy link
Copy Markdown
Contributor Author

/test istio-pilot-multicluster-e2e

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I did that but could please you change cat "${info_path}" | head -n 1 to head -n 1 "${info_path}"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

Copy link
Copy Markdown
Contributor

@sebastienvas sebastienvas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see comments

@tiswanso tiswanso force-pushed the noclusreg_e2e-suite branch from 2d0cc28 to 2ffe236 Compare July 10, 2018 12:20
@istio-testing
Copy link
Copy Markdown
Collaborator

New changes are detected. LGTM label has been removed.

@tiswanso
Copy link
Copy Markdown
Contributor Author

rebased and lint failures finally went away :-)

@sebastienvas
Copy link
Copy Markdown
Contributor

@hklai this is ready to go in. Can we merge it ?

@hklai
Copy link
Copy Markdown
Contributor

hklai commented Jul 10, 2018

This will be merge once the 1.0 code freeze is over (hopefully next week), when we re-enable the merge bot.

And if this is needed for 1.0, please rebase and merge to the release-1.0 branch instead of master.

@rshriram rshriram added the stale label Jul 20, 2018
@rshriram rshriram closed this Jul 20, 2018
@hklai
Copy link
Copy Markdown
Contributor

hklai commented Jul 20, 2018

Sorry for dropping the ball here. Had an offline discussion with @sebastienvas and was wondering if this should go to releaae-1.0 instead of master.

@tiswanso is this still needed?

@tiswanso
Copy link
Copy Markdown
Contributor Author

@hklai -- Yes... this is still needed. It could go into release-1.0 instead. It is only effecting testing.

@tiswanso tiswanso changed the base branch from master to release-1.0 July 23, 2018 20:26
@tiswanso tiswanso changed the base branch from release-1.0 to master July 24, 2018 21:03
@istio-testing
Copy link
Copy Markdown
Collaborator

istio-testing commented Jul 24, 2018

@tiswanso: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
prow/e2e-bookInfoTests.sh 7f1b99396f52e80b47068bb78aaccfd36c439858 link /test e2e-bookInfo
prow/istio-pilot-e2e.sh 7f1b99396f52e80b47068bb78aaccfd36c439858 link /test istio-pilot-e2e
prow/istio-pilot-multicluster-e2e.sh 7f025c0aaf4b66868d5ff0b73a02fbf38d1bc613 link /test istio-pilot-multicluster-e2e
prow/istio-pilot-e2e-envoyv2-v1alpha3.sh 2ffe236 link /test istio-pilot-e2e-envoyv2-v1alpha3
Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@tiswanso
Copy link
Copy Markdown
Contributor Author

I wasn't able to reopen this PR so I recreated the PR -- #7376

@rlenglet rlenglet modified the milestones: Nebulous Future, 1.1 Jul 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants