Skip to content

Bump up timeout for pilot e2e tests#3653

Merged
istio-merge-robot merged 1 commit intoistio:masterfrom
nmittler:test-timeout
Feb 21, 2018
Merged

Bump up timeout for pilot e2e tests#3653
istio-merge-robot merged 1 commit intoistio:masterfrom
nmittler:test-timeout

Conversation

@nmittler
Copy link
Copy Markdown
Contributor

Tests on prow seem to be timing out with 20m. Bumping up to 30m.

Tests on prow seem to be timing out with 20m. Bumping up to 30m.
@nmittler nmittler requested review from a team and ldemailly February 21, 2018 18:01
@ldemailly
Copy link
Copy Markdown
Member

/lgtm
did you look a bit why it is taking so long ? maybe there is something wrong ?

@istio-merge-robot
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ldemailly

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@nmittler
Copy link
Copy Markdown
Contributor Author

@ldemailly it looks like prow is running really slow for some reason. Also #3516 (merged on 2/16) fixed the fact that we were not previously running the auth tests. Now that we're running the full suite of pilot tests, its clear that the auth tests are taking a ridiculously long time on prow. When I run them locally on a GKE cluster, all of the tests take ~5min (with the same parameters as prow).

@nmittler
Copy link
Copy Markdown
Contributor Author

/test istio-presubmit

@istio-merge-robot
Copy link
Copy Markdown

/test all [submit-queue is verifying that this PR is safe to merge]

@ldemailly
Copy link
Copy Markdown
Member

can you raise this with @sebastienvas @chxchx

cc @hklai

@istio-merge-robot
Copy link
Copy Markdown

Automatic merge from submit-queue.

@istio-merge-robot istio-merge-robot merged commit 6c299dc into istio:master Feb 21, 2018
@nmittler
Copy link
Copy Markdown
Contributor Author

@sebastienvas @chxchx @hklai

I've just managed to replicate the slowness of the pilot e2e tests from my machine on a gke cluster. The problem appears to be the fact that our test cluster has only 1 node. Previously, I was using a cluster with 4 nodes (nodeType=n1-standard-4) and the test completed in ~5 min. With an identical cluster running a single node, the same tests took 1 h 12 min and failed several tests.

As a short-term solution we could we bump up the cluster size for these tests to something reasonable (e.g. 4).

Going forward, we might consider using a large shared cluster, separating the test resources by namespace.

WDYT?

@nmittler
Copy link
Copy Markdown
Contributor Author

I've created a short-term fix in #3663

@istio-testing
Copy link
Copy Markdown
Collaborator

@nmittler: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
prow/istio-pilot-e2e.sh 546ab38 link /test istio-pilot-e2e
Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@ldemailly
Copy link
Copy Markdown
Member

thanks for the investigation. in the past we've had the issue where some tasks were run on prow itself instead of being scheduled on the target cluster, is this another example of that ?
(though I guess 1 node is not much either way)

@nmittler
Copy link
Copy Markdown
Contributor Author

@ldemailly I think so ... the logs showed that the cluster had a single node.

@nmittler
Copy link
Copy Markdown
Contributor Author

@ldemailly I've added details in #3663 (comment)

@ldemailly
Copy link
Copy Markdown
Member

ty!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants