kubemark cluster starts partition servers in parallel by h-w-chen · Pull Request #1113 · CentaurusInfra/arktos

h-w-chen · 2021-07-06T20:56:17Z

What type of PR is this?
/kind feature

What this PR does / why we need it:
This PR adopts parallel starting of arktos scale-out partitions when starting the kubemark cluster.

Current start-kunemark scripts starts the scale-out partitions (TP or RP) in sequence, which is quite time consuming if the number of TP servers and that of RP servers are not minimum. There are various ways to optimize the cluster setup time and reduce the overhead; one obvious way is making these partition servers started in parallel can significantly reduced the cluster setup time. This PR starts multiple TPs in parallel, after all TPs are started, then starts multiple RPs in parallel.

This PR does not try to start TPs and RPs at the same time.

Does this PR introduce a user-facing change?:
NONE

…rk setup scripts

cluster/gce/util.sh

yb01 · 2021-07-07T17:56:00Z

test/kubemark/start-kubemark.sh

 #
 MASTER_METADATA=""

+### to upload etcd image / binary tar once


the test_resource_upload() can be added to the very beginning of the test setup and ensure it is done. so that this check can be removed

admin cluster also needs this uploading, and it is directly set up by kube-up.sh script; starting partition clusters also eventually call into kube-up.sh; if we don't want to change this structure, we have to have some sort of condition check in place.

test/kubemark/start-kubemark.sh

yb01

/lgtm
with a few minor comments

zmn223 · 2021-07-13T17:24:00Z

/lgtm

zmn223 · 2021-07-13T19:43:31Z

/lgtm

zmn223 · 2021-07-13T19:43:38Z

/approve

centaurus-cloud-bot · 2021-07-13T19:43:42Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: yb01, zmn223

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [zmn223]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

* design doc: DaemonSet support in scale-out Arktos (#1109) * doc: DaemonSet support in scale-out Arktos * minor: rephrased daemonset managability of failed TP * added notes of scale-up arktos supporting system tenanted DS * put daemonset design doc in separate file * trivial: resource manager reworded as resource partition * added design alternatives based on peer feedback * emphasized on DS and supporting resources in unit of TP; put more detail of kubelet secret/configmap local store changes * added clarifications based on peer feedback * minor: revised based on peer review * kubemark cluster starts partition servers in parallel (#1113) * multiple partitions of same kind (tp/rp) able to start in parallel * eliminates /tmp/saved_tenant_ips.txt and TP_IP_CONCAT var from kubemark setup scripts * minor: todo comments for dedicated log stream of parallel calls * Bump Arktos to v0.8.0 (#1116) * Fix a bug that event client was created with wrong user agent (#1120) Co-authored-by: hwchen <hong.chen@futurewei.com>

Hongwei Chen added 2 commits July 6, 2021 11:33

multiple partitions of same kind (tp/rp) able to start in parallel

854d031

eliminates /tmp/saved_tenant_ips.txt and TP_IP_CONCAT var from kubema…

9c5c178

…rk setup scripts

h-w-chen requested review from Sindica, q131172019 and yb01 July 6, 2021 20:56

centaurus-cloud-bot added the size/L label Jul 6, 2021

h-w-chen changed the title ~~Hw kubemark start xp in parallel~~ kubemark cluster starts partition servers in parallel Jul 6, 2021

yb01 reviewed Jul 7, 2021

View reviewed changes

cluster/gce/util.sh Show resolved Hide resolved

yb01 reviewed Jul 7, 2021

View reviewed changes

test/kubemark/start-kubemark.sh Show resolved Hide resolved

yb01 reviewed Jul 7, 2021

View reviewed changes

test/kubemark/start-kubemark.sh Show resolved Hide resolved

yb01 approved these changes Jul 7, 2021

View reviewed changes

centaurus-cloud-bot assigned yb01 Jul 7, 2021

centaurus-cloud-bot added the lgtm label Jul 7, 2021

minor: todo comments for dedicated log stream of parallel calls

e352d9c

centaurus-cloud-bot removed the lgtm label Jul 8, 2021

centaurus-cloud-bot assigned zmn223 Jul 13, 2021

centaurus-cloud-bot added the lgtm label Jul 13, 2021

Merge branch 'master' into hw-kubemark-start-xp-in-parallel

ae5322d

centaurus-cloud-bot removed the lgtm label Jul 13, 2021

centaurus-cloud-bot added the lgtm label Jul 13, 2021

centaurus-cloud-bot added the approved label Jul 13, 2021

centaurus-cloud-bot merged commit 1dfd6c4 into CentaurusInfra:master Jul 13, 2021

h-w-chen deleted the hw-kubemark-start-xp-in-parallel branch July 13, 2021 19:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubemark cluster starts partition servers in parallel#1113

kubemark cluster starts partition servers in parallel#1113
centaurus-cloud-bot merged 4 commits intoCentaurusInfra:masterfrom
h-w-chen:hw-kubemark-start-xp-in-parallel

h-w-chen commented Jul 6, 2021

Uh oh!

Uh oh!

yb01 Jul 7, 2021

Uh oh!

h-w-chen Jul 8, 2021

Uh oh!

Uh oh!

Uh oh!

yb01 left a comment

Uh oh!

zmn223 commented Jul 13, 2021

Uh oh!

zmn223 commented Jul 13, 2021

Uh oh!

zmn223 commented Jul 13, 2021

Uh oh!

centaurus-cloud-bot commented Jul 13, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

h-w-chen commented Jul 6, 2021

Uh oh!

Uh oh!

yb01 Jul 7, 2021

Choose a reason for hiding this comment

Uh oh!

h-w-chen Jul 8, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

yb01 left a comment

Choose a reason for hiding this comment

Uh oh!

zmn223 commented Jul 13, 2021

Uh oh!

zmn223 commented Jul 13, 2021

Uh oh!

zmn223 commented Jul 13, 2021

Uh oh!

centaurus-cloud-bot commented Jul 13, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants