Skip to content

[vm_set] Reduce the testbed-cli.sh start-vms time from 3 hours to 20 minutes#962

Merged
liat-grozovik merged 2 commits intosonic-net:masterfrom
wangxin:start-vms
Jul 9, 2019
Merged

[vm_set] Reduce the testbed-cli.sh start-vms time from 3 hours to 20 minutes#962
liat-grozovik merged 2 commits intosonic-net:masterfrom
wangxin:start-vms

Conversation

@wangxin
Copy link
Copy Markdown
Collaborator

@wangxin wangxin commented Jun 17, 2019

Description of PR

Summary:
Fixes # (issue)

The original approach starts and configures the VMs sequentially.
It takes more than 3 hours to start 32 virtual machines. This
change is to start all the VMs, then configure them one by one.
With this change, starting 32 VMs needs around 20 minutes.

Another change in this commit is to configure 'autostart' for
VMs so that the VMs will automatically start running after host
server is rebooted.

Type of change

  • Bug fix
  • [] Testbed and Framework(new/improvement)
  • [] Test case(new/improvement)

Approach

How did you do it?

Change the approach of starting and configuring VMs one by one to starting all the VMs in batch, then configuring them one by one.
After a VM is started, run "virsh autostart <VM_name>" to set it to autostart.

How did you verify/test it?

Verified in Mellanox lab.

Any platform specific information?

NA

Supported testbed topology if it's a new test case?

Documentation

The original approach starts and configures the VMs sequentially.
It takes more than 3 hours to start 32 virtual machines. This
change is to start all the VMs, then configure them one by one.
With this change, starting 32 VMs needs around 30-40 minutes.

Another change in this commit is to configure 'autostart' for
VMs so that the VMs will automatically start running after host
server is rebooted.

Signed-off-by: Xin Wang <xinw@mellanox.com>
@wangxin wangxin changed the title [vm_set] Improve the start-vms performance [vm_set] Reduce the testbed-cli.sh start-vms time from 3 hours to 20 minutes Jun 20, 2019
Copy link
Copy Markdown
Contributor

@pavel-shirshov pavel-shirshov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution

  1. Can you please make a batch size for starting vms. When I tested it some years ago, if I started more than 4 VMs at parallel, the whole host was stuck because of IO contention. So let's allow others to start VMs one by one, two at the time, and so one.

  2. Please make autostart option optional. If someone needs it, she/he can set it enabled, otherwise the old behavior is kept.

@pavel-shirshov pavel-shirshov self-assigned this Jun 26, 2019
@wangxin
Copy link
Copy Markdown
Collaborator Author

wangxin commented Jun 26, 2019

@pavel-shirshov Thanks for your suggestions. I'll update and push a new commit.

@wangxin
Copy link
Copy Markdown
Collaborator Author

wangxin commented Jul 4, 2019

@pavel-shirshov

I added two enhancements in the new commit:

  1. Add batch_size support for starting VMs. For example:
./testbed-cli.sh start-vms server_name vault -e batch_size=2 -e interval=60
  • Parameter batch_size is for specifying the number of VMs to be started in a batch. If the parameter is not specified, all VMs will be started one by one.
  • Parameter interval is for specifying the number of seconds to wait between starting each batch of VMs.

A better solution should have been used. For example, trigger the starting of a batch of VMs, then wait until this batch of VMs are started and do basic configurations on them (kickstart them). Then move on to the next batch. To use this solution, nested loop is inevitable. The outer loop is for the batches of VMs. The inner loop is for each VM in a batch. However, the ansible version (v2.0) used in sonic-mgmt has an issues blocks us from using nested loop: ansible/ansible#14146. So, I used a simple workaround: trigger the starting of a batch of VMs, then pause some time. After triggered the starting of all VMs, then kickstart them.

  1. Make autostart optional. By default, autostart will not be set.
    Example:
./testbed-cli.sh start-vms server_name vault -e autostart=yes

@liat-grozovik liat-grozovik merged commit bbe1111 into sonic-net:master Jul 9, 2019
yxieca pushed a commit that referenced this pull request Jul 18, 2019
…minutes (#962)

* [vm_set] Improve the start-vms performance

The original approach starts and configures the VMs sequentially.
It takes more than 3 hours to start 32 virtual machines. This
change is to start all the VMs, then configure them one by one.
With this change, starting 32 VMs needs around 30-40 minutes.

Another change in this commit is to configure 'autostart' for
VMs so that the VMs will automatically start running after host
server is rebooted.

Signed-off-by: Xin Wang <xinw@mellanox.com>

* Add batch_size support, make autostart optional
@wangxin wangxin deleted the start-vms branch September 26, 2019 12:36
fraserg-arista pushed a commit to fraserg-arista/sonic-mgmt that referenced this pull request Feb 24, 2026
)

<!--
Please make sure you've read and understood our contributing guidelines;
https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md

Please provide following information to help code review process a bit easier:
-->
### Description of PR
<!--
- Please include a summary of the change and which issue is fixed.
- Please also include relevant motivation and context. Where should reviewer start? background context?
- List any dependencies that are required for this change.
-->
Skip v4 neighbor checks for v6 topo.
Add new lookback_ipv6 fixture because IPv6 loopback IP is not directly used for route advertisement. Also add test_bgp_router_id_set_ipv6 for v6 topo only.
Delete xfail and add skip for test_bgp_router_id_set/test_bgp_router_id_set_ipv6 based on v6/non-v6 topo.

Summary:
Fixes sonic-net#21454

### Type of change

<!--
- Fill x for your type of change.
- e.g.
- [x] Bug fix
-->

- [ ] Bug fix
- [ ] Testbed and Framework(new/improvement)
- [ ] New Test case
 - [ ] Skipped for non-supported platforms
- [x] Test case improvement

### Back port request
- [x] 202412
- [x] 202505

### Approach
#### What is the motivation for this PR?
test_bgp_router_id.py failed on v6 topo

#### How did you do it?
Skip v4 neighbor checks for v6 topo
Get correct IPv6 announced routes
Delete xfail and add proper skip

#### How did you verify/test it?
The test passed after the fix

#### Any platform specific information?

#### Supported testbed topology if it's a new test case?

### Documentation
<!--
(If it's a new feature, new test case)
Did you update documentation/Wiki relevant to your implementation?
Link to the wiki page?
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants