Merge Azure Master into IGNW Master#12
Merged
joeslazaro merged 62 commits intoIGNW:masterfrom Jul 25, 2019
Merged
Conversation
* Extend warm-reboot test to include the BGP sad pass
Create MIRRORV6 ACL table by default Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
* Do not crash in case data plane never stop on fast-reboot
…y smaller values (#946)
Stablize the test by adding pause after the route change Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
* preboot LAG sad path automation for neigh_lag_down and dut_lag_down scenarios
* [vnet_vxlan]: Enhance vnet_vxlan to test ipv6 vxlan tunnels Signed-off-by: Anish Narsian anish.narsian@microsoft.com
…n when deleting config_service_acls.sh (#961)
* Fix testbed_mtu for tasks that invoke fib_test * Set socket buffer size to 16k
In case the config reload operation takes longer than the PTF script's running time, checking PTF script PID after config reload may fail. This change is to improve the robustness of checking PTF script PID after config reload.
- Improve the data test warm up code: Let the data plane IO stablize for 30 seconds before testing. We observed ptf instability causing the test to fail. - Remove config_db.json when fast-reboot into a new image. We want the new image to reload minigraph in this case. Signed-off-by: Ying Xie <ying.xie@microsoft.com>
* [platform] Implement platform phase 1 cases Signed-off-by: Xin Wang <xinw@mellanox.com> * [platform] Add mellanox_psu_controller.py Changes: * Add mellanox_psu_controller.py which has Mellanox implementation of PSU controller. * Increase the delay between reset SFP and checking SFP presence for SFP to be fully recovered. * Improve the checking of PSU status. * Correct spelling errors. Signed-off-by: Xin Wang <xinw@mellanox.com> * [platform] Improve scripts according to review comments * Replace inline command strings with predefined variables * Add test case for testing SFP low power mode Signed-off-by: Xin Wang <xinw@mellanox.com> * [platform] Fix the issue of comparing syseeprom output The order of information output by command "show platform syseeprom" is not guranteed. This commit improve the method of comparing the content output by syseeprom plugin and the show command to avoid the failure caused by inconsistent output order. Signed-off-by: Xin Wang <xinw@mellanox.com>
…analyzer.yml (#963) The copy files task was after the fail tests. In case of failure, the copy task would never get a chance to run. This commit adjusted the task sequence. In case of failure, copy the files, then fail the test. The original copy task copies files with deep folder structure. This issue was also fixed in this commit. Signed-off-by: Xin Wang <xinw@mellanox.com>
…' state (#951) Recently bgp test has failed due to non-established neighbor state frequently. We highly suspect it is due to some topo deployment issue which causes DUT unable to learn arp/nd of its bgp neighbor. By displaying ip neigh info we can easily distinguish this reason from others thus saving effort of diagnosing.
Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
* [fdb_mac_expire.yml]: FDB MAC Expire test case. [fdb_mac_expire_test.py]: PTF helper to add Mac in L2 table. [fdb.yml]: include fdb_mac_expire.yml. This test case verifies that MAC expires within 10 mins if traffic is not flowing using it. Signed-off-by: Praveen Chaudhary<pchaudhary@linkedin.com> * [fdb_mac_expire.yml]: FDB MAC Expire test case. [fdb_mac_expire_test.py]: PTF helper to add Mac in L2 table. [testcases.yml]: include fdb_mac_expire.yml. This test case verifies that MAC expires within 10 mins if traffic is not flowing using it. Signed-off-by: Praveen Chaudhary<pchaudhary@linkedin.com> * [fdb_mac_expire.yml]: Incorporate swssconfig step to set fdb_aging_timer in fdb_mac_expire.yml Signed-off-by: Praveen Chaudhary<pchaudhary@linkedin.com> * [fdb_mac_expire.yml]: minor changes in logs Signed-off-by: Praveen Chaudhary<pchaudhary@linkedin.com> * [fdb_mac_expire.yml]: minor log changes to show time correctly. Example: "MAC Entires are Cleared within 100 secs." instead of "MAC Entires are Cleared within 2*50 secs." Signed-off-by: Praveen Chaudhary<pchaudhary@linkedin.com> * [fdb_mac_expire.yml]: Address review comments related to sonic-clear, -it option and block-always. Signed-off-by: Praveen Chaudhary<pchaudhary@linkedin.com> * [fdb_mac_expire.yml]: Change "sonic-clear fdb all" to "Clear FDB table". Signed-off-by: Praveen Chaudhary<pchaudhary@linkedin.com>
…ore rebooting (#975) - fast-reboot script is an adapted version from 201811 branch. The change is around syncd stop: in 201803 branch, if it is Broadcom platform, request syncd to perform cold shutdown. - Mellanox 201803 branch has a vlan FDB issue causing all vlan IO to flood. Add a knob allow_vlan_flooding to ignore this symptom and continue with fast-reboot. Signed-off-by: Ying Xie <ying.xie@microsoft.com>
…iC update (#970) * Added New test case to verify MAC addr is correct after SONiC to SONiC update. * Added fixes and additional verifications. * Added missed fix. * Increased reboot wait timeout in test case to 300 sec.
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
…#968) * fix grep ipv6 addr issue * Add Mellanox onyx fanout switch deploy yml and template * fix typo * remove debug code * revert the change to check_pfcwd_fanout.yml and deploy_pfcwd_fanout.yml * fix typo
Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
Running the test case: dir_bcast with the topology: 't0-16' or't0-56', the error caused as below:
task path: /var/sonic/sonic-mgmt/ansible/roles/test/tasks/dir_bcast.yml:9
fatal: [str-dut-01]: FAILED! => {"changed": false, "failed": true, "invocation": {"module_args": {"msg": "testbed_type t0-64-32 is invalid."}, "module_name": "fail"}, "msg": "testbed_type t0-56is invalid."}
* [continuous_link_flap.yml]: Continous link flap test. This is continuous link flap test. In this test, 1.) Flap all interfaces one by one to cause BGP Flaps (3 iterations). 2.) Flap all interfaces on peer (FanOutLeaf) one by one to cause BGP Flaps (3 iterations). 3.) Watch for memory (show system-memory) ,orchagent CPU Utilization and Redis_memory. Pass Criteria: All routes must be re-learned with < 5% increase in Redis memory and with Orchagent CPU consumption below 10% after 3 mins of stopping flaps. * [continuous_link_flap.yml]: Address review comments for orchagent cpu. * [continuous_link_flap_helper.yml]: use config cli instead of ifdown\up. [testcases.yml]: run only on t1 to bring l3 interfaces down. * [testcases.yml]: Add T0 topo to supported topology for Continuous link flap test.
* [loganalyzer] Fix the IOError of opening match file * [loganalyzer] Improve the logic of generating match file option
…minutes (#962) * [vm_set] Improve the start-vms performance The original approach starts and configures the VMs sequentially. It takes more than 3 hours to start 32 virtual machines. This change is to start all the VMs, then configure them one by one. With this change, starting 32 VMs needs around 30-40 minutes. Another change in this commit is to configure 'autostart' for VMs so that the VMs will automatically start running after host server is rebooted. Signed-off-by: Xin Wang <xinw@mellanox.com> * Add batch_size support, make autostart optional
The dhcp_relay daemon use port interface's alias to fill option82 circuit ID instead of using port inteface's name. In current yml, use minigraph_vlans' member as client port alias name, but there are only port interface's name in minigraph_vlans. Therefore add to use minigraph_port_name_to_alias_map to obtain the port interface's alias
…#997) By default the log analyzer generate a dump which collect all the available log files by default in case of failure. This unnecessary and the dump file could be too big. This fix is to generate a dump to collect log within 1 hour by default. If more log is needed, parameter 'dump_since' can be used. Signed-off-by: Xin Wang <xinw@mellanox.com>
* Upgrade FW for mellanox before fast-reboot * Move some condition check to the main file
* [warm/fast reboot] make sure that /etc/sonic/config_db.json exsits after upgrade Signed-off-by: Ying Xie <ying.xie@microsoft.com> * [warm reboot] save config after warm reboot into new image When new image is defined, test removed /host/config_db.json before warm rebooting. So after the device boots up, it will miss /etc/sonic/config_db.json. It is not an issue for the device to stay up. But it will be an issue when device reboot again (cold or fast). Signed-off-by: Ying Xie <ying.xie@microsoft.com> * review comments
Signed-off-by: Yuriy Volynets <yuriyv@mellanox.com>
When test failed due to dataplane disruption issue, config save would be skipped and leaving the device in vulnerable state. Move config save to the always block. Signed-off-by: Ying Xie <ying.xie@microsoft.com>
…ms doesnt have LOSSY profile (#747)
…2.5 (#1005) * Adding the abblity to use yaml plugin with stdout content Signed-off-by: Zhiqian Wu <zhiqian.wu@nephosinc.com>
The check is to gate removing a line in known_hosts file, so the check needs to be checking /root/.ssh/known_hosts. Signed-off-by: Ying Xie <ying.xie@microsoft.com>
As the hard drive permits, we keep a few history images in the past so that we could easily go back to them. Recent test failure shows a downside of that decision. When a test failed and leave an installed image in broken state. We likely restore the system by booting into another working image. However, if the broken image is not removed before installation happens again, because the target image exists, the installation could be skipped or not fixing the existing issue. So when we boot into the image again, the device is still in broken state. Removing all non-current and non-next images give the DUT a better chance to start a clean test. Signed-off-by: Ying Xie <ying.xie@microsoft.com>
…#1019) * [minigraph] allow generating minigraph without data plane acl defined Signed-off-by: Ying Xie <ying.xie@microsoft.com> * Change the default behavior to enable data plane acl
…r function (#996) * [bgp-gr-helper] Add bgp-gr-helper test case Add script for testing the BGP graceful restart helper function. Signed-off-by: Xin Wang <xinw@mellanox.com> * [bgp-gr-helper] Add supported topo t1-64-lag * [bgp-gr-helper] Improve the wording * Add checking IPv6 route * [bgp-gr-helper] Enable graceful restart for t1 topo * [bgp-gr-helper] Improve script structure * Add more comments * Organize the code to make the two test cases more obvious * Remove the uncessary configuration change of graceful-restart stalepath-time Signed-off-by: Xin Wang <xinw@mellanox.com>
* [platform] Implement platform phase 2 cases Implement the SONiC platform phase 2 test cases using the pytest-ansible framework. Signed-off-by: Xin Wang <xinw@mellanox.com> * [platform] Add interface status checking using the interface_facts module * [platform] Fix some minor issues * Run reboot command in background to avoid command failure caused by SSH connection broken before command returns * Fine tune the reboot wait timeout values * Add delay before checking interface status because the intfutil command may have no output in time Signed-off-by: Xin Wang <xinw@mellanox.com>
Signed-off-by: Volodymyr Samotiy <volodymyrs@mellanox.com>
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
Signed-off-by: Nazarii Hnydyn <nazariig@mellanox.com>
Signed-off-by: Neetha John <nejo@microsoft.com>
* Add Preboot n BGP member down and n Lag down tests Signed-off-by: Neetha John <nejo@microsoft.com>
joeslazaro
approved these changes
Jul 25, 2019
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Merging the current state of Azure / Master into our branch before I fork and get t0-8 and t1-8 added.