Skip to content

[fast reboot] Revert fast-reboot script changes#982

Merged
qiluo-msft merged 1 commit intosonic-net:masterfrom
neethajohn:revert_fast_boot_script
Jun 27, 2019
Merged

[fast reboot] Revert fast-reboot script changes#982
qiluo-msft merged 1 commit intosonic-net:masterfrom
neethajohn:revert_fast_boot_script

Conversation

@neethajohn
Copy link
Copy Markdown
Contributor

Revert part of the changes made in PR #975. Remove the fast-reboot script and the corresponding changes made for its use.

Type of change

  • [] Bug fix
  • [] Testbed and Framework(new/improvement)
  • Test case(new/improvement)

@qiluo-msft qiluo-msft merged commit a383e46 into sonic-net:master Jun 27, 2019
@neethajohn neethajohn deleted the revert_fast_boot_script branch July 16, 2019 23:08
fraserg-arista pushed a commit to fraserg-arista/sonic-mgmt that referenced this pull request Feb 24, 2026
<!--
Please make sure you've read and understood our contributing guidelines;
https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md

Please provide following information to help code review process a bit
easier:
-->
### Description of PR
<!--
- Please include a summary of the change and which issue is fixed.
- Please also include relevant motivation and context. Where should
reviewer start? background context?
- List any dependencies that are required for this change.
-->

Summary:
Fixes # (issue)
This PR addresses **non‑linear dataplane downtime behavior** observed in
high‑scale BGP IPv6 scenarios when running the port and session flapping
tests. When the number of connections to flap doubled, the dataplane
downtime increased by 450x.

This change refines the tests and helper logic to ensure that downtime
measurements:

- More accurately reflect real control‑plane and data‑plane outage
intervals,
- Scale more predictably with load and iterations, and
- Avoid over‑counting or under‑counting downtime due to measurement
artifacts and overlapping events.

### Type of change

<!--
- Fill x for your type of change.
- e.g.
- [x] Bug fix
-->

- [ x ] Bug fix
- [ ] Testbed and Framework(new/improvement)
- [ ] New Test case
    - [ ] Skipped for non-supported platforms
- [ ] Test case improvement


### Back port request
- [ ] 202205
- [ ] 202305
- [ ] 202311
- [ ] 202405
- [ ] 202411
- [ ] 202505

### Approach
#### What is the motivation for this PR?

While validating high‑scale BGP convergence, flap, and route‑programming
tests, we observed that:

- Dataplane downtime did not scale linearly with:
  - The number of flap iterations,
  - The number of routes or neighbors.

These issues were traced to how the tests were executed sequentially
while the PTF dataplane packet‑filtering/counter state was never cleared
between runs. As a result, masks and counters kept accumulating over
time, so that each subsequent run especially those with a larger number
of ports to flap saw an artificially inflated dataplane downtime.

In other words, the measured non‑linear increase in downtime was caused
by PTF dataplane state rather than actual BGP control‑plane behavior.
The goal of this PR is to:

- Properly reset/clean relevant PTF dataplane state between runs,
- Ensure that measured dataplane downtime reflects only the real BGP and
data‑plane behavior,
- Restore a linear and predictable relationship between test scale
(routes/neighbors/iterations) and observed downtime.

#### How did you do it?

- Added logic to explicitly **clear PTF dataplane state between runs**,
including:
- Flushing or re‑initializing PTF packet filters used for counting
traffic to the prefixes under test.
- Resetting relevant PTF counters so that each run starts with a clean
environment.
- Updated the test flow so that:
- Each scale/iteration configuration first ensures PTF dataplane state
is clean before starting flaps and dataplane measurements.
- Dataplane downtime is computed only from counters and observations
collected **within** the current run, avoiding any contamination from
previous runs.
- Adjusted/factored helper utilities (where appropriate) so that the PTF
cleanup is:
- Centralized and reusable across the convergence, flap, and
route‑programming tests,
- Invoked consistently whenever a new test scenario or iteration is
started.
- Enhanced logging around:
  - When PTF dataplane state is cleared,
- Per‑iteration dataplane downtime measurements after the fix, so it is
easy to verify that:
    - Counters are reset when expected, and
- The resulting downtime scales linearly with the number of
ports/routes/iterations, reflecting actual BGP and dataplane behavior.

#### How did you verify/test it?
- Re‑ran the high‑bgp convergence, flap, and route‑programming tests
with the fixes applied:
  - Topology: `t0-isolated-d2u510s2`
  - Platform: Broadcom Arista-7060X6-64PE-B-C512S2
- Verified that:
- Measured downtime per iteration is stable and scales predictably with
load and iteration count.
- Spurious spikes caused by measurement artifacts are eliminated and
stay within millisecond compared to previous tens of seconds.
 
#### Any platform specific information?

#### Supported testbed topology if it's a new test case?

### Documentation
<!--
(If it's a new feature, new test case)
Did you update documentation/Wiki relevant to your implementation?
Link to the wiki page?
-->

---------

Signed-off-by: Priyansh Tratiya <ptratiya@microsoft.com>
kazinator-arista pushed a commit to kazinator-arista/sonic-mgmt that referenced this pull request Mar 4, 2026
* [201811][sairedis][swss] advance sub modules head

Submodule src/sonic-sairedis 18ad5f9..4c75b7f:
  > Fixed conditional operator. (sonic-net#487)

Submodule src/sonic-swss 1e99c93..cd12d48:
  > [teamsyncd]: Add information for LAG membership changes (sonic-net#982)
  > Fix vlan incremental config and add vs test cases (sonic-net#799)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* [swss] include more swss changes

Submodule src/sonic-swss cd12d48..f44029d:
  > [MirrorOrch]: Init the next hop ip with 0 instead of default constructor (sonic-net#953)
  > [AclOrch]: Fix the acl mirror counter doubled by inactive mirror and active again (sonic-net#952)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Pterosaur pushed a commit to Pterosaur/sonic-mgmt that referenced this pull request Mar 26, 2026
```
*   d849917 (HEAD -> user/r12f/merge-202412, origin/user/r12f/merge-202412, internal-202412) r12f 260209:0619 - Merge remote-tracking branch 'azure/202412' into internal-202412
|\
| * c4c429c (azure/202412) mssonicbld 260209:0115 - [action] [PR:21843] Fix test_bgp_suppress_fib.py flakiness on scale topos (sonic-net#1011)
| * 46503f1 Mark Xiao 260205:1058 - [202412] Fix test_acl.py [ipv6-ingress-uplink->downlink-*] cases for v6 topo (sonic-net#942)
| * b6c58d4 mssonicbld 260130:1215 - [action] [PR:21772] Fix test_bgp_suppress_fib.py for v6 topos (sonic-net#985)
| * e762451 mssonicbld 260130:1016 - [action] [PR:21762] Fix test_disk_exhaustion.py for v6 topos (sonic-net#983)
| * 02cadbb mssonicbld 260130:1016 - [action] [PR:21771] Fix test_bgp_bounce.py for v6 topos (sonic-net#984)
| * 7eab3e7 Priyansh 260129:1437 - [202412] Fix/nonlinear dataplane downtime (sonic-net#982)
| * de89a46 mssonicbld 260130:0616 - [action] [PR:21143] configlet/test_add_rack.py Add comparison ignore for extra entries added by generic patcher (sonic-net#977)
| * c731442 mssonicbld 260130:0616 - [action] [PR:21647] Remove test_route_flow_counter.py xfail for v6 topos (sonic-net#979)
| * 0ff3ecc mssonicbld 260130:0116 - [action] [PR:21717] [BGP][test_bgp_session.py::test_bgp_session_interface_down] - Increase BGP Session State Timeout Window when restarting SWSS Container (sonic-net#978)
| * e0a408c Mark Xiao 260129:0914 - [202412] Fix fib/test_fib.py test_vxlan_hash [ipv6-*] (sonic-net#971)
| * 51b0525 mssonicbld 260124:0615 - [action] [PR:21523] Feature/route programming data (sonic-net#974)
| * 2bb2a53 mssonicbld 260124:0615 - [action] [PR:21939] Fix/nonlinear high nexthop dataplane downtime (sonic-net#975)
| * 1fe73bc Mark Xiao 260123:1323 - [202412] Fix setup_interfaces fixture and related bgp tests for v6 topo (sonic-net#941)
| * fda4ac2 Chuan Wu 260124:0520 - Enhance PTF function at scale testbed (sonic-net#913)
| * e8edcae gshemesh2 260123:2317 - Manual cherry-pick PR: Adjust test_bgpmon.py to handle running over ipv6 only topologies sonic-net#21377 (sonic-net#888)
| * 24b3dc7 gshemesh2 260123:2313 - Manual cherry-pick of PR: Adjusting test_stress_arp is it can handle cases of ipv6-only topologies . sonic-net#20932 (sonic-net#923)
```

----
#### AI description  (iteration 1)
#### PR Classification
This pull request is a feature enhancement that adds and refines IPv6 compatibility across multiple BGP, ACL, and related test modules.

#### PR Summary
The changes extend the testing framework and helper utilities to support IPv6-only and dual-stack environments by conditionally switching commands, packet generation, and route verifications.
- `tests/bgp/test_bgpmon.py` and `tests/bgp/test_bgp_update_timer.py`: Updated function signatures and logic to construct, verify, and clean up IPv6 routes and packets (e.g., using /64 and /128 prefixes).
- `tests/bgp/test_bgp_suppress_fib.py` and `tests/bgp/test_bgp_peer_shutdown.py`: Modified commands and packet filters to support IPv6 addressing and proper neighbor session handling.
- `tests/common/...` files and generators: Integr...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants