Skip to content

Add L3VNI cross-DC test cases (L3VNI_dci:1-2, 6-30, 39-43, 91, 101) + 5 trigger classes#12

Open
bpar9 wants to merge 64 commits intomasterfrom
devin/1773125707-l3vni-cross-dc-tests
Open

Add L3VNI cross-DC test cases (L3VNI_dci:1-2, 6-30, 39-43, 91, 101) + 5 trigger classes#12
bpar9 wants to merge 64 commits intomasterfrom
devin/1773125707-l3vni-cross-dc-tests

Conversation

@bpar9
Copy link
Copy Markdown
Owner

@bpar9 bpar9 commented Mar 10, 2026

Description of PR

Summary: Adds thirty-two L3VNI cross-DC test cases to the VXLAN DCI test suite with control plane and data plane verification aligned to the DCI Solution Testplan, driven by detailed configuration context from l3vni_config_diff.txt and vxlan_dci_input_file.yaml. Enables cross-DC L3 stream generation. Incorporates L3VNI-specific BGW configuration (SONiC CLI + FRR) into config_bgw_nodes() and leaf VRF route-target imports from local BGWs into config_l2l3vni(), so all L3VNI configuration from l3vni_config_diff.txt is applied before any L3VNI test verification runs. All test case verification follows a consistent pattern using verify_base_setup_bgw() with testcase-specific check lists. BGW L3VNI VLAN/VRF bindings are now declared in vxlan_dci_input_file.yaml and consumed by the helper code. Cross-DC L3 traffic now uses 23 specific DCI flows from l3vni_dci_traffic_flows.txt instead of generic full-mesh filtering. Type-5 route verification uses unified boolean-based model with differentiated leaf/BGW expectations — has_remote_class_path is n/a on leaf nodes (leaves don't see BGW ASNs), has_local_class_path counts self-originated routes (weight 32768 + empty AS path) for single-leaf DCs, FIB check accepts both connected (C>*) and BGP (B>*) routes. RIB/FIB install verification is integrated; strict checks removed. Old basic/detail Type-5 verification functions have been removed per reviewer feedback. Additionally, adds five trigger test classes (restart, reload, reboot, power cycle, BGP reset, DCI link flap/shut, VLAN/PortChannel operations) with L3VNI traffic verification. All trigger test classes now verify traffic only after the trigger — pre-trigger traffic verification has been removed per reviewer feedback.

Link to Devin Session: https://cisco-demo.devinenterprise.com/sessions/8fabef50d24246fd9573c19e56e512c6
Requested by: @bpar9

Changes:

vxlan_dci_input_file.yaml:

  • Added l3vni sections to all 5 BGW nodes with uniform cross-DC VRF-VNI values: All BGWs: Vrf101→10101, Vrf102→10102 (cross-DC L3VNI)
  • All BGWs share vlan_bindings: [11, 12, 13, 14, 15, 101] for Vrf101 and [16, 17, 18, 19, 20, 102] for Vrf102
  • Traffic rate_percent updated from 0.001 to 0.1 for both l2l3 and bum traffic parameters

test_vxlan_dci.py:

  • ENABLE_L3_ACROSS_DCI flag changed from FalseTrue (line 1048) to enable cross-DC L3 stream generation in tgen_preconfig
  • tgen_preconfig() L3 traffic generation refactored to split within-DC and cross-DC into separate streams: within-DC uses generic find_l3_traffic_endpoints + filtering, cross-DC uses new find_l3_dci_traffic_endpoints with 23 specific flows
  • L3 IPv4/IPv6 traffic items now split into: Within-DC: L3-SH-WITHIN (orphan sources) and L3-MH-WITHIN (PortChannel sources); Cross-DC: L3-SH-CROSS (orphan sources) and L3-MH-CROSS (PortChannel sources)
  • stream_handles['l3_v4'] and stream_handles['l3_v6'] are now lists (accumulating within-SH, within-MH, cross-SH, and cross-MH streams) instead of single traffic item returns
  • Added dci_flap_continuous stream creation in tgen_preconfig — continuous cross-DC L2 IPv4/IPv6 streams (SH + MH per VLAN) with transmit_mode='continuous' for DCI link flap/shut tests
  • Added _dci_merge_flap_continuous() helper to merge created stream handles into stream_handles['dci_flap_continuous'] dict
  • Updated _flatten_stream_ids and stream summary to handle dci_flap_continuous type
  • config_l2l3vni() now filters out BGW nodes ('bgw' not in n) before applying leaf-specific configuration. BGW nodes are configured separately via config_bgw_nodes().
  • config_l2l3vni() updated to apply leaf VRF route-target imports from local BGWs (l3vni_leaf_rt_dci) after bgp_l3vni_config_dci
  • unconfig_l2l3vni() also filters BGW nodes and removes leaf RT imports (delete_l3vni_leaf_rt_dci) before delete_bgp_l3vni_config_dci
  • config_bgw_nodes() now retrieves and passes bgp_info to route-map and L3VNI configuration
  • unconfig_bgw_nodes() updated to remove L3VNI BGW config in reverse order
  • pretest fixture verifies remote VTEPs on all nodes (including BGW nodes)
  • verify_base_setup_bgw() enhanced with six new checks for L3VNI verification and Type-5 CLI fetch consolidation
  • verify_traffic() enhanced with simultaneous=False parameter for dual-stack tests
  • Twenty-five new test methods in TestVxlanDCIBase (L3VNI_dci:1-2, 6-30)
  • Five new trigger test classes (inserted between verify_base_setup_bgw and TestVxlanDCIBase): TestVxlanRestartTriggers, TestVxlanReloadTriggers, TestVxlanBGPTriggers, TestVxlanInterfaceTriggers, TestVxlanAddRemoveVlan
  • All trigger classes now include L3VNI traffic verification and verify traffic only AFTER the trigger
  • ALL_CHECKS updated: removed rib_fib (integrated into evpn_type5_comprehensive), updated docstring for unified Type-5 model

vxlan_helper.py:

  • find_l3_traffic_endpoints(host_info_dict, config_dict): Enhanced to generate both SH (orphan P1) and MH (PortChannel) source flows for within-DC L3 traffic
  • find_l3_dci_traffic_endpoints(host_info_dict, config_dict, vrf_vlan_dict=None): New function to generate exactly 23 L3 DCI cross-DC traffic flows
  • get_dci_link_interfaces(dut, test_cfg): New helper to retrieve DCI-facing link interfaces on BGW nodes
  • get_evpn_vni(dut): Fixed to use st.show(dut, cmd, type='vtysh', skip_tmpl=True) instead of config_dut()
  • DELETED: _verify_type5_leaf() and _verify_type5_bgw() — replaced with unified _verify_type5_unified()
  • NEW: _verify_type5_unified(): Single verification function using 10 boolean checks on both leaf and BGW nodes (present, has_best, has_rt, has_et, has_rmac, has_ipv6_nh, installed_in_rib, installed_in_fib, has_local_class_path, has_remote_class_path)
  • verify_evpn_type5_comprehensive(): Refactored to fetch RIB per-VRF and call unified function; RIB/FIB check now inline
  • DELETED: verify_evpn_type5_rib_fib() — RIB/FIB now integrated into comprehensive check
  • get_expected_type5_routes(): Unified model for both node types — emits local_leaf_asns and remote_bgw_asns for all entries (leaf nodes get empty remote_bgw_asns set to make has_remote_class_path n/a)
  • verify_type5_route_presence_dci(): Updated to use boolean checks only (removed path count range validation)
  • Multiple L3VNI configuration and verification functions added/enhanced

Configuration Context Used:

Per l3vni_config_diff.txt and vxlan_dci_input_file.yaml:

  • VRF-VNI bindings: All BGWs use uniform cross-DC L3VNI (Vrf101→10101, Vrf102→10102)
  • Leaf route-target imports per DC (e.g., DC1 leafs import 65102:10101, 65103:10101)
  • BGW ASN assignments: DC1 BGW1=65102, DC1 BGW2=65103, DC2 BGW1=65104, DC2 BGW2=65105, DC3 BGW1=65106
  • RT-REWRITE route-maps on BGWs rewrite Type-5 routes with IPv4 (WAN VIP) or IPv6 (DC VIP) next-hops
  • L3VNI traffic flows: 23 specific flows per l3vni_dci_traffic_flows.txt

Reviewer start: Begin with the unified Type-5 verification model in vxlan_helper.py (_verify_type5_unified, verify_evpn_type5_comprehensive, get_expected_type5_routes), then review the removal of rib_fib check from test_vxlan_dci.py (verify_base_setup_bgw), then review the continuous traffic integration in TestVxlanInterfaceTriggers (parametrized methods + helper methods), then review the dci_flap_continuous stream creation in tgen_preconfig, then review the modular verification infrastructure in verify_base_setup_bgw(), then review the twenty-five L3VNI test methods, then review the other four trigger test classes, then examine the L3VNI config application, then review the DCI traffic endpoint generation, and finally review the remaining helper functions.

Updates since last revision

Leaf Type-5 Self-Originated Route Fix (commit d4f6f08) — per reviewer comment #4151592506:

Fixed has_local_class_path verification failure on single-leaf DCs (e.g., DC3 with only leaf0_dc3):

Problem: On leaf0_dc3 (ASN 65206, only leaf in DC3), ALL 20 Type-5 prefixes failed has_local_class_path check. All routes are self-originated with weight=32768 and empty AS path (?). The local_class check was looking for ASN 65206 in the AS path, but self-originated routes don't include the originator's own ASN in the AS path.

Fix: In _verify_type5_unified(), self-originated paths (weight=32768 + empty AS path) now count as local-class:

elif not as_path.strip() and path.get('weight') == '32768':
    # Self-originated route: empty AS path + weight 32768
    has_local_class = True

Rationale: The originating leaf IS a local-class source even though its own ASN doesn't appear in the AS path. This fix handles single-leaf DCs where ALL routes are self-originated, while multi-leaf DCs (DC1 with 4 leafs) continue to match on peer leaf ASNs in AS path as before.

Docstring update: has_local_class_path description now reads: "path from local-DC source (leaf ASN in AS path, or self-originated: weight 32768 + empty AS path)"

Leaf Type-5 Verification Fix (commit 3607c39) — per reviewer comment #4151292985:

Fixed two leaf Type-5 verification failures identified in lab testing (type5_not_working.txt):

1. has_remote_class_path now n/a on leaf nodes:

  • Root cause: Leaves don't see BGW ASNs in EVPN Type-5 AS paths. BGW nodes re-originate cross-DC routes, but leaf nodes receive them via spine route reflectors with only same-DC leaf ASNs visible in the AS path.
  • Fix: Changed get_expected_type5_routes() to set remote_bgw_asns = set() for leaf nodes, making has_remote_class_path check n/a instead of required yes.

2. installed_in_fib now accepts connected routes:

  • Root cause: Locally owned /24 subnets appear as connected routes (C>* 80.11.0.0/24 is directly connected, Vlan11) in RIB, not BGP routes (B>*). Old FIB regex r'B[*>]+.*prefix' only matched BGP routes.
  • Fix: Changed FIB regex to protocol-agnostic r'[A-Z]\S*>\S*\s+prefix' to accept any protocol code with > (FIB-selected) flag. Now correctly matches both:
    • C>* 80.11.0.0/24 — locally owned subnet (connected)
    • B>* 80.12.0.40/32 — remote learned route (BGP)

Leaf verification logic after fix:

  • present = yes
  • has_best = yes
  • has_rt = yes
  • has_et = yes
  • has_rmac = yes
  • has_ipv6_nh = yes
  • has_local_class_path = yes (matches ASN in path OR self-originated with weight 32768)
  • has_remote_class_path = n/a (not required on leaf)
  • installed_in_rib = yes (IPv4) / n/a (IPv6)
  • installed_in_fib = yes (accepts C>* connected or B>* BGP) / n/a (IPv6)

BGW verification logic unchangedhas_remote_class_path still required (other-DC BGW ASNs visible in paths), FIB regex also protocol-agnostic for safety.

Pre-Trigger Traffic Verification Removal (commit c026de7) — per reviewer comment #4151220809:

Removed all pre-trigger verify_traffic() calls from the three trigger test classes per vallabh78's feedback: "lets remove verify_traffic before the trigger... we need to verify the traffic only after the trigger."

Changes across 7 test methods:

TestVxlanRestartTriggers:

  • test_leaf_restart_process — removed Step 2 pre-trigger traffic verification (lines 2825-2835), renumbered subsequent steps (Step 3→2, Step 4→3, etc.)
  • test_dci_restart_process — removed Step 2 pre-trigger cross-DC traffic verification (lines 2956-2965), renumbered subsequent steps

TestVxlanReloadTriggers:

  • test_config_reload — removed Step 2 pre-trigger traffic verification (lines 3119-3129), updated docstring from 9 steps to 8 steps
  • test_reboot — removed Step 2 pre-trigger traffic verification, updated docstring from 11 steps to 10 steps, renumbered (Step 3→2, Step 4→3, ..., Step 8→7)
  • test_power_cycle — removed Step 3 pre-trigger traffic verification, updated docstring from 12 steps to 11 steps, renumbered (Step 4→3, Step 5→4, ..., Step 12→11)

TestVxlanBGPTriggers:

  • test_bgp_hard_reset — removed Step 1b pre-trigger traffic verification block with try/except (lines 3646-3661), updated docstring from 7 steps to 6 steps
  • test_bgp_soft_reset — removed Step 1b pre-trigger traffic verification block, updated docstring from 7 steps to 6 steps, renumbered subsequent steps

Rationale: Pre-trigger traffic verification is redundant with base setup verification. Post-trigger traffic verification is sufficient to confirm system recovery after the trigger action.

Net change: −207 lines, +107 lines

Unified Type-5 Verification Model (commit 3af1bee) — per reviewer comment #4150763753:

Replaced separate _verify_type5_leaf() and _verify_type5_bgw() with unified _verify_type5_unified() using the same 10 boolean checks for every prefix on both leaf and BGW nodes:

Boolean Description
present Prefix exists in Type-5 output
has_best At least one best path selected
has_rt At least one path has Route-Target
has_et At least one path has Encap-Type
has_rmac At least one path has Router-MAC
has_ipv6_nh At least one path has IPv6 next-hop
installed_in_rib Prefix found in show ip route vrf (IPv4 only, n/a for IPv6)
installed_in_fib Prefix has > FIB-selected flag in RIB (any protocol: C>*, B>*, etc.; IPv4 only)
has_local_class_path Path from local-DC source (leaf ASN in AS path, or self-originated: weight 32768 + empty AS path)
has_remote_class_path Path from remote source (BGW: other-DC BGW ASN; Leaf: n/a — not required)

Pass = all required booleans are yes (or n/a if not applicable).

Removed strict checks (all five):

  • best_nh_is_local_vtep
  • best_weight_32768
  • best_path_is_local_leaf
  • exact path_count == N
  • exact RD matching

Additional changes:

  1. RIB/FIB integration: verify_evpn_type5_rib_fib() removed as standalone — RIB/FIB is now part of evpn_type5_comprehensive (fetches show ip route vrf per-VRF and checks inline)
  2. Unified entry structure: get_expected_type5_routes() simplified — both leaf and BGW entries carry local_leaf_asns and remote_bgw_asns (leaf remote = empty set; BGW remote = other-DC BGW ASNs)
  3. Boolean-only validation: verify_type5_route_presence_dci() updated to use boolean checks only (no path count range)
  4. Check list cleanup: rib_fib removed from ALL_CHECKS / CHECK_SETS in verify_base_setup_bgw() since it's now part of evpn_type5_comprehensive
  5. Net code reduction: −574 lines, +186 lines across both files

Type-5 CLI Dump Consolidation (commit 71fdb18) — per reviewer comment #4145122013:

Eliminated duplicate Type-5 route output in logs when both rt_rewrite and evpn_type5_comprehensive checks run on the same BGW node:

  • verify_base_setup_bgw() now fetches show bgp l2vpn evpn route type prefix once per node when either check is needed
  • Passes pre-fetched output via cli_output= kwarg to both verify_rt_rewrite_dci(dut, cli_output=...) and verify_evpn_type5_comprehensive(dut, exp_routes, cli_output=...)
  • Both helper functions accept optional cli_output parameter — if provided they skip CLI fetch, if not they fetch it themselves (backward compatible for standalone calls)

Continuous Traffic Integration (commit 1d54ee3) — per reviewer comment #4145673910:

Integrated vallabh78's continuous traffic functionality into TestVxlanInterfaceTriggers class: 1. dci_flap_continuous stream creation in tgen_preconfig (lines 1410-1492): Creates continuous cross-DC L2 IPv4/IPv6 streams (SH + MH per VLAN) with transmit_mode='continuous'; used by parametrized test_dci_link_trigger method via tgen_handles.get('dci_flap_continuous')
2. Updated _flatten_stream_ids and stream summary loop to handle dci_flap_continuous in traffic types list
3. Replaced TestVxlanInterfaceTriggers class with parametrized version (reduced from 740 lines to ~520 lines): New test_leaf_interface_shut_noshut parametrized by leaf_port_kind (orphan/portchannel) for Solution_dci:23/24; Consolidated 5 separate DCI link test methods into single test_dci_link_trigger parametrized by action (flap/shut) and scope (single/all_one_bgw/all_bgws) for Solution_dci:26-30 / L3VNI_dci:39-43 with continuous traffic verification

BGW Type-5 Path Count Fix (commit 9c6bcf0) — validated against reviewer's lab output file:

Fixed BGW Type-5 route path count calculation in get_expected_type5_routes():

Root cause: Each remote BGW re-originates routes from ALL leaves in its DC under separate RDs. The old formula local_leaf_count + remote_bgw_count only counted the number of remote BGW nodes, not the total paths each BGW carries.

Fix: Changed to per-DC multiplication model:

local_leaf_count + SUM per remote DC of (bgw_count_in_dc × leaf_count_in_dc)

Updated expected path counts:

BGW Location Old (wrong) New (correct) Formula
DC1 BGWs 7 9 4 + (DC2: 2×2) + (DC3: 1×1)
DC2 BGWs 5 11 2 + (DC1: 2×4) + (DC3: 1×1)
DC3 BGW 5 13 1 + (DC1: 2×4) + (DC2: 2×2)

Validation: Tested against real lab output (type5_verification_fix.txt, 1783 lines) showing all 20 prefixes on spine3_dc1_bgw2 have exactly 9 paths. Parser correctly handles *= (ECMP multipath) markers, RT/ET/RMAC attributes, IPv4/IPv6 next-hops, and best path detection.

BGW Path Count Range Check (commit cba8e10) — validated against two new reviewer lab files:

Refined BGW path count from exact match to range-based [min..max] comparison after analyzing two new lab output files:

Issue discovered: Different test sessions showed different path counts for the same topology: - type5_verification_fix.txt: spine3_dc1_bgw2 shows 9 paths per prefix (all-leaf re-origination)

  • evpn_type_5_output.txt: spine2_dc1_bgw1 shows 7 paths per prefix (best-only re-origination)

Both are valid EVPN behaviors depending on BGW re-origination configuration. Exact match actual != expected would fail one scenario or the other.

Solution: Range-based path count model:

  • Min = local_leaf_count + remote_bgw_count (each remote BGW sends best-only path per prefix)
  • Max = local_leaf_count + SUM(bgws × leaves per remote DC) (each remote BGW re-originates all leaf paths)

Updated path count ranges:

BGW Location Min Max Formula
DC1 BGWs 7 9 4 + [3..5] where 5 = (DC2: 2×2) + (DC3: 1×1)
DC2 BGWs 5 11 2 + [3..9] where 9 = (DC1: 2×4) + (DC3: 1×1)
DC3 BGW 5 13 1 + [4..12] where 12 = (DC1: 2×4) + (DC2: 2×2)

Changes in three functions:

  1. get_expected_type5_routes(): emits path_count_min/path_count_max instead of single path_count
  2. _verify_type5_bgw(): range-based comparison — displays [min..max] string in comparison table when actual count is in range, raw number when out of range
  3. verify_type5_route_presence_dci(): range-based comparison pc_min <= actual <= pc_max

Validation: Tested parser against both lab files:

  • add_route_type5_v2.txt: 21 prefixes parsed across 3 nodes (leaf1_dc2, spine2_dc1_bgw1, leaf0_dc1), all attributes extracted correctly
  • evpn_type_5_output.txt: 20 prefixes parsed with log-prefix stripping, all showing 7 paths (within expected range [7..9])
  • Enhanced BGW verification checks (best_path_exists, usable_paths, RT/ET/RMAC) ALL PASS on both files
  • Enhanced leaf verification checks ALL PASS on both files

NOTE: This range-based path count model has been replaced by the unified boolean model (commit 3af1bee) which removes exact path count checks entirely.

Type of change

  • New Test case
    • Skipped for non-supported platforms
  • Test case improvement

Back port request

  • 202205
  • 202305
  • 202311
  • 202405
  • 202411
  • 202505
  • 202511

Approach

What is the motivation for this PR?

Add comprehensive L3VNI cross-DC test coverage to the VXLAN DCI test suite, including control plane verification (Type-5 routes with unified boolean-based validation differentiated for leaf vs BGW nodes, VRF-VNI bindings, RT-REWRITE route-maps, eBGP multihop sessions, integrated RIB/FIB install verification) and data plane verification (L3 inter-VLAN routing across DCs with both SH and MH sources). Implement trigger test classes for service restart, config reload, reboot, BGP session reset, DCI link operations, and VLAN/PortChannel operations with L3VNI traffic recovery verification. Optimize test execution by removing duplicate verification steps and consolidating Type-5 CLI dumps. Integrate continuous traffic functionality for DCI link flap/shut tests to verify traffic resilience during interface changes. Streamline trigger tests by removing redundant pre-trigger traffic verification. Fix leaf Type-5 verification to handle connected routes, remove invalid has_remote_class_path requirement, and treat self-originated routes as local-class paths for single-leaf DCs.

How did you do it?

  1. Added 25 new L3VNI test methods aligned to testplan (L3VNI_dci:1-2, 6-30)
  2. Added 7 new trigger test methods in 2 classes (L3VNI_dci:39-43, 91, 101)
  3. Enhanced verify_base_setup_bgw() with 6 new L3VNI-specific checks and Type-5 CLI consolidation
  4. Implemented unified Type-5 verification with 10 boolean checks (same logic on leaf and BGW): present, has_best, has_rt, has_et, has_rmac, has_ipv6_nh, installed_in_rib, installed_in_fib, has_local_class_path, has_remote_class_path
  5. Differentiated leaf vs BGW expectations: has_remote_class_path is n/a on leaf nodes (leaves don't see BGW ASNs in EVPN Type-5 AS paths); has_local_class_path counts self-originated routes (weight 32768 + empty AS path) for single-leaf DCs; FIB regex protocol-agnostic to accept connected routes (C>*) for locally owned subnets
  6. Removed strict checks that break after restart: best_nh_is_local_vtep, best_weight_32768, best_path_is_local_leaf, exact path_count, exact RD matching
  7. Integrated RIB/FIB into unified check (removed standalone verify_evpn_type5_rib_fib() and rib_fib check list item)
  8. Added L3VNI BGW configuration (SONiC CLI + FRR) in config_bgw_nodes() per l3vni_config_diff.txt
  9. Added leaf VRF route-target imports from local BGWs in config_l2l3vni()
  10. Created 23 specific L3 DCI cross-DC traffic flows per l3vni_dci_traffic_flows.txt
  11. Added five trigger test classes with L3VNI traffic verification
  12. Enhanced L3 traffic generation to include both SH (orphan P1) and MH (PortChannel) sources
  13. Added get_dci_link_interfaces() helper to retrieve DCI-facing interfaces from topology config
  14. Removed duplicate verification steps from 19 test methods
  15. Added simultaneous=True parameter to dual-stack tests
  16. Updated traffic rates from 0.001 to 0.1 in YAML config
  17. Separated SH and MH traffic in dual-stack tests using traffic_names filter
  18. Consolidated Type-5 CLI dumps in verify_base_setup_bgw() to avoid duplicate output
  19. Integrated continuous traffic functionality: Added dci_flap_continuous stream creation in tgen_preconfig, replaced TestVxlanInterfaceTriggers with parametrized version
  20. Removed pre-trigger traffic verification from all trigger test classes — traffic is now only verified after the trigger to avoid redundancy with base setup verification

How did you verify/test it?

Code review only. Changes implement L3VNI configuration and verification logic per reference configuration files (l3vni_config_diff.txt, vxlan_dci_input_file.yaml). Test execution requires physical testbed with 3-DC VXLAN DCI topology (9 nodes: 7 leafs + 2 spines + 5 BGWs) and IXIA traffic generator.

Manual verification performed:

  • ASN assignment tracing via Python script confirms correct BGW ASN values (65102-65106) per node sort order
  • WAN VIP generation formula verified against reference config (l3vni_config_diff.txt lines 117, 246, 370, 490, 610)
  • Type-5 route parser tested against three sample FRR outputs (fix_type5_verification.txt, add_route_type5_v2.txt, evpn_type_5_output.txt)
  • BGW path count formula validated against real lab outputtype5_verification_fix.txt (1783 lines, 9 paths), evpn_type_5_output.txt (656 lines, 7 paths), both within expected range [7..9]
  • Parser validation against two new reviewer lab files — all 21+20 prefixes parsed correctly, all attributes (RT/ET/RMAC) extracted, path markers (*, *=, *>) handled, IPv4/IPv6 next-hops detected
  • Unified model validation — syntax check passed, boolean extraction logic verified against lab data
  • Leaf Type-5 fix validated against reviewer lab outputtype5_not_working.txt shows both failures (has_remote_class_path, installed_in_fib) correctly diagnosed and fixed
  • Self-originated route fix — logic review confirms weight 32768 + empty AS path correctly identifies self-originated routes on single-leaf DCs
  • MH flow generation logic verified against l3vni_dci_traffic_flows.txt (flows 9-10, 19-20)
  • Syntax check passed for both modified files (vxlan_helper.py, test_vxlan_dci.py)
  • Type-5 CLI consolidation verified — functions accept optional cli_output kwarg and fall back to CLI fetch if not provided
  • Continuous traffic integration verified — syntax check passed, parametrized test structure follows pytest conventions
  • Pre-trigger removal verified — Python syntax check passed after removing all pre-trigger traffic blocks from 7 test methods

Any platform specific information?

Requires SONiC with:

  • EVPN VXLAN support (L2VNI + L3VNI)
  • Dual VXLAN tunnels (vxlan-dc + vxlan-wan)
  • FRR BGP with RT-REWRITE route-map support
  • EVPN Multi-homing (MH) for trigger tests

Supported testbed topology if it's a new test case?

3-datacenter EVPN-VXLAN DCI topology:

  • DC1: 4 leafs + 2 regular spines + 2 BGW spines
  • DC2: 2 leafs + 2 BGW spines
  • DC3: 1 leaf + 1 BGW spine
  • IXIA traffic generator connected to all leaf nodes

Documentation

Test plan: DCI_Solution_Testplan.xlsx (L3VNI_Testcases sheet)
Reference config: l3vni_config_diff.txt (685 lines, 5 BGW sections)
Traffic flows: l3vni_dci_traffic_flows.txt (23 specific L3 DCI flows)

Human Review Checklist

CRITICAL — Self-Originated Route Detection:

  • Verify elif placement doesn't miss edge cases — the self-originated check is an elif after ASN match check. If a path has weight 32768 AND a non-empty AS path (unlikely but theoretically possible), it would NOT match. Verify this is the intended behavior.
  • Validate on DC3 leaf0_dc3 — the fix assumes DC3 has only 1 leaf with all self-originated routes. Confirm this topology matches the actual testbed.
  • Check for false positives — verify no non-self-originated routes can have weight 32768 + empty AS path combination

CRITICAL — Leaf Type-5 Fix:

  • Verify FIB regex [A-Z]\S*>\S*\s+ doesn't match false positives — should only match route lines with > (FIB-selected) flag; test against various FRR output formats
  • Verify leaf has_remote_class_path is correctly n/a — confirm leaves don't see BGW ASNs in EVPN Type-5 AS paths even after BGW re-origination
  • CRITICAL: BGW verification not validated against lab data — only leaf data provided in type5_not_working.txt and leaf0_dc3_type5.txt; BGW Type-5 verification logic unchanged but may have undiscovered issues

CRITICAL — Unified Type-5 Model:

  • Verify compare_exp_actual_data() handles 'n/a' values correctly — RIB/FIB checks use 'n/a' for IPv6 prefixes, has_remote_class_path uses 'n/a' for leaf nodes
  • Verify path classification logic (local vs remote class) extracts ASNs from AS path correctly — depends on AS path string format from _parse_type5_routes_detailed()
  • Confirm _parse_type5_routes_detailed() populates all required fields — must include rt, et, rmac, next_hop, as_path, weight for boolean checks to work
  • Verify get_expected_type5_routes() ASN collection logiclocal_leaf_asns, local_bgw_asns, remote_bgw_asns must be correctly populated for path classification
  • Check that no other code calls deleted functions_verify_type5_leaf(), _verify_type5_bgw(), verify_evpn_type5_rib_fib() should have zero references
  • Verify verify_type5_route_presence_dci() works on both leaf and BGW — uses same boolean checks as unified model
  • Confirm RIB fetch per-VRF doesn't timeout or failverify_evpn_type5_comprehensive() fetches show ip route vrf X for each VRF in exp_routes

Pre-Trigger Traffic Removal:

  • Verify step renumbering is consistent — check that docstrings, st.banner() calls, and code comments all reflect the same step numbers after pre-trigger removal
  • Confirm all 7 test methods had pre-trigger blocks removed — TestVxlanRestartTriggers (2), TestVxlanReloadTriggers (3), TestVxlanBGPTriggers (2)
  • Verify no accidental removal of base setup verification — only traffic verification should be removed, not base setup checks
  • Confirm post-trigger traffic verification is still present — must verify traffic after the trigger action completes

Traffic and Configuration:

  • Verify continuous traffic logic in test_dci_link_trigger — when scope="all_bgws", traffic is expected to FAIL while shut (negative test), verify this is correctly implemented
  • Review f-string vs .format() consistency — parametrized test uses .format() consistently with rest of codebase (good)
  • Verify _flatten_stream_ids correctly handles dci_flap_continuous — dict type (like l3_v4/l3_v6) vs list type (like bum_SH/bum_MH)
  • Verify WAN VIP values (101.x, 102.x, 103.x) propagate correctly to RT-REWRITE route-maps, Loopback11, vxlan-wan tunnel source
  • Confirm ASN assignment (65102-65106) matches actual testbed node naming convention
  • Confirm L3VNI config (VLAN 101/102, VRF-VNI map, RT-REWRITE route-maps) matches l3vni_config_diff.txt reference
  • Verify 23 L3 DCI traffic flows match l3vni_dci_traffic_flows.txt specification
  • Verify MH (PortChannel) flow generation logic in find_l3_traffic_endpoints()
  • Confirm SH/MH split in tgen_preconfig() correctly filters endpoints
  • CRITICAL: Verify cross-DC L3 SH/MH split logic
  • CRITICAL: Verify st.show(dut, cmd, type='vtysh', skip_tmpl=True) returns raw text output
  • CRITICAL: Verify get_dci_link_interfaces() lookup logic
  • Verify all DCI link tests handle empty DCI interface list gracefully
  • Check that trigger tests properly use verify_base_setup_bgw() with correct node filters
  • Verify traffic_types list in all trigger tests includes both L2VNI and L3VNI types
  • CRITICAL (duplicate removal): Verify pytest execution order guarantees test_base_dci_bringup runs first
  • CRITICAL (simultaneous traffic): Verify stream_id values are unique across l3_v4 and l3_v6
  • Review simultaneous mode hardcodes mode='traffic_item'
  • Verify type5_route_withdrawal test pre-condition
  • CRITICAL (traffic rate): Verify 100x rate increase doesn't cause congestion
  • Review dual-stack SH/MH separation: verify traffic_names correctly filter streams
  • CRITICAL (Type-5 consolidation): Verify pre-fetched type5_cli_output remains valid
  • Verify cli_output kwarg handling in helper functions
  • CRITICAL (parametrized tests): Verify test_cfg['testcases'] dict contains entries for hardcoded tc_idstest_leaf_interface_shut_noshut references "test_portchannel_shut_noshut" and "test_host_interface_shut_noshut_orphan" which must exist in test config
  • CRITICAL (helper methods): Verify test_cfg structure includes required keys_get_leaf_portchannels_by_dc accesses test_cfg.get(dut, {}).get('port_channels'), _get_bgw_dci_interfaces accesses test_cfg['nodes'].get('dc1_bgw') etc.

- Add test_base_dci_l3vni_ipv4_across_dci: L3VNI IPv4 traffic across DCI
- Add test_base_dci_l3vni_ipv6_across_dci: L3VNI IPv6 traffic across DCI
- Add test_base_dci_l3vni_control_plane_across_dci: L3VNI control plane
  verification (VRF-VNI maps, EVPN VNI, BGP EVPN summary, Type-5 routes)
- Add verify_evpn_type5_routes_dci() helper in vxlan_helper.py
- Enable ENABLE_L3_ACROSS_DCI flag for cross-DC L3 stream generation
@devin-ai-integration
Copy link
Copy Markdown

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

bpar9 added 2 commits March 10, 2026 06:59
- L3VNI_dci:1: Full base profile verification (VRF-VNI, VLAN-VNI,
  Type-5 routes on BGWs) + IPv4 traffic across DCI
- L3VNI_dci:2: Type-5 route detail verification (format, L3VNI in
  ext-community, IPv6 VTEP next-hop, RT values) + BGP EVPN summary
  + IPv6 traffic across DCI
- L3VNI_dci:6: eBGP multihop EVPN session verification between BGWs
  across DCs + VRF-VNI maps + EVPN VNI table + Type-5 route exchange

New helpers in vxlan_helper.py:
- verify_evpn_type5_route_detail_dci(): Checks Type-5 route format,
  L3VNI (10101/10102) in extended community, RT values, IPv6 next-hop
- verify_bgp_evpn_multihop_sessions_dci(): Verifies eBGP multihop
  EVPN sessions to remote DC BGWs via OVERLAY_WAN peer-group
…plan

- Rename test_base_dci_l3vni_ipv4_across_dci -> test_base_dci_l3vni_base_profile (L3VNI_dci:1)
- Rename test_base_dci_l3vni_ipv6_across_dci -> test_base_dci_l3vni_type5_route_ipv6_vtep (L3VNI_dci:2)
- Rename test_base_dci_l3vni_control_plane_across_dci -> test_base_dci_l3vni_ebgp_multihop_bgw (L3VNI_dci:6)
- Remove config steps 1-3 from L3VNI_dci:1 docstring (config done by hooks, not test code)
- Remove traffic step from L3VNI_dci:1 (not in testplan description)
- Remove BGP EVPN summary step and traffic step from L3VNI_dci:2 (not in testplan description)
- Update docstrings to match testplan titles exactly
@vallabh78
Copy link
Copy Markdown
Collaborator

In the l3vni_config_diff.txt file we have configuration specific to L3VNI which is missing here. This needs to incorporated before doing testcases verification.

1 similar comment
@bpar9
Copy link
Copy Markdown
Owner Author

bpar9 commented Mar 10, 2026

In the l3vni_config_diff.txt file we have configuration specific to L3VNI which is missing here. This needs to incorporated before doing testcases verification.

…rification

Incorporates the L3VNI-specific configuration for all BGW nodes as part of
the config_bgw_nodes() fixture, ensuring it runs before L3VNI test verification.

SONiC CLI (l3vni_sonic_bgw_dci):
  - VLAN 101/102 creation, VRF add, VRF-VLAN bindings
  - VXLAN map on vxlan-dc and vxlan-wan (cross-DC L3VNI 10101/10102)
  - VRF-VNI map (Vrf101->10101, Vrf102->10102)

FRR config (l3vni_frr_bgw_dci):
  - VRF-VNI bindings (cross-DC L3VNI)
  - BGP extcommunity-lists (RT-WAN-* for leaf routes, RT-DC-* for remote BGW routes)
  - RT-REWRITE-WAN route-map (IPv4 WAN VIP next-hop)
  - RT-REWRITE-DC route-map (IPv6 DC VIP next-hop)
  - Apply route-maps to OVERLAY and OVERLAY_WAN neighbors
  - BGP VRF config with route-target import/export

Helper functions added to vxlan_helper.py:
  - _get_l3vni_bgw_params(): Computes per-BGW L3VNI parameters from topology
  - generate_l3vni_bgw_sonic_config(): SONiC CLI config generator
  - generate_l3vni_bgw_frr_config(): FRR config generator
  - delete_l3vni_bgw_frr_config(): FRR unconfig generator

Addresses PR comment about missing L3VNI configuration from l3vni_config_diff.txt.
@devin-ai-integration
Copy link
Copy Markdown

Addressed: L3VNI configuration from l3vni_config_diff.txt incorporated (commit cba3b5f)

The L3VNI-specific configuration is now applied as part of config_bgw_nodes() before any L3VNI test verification runs.

Two new config features added:

  1. l3vni_sonic_bgw_dci — SONiC CLI config per BGW:

    • VLAN 101/102 creation, VRF add (Vrf101/Vrf102)
    • VRF-VLAN bindings (Vlan11-15→Vrf101, Vlan16-20→Vrf102)
    • VXLAN map on both vxlan-dc and vxlan-wan (cross-DC L3VNI 10101/10102)
    • VRF-VNI map (Vrf101→10101, Vrf102→10102)
  2. l3vni_frr_bgw_dci — FRR config per BGW:

    • VRF-VNI bindings (cross-DC L3VNI)
    • bgp extcommunity-list definitions (RT-WAN-* for leaf routes, RT-DC-* for remote BGW routes)
    • route-map RT-REWRITE-WAN (match leaf routes → set cross-DC VNI, rmac, RT, IPv4 WAN VIP next-hop)
    • route-map RT-REWRITE-DC (match remote BGW routes → set cross-DC VNI, rmac, RT, IPv6 DC VIP next-hop)
    • Apply route-maps: neighbor OVERLAY route-map RT-REWRITE-DC out, neighbor OVERLAY_WAN route-map RT-REWRITE-WAN out
    • BGP VRF config with route-target export (own ASN:cross-DC VNI) and import (remote BGW + local leaf RTs)

All parameters derived dynamically from topology data. Corresponding delete_l3vni_* unconfig added to unconfig_bgw_nodes().

@bpar9
Copy link
Copy Markdown
Owner Author

bpar9 commented Mar 11, 2026

there is additional configuartions in L3VNI_config_diff.txt file specific to L3VNI which is missing here. This needs to incorporated before doing testcases verification.

Per l3vni_config_diff.txt lines 1-39, each leaf node needs to import
route-targets from its local (same-DC) BGW cross-DC L3VNI so that
Type-5 prefix routes from remote DCs are accepted into the leaf's VRF.

Example for DC1 leafs (Vrf101, cross-DC VNI 10101):
  route-target import 65102:10101  (DC1 BGW1)
  route-target import 65103:10101  (DC1 BGW2)

Changes:
- vxlan_helper.py: Add generate_l3vni_leaf_rt_config() and
  delete_l3vni_leaf_rt_config() helpers; register l3vni_leaf_rt_dci
  and delete_l3vni_leaf_rt_dci features in config_feature_dci()
- test_vxlan_dci.py: Apply l3vni_leaf_rt_dci in config_l2l3vni()
  after bgp_l3vni_config_dci; remove in unconfig_l2l3vni() before
  delete_bgp_l3vni_config_dci
@devin-ai-integration
Copy link
Copy Markdown

@bpar9 Addressed the missing L3VNI configurations from l3vni_config_diff.txt in commits cba3b5f and 75434ec:

BGW config (commit cba3b5f) — lines 42-685 of l3vni_config_diff.txt:

  • SONiC CLI: VLAN 101/102, VRF, VRF-VLAN bindings, VXLAN map (vxlan-dc/vxlan-wan) with cross-DC L3VNI 10101/10102, VRF-VNI map
  • FRR: VRF-VNI bindings, extcommunity-lists (RT-WAN-/RT-DC-), RT-REWRITE-WAN/RT-REWRITE-DC route-maps, BGP VRF with route-target import/export
  • Applied in config_bgw_nodes() before test verification

Leaf VRF route-target imports (commit 75434ec) — lines 1-39 of l3vni_config_diff.txt:

  • Each leaf imports cross-DC L3VNI RTs from its local (same-DC) BGWs:
    • DC1 leafs: route-target import 65102:10101, 65103:10101 (Vrf101); 65102:10102, 65103:10102 (Vrf102)
    • DC2 leafs: route-target import 65104:10101, 65105:10101; 65104:10102, 65105:10102
    • DC3 leaf: route-target import 65106:10101; 65106:10102
  • Applied in config_l2l3vni() after bgp_l3vni_config_dci

All parameters derived dynamically from topology data. Cross-DC VNI computed as 10000 + vrf_id.

@vallabh78
Copy link
Copy Markdown
Collaborator

For the verification, we have this verify_base_setup_bgw in the test_vxlan_dci.py file. Can we use that function to verify L3VNI base testcases?

…VNI checks

Per PR review comment: replace manual VRF-VNI and VLAN-VNI verification
loops in L3VNI test cases with calls to verify_base_setup_bgw().

- L3VNI_dci:1: Steps 1-2 (VRF-VNI + VLAN-VNI on all nodes) replaced with
  verify_base_setup_bgw(nodes, checks=['vrf_vni', 'vlan_vni'])
- L3VNI_dci:2: Step 2 (VRF-VNI on BGWs) replaced with
  verify_base_setup_bgw(bgw_nodes, checks=['vrf_vni'])
- L3VNI_dci:6: Step 2 (VRF-VNI on BGWs) replaced with
  verify_base_setup_bgw(bgw_nodes, checks=['vrf_vni'])
- Type-5 route verification remains custom (not available in verify_base_setup_bgw)
@devin-ai-integration
Copy link
Copy Markdown

@vallabh78 Good suggestion. Refactored in commit 8349da3 to reuse verify_base_setup_bgw for the VRF-VNI and VLAN-VNI checks in all three L3VNI test cases:

  • L3VNI_dci:1: verify_base_setup_bgw(nodes, checks=['vrf_vni', 'vlan_vni']) replaces manual loops for Steps 1-2
  • L3VNI_dci:2: verify_base_setup_bgw(bgw_nodes, checks=['vrf_vni']) replaces manual VRF-VNI loop
  • L3VNI_dci:6: verify_base_setup_bgw(bgw_nodes, checks=['vrf_vni']) replaces manual VRF-VNI loop

The Type-5 route verification (Steps 2/3 in L3VNI_dci:1, Step 1 in L3VNI_dci:2) and eBGP multihop session verification (Step 1 in L3VNI_dci:6) remain as custom code since verify_base_setup_bgw doesn't have Type-5 or multihop session checks.

@vallabh78
Copy link
Copy Markdown
Collaborator

can we move the evpn route type 5 verification to verify_base_setup_bgw and call it in the testcase? In that way we can use it in other testcases.

@vallabh78
Copy link
Copy Markdown
Collaborator

We need to add this bgp cache: bgp_info = vxlan_obj.get_bgp_underlay_info_cached() and pass it into l3vni_sonic_bgw_dci and l3vni_frr_bgw_dci in the config_bgw_nodes function. Without this it caused the script to fail and script was unable to configure any BGP related config.

…config_bgw_nodes

Address PR review comments:
1. Move EVPN Type-5 route verification into verify_base_setup_bgw as a
   new 'evpn_type5' check (BGW-only, auto-skipped on leaf nodes).
   Added to ALL_CHECKS, control_plane CHECK_SET, and docstring.
2. Add bgp_info = vxlan_obj.get_bgp_underlay_info_cached() in
   config_bgw_nodes and pass it to l3vni_sonic_bgw_dci and
   l3vni_frr_bgw_dci calls to fix BGP config lookup failures.

Refactored L3VNI test cases to use new evpn_type5 check:
- L3VNI_dci:1: checks=['vrf_vni', 'vlan_vni', 'evpn_type5']
- L3VNI_dci:2: checks=['vrf_vni', 'evpn_type5'] (Step 2)
- L3VNI_dci:6: checks=['vrf_vni', 'evpn_type5'] (merged old Steps 2+4)
@devin-ai-integration
Copy link
Copy Markdown

@vallabh78 Done in commit fbe972b. Added evpn_type5 as a new check in verify_base_setup_bgw:

  • New check in ALL_CHECKS and control_plane CHECK_SET
  • Auto-skips on non-BGW nodes (leaf switches)
  • Calls vxlan_obj.verify_evpn_type5_routes_dci(dut) on BGW nodes

All 3 L3VNI test cases now use it:

  • L3VNI_dci:1: checks=['vrf_vni', 'vlan_vni', 'evpn_type5']
  • L3VNI_dci:2: checks=['vrf_vni', 'evpn_type5'] (Step 2)
  • L3VNI_dci:6: checks=['vrf_vni', 'evpn_type5'] (merged old Steps 2+4)

The detailed Type-5 route verification (verify_evpn_type5_route_detail_dci) in L3VNI_dci:2 Step 1 remains custom since it checks additional attributes (IPv6 next-hop, L3VNI in ext-community, RT values) beyond basic presence.

@devin-ai-integration
Copy link
Copy Markdown

@vallabh78 Fixed in commit fbe972b. Added bgp_info = vxlan_obj.get_bgp_underlay_info_cached() in config_bgw_nodes() and now passing it to both l3vni_sonic_bgw_dci and l3vni_frr_bgw_dci calls:

bgp_info = vxlan_obj.get_bgp_underlay_info_cached()
vxlan_obj.config_feature_parallel(bgw_nodes, 'l3vni_sonic_bgw_dci', dci_enabled=True, bgp_info=bgp_info)
vxlan_obj.config_feature_parallel(bgw_nodes, 'l3vni_frr_bgw_dci', dci_enabled=True, bgp_info=bgp_info)

@vallabh78
Copy link
Copy Markdown
Collaborator

Lets follow the same pattern for all testcases and use verify_base_setup_bgw and call specific verification based on testcase requirement.

…t pattern

Add three new reusable checks to verify_base_setup_bgw:
- evpn_type5_detail: Detailed Type-5 route verification (format, L3VNI, RT, IPv6 next-hop)
- ebgp_multihop: eBGP multihop EVPN session verification between BGWs across DCs
- evpn_vni: EVPN VNI table verification (L3 VNIs on BGW nodes)

Refactor L3VNI_dci:2 and L3VNI_dci:6 to use verify_base_setup_bgw for all
verification instead of custom loops, following the same pattern as L3VNI_dci:1.

L3VNI_dci:2: checks=['evpn_type5_detail', 'vrf_vni', 'evpn_type5']
L3VNI_dci:6: checks=['ebgp_multihop', 'vrf_vni', 'evpn_type5', 'evpn_vni']
@devin-ai-integration
Copy link
Copy Markdown

@vallabh78 Done in commit 9d286a7. All three L3VNI test cases now follow the same pattern — every verification goes through verify_base_setup_bgw with testcase-specific checks:

Three new reusable checks added to verify_base_setup_bgw:

  • evpn_type5_detail: Detailed Type-5 route verification (format, L3VNI in ext-community, RT values, IPv6 next-hop) — BGW only
  • ebgp_multihop: eBGP multihop EVPN session verification between BGWs across DCs — BGW only
  • evpn_vni: EVPN VNI table verification (L3 VNIs) — BGW only

Refactored test cases:

  • L3VNI_dci:1: verify_base_setup_bgw(nodes, checks=['vrf_vni', 'vlan_vni', 'evpn_type5']) (unchanged)
  • L3VNI_dci:2: verify_base_setup_bgw(bgw_nodes, checks=['evpn_type5_detail', 'vrf_vni', 'evpn_type5'])
  • L3VNI_dci:6: verify_base_setup_bgw(bgw_nodes, checks=['ebgp_multihop', 'vrf_vni', 'evpn_type5', 'evpn_vni'])

No more custom verification loops in any L3VNI test case.

@vallabh78
Copy link
Copy Markdown
Collaborator

can we add the L3VNI vlan vrf binding in the vxlan_dci_input_file.yaml for the BGW's? beliw is an example
spine3_dc1_bgw2
l3vni:
- vrf_id: 101
vlan_bindings: [11, 12, 13, 14, 15, 101]
vxlan_id: 5101
- vrf_id: 102
vlan_bindings: [16, 17, 18, 19, 20, 102]
vxlan_id: 5102

@devin-ai-integration
Copy link
Copy Markdown

@vallabh78 Done in commit 401dc90. Added l3vni sections to all 5 BGW nodes in vxlan_dci_input_file.yaml with per-BGW VRF-VNI values from l3vni_config_diff.txt:

BGW Vrf101 vxlan_id Vrf102 vxlan_id
spine2_dc1_bgw1 10101 10102
spine3_dc1_bgw2 5101 5102
spine0_dc2_bgw1 5101 7102
spine1_dc2_bgw2 5101 5102
spine0_dc3_bgw1 5101 9102

All BGWs have the same vlan_bindings: [11, 12, 13, 14, 15, 101] for Vrf101 and [16, 17, 18, 19, 20, 102] for Vrf102.

Also updated _get_l3vni_bgw_params() in vxlan_helper.py to read BGW l3vni data directly from the YAML instead of deriving from a reference leaf. The FRR VRF-VNI binding (vrf Vrf101 / vni <value>) now uses the per-BGW vxlan_id from YAML rather than always using the cross-DC VNI (10101/10102).

@vallabh78
Copy link
Copy Markdown
Collaborator

So all BGW's use. Update it in the yaml file

vrf Vrf101
vni 10101
exit-vrf
vrf Vrf102
vni 10102
exit-vrf

@vallabh78
Copy link
Copy Markdown
Collaborator

    # Determine test case number and ID
    test_map = {
        ("flap", "single"): (26, "test_dci_link_flap_1_link"),
        ("shut", "single"): (27, "test_dci_link_shut_1_link"),
        ("flap", "all_one_bgw"): (28, "test_dci_link_flap_all_interfaces"),
        ("shut", "all_one_bgw"): (29, "test_dci_link_shut_all_interfaces_one_bgw"),
        ("shut", "all_bgws"): (30, "test_dci_link_shut_all_interfaces_all_bgw"),
    }
    test_num, tc_id = test_map[(action, scope)]
    test_cfg['tc_id'] = tc_id
  tc_cfg = get_tc_params(tc_id)

E NameError: name 'get_tc_params' is not defined

cisco/tortuga/solution/test_vxlan_dci.py:4317: NameError
=============================== warnings summary ===============================
cisco/tortuga/solution/test_vxlan_dci.py: 32 warnings
/data/sonic-test/sonic-mgmt/spytest/utilities/parallel.py:139: DeprecationWarning: currentThread() is deprecated, use current_thread() instead
return threading.currentThread()

@devin-ai-integration
Copy link
Copy Markdown

@vallabh78 Good catch — fixed in commit e2d57ab. Both get_tc_params(tc_id) calls in test_leaf_interface_shut_noshut and test_dci_link_trigger were missing the vxlan_obj. prefix. Now corrected to vxlan_obj.get_tc_params(tc_id) to match the pattern used everywhere else in the file.

@vallabh78
Copy link
Copy Markdown
Collaborator

wanted to update the type5 verification.
Leaf
expected prefixes present
best path exists
attributes present
installed in RIB/FIB
at least one usable path per prefix
optionally at least one re-originated/BGW-style path if your topology should import them
BGW
expected prefixes present
best path exists
attributes present
installed in RIB/FIB
at least one local-side path exists
at least one remote-side path exists

…ists, RIB/FIB on all nodes

Leaf model enhancements (_verify_type5_leaf):
- Added RT/ET/RMAC attribute checks (at least one path per prefix)
- Added usable path check (at least one valid path with next-hop)
- Added optional BGW-style (re-originated) path detection
  (only checked when bgw_asns provided in exp_route)

BGW model enhancements (_verify_type5_bgw):
- Added explicit best_path_exists check column

RIB/FIB install check (verify_base_setup_bgw):
- Removed BGW-only restriction; now runs on both leaf and BGW nodes
- Both node types verify tenant prefixes in show ip route vrf output
@devin-ai-integration
Copy link
Copy Markdown

@vallabh78 Enhanced Type-5 verification models per your feedback in commit 6c3dc88.

Leaf model enhancements (_verify_type5_leaf):

# Check How
1 prefix present prefix found in parsed output
2 best path exists best path (>) detected
3 attributes present at least one path has RT, ET, RMAC (3 separate columns)
4 usable path at least one valid path (*) with a next-hop
5 best NH = local VTEP best path next-hop matches leaf's IPv6 loopback
6 best weight = 32768 locally originated best path weight
7 BGW-style path (optional) at least one path with BGW ASN in AS path (only checked when bgw_asns provided in expected route data)

Checks 3, 4, 7 are new additions.

BGW model enhancements (_verify_type5_bgw):

# Check How
1 prefix exists prefix found in parsed output
2 path count matches actual count == DC-specific expected count
3 best path exists best path (>) detected (new explicit check)
4 local-site leaf path at least one path with local leaf ASN
5 remote BGW path at least one path with remote BGW ASN
6 best path is local leaf best path AS is local leaf, not remote BGW
7 RT, ET, RMAC at least one path has each attribute
8 IPv6 next-hop at least one path has IPv6 VTEP NH

Check 3 is new (was previously implicit in check 6).

RIB/FIB install check:

  • Now runs on both leaf and BGW nodes (previously BGW-only)
  • Verifies show ip route vrf Vrf101/Vrf102 has all expected tenant subnet prefixes as BGP routes

@vallabh78
Copy link
Copy Markdown
Collaborator

check this file and see if we can verify type5 routes correctly on leaf and BGW.
type5_verification_fix.txt

The BGW path count formula was: local_leaf_count + remote_bgw_count
This gave 7 for DC1 BGW (4 leaves + 3 remote BGWs), but actual lab
output shows 9 paths per prefix.

Root cause: each remote BGW re-originates routes from ALL leaves in
its DC under separate RDs. So the correct formula is:
  local_leaf_count + SUM(bgw_count_in_dc * leaf_count_in_dc)

For DC1 BGW: 4 + (DC2: 2*2) + (DC3: 1*1) = 4 + 4 + 1 = 9

Validated against real lab output from type5_verification_fix.txt:
- Parser correctly handles *= (ECMP multipath) markers
- All 20 prefixes on DC1 BGW2 show exactly 9 paths
- Leaf verification: RT/ET/RMAC, best-path weight=32768, local VTEP all pass
@devin-ai-integration
Copy link
Copy Markdown

@vallabh78 Analyzed type5_verification_fix.txt (1783 lines of real lab output) and validated the Type-5 parser and verification logic. Here are the findings:

Parser validation (_parse_type5_routes_detailed) — ALL PASS:

  • leaf0_dc1: 20 prefixes parsed, all with RT/ET/RMAC attributes, best path with weight=32768, local VTEP next-hop (2000:1::1) ✓
  • leaf0_dc2: 20 prefixes, 4 paths each, best path correctly identified ✓
  • spine3_dc1_bgw2 (BGW): 20 prefixes, 9 paths each, local-leaf (IPv6 NH) + remote-BGW (IPv4 NH 102.102.102.102) paths correctly distinguished ✓
  • *= (ECMP multipath) marker: Correctly parsed as is_valid=True, is_best=False — 140 valid non-best paths found on BGW ✓

BGW path count issue — IDENTIFIED and FIXED (commit 9c6bcf0):

The verification output showed expected_path_count=7 but actual=9 for ALL 20 prefixes on spine3_dc1_bgw2.

Root cause: get_expected_type5_routes() used local_leaf_count + remote_bgw_count (4 + 3 = 7), but each remote BGW re-originates routes from ALL leaves in its DC under separate RDs. The correct formula is:

local_leaf_count + SUM per remote DC of (bgw_count × leaf_count)

For DC1 BGW2: 4 + (DC2: 2×2) + (DC3: 1×1) = 4 + 4 + 1 = 9

Updated expected path counts:

BGW Location Formula Count
DC1 BGWs 4 + (2×2) + (1×1) 9
DC2 BGWs 2 + (2×4) + (1×1) 11
DC3 BGW 1 + (2×4) + (2×2) 13

Leaf verification — ALL PASS (no changes needed):

  • RT/ET/RMAC attribute detection working correctly
  • Usable path check (valid path with next-hop) working correctly
  • Best path NH = local VTEP with weight=32768 verified
  • All 20 locally-originated prefixes on leaf0_dc1 correctly identified

@vallabh78
Copy link
Copy Markdown
Collaborator

can you check if the verification will work for this -
add_route_type5.txt
evpn_type_5_output.txt

Validation of reviewer's two lab output files revealed that the BGW path
count per prefix varies depending on BGW re-origination behavior:
- evpn_type_5_output.txt shows 7 paths (best-only re-origination)
- type5_verification_fix.txt showed 9 paths (all-leaf re-origination)

Changed path count model from exact match to range:
- Min: local_leaves + remote_bgw_count (one path per remote BGW)
- Max: local_leaves + SUM(bgws*leaves per remote DC)
- DC1 BGWs: min=7, max=9

Updated three functions:
- get_expected_type5_routes(): now emits path_count_min/path_count_max
- _verify_type5_bgw(): range-based comparison via matching display strings
- verify_evpn_type5_route_presence_dci(): range-based comparison
@devin-ai-integration
Copy link
Copy Markdown

@vallabh78 Completed comprehensive validation of the Type-5 parser and verification logic against both lab output files (add_route_type5_v2.txt and evpn_type_5_output.txt). Found and fixed a critical path count issue — pushed in commit cba8e10.


Parser Validation Results

File 1: add_route_type5_v2.txt (raw CLI, 3 node sections)

Node Prefixes Path Count Distribution RT/ET/RMAC Best Path
leaf1_dc2 21 20×4 paths + 1×1 path (75.75.75.0/24 newly added) All present All 21 locally originated (weight=32768)
leaf0_dc1 21 20×4 paths + 1×5 paths (75.75.75.0/24 cross-DC) All present 20 local + 1 remote (75.75.75.0 via BGW)
  • 75.75.75.0/24 on leaf1_dc2: 1 path, best, NH=2000:1::6, RT=65205:5101, weight=32768
  • 75.75.75.0/24 on leaf0_dc1: 5 paths via DC VIP (4000:1::1), AS paths correctly parsed (e.g., 65102 65104 65205)

File 2: evpn_type_5_output.txt (log-prefixed, spine2_dc1_bgw1)

Metric Result
Prefixes 20 (10 IPv4 + 10 IPv6)
Path count All 20 prefixes: exactly 7 paths each
Log prefix stripping Working correctly
RT/ET/RMAC extraction All present on all paths
Best path detection All 20 prefixes have best path
IPv4 WAN next-hops 103.103.103.103, 104.104.104.104 (3 WAN paths)
IPv6 DC next-hops 2000:1::1 through 2000:1::4 (4 local leaf paths)

Enhanced BGW verification checks ALL PASS on all 20 prefixes:

  • best_path_exists: yes
  • has_usable_paths: yes
  • has_rt: yes
  • has_et: yes
  • has_rmac: yes

Enhanced leaf verification checks ALL PASS on all 21 prefixes from leaf1_dc2.


Critical Finding: Path Count Discrepancy (Fixed)

The previous type5_verification_fix.txt showed 9 paths per prefix on DC1 BGW, but evpn_type_5_output.txt shows 7 paths. Both are valid depending on BGW re-origination behavior:

  • 7 paths = 4 local DC1 leaves + 3 remote BGWs (each BGW sends best-only path per prefix)
  • 9 paths = 4 local DC1 leaves + (DC2: 2 BGWs × 2 leaves) + (DC3: 1 BGW × 1 leaf)

The exact match actual_path_count != expected_path_count would FAIL for one scenario or the other.

Fix (commit cba8e10): Changed BGW path count from exact match to range check [min..max]:

BGW Location Min Max Formula
DC1 BGWs 7 9 4 + [3..5]
DC2 BGWs 5 11 2 + [3..9]
DC3 BGW 5 13 1 + [4..12]

Updated in all three functions:

  • get_expected_type5_routes(): emits path_count_min/path_count_max
  • _verify_type5_bgw(): range-based comparison
  • verify_evpn_type5_route_presence_dci(): range-based comparison

This ensures verification passes whether BGWs send best-only or all-leaf paths.

@vallabh78
Copy link
Copy Markdown
Collaborator

Can you confirm if this is what we have:
What to check on leaf and BGW with the same logic

For each prefix, derive booleans such as:

present
has_best
has_rt
has_et
has_rmac
has_ipv6_nh
installed_in_rib
installed_in_fib
has_local_class_path
has_remote_class_path

Then pass if all required booleans are true.

Remove these strict checks:

best_nh_is_local_vtep
best_weight_32768
best_path_is_local_leaf
exact path_count == N
exact RD matching

These are the ones that will break after restart even if behavior is acceptable.

…move strict checks

Per reviewer feedback, replace separate _verify_type5_leaf() and
_verify_type5_bgw() with unified _verify_type5_unified() using the
same 10 boolean columns for every prefix on both node types:

  present, has_best, has_rt, has_et, has_rmac, has_ipv6_nh,
  installed_in_rib, installed_in_fib,
  has_local_class_path, has_remote_class_path

Removed strict checks that break after restart:
  - best_nh_is_local_vtep
  - best_weight_32768
  - best_path_is_local_leaf
  - exact path_count == N
  - exact RD matching

RIB/FIB install check is now integrated into
verify_evpn_type5_comprehensive() (no longer a separate check).
verify_type5_route_presence_dci() also uses boolean checks only.
@devin-ai-integration
Copy link
Copy Markdown

@vallabh78 — Confirmed, this is now exactly what we have. Commit 3af1bee unifies both leaf and BGW to use the same 10 boolean checks per prefix:

Boolean Description
present Prefix exists in Type-5 output
has_best At least one best path selected
has_rt At least one path has Route-Target
has_et At least one path has Encap-Type
has_rmac At least one path has Router-MAC
has_ipv6_nh At least one path has IPv6 next-hop
installed_in_rib Prefix found in show ip route vrf (IPv4 only, n/a for IPv6)
installed_in_fib Prefix has FIB-selected flag in RIB output (IPv4 only)
has_local_class_path Path from local-DC source (leaf ASN in AS path)
has_remote_class_path Path from remote source (BGW: other-DC BGW ASN; Leaf: same-DC BGW re-originated)

Pass = all required booleans are yes (or n/a if not applicable).

Removed strict checks (all five):

  • best_nh_is_local_vtep
  • best_weight_32768
  • best_path_is_local_leaf
  • exact path_count == N
  • exact RD matching

Additional cleanup:

  • verify_evpn_type5_rib_fib() removed as standalone — RIB/FIB is now integrated into the unified check
  • verify_type5_route_presence_dci() also updated to use boolean checks only (no path count range)
  • rib_fib removed from ALL_CHECKS / CHECK_SETS in verify_base_setup_bgw() since it's now part of evpn_type5_comprehensive
  • get_expected_type5_routes() simplified — both leaf and BGW entries now carry local_leaf_asns and remote_bgw_asns (leaf remote = same-DC BGW ASNs; BGW remote = other-DC BGW ASNs)

Net: −574 lines, +186 lines across both files.

@vallabh78
Copy link
Copy Markdown
Collaborator

lets remove verify_traffic before the trigger. Check TestVxlanRestartTriggers and TestVxlanReloadTriggers and TestVxlanBGPTriggers. we need to verify the traffic only after the trigger.

@devin-ai-integration
Copy link
Copy Markdown

@vallabh78 Done — removed all pre-trigger verify_traffic() calls from the three trigger test classes per your feedback. Traffic is now only verified after the trigger.

Classes/methods modified (commit c026de7):

TestVxlanRestartTriggers:

  • test_leaf_restart_process — removed Step 2 pre-trigger traffic verification
  • test_dci_restart_process — removed Step 2 pre-trigger traffic verification

TestVxlanReloadTriggers:

  • test_config_reload — removed Step 2 pre-trigger traffic verification
  • test_reboot — removed Step 2 pre-trigger traffic verification
  • test_power_cycle — removed Step 2 pre-trigger traffic verification

TestVxlanBGPTriggers:

  • test_bgp_hard_reset — removed Step 1b pre-trigger traffic verification
  • test_bgp_soft_reset — removed Step 1b pre-trigger traffic verification

Step numbers renumbered accordingly in all methods. Post-trigger traffic verification and base setup verification retained.

@vallabh78
Copy link
Copy Markdown
Collaborator

type5 verification failed again. Make sure you are doing correct verification for both leaf and BGW. Subnets are same but prefixes path may change.
type5_not_working.txt
Suggestion:
For leaf checks, use this logic:

For control-plane type-5 check

For each expected prefix:

present = yes
has_best = yes
has_rt = yes
has_et = yes
has_rmac = yes
has_ipv6_nh = yes
has_local_class_path = yes
Do not require on leaf
has_remote_class_path = yes
installed_in_fib = yes for locally owned /24 or /64 aggregates as BGP route
Instead use:
if prefix is locally owned subnet, accept:
connected in VRF route table
if prefix is remote subnet, then expect:
BGP installed in RIB/FIB

For BGW, don't use same but from the data we have make sure the verification will work

@vallabh78
Copy link
Copy Markdown
Collaborator

how logn will it take for type

type5 verification failed again. Make sure you are doing correct verification for both leaf and BGW. Subnets are same but prefixes path may change. type5_not_working.txt Suggestion: For leaf checks, use this logic:

For control-plane type-5 check

For each expected prefix:

present = yes has_best = yes has_rt = yes has_et = yes has_rmac = yes has_ipv6_nh = yes has_local_class_path = yes Do not require on leaf has_remote_class_path = yes installed_in_fib = yes for locally owned /24 or /64 aggregates as BGP route Instead use: if prefix is locally owned subnet, accept: connected in VRF route table if prefix is remote subnet, then expect: BGP installed in RIB/FIB

For BGW, don't use same but from the data we have make sure the verification will work

where are we on this?

@devin-ai-integration
Copy link
Copy Markdown

@vallabh78 Fixed both leaf Type-5 verification failures (commit 3607c39):

1. has_remote_class_path — now n/a on leaf nodes
Leaves don't see BGW ASNs in EVPN Type-5 AS paths (BGW re-originates but leaf receives via spine RR with only leaf ASNs visible). Changed get_expected_type5_routes() to set remote_bgw_asns = set() for leaf nodes, so this check becomes n/a instead of required yes.

2. installed_in_fib — now accepts connected routes
Locally owned /24 subnets show as C>* 80.11.0.0/24 is directly connected (connected route), not B>* (BGP). Changed FIB regex from r'B[*>]+.*prefix' (BGP-only) to r'[A-Z]\S*>\S*\s+prefix' (any protocol with > FIB flag). This correctly matches both:

  • C>* 80.11.0.0/24 — locally owned subnet (connected)
  • B>* 80.12.0.40/32 — remote learned route (BGP)

Leaf check logic now:

  • present = yes
  • has_best = yes
  • has_rt = yes
  • has_et = yes
  • has_rmac = yes
  • has_ipv6_nh = yes
  • has_local_class_path = yes
  • has_remote_class_path = n/a (not required on leaf)
  • installed_in_rib = yes (IPv4) / n/a (IPv6)
  • installed_in_fib = yes (accepts C>* connected or B>* BGP) / n/a (IPv6)

BGW check logic unchangedhas_remote_class_path still required (other-DC BGW ASNs visible in paths), FIB regex now also protocol-agnostic for safety.

@vallabh78
Copy link
Copy Markdown
Collaborator

type5 verification failing again. i see it fails for leaf0_dc3
leaf0_dc3_type5.txt

type5_meaning.txt

@devin-ai-integration
Copy link
Copy Markdown

Fixed in commit d4f6f08. The root cause on leaf0_dc3:

Problem: has_local_class_path = no for ALL 20 prefixes. DC3 has only 1 leaf (leaf0_dc3, ASN 65206), so ALL Type-5 routes are self-originated with weight=32768 and empty AS path (?). The local_class check was looking for ASN 65206 in the AS path, but self-originated routes don't include the originator's own ASN.

Fix: In _verify_type5_unified(), self-originated paths (weight=32768 + empty AS path) now count as local-class:

elif not as_path.strip() and path.get('weight') == '32768':
    # Self-originated route: empty AS path + weight 32768
    has_local_class = True

This handles single-leaf DCs (DC3) where all routes are self-originated, while multi-leaf DCs (DC1 with 4 leafs) continue to match on peer leaf ASNs in AS path as before.

@vallabh78
Copy link
Copy Markdown
Collaborator

i see the ixia traffic rate is 0.8% and i want it to be 0.1%. can we fix that?

@devin-ai-integration
Copy link
Copy Markdown

Fixed in commit 5f5b3f3. Changed the default fallback rate_percent from 0.8 to 0.1 across all 23 create_traffic_item() calls in tgen_preconfig() — both L2/L3 streams and BUM/continuous streams.

The YAML config already had rate_percent: 0.1, but the Python fallback default was 0.8, so if the YAML lookup missed for any reason, it would use 0.8% instead of 0.1%.

@vallabh78
Copy link
Copy Markdown
Collaborator

can you confirm if its this
l2l3:
pkts_per_burst: 100
rate_percent: 0.1
bum:
pkts_per_burst: 10
rate_percent: 0.1

@devin-ai-integration
Copy link
Copy Markdown

Yes, confirmed. The YAML (vxlan_dci_input_file.yaml lines 857-862) has exactly:

l2l3:
    pkts_per_burst: 100
    rate_percent: 0.1
bum:
    pkts_per_burst: 10
    rate_percent: 0.1

And the Python fallback defaults in tgen_preconfig() now also use 0.1 (commit 5f5b3f3), so even if the YAML lookup misses, it still uses 0.1%.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants