bgpd: ignore NEXT_HOP for MP_REACH_NLRI#11623
Conversation
ccd373b to
66e355e
Compare
|
This commit is currently making the assumption that an MP_REACH_NLRI attribute will be parsed after the NEXT_HOP attribute. I looked at RFC 4271 and 4760 but didn't find any explicit mentions of the ordering in which the attribute TLVs are packed in the Update. Is it safe to make this assumption about the order in which attributes are packed into an Update message? Or do I need to update the code to ensure that we still do the right thing even if NEXT_HOP is parsed after MP_REACH_NLRI? |
66e355e to
bd7b017
Compare
|
Never mind my previous questions - I decided to go with a different approach that doesn't involve tweaking flags and is independent of attribute parsing order. Before: After: |
Continuous Integration Result: SUCCESSFULContinuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-6511/ This is a comment from an automated CI system. |
Continuous Integration Result: SUCCESSFULContinuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-6510/ This is a comment from an automated CI system. |
Continuous Integration Result: FAILEDContinuous Integration Result: FAILEDSee below for issues. This is a comment from an automated CI system. Get source / Pull Request: SuccessfulBuilding Stage: SuccessfulBasic Tests: FailedTopotests Ubuntu 18.04 arm8 part 3: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 3: No useful log foundTopotests debian 10 amd64 part 9: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO9DEB10AMD64-6512/test Topology Tests failed for Topotests debian 10 amd64 part 9 Topotests debian 10 amd64 part 4: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO4DEB10AMD64-6512/test Topology Tests failed for Topotests debian 10 amd64 part 4 Topotests Ubuntu 18.04 amd64 part 3: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TP3U1804AMD64-6512/test Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 3 Topotests Ubuntu 18.04 arm8 part 4: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 4: No useful log foundTopotests debian 10 amd64 part 3: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO3DEB10AMD64-6512/test Topology Tests failed for Topotests debian 10 amd64 part 3 Topotests Ubuntu 18.04 arm8 part 9: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 9: No useful log foundTopotests Ubuntu 18.04 amd64 part 5: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO5U18AMD64-6512/test Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 5 Topotests Ubuntu 18.04 i386 part 1: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO1U18I386-6512/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 1 Topotests Ubuntu 18.04 amd64 part 4: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TP4U1804AMD64-6512/test Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 4 Topotests Ubuntu 18.04 i386 part 6: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO6U18I386-6512/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 6 Topotests Ubuntu 18.04 i386 part 8: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO8U18I386-6512/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 8 Topotests Ubuntu 18.04 amd64 part 9: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO9U18AMD64-6512/test Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 9 Topotests Ubuntu 18.04 i386 part 3: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO3U18I386-6512/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 3 Topotests Ubuntu 18.04 i386 part 5: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO5U18I386-6512/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 5 Topotests debian 10 amd64 part 1: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO1DEB10AMD64-6512/test Topology Tests failed for Topotests debian 10 amd64 part 1 Topotests Ubuntu 18.04 arm8 part 5: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 5: No useful log foundTopotests Ubuntu 18.04 i386 part 9: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO9U18I386-6512/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 9 Topotests Ubuntu 18.04 amd64 part 1: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TP1U1804AMD64-6512/test Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 1 Topotests Ubuntu 18.04 i386 part 4: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO4U18I386-6512/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 4 Topotests debian 10 amd64 part 5: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO5DEB10AMD64-6512/test Topology Tests failed for Topotests debian 10 amd64 part 5 Topotests Ubuntu 18.04 arm8 part 6: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 6: No useful log foundTopotests debian 10 amd64 part 6: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO6DEB10AMD64-6512/test Topology Tests failed for Topotests debian 10 amd64 part 6 Topotests Ubuntu 18.04 arm8 part 1: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 1: No useful log foundTopotests Ubuntu 18.04 amd64 part 6: Failed (click for details)Topotests Ubuntu 18.04 amd64 part 6: Unknown Log URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-6512/artifact/TOPO6U18AMD64/ErrorLog/ Topotests Ubuntu 18.04 amd64 part 6: No useful log foundSuccessful on other platforms/tests
|
|
What happens if both (BGP_ATTR_NEXT_HOP and BGP_ATTR_MP_REACH_NLRI) attributes exist? |
That's the exact situation this PR is covering. For prefixes described in the MP_REACH_NLRI we should use the next-hop within the MP_REACH attribute rather than the separate NEXT_HOP attribute. I think it's valid to have both flags set, since an attr set can have both attributes, we just need to make sure that scenario is properly handled. From what I can tell, IPv4 Unicast prefixes are generally encoded in MP_REACH_NLRI when MP-BGP is used rather than as native BGP4 prefixes, so even for v4 uni I think we should use the next-hop within the MP_REACH_NLRI attribute. FRR doesn't include the NEXT_HOP attribute when it sends v4 uni prefixes as MP_REACH_NLRI, so I'm only able to reproduce this with a third party sender (in this case ArcOS). |
|
Seems like a number of tests have failed, so I need to go through the failures and see where this change has caused breakage |
I'm curious if this is not the case when we apply route-map with |
I think if we have an inbound route map configured we should honor that -- the rfc "wants to be applied" just to attributes received from a peer, not to locally configured stuff ... |
riw777
left a comment
There was a problem hiding this comment.
code change looks good ... do we need any topo test changes or additional topo test here?
|
not certain how the ci results relate to this fix ... I'll rerun and see where we are |
|
ci:rerun |
Continuous Integration Result: FAILEDContinuous Integration Result: FAILEDSee below for issues. This is a comment from an automated CI system. Get source / Pull Request: SuccessfulBuilding Stage: SuccessfulBasic Tests: FailedTopotests Ubuntu 18.04 arm8 part 5: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 5: No useful log foundTopotests Ubuntu 18.04 amd64 part 9: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO9U18AMD64-6563/test Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 9 Topotests Ubuntu 18.04 i386 part 9: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO9U18I386-6563/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 9 Topotests debian 10 amd64 part 1: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO1DEB10AMD64-6563/test Topology Tests failed for Topotests debian 10 amd64 part 1 Topotests Ubuntu 18.04 amd64 part 1: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TP1U1804AMD64-6563/test Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 1 Topotests Ubuntu 18.04 i386 part 5: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO5U18I386-6563/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 5 Topotests Ubuntu 18.04 amd64 part 6: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO6U18AMD64-6563/test Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 6 Topotests debian 10 amd64 part 3: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO3DEB10AMD64-6563/test Topology Tests failed for Topotests debian 10 amd64 part 3 Topotests Ubuntu 18.04 arm8 part 6: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 6: No useful log foundTopotests debian 10 amd64 part 6: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO6DEB10AMD64-6563/test Topology Tests failed for Topotests debian 10 amd64 part 6 Topotests Ubuntu 18.04 arm8 part 1: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 1: No useful log foundTopotests debian 10 amd64 part 5: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO5DEB10AMD64-6563/test Topology Tests failed for Topotests debian 10 amd64 part 5 Topotests Ubuntu 18.04 arm8 part 3: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 3: No useful log foundTopotests Ubuntu 18.04 i386 part 4: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO4U18I386-6563/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 4 Topotests debian 10 amd64 part 4: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO4DEB10AMD64-6563/test Topology Tests failed for Topotests debian 10 amd64 part 4 Topotests debian 10 amd64 part 9: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO9DEB10AMD64-6563/test Topology Tests failed for Topotests debian 10 amd64 part 9 Topotests Ubuntu 18.04 amd64 part 3: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TP3U1804AMD64-6563/test Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 3 Topotests Ubuntu 18.04 i386 part 6: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO6U18I386-6563/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 6 Topotests Ubuntu 18.04 amd64 part 4: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TP4U1804AMD64-6563/test Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 4 Topotests Ubuntu 18.04 i386 part 1: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO1U18I386-6563/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 1 Topotests Ubuntu 18.04 arm8 part 9: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 9: No useful log foundTopotests Ubuntu 18.04 arm8 part 4: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 4: No useful log foundTopotests Ubuntu 18.04 amd64 part 5: Failed (click for details)Topotests Ubuntu 18.04 amd64 part 5: Unknown Log URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-6563/artifact/TOPO5U18AMD64/ErrorLog/ Topotests Ubuntu 18.04 amd64 part 5: No useful log foundTopotests Ubuntu 18.04 i386 part 3: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO3U18I386-6563/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 3 Successful on other platforms/tests
|
|
From what I can tell going over the failures, it would seem there are 2 main issues:
(1) seems related to my change and I need to go back and fix it. Right now we seem to be losing the next-hop of redistributed routes, and the display of I'm not sure how (2) would be caused by my change. One example is And the test itself has this topology: Where we can clearly see that 10.0.3.0/24 in VRF neno is not connected + should be reachable via a next-hop of 10.0.30.3, and earlier in the test we see that the Source VRF's (neno's) OSPF RIB reflects this: |
bd7b017 to
ccbc291
Compare
|
Output from latest build: config: |
Continuous Integration Result: FAILEDContinuous Integration Result: FAILEDSee below for issues. This is a comment from an automated CI system. Get source / Pull Request: SuccessfulBuilding Stage: FailedUbuntu 16.04 arm8 build: Failed (click for details)Ubuntu 16.04 arm8 build: No useful log foundSuccessful on other platforms/tests
|
|
Looks like an infra failure: |
|
ci:rerun |
ccbc291 to
371a604
Compare
Continuous Integration Result: FAILEDContinuous Integration Result: FAILEDSee below for issues. This is a comment from an automated CI system. Get source / Pull Request: SuccessfulBuilding Stage: SuccessfulBasic Tests: FailedTopotests Ubuntu 18.04 amd64 part 6: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO6U18AMD64-6666/test Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 6 Topotests Ubuntu 18.04 i386 part 6: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO6U18I386-6666/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 6 Topotests Ubuntu 18.04 arm8 part 6: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 6: No useful log foundTopotests debian 10 amd64 part 6: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO6DEB10AMD64-6666/test Topology Tests failed for Topotests debian 10 amd64 part 6 Topotests Ubuntu 18.04 arm8 part 8: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 8: No useful log foundSuccessful on other platforms/tests
|
Continuous Integration Result: FAILEDContinuous Integration Result: FAILEDSee below for issues. This is a comment from an automated CI system. Get source / Pull Request: SuccessfulBuilding Stage: SuccessfulBasic Tests: FailedTopotests Ubuntu 18.04 amd64 part 6: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO6U18AMD64-6667/test Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 6 Topotests Ubuntu 18.04 i386 part 6: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO6U18I386-6667/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 6 Topotests Ubuntu 18.04 arm8 part 6: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 6: No useful log foundTopotests debian 10 amd64 part 6: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO6DEB10AMD64-6667/test Topology Tests failed for Topotests debian 10 amd64 part 6 Successful on other platforms/tests
|
371a604 to
062f7cd
Compare
|
Latest failures were in test_bgp_vrf_lite_best_path_topo2 and were caused by a missing This has been resolved + tests should all pass this time around (:crossed_fingers:). |
Continuous Integration Result: SUCCESSFULContinuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-6668/ This is a comment from an automated CI system. |
|
Please hold off on merging - it looks like we have the same/similar issues with EVPN routes received with both mp_nexthop* and NEXT_HOP. Will address this issue for EVPN + force push it all in one commit. |
062f7cd to
8c3ea42
Compare
|
Output with the latest changes (fixes to EVPN) below. 3.3.3.3 is the next-hop sent by an FRR peer (MP_REACH nexthop only, no NEXT_HOP), and 172.16.10.10 is the MP_REACH nexthop sent by an ArcOS peer (MP_REACH nexthop + NEXT_HOP). Detailed bgp output (shows attr->mp_nexthop_global_in instead of attr->nexthop): |
Continuous Integration Result: SUCCESSFULContinuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-6784/ This is a comment from an automated CI system. |
Move the logic to check the mp_nexthop_len against v6 lengths into its own macro so we can apply that logic elsewhere on its own without always checking for presence of BGP_ATTR_NEXT_HOP. Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
8c3ea42 to
082cf91
Compare
|
Quick rebase to master avoid any possible conflicts. Ready for review. |
ton31337
left a comment
There was a problem hiding this comment.
I'm thinking if we need an additional topotest to make sure we don't break anything in the future or not? (At least to guarantee the correct behavior)
Continuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-6792/ This is a comment from an automated CI system. |
I agree it would be great to have a topotest for this, but FRR doesn't advertise routes with both NEXT_HOP and the MP_REACH_NLRI nexthop. The reporter of this issue setup an OpenVPN tunnel so I could peer with their ArcOS VM, because the third-party vendor required an NDA to get a copy of their VM... so I'm not sure how we would get a BGP speaker into a topotest that mirrors this behavior. |
RFC 4760 states we SHOULD ignore the NEXT_HOP attribute for BGP Update messages carrying only MP_REACH_NLRI attributes. Thus we should use the Network Address of Next Hop field of the MP_REACH_NLRI as the nexthop. Instead of always looking for BGP_ATTR_NEXT_HOP, this commit ensures: 1) we set mp_nexthop_len to BGP_ATTR_NHLEN_IPV4 for v4 bgp_static routes 2) we check mp_nexthop_len when choosing the nexthop to use for nht 3) we check mp_nexthop_len when choosing the nexthop to send to zebra 4) we check mp_nexthop_len when picking the nexthop to shown by vtysh Reported-by: Binon Gorbutt <binon@aervivo.com> Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
082cf91 to
7226bc4
Compare
We have ExaBGP in topotests :) |
|
I didn't think about ExaBGP... good call. If ExaBGP has the same behavior as ArcOS, then we should observe the same issue w/o these changes just by setting up a simple v4 uni session over v6 GUA next-hops w/ Extended Next-hop enabled. I will poke around and see what tests we have today that use ExaBGP + see if I can replicate similar sender-side behavior. If that doesn't yield anything useful (they don't/can't include both nexthops) then I'm open to further suggestions on how we might be able to test this in a topotest. |
Continuous Integration Result: FAILEDSee below for issues. This is a comment from an automated CI system. Get source / Pull Request: SuccessfulBuilding Stage: SuccessfulBasic Tests: FailedTopotests Ubuntu 18.04 arm8 part 9: Failed (click for details)Topotests Ubuntu 18.04 arm8 part 9: No useful log foundSuccessful on other platforms/tests
|
|
Looks like ExaBGP 4+ has extended-nexthop support, but does not encode NEXT_HOP along with the MP_REACH_NLRI nexthop like ArcOS does: If we want this in a topotest, we'll need to come up with another way to get the encoding we're looking for. |
|
ExaBGP config used in above testing: |
|
ci:rerun |
Continuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-6796/ This is a comment from an automated CI system. |
RFC 4760 states we SHOULD ignore the NEXT_HOP attribute for BGP Update
messages carrying only MP_REACH_NLRI attributes. Thus we should use the
Network Address of Next Hop field of the MP_REACH_NLRI as the nexthop.
Instead of always looking for BGP_ATTR_NEXT_HOP, this commit ensures:
Reported-by: Binon Gorbutt binon@aervivo.com
Signed-off-by: Trey Aspelund taspelund@nvidia.com