Skip to content

Kernel routes not removed when nexthop becomes unreachable #4535

@gromit1811

Description

@gromit1811

It seems #3152 reappeared in a slightly different form:

The Linux kernel removes routes from its routing table when their nexthop becomes unreachble. Unfortunately, it does not send a RTM_DELROUTE netlink event in this case.

Originally (pre-7.0), FRR noticed the unreachable nexthop and marked the route "inactive", but didn't remove it from its own RIB. This caused it to get stuck in the RIB and even if zebra subsequently learned about another route with the same prefix, it wouldn't use it due to the stuck kernel route.

After merging pull request #3165 (closing #3152), FRR properly handled this situation and removed the kernel route from its RIB when the kernel removed it from its routing table, even without an explicit RTM_DELROUTE.

With current Git master (tried ce746ed), zebra neither removes the route from the RIB nor marks it inactive, so it's even worse than pre-7.0 FRR.

Script to reproduce:

# Create dummy interface for testing, assign address
ip link add frrdummy0 type dummy 2> /dev/null
ip addr flush dev frrdummy0
ip addr add 172.16.0.1/24 dev frrdummy0            
ip link set frrdummy0 up

# Add test route using dummy interface
ip route add 172.16.1.0/24 via 172.16.0.2

# Check kernel routing table and zebra RIB
ip route show 172.16.1.0/24
vtysh -c "show ip route 172.16.1.0/24"

# Make nexthop unreachable
ip addr flush dev frrdummy0
sleep 1

# Check kernel routing table and zebra RIB
ip route show 172.16.1.0/24
vtysh -c "show ip route 172.16.1.0/24"

Expected output is this (from FRR 7.0):

[...]
+ ip route show 172.16.1.0/24
172.16.1.0/24 via 172.16.0.2 dev frrdummy0 
+ vtysh -c 'show ip route 172.16.1.0/24'
Routing entry for 172.16.1.0/24
  Known via "kernel", distance 0, metric 0, best
  Last update 00:00:00 ago
  * 172.16.0.2, via frrdummy0

+ ip addr flush dev frrdummy0
+ sleep 1
+ ip route show 172.16.1.0/24
+ vtysh -c 'show ip route 172.16.1.0/24'
% Network not in table

-> After we make the nexthop 172.16.0.2 unreachable, the route is gone from the kernel and the zebra RIB.

With Git ce746ed, we get this instead:

[...]
+ ip route show 172.16.1.0/24
172.16.1.0/24 via 172.16.0.2 dev frrdummy0 
+ vtysh -c 'show ip route 172.16.1.0/24'
Routing entry for 172.16.1.0/24
  Known via "kernel", distance 0, metric 0, best
  Last update 00:00:00 ago
  * 172.16.0.2, via frrdummy0

+ ip addr flush dev frrdummy0
+ sleep 1
+ ip route show 172.16.1.0/24
+ vtysh -c 'show ip route 172.16.1.0/24'
Routing entry for 172.16.1.0/24
  Known via "kernel", distance 0, metric 0, best
  Last update 00:00:01 ago
  * 172.16.0.2, via frrdummy0

-> Removed from kernel but not from zebra RIB (and not even marked "inactive")

Metadata

Metadata

Assignees

Labels

triageNeeds further investigation

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions