-
Notifications
You must be signed in to change notification settings - Fork 2.2k
net/netns, net/interfaces: explicitly bind sockets to the default interface on all Darwin variants #6566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Definitely open to alternate ideas here. Possible ways to make this less conservative:
|
|
Moving back to draft. Using an exit node is broken, will need to investigate more. |
28b3e91 to
e7f9b92
Compare
e7f9b92 to
5555086
Compare
|
Will wait until the 1.34 branch is cut before merging this (CC @DentonGentry). |
5555086 to
9672337
Compare
dsnet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm no expert in the Apple networking code, but it generally seems sensible.
It's mostly an FYI, though if you wanted, you could run a custom iOS build with this and https://github.com/tailscale/corp/pull/8201 to check if it fixes problem for you. |
9672337 to
ad43ecb
Compare
|
As discussed on Slack, the default-route-when-an-exit-node-is-used mode did not handle the macOS network service order preference. I've redone this to allow an alternate default interface function to be provided (implemented in tailscale/corp#8201), since the iOS/macOS native code has that information. It ends up being quite a but a bit simpler, but PTAL since there's significant changes from the last version. |
…erface on all Darwin variants We were previously only doing this for tailscaled-on-Darwin, but it also appears to help on iOS. Otherwise, when we rebind magicsock UDP connections after a cellular -> WiFi interface change they still keep using cellular one. To do this correctly when using exit nodes, we need to exclude the Tailscale interface when getting the default route, otherwise packets cannot leave the tunnel. There are native macOS/iOS APIs that we can use to do this, so we allow those clients to override the implementation of DefaultRouteInterfaceIndex. Updates #6565, may also help with #5156 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
ad43ecb to
42c3052
Compare
We would replace the existing real implementation of nettype.PacketConn with a blockForeverConn, but that violates the contract of atomic.Value (where the type cannot change). Fix by switching to a pointer value (atomic.Pointer[nettype.PacketConn]). A longstanding issue, but became more prevalent when we started binding connections to interfaces on macOS and iOS (#6566), which could lead to the bind call failing if the interface was no longer available. Fixes #6641 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
We would replace the existing real implementation of nettype.PacketConn with a blockForeverConn, but that violates the contract of atomic.Value (where the type cannot change). Fix by switching to a pointer value (atomic.Pointer[nettype.PacketConn]). A longstanding issue, but became more prevalent when we started binding connections to interfaces on macOS and iOS (#6566), which could lead to the bind call failing if the interface was no longer available. Fixes #6641 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
We would replace the existing real implementation of nettype.PacketConn with a blockForeverConn, but that violates the contract of atomic.Value (where the type cannot change). Fix by switching to a pointer value (atomic.Pointer[nettype.PacketConn]). A longstanding issue, but became more prevalent when we started binding connections to interfaces on macOS and iOS (#6566), which could lead to the bind call failing if the interface was no longer available. Fixes #6641 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
We would replace the existing real implementation of nettype.PacketConn with a blockForeverConn, but that violates the contract of atomic.Value (where the type cannot change). Fix by switching to a pointer value (atomic.Pointer[nettype.PacketConn]). A longstanding issue, but became more prevalent when we started binding connections to interfaces on macOS and iOS (#6566), which could lead to the bind call failing if the interface was no longer available. Fixes #6641 Signed-off-by: Mihai Parparita <mihai@tailscale.com> (cherry picked from commit bdc45b9)
We would replace the existing real implementation of nettype.PacketConn with a blockForeverConn, but that violates the contract of atomic.Value (where the type cannot change). Fix by switching to a pointer value (atomic.Pointer[nettype.PacketConn]). A longstanding issue, but became more prevalent when we started binding connections to interfaces on macOS and iOS (#6566), which could lead to the bind call failing if the interface was no longer available. Fixes #6641 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
We would replace the existing real implementation of nettype.PacketConn with a blockForeverConn, but that violates the contract of atomic.Value (where the type cannot change). Fix by switching to a pointer value (atomic.Pointer[nettype.PacketConn]). A longstanding issue, but became more prevalent when we started binding connections to interfaces on macOS and iOS (tailscale#6566), which could lead to the bind call failing if the interface was no longer available. Fixes tailscale#6641 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
…eerface getting and binding With #6566 we started to more aggressively bind to the default interface on Darwin. We are seeing some reports of the wrong cellular interface being chosen on iOS. To help with the investigation, this adds to knobs to control the behavior changes: - CapabilityDebugDisableAlternateDefaultRouteInterface disables the alternate function that we use to get the default interface on macOS and iOS (implemented in tailscale/corp#8201). We still log what it would have returned so we can see if it gets things wrong. - CapabilityDebugDisableBindConnToInterface is a bigger hammer that disables binding of connections to the default interface altogether. Updates #7184 Updates #7188 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
…erface getting and binding With #6566 we started to more aggressively bind to the default interface on Darwin. We are seeing some reports of the wrong cellular interface being chosen on iOS. To help with the investigation, this adds to knobs to control the behavior changes: - CapabilityDebugDisableAlternateDefaultRouteInterface disables the alternate function that we use to get the default interface on macOS and iOS (implemented in tailscale/corp#8201). We still log what it would have returned so we can see if it gets things wrong. - CapabilityDebugDisableBindConnToInterface is a bigger hammer that disables binding of connections to the default interface altogether. Updates #7184 Updates #7188 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
…erface getting and binding With #6566 we started to more aggressively bind to the default interface on Darwin. We are seeing some reports of the wrong cellular interface being chosen on iOS. To help with the investigation, this adds to knobs to control the behavior changes: - CapabilityDebugDisableAlternateDefaultRouteInterface disables the alternate function that we use to get the default interface on macOS and iOS (implemented in tailscale/corp#8201). We still log what it would have returned so we can see if it gets things wrong. - CapabilityDebugDisableBindConnToInterface is a bigger hammer that disables binding of connections to the default interface altogether. Updates #7184 Updates #7188 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
…erface getting and binding With #6566 we started to more aggressively bind to the default interface on Darwin. We are seeing some reports of the wrong cellular interface being chosen on iOS. To help with the investigation, this adds to knobs to control the behavior changes: - CapabilityDebugDisableAlternateDefaultRouteInterface disables the alternate function that we use to get the default interface on macOS and iOS (implemented in tailscale/corp#8201). We still log what it would have returned so we can see if it gets things wrong. - CapabilityDebugDisableBindConnToInterface is a bigger hammer that disables binding of connections to the default interface altogether. Updates #7184 Updates #7188 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
…erface getting and binding With #6566 we started to more aggressively bind to the default interface on Darwin. We are seeing some reports of the wrong cellular interface being chosen on iOS. To help with the investigation, this adds to knobs to control the behavior changes: - CapabilityDebugDisableAlternateDefaultRouteInterface disables the alternate function that we use to get the default interface on macOS and iOS (implemented in tailscale/corp#8201). We still log what it would have returned so we can see if it gets things wrong. - CapabilityDebugDisableBindConnToInterface is a bigger hammer that disables binding of connections to the default interface altogether. Updates #7184 Updates #7188 Signed-off-by: Mihai Parparita <mihai@tailscale.com> (cherry picked from commit 62f4df3)
With #6566 we added an external mechanism for getting the default interface, and used it on macOS and iOS (see tailscale/corp#8201). The goal was to be able to get the default physical interface even when using an exit node (in which case the routing table would say that the Tailscale utun* interface is the default). However, the external mechanism turns out to be unreliable in some cases, e.g. when multiple cellular interfaces are present/toggled (I have occasionally gotten my phone into a state where it reports the pdp_ip1 interface as the default, even though it can't actually route traffic). It was observed that `ifconfig -v` on macOS reports an "effective interface" for the Tailscale utn* interface, which seems promising. By examining the ifconfig source code, it turns out that this is done via a SIOCGIFDELEGATE ioclt syscall. Though this is a private API, it appears to have been around for a long time (e.g. it's in the 10.13 xnu release at https://opensource.apple.com/source/xnu/xnu-4570.41.2/bsd/net/if_types.h.auto.html) and thus is unlikely to go away. We can thus use this ioctl if the routing table says that a utun* interface is the default, and go back to the simpler mechanism that we had before #6566. Updates #7184 Updates #7188 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
With #6566 we added an external mechanism for getting the default interface, and used it on macOS and iOS (see tailscale/corp#8201). The goal was to be able to get the default physical interface even when using an exit node (in which case the routing table would say that the Tailscale utun* interface is the default). However, the external mechanism turns out to be unreliable in some cases, e.g. when multiple cellular interfaces are present/toggled (I have occasionally gotten my phone into a state where it reports the pdp_ip1 interface as the default, even though it can't actually route traffic). It was observed that `ifconfig -v` on macOS reports an "effective interface" for the Tailscale utn* interface, which seems promising. By examining the ifconfig source code, it turns out that this is done via a SIOCGIFDELEGATE ioctl syscall. Though this is a private API, it appears to have been around for a long time (e.g. it's in the 10.13 xnu release at https://opensource.apple.com/source/xnu/xnu-4570.41.2/bsd/net/if_types.h.auto.html) and thus is unlikely to go away. We can thus use this ioctl if the routing table says that a utun* interface is the default, and go back to the simpler mechanism that we had before #6566. Updates #7184 Updates #7188 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
With #6566 we added an external mechanism for getting the default interface, and used it on macOS and iOS (see tailscale/corp#8201). The goal was to be able to get the default physical interface even when using an exit node (in which case the routing table would say that the Tailscale utun* interface is the default). However, the external mechanism turns out to be unreliable in some cases, e.g. when multiple cellular interfaces are present/toggled (I have occasionally gotten my phone into a state where it reports the pdp_ip1 interface as the default, even though it can't actually route traffic). It was observed that `ifconfig -v` on macOS reports an "effective interface" for the Tailscale utn* interface, which seems promising. By examining the ifconfig source code, it turns out that this is done via a SIOCGIFDELEGATE ioctl syscall. Though this is a private API, it appears to have been around for a long time (e.g. it's in the 10.13 xnu release at https://opensource.apple.com/source/xnu/xnu-4570.41.2/bsd/net/if_types.h.auto.html) and thus is unlikely to go away. We can thus use this ioctl if the routing table says that a utun* interface is the default, and go back to the simpler mechanism that we had before #6566. Updates #7184 Updates #7188 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
With #6566 we added an external mechanism for getting the default interface, and used it on macOS and iOS (see tailscale/corp#8201). The goal was to be able to get the default physical interface even when using an exit node (in which case the routing table would say that the Tailscale utun* interface is the default). However, the external mechanism turns out to be unreliable in some cases, e.g. when multiple cellular interfaces are present/toggled (I have occasionally gotten my phone into a state where it reports the pdp_ip1 interface as the default, even though it can't actually route traffic). It was observed that `ifconfig -v` on macOS reports an "effective interface" for the Tailscale utn* interface, which seems promising. By examining the ifconfig source code, it turns out that this is done via a SIOCGIFDELEGATE ioctl syscall. Though this is a private API, it appears to have been around for a long time (e.g. it's in the 10.13 xnu release at https://opensource.apple.com/source/xnu/xnu-4570.41.2/bsd/net/if_types.h.auto.html) and thus is unlikely to go away. We can thus use this ioctl if the routing table says that a utun* interface is the default, and go back to the simpler mechanism that we had before #6566. Updates #7184 Updates #7188 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
With #6566 we added an external mechanism for getting the default interface, and used it on macOS and iOS (see tailscale/corp#8201). The goal was to be able to get the default physical interface even when using an exit node (in which case the routing table would say that the Tailscale utun* interface is the default). However, the external mechanism turns out to be unreliable in some cases, e.g. when multiple cellular interfaces are present/toggled (I have occasionally gotten my phone into a state where it reports the pdp_ip1 interface as the default, even though it can't actually route traffic). It was observed that `ifconfig -v` on macOS reports an "effective interface" for the Tailscale utn* interface, which seems promising. By examining the ifconfig source code, it turns out that this is done via a SIOCGIFDELEGATE ioctl syscall. Though this is a private API, it appears to have been around for a long time (e.g. it's in the 10.13 xnu release at https://opensource.apple.com/source/xnu/xnu-4570.41.2/bsd/net/if_types.h.auto.html) and thus is unlikely to go away. We can thus use this ioctl if the routing table says that a utun* interface is the default, and go back to the simpler mechanism that we had before #6566. Updates #7184 Updates #7188 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
With #6566 we added an external mechanism for getting the default interface, and used it on macOS and iOS (see tailscale/corp#8201). The goal was to be able to get the default physical interface even when using an exit node (in which case the routing table would say that the Tailscale utun* interface is the default). However, the external mechanism turns out to be unreliable in some cases, e.g. when multiple cellular interfaces are present/toggled (I have occasionally gotten my phone into a state where it reports the pdp_ip1 interface as the default, even though it can't actually route traffic). It was observed that `ifconfig -v` on macOS reports an "effective interface" for the Tailscale utn* interface, which seems promising. By examining the ifconfig source code, it turns out that this is done via a SIOCGIFDELEGATE ioctl syscall. Though this is a private API, it appears to have been around for a long time (e.g. it's in the 10.13 xnu release at https://opensource.apple.com/source/xnu/xnu-4570.41.2/bsd/net/if_types.h.auto.html) and thus is unlikely to go away. We can thus use this ioctl if the routing table says that a utun* interface is the default, and go back to the simpler mechanism that we had before #6566. Updates #7184 Updates #7188 Signed-off-by: Mihai Parparita <mihai@tailscale.com> (cherry picked from commit fa932fe)
We were previously only doing this for tailscaled-on-Darwin, but it also appears to help on iOS. Otherwise, when we rebind magicsock UDP connections after a cellular -> WiFi interface change they still keep using cellular one.
To do this correctly when using exit nodes, we need to exclude the Tailscale interface when getting the default route, otherwise packets cannot leave the tunnel. There are native macOS/iOS APIs that we can use to do this, so we allow those clients to override the implementation of
DefaultRouteInterfaceIndex.Updates #6565, may also help with #5156