CPU pegged in bgp_recv for each peer

We've noticed an issue where a CPU gets pegged per-peer in `bgp_recv`.

This does not reproduce everywhere. Some systems in the lab do not exhibit this behavior.

Reading through the code I've noticed that we set up our TCP socket differently depending on whether we have initiated the connection or not. 

When we set up the BGP listener, se set up the socket as non-blocking

https://github.com/oxidecomputer/maghemite/blob/0e2fea55db2208602885562b175f1aa28981f149/bgp/src/connection_tcp.rs#L103-L124

If we accept a connection on this socket, it will continue to be non-blocking for the lifetime of the session. That overrides the `set_read_timeout(IO_TIMEOUT)` that we do in `spawn_recv_loop`.

On the other hand if we are the initiator of the connection, non-blocking is not set

https://github.com/oxidecomputer/maghemite/blob/0e2fea55db2208602885562b175f1aa28981f149/bgp/src/connection_tcp.rs#L315-L329

In the `recv_header` function we effectively have a hot spin-loop in the `WouldBlock` case

https://github.com/oxidecomputer/maghemite/blob/0e2fea55db2208602885562b175f1aa28981f149/bgp/src/connection_tcp.rs#L732-L751

We should be setting non-blocking to false in `spawn_recv_loop` so that `WouldBlock` just happens on the IO timeout we intend. Something like this

```diff
diff --git a/bgp/src/connection_tcp.rs b/bgp/src/connection_tcp.rs
index 13edd95..99d7c78 100644
--- a/bgp/src/connection_tcp.rs
+++ b/bgp/src/connection_tcp.rs
@@ -599,6 +599,20 @@ impl BgpConnectionTcp {
             .spawn(move || {
                 let mut conn = conn;

+                if let Err(e) = conn.set_nonblocking(false) {
+                    connection_log_lite!(log,
+                        error,
+                        "failed to set connection to blocking for {peer} (conn_id: {}): {e}",
+                        conn_id.short();
+                        "direction" => direction,
+                        "connection" => format!("{conn:?}"),
+                        "connection_peer" => format!("{peer}"),
+                        "connection_id" => conn_id.short(),
+                        "error" => format!("{e}")
+                    );
+                    return;
+                }
+
                 if !timeout.is_zero()
                     && let Err(e) = conn.set_read_timeout(Some(timeout))
                 {
```

	impl BgpListener<BgpConnectionTcp> for BgpListenerTcp {
	fn bind<A: ToSocketAddrs>(
	addr: A,
	unnumbered_manager: Option<Arc<dyn UnnumberedManager>>,
	) -> Result<Self, Error>
	where
	Self: Sized,
	{
	let addr = addr
	.to_socket_addrs()
	.map_err(\|e\| Error::InvalidAddress(e.to_string()))?
	.next()
	.ok_or(Error::InvalidAddress(
	"at least one address required".into(),
	))?;
	let listener = TcpListener::bind(addr)?;
	listener.set_nonblocking(true)?;
	Ok(Self {
	listener,
	unnumbered_manager,
	})
	}

	// Establish the connection (THIS IS THE BLOCKING CALL)
	let sa: socket2::SockAddr = peer.into();
	let new_conn: TcpStream = match s.connect_timeout(&sa, timeout) {
	Ok(()) => s.into(),
	Err(e) => {
	connection_log_lite!(log,
	warn,
	"connection attempt to {peer} failed: {e}";
	"direction" => ConnectionDirection::Outbound,
	"peer" => format!("{peer}"),
	"error" => format!("{e}")
	);
	return;
	}
	};

	loop {
	if dropped.load(Ordering::Relaxed) {
	return Err(RecvError::Shutdown);
	}
	match stream.read(&mut buf[i..]) {
	Ok(0) => {
	return Err(RecvError::Io(std::io::Error::new(
	std::io::ErrorKind::UnexpectedEof,
	"peer closed connection",
	)));
	}
	Ok(n) => i += n,
	Err(e) if e.kind() == std::io::ErrorKind::WouldBlock => {
	// This condition happens due to the read timeout that
	// is set on the TcpStream object on connect being hit.
	// This is a normal condition and we just jump back to
	// the beginning of the loop, check the shutdown flag
	// and carry on reading if there is no shutdown.
	continue;
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CPU pegged in bgp_recv for each peer #657

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

CPU pegged in bgp_recv for each peer #657

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions