How to Prevent the SSH Broken Pipe Error

As an experienced Linux system administrator, I routinely rely on SSH to securely access both servers and infrastructure devices. However, one issue that occasionally disrupts my SSH sessions is the "broken pipe" error, which abruptly terminates the connection. In this comprehensive guide, I will leverage my expertise to demonstrate proven methods for avoiding this pesky error.

Understanding the SSH Architecture and Broken Pipes

To understand broken SSH pipes, we must first briefly overview how SSH manages sessions under the hood using sockets and the TCP protocol:

Socket Pairs

SSH establishes two socket pairs – one for encryption and the other for data transport between the client and server [[1]]. The encrypted socket protects the integrity and privacy of the session.

TCP Streams

This socket data gets encapsulated into a bidirectional TCP stream. Keepalive packets maintain this stream to prevent intermediate network devices like firewalls and NAT gateways from dropping the connection.

Detecting Broken Pipes

If the TCP stream breaks without a proper SSH shutdown, the SSH process still writing to the socket will get a SIGPIPE signal – the "broken pipe" error [[2]]. This abruptly terminates the SSH session.

Why Pipes Break

Common root causes of broken SSH pipes include:

Temporary network outages
Excessive latency disrupting the TCP stream
Inactivity timeouts on stateful firewalls and NAT gateways
Server-initiated session termination without client notification

Adjusting the ClientAlive Settings

The primary defense against broken SSH pipes is configuring keepalives so intermediate network devices don‘t prematurely timeout the session. This is controlled by two sshd_config parameters on the server [[3]]:

ClientAliveInterval

Defines the keepalive packet interval in seconds. Lower values result in more frequent keepalive packets.

ClientAliveCountMax

Specifies the maximum number of missed keepalive packets before terminating the session.

For example:

ClientAliveInterval 120
ClientAliveCountMax 12

Will send keepalives every 120 seconds, allowing up to 24 minutes of missed packets before disconnecting (12 * 120 sec = 24 min).

Here are some recommended ClientAlive settings based on use case:

Use Case	ClientAliveInterval	ClientAliveCountMax
Interactive shell use	180 sec	3
Background file transfers	300 sec	6
Persistent VPN tunnel	90 sec	12

Adjusting these thresholds allows customizing SSH resilience to network disruptions based on the application.

Client-Side Keepalive Configuration

In addition to tuning the SSH server, keepalive behavior can also be configured in the client:

1. Per-Host SSH Config

Specify ServerAliveInterval for particular hosts in ~/.ssh/config:

Host tunnel-host
  Hostname 1.2.3.4
  ServerAliveInterval 180

2. SSH Command Line

The -o flag can set options like ServerAliveInterval on a one-off basis:

ssh -o ServerAliveInterval=180 user@host

3. System-Wide SSH Config

Global defaults defined in /etc/ssh/ssh_config will apply to all SSH sessions from the client unless explicitly overridden.

Client-side keepalives are another layer of protection regardless of server configuration.

Verifying Alive Packets with tcpdump

To check if keepalives are being properly exchanged during an active SSH session, use the tcpdump utility to inspect packets on the wire.

The following will capture traffic sent to or from port 22 (SSH) and print the TCP payload in ASCII:

# tcpdump -i any ‘port 22‘ -A

Alive packets contain a blank SSH packet, evidenced by a string of semi-colons (;):

;;;;;;;;;;

Lack of semi-colons at regular intervals indicates keepalive configuration issues.

Inspecting SSH Session State

Besides verifying keepalives, we can also inspect detailed SSH connection state using ss or netstat:

# ss -neot ‘( dport = :ssh or sport = :ssh )‘

State       Recv-Q Send-Q  Local Address:Port Peer Address:Port 
ESTAB       0      0       192.168.1.20:56818 100.64.0.2:ssh

If a broken pipe is encountered, expect to see TIME-WAIT, CLOSE-WAIT, FIN-WAIT-1, or related states indicating the session is winding down.

Under normal operation, state will continue displaying ESTABLISHED while the stream remains intact.

Enable TCP Keepalives for Persistence

While SSH handles session keepalives itself, enabling TCP keepalive probes provides additional resilience:

# sysctl -w net.ipv4.tcp_keepalive_time=120
# sysctl -w net.ipv4.tcp_keepalive_intvl=30
# sysctl -w net.ipv4.tcp_keepalive_probes=6

This instructs the TCP stack to send a keepalive probe every 30 seconds, waiting up to 120 seconds (6 * 30sec) before declaring the connection dead [[4]].

TCP keepalives persist even if routes change or IPs get remapped by NAT, preventing additional broken pipes.

OS-Specific TCP Keepalive Configuration

In addition to sysctl parameters, most major operating systems provide additional TCP keepalive controls:

Linux: /proc/sys/net/ipv4/tcpkeepalive* [[5]]
Windows: Registry keys like KeepAliveInterval [[6]]
macOS: sysctl net.inet.tcp.* e.g. net.inet.tcp.always_keepalive [[7]]

Consult your OS documentation for exposing advanced socket-level tuning.

Renegotiating SSH Sessions with Rekeying

By default, SSH cryptographic session parameters get renegotiated after 1GB of data gets transferred or 1 hour passes.

Triggering intentional rekeying helps resurrect broken connections, provided the network outage was brief:

# ssh -oRekeyLimit=5M user@host

This lowers the rekey threshold to 5MB, allowing more frequent rekeys.

However, excessive rekeying increases overhead. Tune based on reliability needs.

The Risk of Undetectable Broken Pipes

While properly configured keepalives prevent most broken pipes, extremely long network failures exceeding ClientAliveCountMax intervals can still unexpectedly terminate SSH sessions without any errors on the client side!

The only indication will be failure to execute commands or transfer data due to the cloaked broken pipe.

Additionally, intermittent connectivity loss can trick TCP‘s error detection, abruptly breaking the underlying socket without notification [[8]].

So always architect SSH usage expecting potential unannounced disconnections, despite applying all keepalive best practices.

The only guaranteed resilience comes from directly attaching servers to highly reliable networks.

Conclusion

Broken pipes represent a lurking reliability threat, poised to sabotage SSH connectivity. Protect mission critical sessions by:

Tuning ClientAliveInterval and CountMax appropriately
Enabling multilayered TCP + SSH keepalives
Designing infrastructure and software for intermittent failures

With vigilance and regular monitoring, the sneaky broken pipe phenomenon can be prevented most of the time. But also plan for handling the unexpected disconnections that will inevitably slip through.

By leveraging SSH‘s keepalive capabilities complemented by TCP-layer persistence, you can maximize remote access resilience and productivity.

How to Prevent the SSH Broken Pipe Error

Understanding the SSH Architecture and Broken Pipes

Why Pipes Break

Adjusting the ClientAlive Settings

Client-Side Keepalive Configuration

Verifying Alive Packets with tcpdump

Inspecting SSH Session State

Enable TCP Keepalives for Persistence

OS-Specific TCP Keepalive Configuration

Renegotiating SSH Sessions with Rekeying

The Risk of Undetectable Broken Pipes

Conclusion

Optimal OpenJDK Setup on CentOS 8: An In-Depth Guide

Demystifying Sizeof in Rust: A Deep Dive

Optimizing Nginx Performance with Worker Processes and Connections

Compiling C Programs on Linux with GCC

Making ‘chown‘ Recursive in Linux

How to Use Pydantic Optional Fields for Flexible Data Handling

Linuxhaxor.net – About Open Source & Linux

Understanding the SSH Architecture and Broken Pipes

Why Pipes Break

Adjusting the ClientAlive Settings

Client-Side Keepalive Configuration

Verifying Alive Packets with tcpdump

Inspecting SSH Session State

Enable TCP Keepalives for Persistence

OS-Specific TCP Keepalive Configuration

Renegotiating SSH Sessions with Rekeying

The Risk of Undetectable Broken Pipes

Conclusion

Related posts:

Similar Posts

Linuxhaxor.net – About Open Source & Linux