Demystifying the Known Hosts File for SSH in Linux

The Secure Shell (SSH) protocol is ubiquitous for remote access and file transfer between Linux, Unix, and even Windows systems. SSH utilizes strong encryption to prevent eavesdropping and establishes secure channels between endpoints.

As a professional with over a decade of Linux experience, I often rely on SSH to administer remote servers and transfer files. The protocol‘s integrated security mechanisms provide vital protection for these communications. One important component that prevents man-in-the-middle attacks is the SSH known_hosts file.

In this comprehensive guide, I‘ll draw on my expertise to explain the inner workings of the SSH known hosts system. Well delve into how it relates to security best practices and protocols. You‘ll gain a solid grasp of this key component and how to apply it effectively in practice.

An Introduction to SSH

Before diving specifically into known_hosts, a quick primer on SSH is helpful for context. Feel free to skip ahead if you‘re already familiar with the protocol!

SSH stands for Secure Shell, with "shell" meaning a command-line interface for administration and file operations. The client/server architecture allows remote login from user workstations to servers over an encrypted tunnel:

ssh protocol diagram

This tunnel protects against eavesdroppers on the network. SSH handles setting up blowfish/AES symmetric keys for the session as well as authenticating endpoints. But securing this bi-directional tunnel requires verifying the server identity to avoid MITM attacks.

The known_hosts file serves as the database of previously seen and trusted servers accessed by a user. The SSH client transparently maintains this list, updating it on first connections to new endpoints.

Asymmetric Encryption & Public Key Hashes

SSH utilizes asymmetric public-key cryptography for its initial key exchange. Without diving too deep into cryptography theory, here‘s a quick explanation:

Each server has a unique public & private keypair used to establish secure SSH connections.
Data encrypted by the private key can only be decrypted by the public key.
These keypairs rely on complex mathematical functions to enable the one-way encryption.
Public keys can be freely shared to allow encryption without compromising the private key.

To derive an easy method to identify known servers, the SSH protocol runs the public keys through a cryptographic hash function.

Visualization of ssh key hashing

Common choices for SSH include MD5, SHA1, and SHA256 hash algorithms. By processing the key via these one-way functions, the output becomes an identifiable fingerprint for each public key.

These fingerprints serve as a proxy for referring to associated public keys from specific hosts. Their high entropy makes collisions exceedingly unlikely in practice.

Armed with some background, let‘s explore specifically how SSH applies this encryption workflow securely identify remote endpoints.

What is the SSH Known Hosts File?

The known_hosts file resides in the .ssh directory of user home directories – ~/.ssh/known_hosts. This allows each user to maintain separate fingerprints for the servers they access.

On first establishing an SSH session from a client to a remote server, the client records the public key fingerprint hashed using an algorithm like SHA256. These fingerprints allow identifying that same server when connecting again in the future.

Here‘s an example known_hosts file with two entries:

|1|8k4Uu98Q4bVeuOVDp0qjPdwEOjk=|ix54XSRwv+fVaxqH7MfIkZfx2T8= ec2-54-173-55-117.compute-1.amazonaws.com  

|1|hQCXnxsLaC5pOFQgP6Ak8clamJo=|a2G1H6XaDoffMvG2M3bmv4owiZ4= github.com

Breaking this down:

The first field indicates the key type, with 1 denoting an RSA key.
The two long strings are the public key fingerprints hashed via SHA256 algorithm.
The trailing domain indicates the associated host these keys are valid for.

How Known Hosts Protects Against MITM Attacks

On subsequent SSH connections, the client transparently checks the server public key matches the stored fingerprint:

ssh known_hosts verification

This protects against man-in-the-middle (MITM) attacks by detecting invalid public keys not seen previously. An attacker intercepting traffic cannot silently use their own keys without being flagged by SSH through this identity check.

When mismatches occur, SSH presents the user an alert like:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!

This prevents trojaned SSH endpoints hijacking secure tunnels without raising red flags. The decentralized known_hosts file gives clients visibility and control to lock down trusted server keys.

Inspecting and Managing Known Hosts

Now we‘ll explore common tasks around inspecting and managing known SSH host entries using the CLI.

Listing Current Hosts

View all stored hosts using cat:

cat ~/.ssh/known_hosts

You‘ll see the full set of key fingerprints and associated hostnames stored by SSH.

Search for a specific host such as github.com using grep:

grep github ~/.ssh/known_hosts

This prints only lines containing the search term.

Removing Outdated Entries

Over time you may accumulate stale or unwanted hosts. Prune these using ssh-keygen:

ssh-keygen -R oldhost.example.com

The -R flag removes specified hosts by hostname or IP from your known_hosts file.

Clearing All Hosts

To purge the entire known hosts database:

>$ truncate --size 0 ~/.ssh/known_hosts

After truncating the file, SSH will rebuild hosts on next connections.

Prompt to Update Keys

If a server admin regenerates host SSH keys, clients will see a mismatched fingerprint on next connection.

SSH can automatically prompt to update the expected key for that host. Simply reconnect to accept the new key, updating your known_hosts file.

Potential Issues: Troubleshooting Known_Hosts

Now we‘ll cover some common issues users face related to SSH‘s known_hosts verification. Understanding the root cause helps troubleshoot each smoothly.

"Warning: Permanently added…" On Every Connection

If you see SSH adding new entries on every connection – even to previously known hosts – check whether StrictHostKeyChecking is disabled system-wide or for a specific context/host.

Confirm /etc/ssh/ssh_config has:

Host * 
  StrictHostKeyChecking yes

And add the option to individual hosts if needed in ~/.ssh/config. This enforcement will prevent new keys constantly overriding your known_hosts file.

"Host Key Has Changed" Warnings

As mentioned earlier, warning messages about changed host keys indicate the server‘s public key no longer matches what SSH expects from stored known_hosts fingerprints.

Legitimate reasons this may happen:

The SSH daemon on the remote server was recently reinstalled or upgraded
The server administrator regenerated new SSH host keys
The remote server endpoint you are connecting to has changed
An attacker is intercepting your connection (unlikely if you keep SSH keys secured)

Always investigate carefully before accepting changed key alerts. Check with the server admin if you have doubts about the fingerprint mismatch reason.

Update the known_hosts entry after due diligence by removing then re-adding the host.

Permission Denied Errors

Invalid file permissions on ~/.ssh or known_hosts itself can prevent SSH reading or updating host entries.

Common permissible modes:

700 ~/.ssh
600 known_hosts

If facing errors attempt to reconnect, double check permissions allow the SSH process read/write access.

While public key cryptography and hashing may seem complex at first glance, SSH streamlines these concepts for ease of use. The known_hosts system manages the key continuity automatically after initial connections.

Administrators mainly need to handle occasional warnings or pruning outdated entries – fairly straightforward with the ssh-keygen tool. Now that we‘ve covered the major areas let‘s discuss some best practices around security and monitoring.

Securing Known Hosts Entries

The user known_hosts file contains a history of servers accessed from a workstation. Like other SSH secrets, its best kept confidential to limit exposure.

ssh encryption and access layers

Since this file is controlled by user permissions, some tips for locking it down:

Restrict permissions using chmod 600 so that only the SSH user can access the contents
Encrypt the file using gpg/OpenSSL when not actively maintaining SSH connections
Utilize SSH Agent for unlocking keys instead of storing unencrypted keys that would grant access even without known_hosts

The global /etc/ssh/ssh_known_hosts contains system-wide defaults, so should already be adequately permissioned in most Linux environments.

Ultimately the private keys used for authentication represent the "keys to the kingdom" granting full access to accounts. So while limiting known_hosts exposure is good practice, keeping your private keys secured is absolutely critical for preventing account compromises or breaches.

Monitoring Key Changes

Routinely monitoring public keys from critical servers allows detecting changes that could reflect infrastructure compromises or misconfigurations.

One method is setting up periodic cron jobs that:

Dump the known_hosts entries to a file
Compare the latest entries versus the saved historical snapshot
Trigger alerts if differences emerge

For example:

# Monthly Host Key Monitoring
0 0 1 * * ssh-keyscan critical-server > /data/keys.curr
diff /data/keys.prev /data/keys.curr && echo "SSH host keys changed!" | mail admin@company.com
mv /data/keys.{curr,prev}

The ssh-keyscan command extracts server keys established in known_hosts without needing to connect. Comparing periodic scans helps keep tabs on environments for unauthorized alterations.

Integrating with Centralized PKI

So far we‘ve focused on the decentralized trust-on-first-use model that SSH employs. The user known_hosts file establishes the canonical keys for each accessed server.

In more security sensitive environments, SSH can integrate with enterprise-grade Public Key Infrastructure (PKI):

ssh integrating with private pki

Rather than TOFU, host keys are signed by a trusted Certificate Authority vouching for their validity. SSH clients validate certificates against the CA before accepting connections.

This guides key continuity through centralized policy control instead of localized user files. Explore integrating SSH with an Internal PKI or third party SSH CA products to meet your compliance needs.

The known_hosts concept remains useful as a fallback trust method even when adopting certificate-based infrastructure.

Wrapping Up

I hope this article helped provide useful background and tips on effectively managing SSH known_hosts files. Now you know:

The known_hosts file acts as a local cache of previously seen SSH server public key fingerprints
This allows transparently verifying host identity on subsequent secure connections
Clients alert users whenever altered host keys are detected, thwarting MITM attacks
Proper management keeps this trust store up-to-date and secured from prying eyes

Feel free to reach out if you have any other questions! Stay secure out there.

Demystifying the Known Hosts File for SSH in Linux

An Introduction to SSH

Asymmetric Encryption & Public Key Hashes

What is the SSH Known Hosts File?

How Known Hosts Protects Against MITM Attacks

Inspecting and Managing Known Hosts

Listing Current Hosts

Removing Outdated Entries

Clearing All Hosts

Prompt to Update Keys

Potential Issues: Troubleshooting Known_Hosts

"Warning: Permanently added…" On Every Connection

"Host Key Has Changed" Warnings

Permission Denied Errors

Securing Known Hosts Entries

Monitoring Key Changes

Integrating with Centralized PKI

Wrapping Up

Compiling the Linux Kernel from Source on Ubuntu

Case-Insensitive String Comparison in C++

Comprehensive Guide to Generating Prime Numbers in Python

How To Make a File Executable in Linux

A Detailed Guide: Converting Python Code to C/C++

A Full-Stack Developer‘s Advanced Guide to Tracking and Analyzing Installed Packages on Linux

Linuxhaxor.net – About Open Source & Linux

An Introduction to SSH

Asymmetric Encryption & Public Key Hashes

What is the SSH Known Hosts File?

How Known Hosts Protects Against MITM Attacks

Inspecting and Managing Known Hosts

Listing Current Hosts

Removing Outdated Entries

Clearing All Hosts

Prompt to Update Keys

Potential Issues: Troubleshooting Known_Hosts

"Warning: Permanently added…" On Every Connection

"Host Key Has Changed" Warnings

Permission Denied Errors

Securing Known Hosts Entries

Monitoring Key Changes

Integrating with Centralized PKI

Wrapping Up

Related posts:

Similar Posts

Linuxhaxor.net – About Open Source & Linux