A Detailed Guide to Using rsync for File Copying and Backups on Ubuntu

As a Linux system administrator, being able to seamlessly copy and synchronize files between servers is an essential skill. Whether copying user directories, replicating databases, or maintaining remote backups, the venerable rsync tool simplifies the process.

Developed in 1996 for efficiently mirroring data, rsync remains a staple in every admin‘s toolkit for good reason – its versatile utility for taming file operations. Let‘s dive into rsync mastery!

Understanding the rsync Utility

At its core, rsync provides fast incremental file transfer by minimizing data moved between source and target files/directories. It accomplishes this by using an algorithm called rolling checksum to quickly compare existing files and only transfer differences.

This makes rsync extremely efficient for copying new and changed files in very large directory structures. It is also resilient to interruptions and can be resumed rather than starting transfers from scratch.

Key capabilities include:

Local file copying as well as remote transfers over ssh
Preservation of symbolic links, permissions, ownership, timestamps etc
Powerful include/exclude rules for precision file selection
Can delete extraneous files on destination to mirror source
Daemon mode for persistent backup recipient server
Return values for scripting into larger automated workflows

Learning rsync is a rite of passage for Linux admins. While it is characterized by a sea of complex-looking options, we will break down practical use cases so you can gain confidence applying rsync for common needs.

Rsync Command Syntax Overview

The syntax structure for rsync commands takes the basic form:

rsync [options] [source] [destination]

Where classic use cases involve:

Local file copying from one directory to another
Remote server file transfer and synchronization
Remote incremental backups from source server to destination

Common scenario examples:

# Local file copy
rsync -azvh /usr/local /backup 

# Remote server file copy 
rsync -azvh /home user@host:/backup/home 

# Remote incremental backup
rsync -azvh --delete /data user@host:/backups/data

We will decipher more realistic examples in the sections below. But first, let‘s ensure rsync is installed and ready.

Installing rsync on Ubuntu

Current versions of Ubuntu and most other Linux distributions ship with rsync pre-installed. But if needed, use apt to install:

sudo apt update
sudo apt install rsync

Verify with:

rsync --version

# rsync version 3.1.3  (protocol 31)

With rsync installed, let‘s unpack some key options.

Understanding rsync Options

With 30+ command options available, rsync functionality is extremely flexible but the abundant options can seem overwhelming.

Let‘s demystify some of the commonly used ones:

Archive mode (-a):

Recursively transfer files while preserving symbolic links, permissions, ownership, timestamps, etc.
Essential for maintaining an exact mirror backup copy.

Verbose mode (-v):

Increase verbosity to monitor the transfer progress.
Use -vv or -vvv for even more detailed logs.

Compress (-z):

In-transit file compression for faster transfer of remote data.

Delete (–delete):

Remove extraneous files from destination dirs that no longer exist on sender.
Important for synchronizing mirror copies and removing outdated backups. Exercise caution using this recursively.

Bandwidth limit (–bwlimit):

Limits transfer speed in kilobytes/second. Useful for lowering impact if running over metered network links.

Exclude (–exclude):

Specify a pattern list of files/dirs to exclude from transferring. Crucial for fine-tuning backups.

Stats log (–log-file=FILE):

Output verbose statistics to log file for analyzation.

There are many more options worth reviewing via man rsync once you are comfortable with the basics. But we are now armed with enough to demonstrate practical examples!

Copying Local Files with rsync

A simple way to start harnessing rsync is copying local files from one directory to another on your filesystem.

For example, let‘s copy our Downloads folder to an external USB drive plugged in at /media/backups.

rsync -azvh /home/user/Downloads/ /media/backups

Breaking down the key parts:

Source (-):

/home/user/Downloads/: Path to files being copied from

Destination ():

/media/backups: Path to location files will copy to

Options:

-a archive mode preserves metadata
-z compress for faster transfer
-v verbose output for monitoring
-h human-readable file sizes

This will recursively copy all contents from the Downloads folder to our backup USB drive while showing verbose output like:

building file list ... done
./
file1
file2 
file3
...

Total transferred file size: 9.32M bytes
...
sent 138 bytes  received 122 bytes  142.00 bytes/sec
total size is 15,238 bytes  speedup is 67.75 (DRY RUN)

Note a few bits from the output:

Total transferred file size – total size of changes copied over
speedup – the ratio of transfer speed vs raw copy speed

Because rsync leverages rolling checksums to only transfer differences, we see the speedup demonstrating major gains!

Now let‘s explore more use cases.

Remote Server Backups with rsync

Where rsync truly excels is doing data backups and transfers between remote servers rather than just local. The ability to mirror directory structuresmakes rsync well-suited for maintaining offsite copies.

For demonstration, we will backup key data from our web server web1 onto a separate host backup1 located at IP 192.168.1.150.

There are two common methods for enabling remote server rsync:

1. Rsync over Remote Shell

The most ubiquitous rsync method utilizes the rsync:// protocol along with remote shell access – typically ssh. This allows securely contacting any remote server accessible over the network.

General syntax for rsync file transfer over remote shell:

rsync [option...] /path/to/source remoteuser@remotehost:/remote/destination

As an example, let‘s do a full recursive mirror copy of our web1 codebase to backup1:

rsync -azh --delete /var/www webuser@192.168.1.150:/backups/web1/

Now everything under /var/www on web1 will copy over to /backups/web1 path on backup1 via ssh.

Key points of note:

Trailing / on source path copies contents only rather than the whole directory
--delete option cleans up stale file leftovers from previous backups

For scheduled backup scripts, it‘s wise to setup ssh public key authentication between hosts rather than relying on password logins.

2. Rsync Daemon Method

An alternative approach is configuring the destination host to run a persistent rsync daemon process. This listens on a defined TCP port allowing source servers to directly contact it without ssh.

On the backup1 recipient server, edit rsyncd.conf with your modules and parameters. Then launch the daemon:

/etc/rsyncd.conf

[backups]
        path = /backups
        read only = yes

Start daemon:

rsync --daemon --config=/etc/rsyncd.conf

Once running, rsync clients can push data to the daemon server like:

web1 source server:

rsync [options] /local/path/ backup1::backups

The daemon method has some advantages such as avoiding remote ssh and credentials per transfer. The downside is needing to open firewall ports.

Automating Incremental Backups

One of rsync‘s major advantages over other mirroring tools is built-in support for incremental backups. By calculating differences at the file level before transfer, rsync minimizes bandwidth needs for ongoing backup tasks.

It determines changes through its rolling checksum algorithm comparing files byte-by-byte. When a difference is discovered, only the changed bytes transfer.

Let‘s look at an example backup script automating incremental copies:

/home/user/bin/backup.sh

#!/bin/bash

# Backup script via rsync 

# Config
SRC=/home              # Source dir to backup  
USER=remoteuser        
HOST=192.168.50.10
DEST=/backups/$HOST   

# Create logfile
LOG="$(date +%Y-%m-%d)_backup.log"

# Rsync opts
RSYNC_OPTS="-azh --del --stats --log-file=$LOG"  

echo "*** Daily backup from $(hostname) started ***" >> $LOG

# Initial full backup
if [[ ! -d $DEST ]]; then 
  echo "Running initial full backup..."  
  rsync $RSYNC_OPTS $SRC $USER@$HOST:$DEST >> $LOG
else 
  # Incremental backup   
  echo "Running daily incremental backup..."
  rsync $RSYNC_OPTS --ignore-existing $SRC $USER@$HOST:$DEST >> $LOG  
fi

echo "*** Backup completed at $(date) ***" >> $LOG

The key points that enable incrementals:

First initial seed backup transfers full data
Subsequently only changed files copied with --ignore-existing
Log stats to analyze over time

We could then schedule this daily using cron. The same logic can extend to hourly backups of critical data silos as needed.

Bidirectional Sync with Rsync

A common scenario is maintaining an active-passive server pair with mirrored data between them – for example dual webheads or a hot standby database replica.

Rsync simplifies keeping files synchronized bidirectionally through its double colon :: path delimiter.

For example on web1 as the primary:

rsync -azvv --delete /var/www/ ::stanby1.ex:/var/www/

And the reverse direction sync configured on standby1:

rsync -azvv --delete /var/www ::web1.ex:/var/www/

With this any files changed on either server will automatically update the counterpart location. Very useful for keeping high availability clusters in sync!

Common rsync Pitfalls & Issues

While extremely powerful, rsync does come with some best practices worth calling out:

1. Accidental large file deletions

Using --delete can lead to data loss if sync directories misconfigured. Always test first without delete option enabled before trusting live. Some tips:

Start --dry-run first to preview impact
Use --max-delete=10 to limit deleted file count
Setup exclude rules to protect files from removal

2. TimestampTruncate and daylight savings time (DST)

Filesystems such as HFS+ on OSX that truncate timestamp precision can confuse rsync change detection around DST clock changes. Workaround by using --modify-window=1 and run between 2-3 AM local server time.

3. Resuming aborted transfers

If a file transfer aborts mid-stream, rerun using same command with --partial flag and rsync will resume from interrupted point rather than starting over.

4. Remote shell locale confusion

If rsync throws weird character encoding errors during remote transfers, force LANG=C before rsync call. Locale environmental differences can otherwise corrupt file data.

Conclusion & Next Steps

In closing, hopefully this guide has equipped you with both a broad conceptual grasp and readily applicable examples for harnessing rsync. Mastery of rsync will enable you to slash time previously lost to clumsy data copying operations.

Some recommended next steps:

For protecting backups, combine rsync with client-side encryption tools like gocryptfs before sending data over the wire
To prevent accidental data destruction, leverage filesystem snapshots on destination backup volumes
Build redundancy by cascading backups across multiple remote servers
Containerize rsync as a Docker image for easier portability and availability
For insights into rsync runtime performance, analyze the output stats logs using tools like rsyncstats

What are your favorite use cases or optimizations for rsync? I welcome any feedback for improving this guide!

A Detailed Guide to Using rsync for File Copying and Backups on Ubuntu

Understanding the rsync Utility

Rsync Command Syntax Overview

Installing rsync on Ubuntu

Understanding rsync Options

Copying Local Files with rsync

Remote Server Backups with rsync

1. Rsync over Remote Shell

2. Rsync Daemon Method

Automating Incremental Backups

Bidirectional Sync with Rsync

Common rsync Pitfalls & Issues

Conclusion & Next Steps

How to Build and Configure an OpenMediaVault NAS for Home or Small Business

Extending MLflow with Plugins: An Advanced Guide

Removing a Value from an Array with jQuery

Mastering the Select-String PowerShell Cmdlet: An Advanced 3300+ Word Guide

Controlling Text Size in Markdown through Effective Font Size Formatting

Installing Firefox on your Raspberry Pi

Linuxhaxor.net – About Open Source & Linux

Understanding the rsync Utility

Rsync Command Syntax Overview

Installing rsync on Ubuntu

Understanding rsync Options

Copying Local Files with rsync

Remote Server Backups with rsync

1. Rsync over Remote Shell

2. Rsync Daemon Method

Automating Incremental Backups

Bidirectional Sync with Rsync

Common rsync Pitfalls & Issues

Conclusion & Next Steps

Related posts:

Similar Posts

Linuxhaxor.net – About Open Source & Linux