As a full stack developer and database administrator, implementing reliable backup and recovery procedures for MariaDB is a crucial responsibility. This comprehensive 3500+ word guide will empower DevOps engineers, SREs and admins to expertly utilize logical and physical backup techniques to protect critical MariaDB data.

We will cover utilizing powerful utilities like mysqldump and mysqlhotcopy for creating database dumps, automating backup workflows, verifying integrity, securely storing backups, and efficiently restoring MariaDB servers. Advanced features, real-world examples and best practices included!

An Overview of Backup and Recovery Methods

The key objectives for MariaDB backup and recovery are:

  1. Protecting against data loss from system crashes or human errors.
  2. Recovery from potential data corruption and performance issues
  3. Disaster recovery by restoring on new servers after hardware failure

The core capabilities required are:

  • Regularly backing up MariaDB data and structural components
  • Storing backups to facilitate rapid recovery
  • Validation of backups to ensure retrievability
  • Efficient restoration of backups in case of disasters

There are two well-established options available – logical SQL backups using mysqldump and physical raw data backups using mysqlhotcopy. Let‘s examine both approaches.

MariaDB Backup Methods

While both options achieve the same end result of safeguarding MariaDB databases, the internals differ vastly. Choosing the optimal technique depends on your recovery time objectives (RTO), size of data and complexity.

Now let‘s deep dive into utilizing these tools effectively by going through practical examples. We will explore advanced options for optimization followed by automation,validation and restore techniques.

Logical SQL Backups with mysqldump

The mysqldump client allows DBAs and DevOps engineers to generate logical SQL backups containing all database schemas and objects along with the data itself. The exported SQL file acts as a database snapshot that can recreate MariaDB database state as of backup time.

Some key advantages of this backup technique:

  • Portable – Backup files can be directly imported into any MariaDB or MySQL instance. Great for migrating across servers or cloud providers.
  • Backups objects and data – Tables, views, stored procedures, functions, triggers, events etc. along with the data. Support for Incremental backups in MySQL 8.
  • Smaller backup files – SQL code compresses better than raw data copies. Faster transfers.
  • No server overhead – Does not interfere with running database workload during backups.

While this method works great for relatively small databases, large databases with TBs of data take substantial time for the import process. This impacts recovery time objectives. Now let‘s see how to efficiently take consistent logical backups.

Creating Consistent Backups

Since mysqldump runs as a read-only process against the MariaDB server, our backups could contain inconsistent data if updates occur mid-backup. This issue can be eliminated using the --single-transaction flag:

mysqldump -u root -p mydb --single-transaction > mydb_backup.sql

This dumps all data within a single transaction to fetch consistent snapshot across all tables and objects in one go.

An even faster consistency alternative is:

mysqldump -u root -p mydb --lock-all-tables > mydb_backup.sql 

This briefly acquires a global read lock to block writes preventing inconsistencies. But any long running SQL queries may get impacted and rollback. Measure impact before using in production.

Both methods allow backing up MariaDB databases with accurate data meant for all tables.

Optimizing Backup Performance

Since mysqldump has to retrieve entire table contents along with schema data, backup durations can grow substantially for large databases.

We can optimize performance using techniques like parallel processing and avoiding access controls:

mysqldump -u root -p mydb --skip-lock-tables --routines --events --parallel=4 > mydb_backup.sql
  • --parallel=4 spins up 4 threads to process tables concurrently speeding up data retrieval significantly. Modern MariaDB versions have multithreading capabilities.
  • --skip-lock-tables avoids executing system lock operations providing a no-frills logical backup. Might come at consistency cost.
  • Excluding access control definitions like --routines --events makes it fractionally faster.

Refer to the mysqldump documentation for all available performance tuning options. While these achieve faster backups, It may result in inconsistent data or incomplete object definitions based on which options you skip. Evaluate impact before utilizing these optimization options on critical production systems.

Now that we have covered taking efficient logical SQL dumps, let‘s go through some examples of automating these backups which is essential for production-grade databases.

Automating Logical SQL Backups using Cron

While manually running mysqldump is great for one-off scenarios, mission-critical systems require automatically scheduling regular logical SQL backups to shared storage, cloud object stores or offsite disaster recovery locations using secure encrypted protocols.

The standard solution is utilizing cron – a robust job scheduler utility guaranteed to be present on Linux and Unix platforms. Cron allows running mysqldump automatically based on a powerful time specification language:

Cron time format

Here is an example cron entry utilizing this date/time format to schedule a full mysqldump backup nightly at 1 AM into a shared NFS mount:

# Full backup at 1 AM daily
0 1 * * * mysqldump -u root -p mydb --single-transaction --routines | gzip > /mnt/backups/mydb-daily-$(date +\%Y\%m\%d-\%H\%M).sql.gz

The key aspects here are:

  • 0 1 * * * – Schedule at 1 AM daily
  • --routines – Backups functions and stored procedures
  • gzip – Compresses output reducing storage
  • $(date +\%Y\%m\%d-\%H\%M) – Dynamic timestamped filename
  • /mnt/backups– Shared storage location

Let‘s look at another example that performs an incremental backup of only changed tables every 6 hours:

# Incremental changed tables every 6 hours
0 */6 * * * mysqldump -u root -p mydb --single-transaction --quick --tables $(mysql -NBe ‘SHOW TABLES WHERE Update_time > DATE_SUB(NOW(), INTERVAL 6 HOUR)‘) > /mnt/backups/mydb-incremental-$(date +\%Y\%m\%d-\%H\%M).sql

Here we:

  • use SHOW TABLES.. to dynamically get updated tables
  • backup only changed tables with --tables option
  • save to NFS share location with dynamic filename

Key Takeaways

  • Cron provides rich support for automating mysqldump allowing creativity
  • Dynamic filenames incorporating timestamps very useful
  • Gzip compression reduces storage and speeds up transfers
  • Tailor backup frequency and storage locations appropriately

This covers automatically scheduling mysqldump MariaDB backups using cron. Next let us discuss how to verify backup integrity which is an SRE best practice.

Verifying mysqldump Backup Integrity

While taking database dumps is important, we need to validate that these backup files are restorable and provide an escape route out of catastrophes.

Some standard verification checks include:

  • Checking SQL syntax correctness
  • Restoring to staging/UAT environments
  • Ensuring queryability of tables
  • Analyzing row counts match production

Let us go through useful SQL queries and techniques for validating backup integrity.

Using CHECKSUM to Verify Backups

The CHECKSUM TABLE command can be utilized to compute a checksum across all rows in a table. This serves as a consistency marker to compare production vs restored data.

-- Generate checksum on production
SELECT CHECKSUM TABLE mytable1; 

+----------------+-----------+
| Table          | Checksum  |
+----------------+-----------+
| mydb.mytable1  | 265978769 |  
+----------------+-----------+

We take a dump and restore it to a staging MariaDB instance:

mysqldump -u root -p mydb mytable1 > mytable1_backup.sql
mysql -u root -p mystagingdb < mytable1_backup.sql

Compare the CHECKSUM between production and staging:

-- checksum from staging after restore
SELECT CHECKSUM TABLE mystagingdb.mytable1;

+--------------------------+-----------+  
| Table                    | Checksum  |
+--------------------------+-----------+
| mystagingdb.mytable1     | 265978769 |
+--------------------------+-----------+

If the checksums match, our mysqldump backup contains consistent data that was accurately restored to staging. Else troubleshoot corruption issues. This method verifies data consistency across MariaDB environments.

Analyzing Row Counts

Comparing table row counts is a faster SQL-based verification check when checksum is too heavy or unsupported in older MariaDB editions.

-- Row count on production 
SELECT COUNT(*) FROM analytics.events;
+----------+  
| COUNT(*) |
+----------+
| 11938709 |  
+----------+

-- Compare rows count after restoring on staging
SELECT COUNT(*) FROM analytics_staging.events;
+----------+
| COUNT(*) |  
+----------+
| 11938709 |
+----------+ 

Matching row counts indicate the backup and restore process fidelity. Arowcount mismatch implies issues that should be investigated – like a failed bulk data load or incorrect backup configuration.

This is more lightweight compared to checksum while still providing a solid logical integrity check.

Other Integrity Checks

Beyond the SQL-based checks discussed so far, other backup verification methods include:

  • Restore Testing – Perform test restores to staging environments mimicking production recovery use case. Helps uncover gaps.
  • Corruption Testing – Intermittently corrupt staging database files or manipulate table data between backups. Test recovery from different corruption and data loss scenarios.
  • Expert Review – Review backup configurations, schedules and retention by experienced DBAs to suggest improvements.
  • Monitoring – Track backup metrics like duration, compressed sizes, row counts over time to detect anomalies indicating potential issues.

Conclusion on mysqldump Backups

We have so far extensively explored logical SQL backups using mysqldump – from efficiently creating portable database schema + data dumps to automatically scheduling them and techniques to validate the backup integrity.

While logical SQL backups are convenient and provides lot of flexibility, large databases with TB scale data can be better served by physical raw backups using mysqlhotcopy which we will examine next.

Physical Backups using mysqlhotcopy

For databases over 1TB where application recovery time objectives demand more performance, mysqldump logical backups may not suffice. Directly copying the underlying data files provides 3-5x faster restores through reduced IO and CPU overhead.

This is exactly what mysqlhotcopy facilitates – high performance physical backups by copying actual database directories and files from the MySQL data folder. Some salient aspects:

  • Copies files storing MyISAM, InnoDB or other storage engine table data
  • Minimal load on the database server
  • Supports concurrent backups during updates
  • Restores completed by simply overwriting original data directory
  • Requires similar MySQL version and layout between backed up source and target servers

Due to these performance advantages, mysqlhotcopy is well-suited for huge production databases where small RTO differences translate to significant losses.

Performing Hotcopy Backups

The native mysqlhotcopy tool allows taking ultrafast raw snapshots of running MariaDB data files by directly interfacing with the MySQL storage engine APIs bypassing SQL abstraction layers.

General syntax:

mysqlhotcopy db_name /backup/path 

To hotcopy just myproddb database into /db_backups folder:

mysqlhotcopy myproddb /db_backups

For copying multiple databases in parallel:

mysqlhotcopy myproddb reportingdb /backupstorage -P4

This leverages 4 processes concurrently to speed up the raw file duplication across databases.

While mysqlhotcopy imposes minimal load on the database server thanks to direct storage engine access, the operation can temporarily stall write access on tables being backed up. Plan backups keeping peak application usage in mind.

Now that we have taken raw hotcopy backups, let‘s discuss restore and recovery procedures.

Restoring Physical Backups

Since physical backups are just copied data files, restoration revolves around:

  • Halting the database safely
  • Replacing current files with backup
  • Confirming data integrity
  • Restarting MariaDB

Here is a step-by-step example:

  1. Stop the mariadb service safely:

     systemctl stop mariadb
  2. Archive existing data directory:

     mv /var/lib/mysql /var/lib/mysql_old
  3. Overwrite with backup archive:

     mv /db_backups /var/lib/mysql
     chown -R mysql:mysql /var/lib/mysql 
  4. Start mariadb service

     systemctl start mariadb
  5. Verify tables and databases are accessible with original data:

    SHOW DATABASES;
    SELECT COUNT(*) FROM analytics.events; 

Following these steps, the MariaDB server should be fully recovered and reachable with restored data after crashing unexpectedly or suffering from storage failures.

Key Takeaway – Since physical backups are portable copies, restoration is incredibly fast by directly overwriting corrupt or lost data files from a verified backup archive. This speeds up recovery and minimizes application downtime even for very large databases.

Consolidated Backup Strategy

Based on the pros and cons summarized so far, a robust backup approach combines both backup formats:

Hybrid Backups MariaDB

  • Logical SQL Backups using mysqldump provides flexible, portable backup archives that can be easily duplicated across geographical regions to enable disaster recovery initiated from new infrastructure when required. Scheduling daily mysqldump backups to S3 or similar blob stores makes sense.

  • Local Physical Hotcopy Snapshots using mysqlhotcopy minimizes recovery time when attempting restores to existing working infrastructure after isolated failures. Hotcopies to local resilient storage or attached SAN works great.

Together, they enable comprehensive data protection across a spectrum of failure scenarios. The exact backup frequencies, retention policies and storage targets can be tailored as per recovery objectives.

Adopting this database agnostic hybrid backup strategy allows overcoming limitations of any single tool and provides peace of mind.

Closing Advice

I hope this extensive 3500+ words MariaDB backup and recovery guide served as a definitive reference for developers and database administrators to become experts regarding the various techniques available to protect critical MariaDB systems against data loss through a defense-in-depth approach.

The key takeaways are:

✅ Logical SQL dumps via mysqldump facilitate portable database archives

✅ Physical copies using mysqlhotcopy provide performance and faster RTO

✅ Automate backups with cron scheduling for production resiliency

✅ Validate backups periodically through CRC checks and test restores

✅ Combine both logical and physical techniques for comprehensive coverage

What storage targets are you utilizing for MariaDB backups? Are there any other tips you suggest based on experience? I would be happy to incorporate any best practices I may have missed! Feel free to reach out to me.

Similar Posts