-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Backup/restore performance issues #7802
Copy link
Copy link
Open
Description
Writing up a general/container issue, to enumerate the problem areas:
Intro
For backup/restore, the preferred solution in Vitess today is to use the xtrabackup integration. In general, the way this works is:
Backup
- vttablet launches xtrabackup against the local MySQL instance
- xtrabackup streams the backup output to vttablet (potentially multiple streams)
- vttablet compresses the stream(s)
- vttablet then directs the stream(s) to a data store (upload to S3, upload to GCS, store to local file, send to Ceph, etc.)
Restore
- vttablet reads the backup data from a data store
- vttablet decompresses the stream
- vttablet streams the decompressed data into xtrabackup, which restores the database in a temporary directory locally
- xtrabackup then performs a "prepare" on the backup (replaying the logged changes that accumulated during the backup and was stored along with the MySQL data files). Depending on how long the backup took, how large it is, and how many writes the database took while the backup was running; this can take a significant amount of time
- When the prepare is done, the files are moved into position, and MySQL can be started on top of it, completing the restore.
Note that for purposes of this discussion, I am going to ignore any point-in-time restore by applying binlogs after the above process is complete.
Performance bottlenecks
- General: Because much of the streaming/compression/decompression work happens in vttablet, which also might be serving queries; this can be a problem, both from a CPU and GC point of view. We have no limiter/throttler in the pipeline to limit impact on the serving tablet (only really a problem for backups; restores obviously should go as fast as possible)
- Backup: compression phase. This has been problematic (from a correctness point of view) in the past, we have used a few parallel gzip/zlib libraries. There is also not much knobs for the user to tune to make tradeoffs (e.g. compression level). One specific issue here is that with the parallel compression and multiple threads, it is easy to just blow out the local host/container CPU allocation completely (maybe just something to document?)
- Restore: decompression phase. zlib/gzip decompression is single-threaded, and not very fast (at least with the library we use), so we are limited to a maximum decompression speed around 100-200 MB/sec (depending on the data). For small tablets, this isn't a big deal, but we can do better. By definition, this is probably the most performance critical stage.
- In general, it might be advisable to move to zstd compression, it compresses much faster (or with less CPU), supports multi-threaded compression out of the box, and it's single thread decompression is easily 4x as fast as our current gzip/zlib library. As a bonus, depending on the compression level you use, it can compress significantly better than gzip/zlib.
- In addition (or alternatively), we should look into an optimized zlib/gzip implementation for decompression to speed up restores. Something based on zlib-ng or even the DEFLATE implementation from https://nigeltao.github.io/blog/2021/fastest-safest-png-decoder.html might work here.
- Other areas / notes:
- We run the xtrabackup prepare phase without any innodb options that could speed it up significantly (e.g. decent size buffers, relaxing consistency, etc.). There are options to customize this, but I doubt most people know or bother. Maybe at least document this?
- S3 uploads has been problematic in the past, but more from a correctness than performance p.o.v. (cf. issues and PRs for this code)
Reactions are currently unavailable