Project

General

Profile

Actions

Enhancement #69190

open

cephfs-mirror should be able to transfer multiple files in parallel

Added by Alexander Patrakov over 1 year ago. Updated about 1 year ago.

Status:
Fix Under Review
Priority:
Normal
Category:
Performance/Resource Usage
Target version:
% Done:

0%

Source:
Community (user)
Backport:
squid, reef
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
cephfs-mirror
Labels (FS):
Pull request ID:
Tags (freeform):
Merge Commit:
Fixed In:
Released In:
Upkeep Timestamp:

Description

cephfs-mirror, in its current form, for each directory, transfers one file at a time. For HDD-based clusters, this means that one can't transfer more than ~100 files per second in a configured directory, which is a serious bottleneck if the source directory contains a lot of small files.

I have performed the following benchmarks of the source cluster. The cluster in question contains 1006 OSDs (NVMes for the metadata pool, HDDs with DB on NVMe for the data pools).

1. Finding files, as a case of a single-threaded metadata-related workload:

cd /mnt/cephfs/something/.snap/something
find . -type f | pv -l > /tmp/files

Result: 6 mln files enumerated in just under 10 minutes, which gives us ~10000 files per second.

2. Running 192 md5sum processes in parallel, giving each 1/192th of the file list, as an example of a massively data-parallel workload:

cat /tmp/files | xargs -d "\\n" -P 192 -n 192 md5sum | pv -l > /tmp/md5sums

Result: this finished in 25 minutes, thus yielding the performance of 4000 files per second. In retrospect, I think I might have overloaded the cluster.

Thus, assuming that the destination cluster has a similar write performance, a speedup of at least 40x would be achievable due to parallelization.

As far as I know, no filesystem data synchronization or backup tool supports massively parallel transfers. It would be a killer feature if cephfs-mirror could do such transfer parallelization automatically.

I know that @Md Mahamudur Rahaman Sajib is working on this already, see https://github.com/sajibreadd/ceph/pull/5, but I was asked to create an issue here for tracking purposes.

Actions #1

Updated by Md Mahamudur Rahaman Sajib over 1 year ago

  • Assignee set to Md Mahamudur Rahaman Sajib
Actions #2

Updated by Venky Shankar about 1 year ago

  • Status changed from New to Fix Under Review
  • Target version set to v20.0.0
  • Backport set to squid, reef
  • Pull request ID set to 61245
Actions

Also available in: Atom PDF