Skip to content

clone volume: cp doesn't support sparse file #279

@stoneshi-yunify

Description

@stoneshi-yunify

Hostpath: v1.6.2

Cloning volume will call cp -a <src-vol> <dest_vol>, refer to

func loadFromFilesystemVolume(hostPathVolume hostPathVolume, destPath string) error {
.

The cp from Alpine by default doesn't support sparse file, it will copy a sparse file as a regular file. Therefore, if the source volume has a large sparse file, the cp will be extremely slow.

A QEMU/VM disk image is a kind of sparse file we usually see, projects like kubevirt them a lot.

The cp from coreutils supports sparse file by default, and will extremely shorten the copying time. So hostpath may just install the coreutils.

The cp test:

root@kubevm:~# kubectl -n kube-system exec -it csi-hostpathplugin-0 -c hostpath -- sh
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 # ls -l
total 17056
-rw-rw---- 1 root root 20293720064 Apr 23 05:46 disk.img
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 # du -sh *
17M	disk.img
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 # time cp -a /csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 /csi-data-dir/old-cp
real	2m 35.27s
user	0m 0.02s
sys	1m 58.11s
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 # cp --help
BusyBox v1.32.1 () multi-call binary.

Usage: cp [OPTIONS] SOURCE... DEST

Copy SOURCE(s) to DEST

	-a	Same as -dpR
	-R,-r	Recurse
	-d,-P	Preserve symlinks (default if -R)
	-L	Follow all symlinks
	-H	Follow symlinks on command line
	-p	Preserve file attributes if possible
	-f	Overwrite
	-i	Prompt before overwrite
	-l,-s	Create (sym)links
	-T	Treat DEST as a normal file
	-u	Copy only newer files
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 # apk add coreutils
fetch https://mirrors.aliyun.com/alpine/v3.13/main/x86_64/APKINDEX.tar.gz
fetch https://mirrors.aliyun.com/alpine/v3.13/community/x86_64/APKINDEX.tar.gz
(1/6) Installing libacl (2.2.53-r0)
(2/6) Installing libattr (2.4.48-r0)
(3/6) Installing skalibs (2.10.0.0-r0)
(4/6) Installing s6-ipcserver (2.10.0.0-r0)
(5/6) Installing utmps (0.1.0.0-r0)
Executing utmps-0.1.0.0-r0.pre-install
(6/6) Installing coreutils (8.32-r2)
Executing busybox-1.32.1-r6.trigger
OK: 14 MiB in 39 packages
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 #
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 #
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 # cp --help
Usage: cp [OPTION]... [-T] SOURCE DEST
  or:  cp [OPTION]... SOURCE... DIRECTORY
  or:  cp [OPTION]... -t DIRECTORY SOURCE...
Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY.

Mandatory arguments to long options are mandatory for short options too.
  -a, --archive                same as -dR --preserve=all
      --attributes-only        don't copy the file data, just the attributes
      --backup[=CONTROL]       make a backup of each existing destination file
  -b                           like --backup but does not accept an argument
      --copy-contents          copy contents of special files when recursive
  -d                           same as --no-dereference --preserve=links
  -f, --force                  if an existing destination file cannot be
                                 opened, remove it and try again (this option
                                 is ignored when the -n option is also used)
  -i, --interactive            prompt before overwrite (overrides a previous -n
                                  option)
  -H                           follow command-line symbolic links in SOURCE
  -l, --link                   hard link files instead of copying
  -L, --dereference            always follow symbolic links in SOURCE
  -n, --no-clobber             do not overwrite an existing file (overrides
                                 a previous -i option)
  -P, --no-dereference         never follow symbolic links in SOURCE
  -p                           same as --preserve=mode,ownership,timestamps
      --preserve[=ATTR_LIST]   preserve the specified attributes (default:
                                 mode,ownership,timestamps), if possible
                                 additional attributes: context, links, xattr,
                                 all
      --no-preserve=ATTR_LIST  don't preserve the specified attributes
      --parents                use full source file name under DIRECTORY
  -R, -r, --recursive          copy directories recursively
      --reflink[=WHEN]         control clone/CoW copies. See below
      --remove-destination     remove each existing destination file before
                                 attempting to open it (contrast with --force)
      --sparse=WHEN            control creation of sparse files. See below
      --strip-trailing-slashes  remove any trailing slashes from each SOURCE
                                 argument
  -s, --symbolic-link          make symbolic links instead of copying
  -S, --suffix=SUFFIX          override the usual backup suffix
  -t, --target-directory=DIRECTORY  copy all SOURCE arguments into DIRECTORY
  -T, --no-target-directory    treat DEST as a normal file
  -u, --update                 copy only when the SOURCE file is newer
                                 than the destination file or when the
                                 destination file is missing
  -v, --verbose                explain what is being done
  -x, --one-file-system        stay on this file system
  -Z                           set SELinux security context of destination
                                 file to default type
      --context[=CTX]          like -Z, or if CTX is specified then set the
                                 SELinux or SMACK security context to CTX
      --help     display this help and exit
      --version  output version information and exit

By default, sparse SOURCE files are detected by a crude heuristic and the
corresponding DEST file is made sparse as well.  That is the behavior
selected by --sparse=auto.  Specify --sparse=always to create a sparse DEST
file whenever the SOURCE file contains a long enough sequence of zero bytes.
Use --sparse=never to inhibit creation of sparse files.

When --reflink[=always] is specified, perform a lightweight copy, where the
data blocks are copied only when modified.  If this is not possible the copy
fails, or if --reflink=auto is specified, fall back to a standard copy.
Use --reflink=never to ensure a standard copy is performed.

The backup suffix is '~', unless set with --suffix or SIMPLE_BACKUP_SUFFIX.
The version control method may be selected via the --backup option or through
the VERSION_CONTROL environment variable.  Here are the values:

  none, off       never make backups (even if --backup is given)
  numbered, t     make numbered backups
  existing, nil   numbered if numbered backups exist, simple otherwise
  simple, never   always make simple backups

As a special case, cp makes a backup of SOURCE when the force and backup
options are given and SOURCE and DEST are the same name for an existing,
regular file.

GNU coreutils online help: <https://www.gnu.org/software/coreutils/>
Report any translation bugs to <https://translationproject.org/team/>
Full documentation <https://www.gnu.org/software/coreutils/cp>
or available locally via: info '(coreutils) cp invocation'
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 # time cp -a /csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 /csi-data-dir/coreutils-cp
real	0m 0.08s
user	0m 0.00s
sys	0m 0.02s

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions