Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Use copy_file_range(2), if available, to implement File.Copy#39235

Closed
lpereira wants to merge 1 commit intodotnet:masterfrom
lpereira:use-copy-file-range
Closed

Use copy_file_range(2), if available, to implement File.Copy#39235
lpereira wants to merge 1 commit intodotnet:masterfrom
lpereira:use-copy-file-range

Conversation

@lpereira
Copy link

@lpereira lpereira commented Jul 5, 2019

From the man page:

  The  copy_file_range()  system  call performs an in-kernel
  copy between two file descriptors without the additional cost
  of transferring data from the kernel to user space and then
  back into the kernel.

This system call, available in some modern-ish Linux systems, will try
to copy between file descriptors using the most efficient way according
to the filesystem. If none of the filesystems involved are able to
perform a efficient copy, it'll fallback to using splice(2), which is
essentially how sendfile(2) is implemented.

The GNU C Library provides a syscall wrapper since 2.27. Since 2.29,
it also provides a userspace fallback mechanism (similar to
CopyFile_ReadWrite(), using read(2) and write(2)). The reason for not
using syscall(2) directly instead of relying on the syscall wrapper is
precisely to use the glibc userspace copy fallback instead.

Should fix #39231.

@lpereira lpereira requested review from janvorli and jkotas July 5, 2019 19:56
@jkotas
Copy link
Member

jkotas commented Jul 5, 2019

Build breaks...

@lpereira lpereira force-pushed the use-copy-file-range branch from 2470f49 to 4e32dc1 Compare July 5, 2019 20:17
@lpereira
Copy link
Author

lpereira commented Jul 5, 2019

Build breaks...

Should be fixed. That reminds me that we should evaluating building with -D_FILE_OFFSET_BITS=64 on 32-bit glibc builds.

@danmoseley
Copy link
Member

Post 3.0 at this point I think.

@danmoseley danmoseley added the * NO MERGE * The PR is not ready for merge yet (see discussion for detailed reasons) label Jul 7, 2019
@danmoseley
Copy link
Member

Marking no merge until master becomes 5.0.

@janvorli
Copy link
Member

janvorli commented Jul 8, 2019

We build our official binaries on CentOS 7, which AFAIK has glibc 2.17. That means that they will not be able to take advantage of this change. So it seems it would be better to use the syscall directly.

@lpereira
Copy link
Author

lpereira commented Jul 8, 2019

We build our official binaries on CentOS 7, which AFAIK has glibc 2.17. That means that they will not be able to take advantage of this change. So it seems it would be better to use the syscall directly.

Considering that we have a read(2) + write(2) fallback in place already, using the syscall directly might be an option, yes. (glibc provides a similar fallback if the kernel does not provide that syscall.)

@lpereira lpereira force-pushed the use-copy-file-range branch from 4e32dc1 to 02c5787 Compare July 15, 2019 17:06
@lpereira
Copy link
Author

Updated patch to use syscall() directly.

@stephentoub stephentoub removed * NO MERGE * The PR is not ready for merge yet (see discussion for detailed reasons) labels Jul 17, 2019
@danmoseley danmoseley requested a review from JeremyKuhne July 19, 2019 23:09
@lpereira lpereira force-pushed the use-copy-file-range branch from f7e8020 to b5c41c8 Compare July 22, 2019 23:43
@JeremyKuhne JeremyKuhne added this to the 5.0 milestone Jul 22, 2019
@JeremyKuhne JeremyKuhne requested a review from carlossanlop July 22, 2019 23:50
@eiriktsarpalis
Copy link
Member

@lpereira would you be able to take a look? Thanks

@carlossanlop carlossanlop assigned lpereira and unassigned carlossanlop Aug 8, 2019
@lpereira
Copy link
Author

@filipnavara is fine in rebasing #37583 on top of this one; could I please get some review here? (I leave on vacations this week, would like to get this merged before then. Build breaks seem spurious.)

@jkotas
Copy link
Member

jkotas commented Aug 13, 2019

Could you please verify that the optimization will kick in for official builds that are built on CentOS 7?

@lpereira lpereira force-pushed the use-copy-file-range branch 2 times, most recently from 6eb6a73 to 4db6819 Compare August 13, 2019 22:21
@lpereira
Copy link
Author

Could you please verify that the optimization will kick in for official builds that are built on CentOS 7?

I force-pushed a new version that hardcodes the syscall number on x86-64, i386, arm32, and arm64 (falling back to including syscall.h and trying to use __NR_copy_file_range). It should be enabled when built with CentOS 7.

@jkotas
Copy link
Member

jkotas commented Aug 14, 2019

The Linux test failures seems to be caused by this change.

@lpereira lpereira force-pushed the use-copy-file-range branch from 4db6819 to faa0054 Compare October 2, 2019 18:24
@lpereira
Copy link
Author

lpereira commented Oct 2, 2019

Rebased and repushed with some of the comments addressed.

From the man page:
      The  copy_file_range()  system  call performs an in-kernel
      copy between two file descriptors without the additional cost
      of transferring data from the kernel to user space and then
      back into the kernel.

This system call, available in some modern-ish Linux systems, will try
to copy between file descriptors using the most efficient way according
to the filesystem.  If none of the filesystems involved are able to
perform a efficient copy, it'll fallback to using splice(2), which is
essentially how sendfile(2) -- the current method to perform a file
copy -- is implemented.

Although the GNU C Library provides a syscall wrapper since 2.27, our
build servers are based off of an older version of CentOS, which do not
ship with that library.  So the system call is called directly with
syscall() instead.  Since glibc 2.29, there's a user mode fallback too,
but we don't need it as there are two other fallback mechanisms in place
already.
@danmoseley
Copy link
Member

@JeremyKuhne @carlossanlop could you please re-review this oldie?

@danmoseley
Copy link
Member

it also blocks #37583

}

toCopy -= (size_t)copied;
if (toCopy == 0)
Copy link

@carlossanlop carlossanlop Oct 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a chance that toCopy could end up being less than 0? Should this check be if (toCopy <= 0) or is there no point?

Copy link
Author

@lpereira lpereira Oct 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no point: copied will either be -1 (and that's handled above), or, at most, the value of toCopy (it'll return the number of bytes actually copied, which could be less than that). If for some reason the syscall returns more than the value of toCopy (which is passed as a parameter), then there's a bug in the kernel.

(Also, toCopy is size_t, which is unsigned.)

Copy link

@carlossanlop carlossanlop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question. If it's not actionable, let me know, so I can approve, @lpereira.

return fcopyfile(inFd, outFd, NULL, COPYFILE_ALL) == 0 ? 0 : -1;
#else
// Get the stats on the source file.
// Now that the data from the file has been copied, copy over metadata
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment no longer makes sense now that it's been separated into its own function. Let's delete the "Now that the data from the file has been copied, " part.

{
// If copy_file_range is unavailable in this kernel, size
// will remain untouched, so it's fine for concurrent
// threads to write to sendFileUnavailable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment... what does this check have to do with sendFileUnavailable? Is that a copy-and-paste error in the comment?

if (toCopy == 0)
{
break;
}
Copy link
Member

@stephentoub stephentoub Oct 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just make the condition on the loop while (toCopy > 0)? That would also avoid entering the loop if the amount of data to copy initially is 0.

if (ret != 0)

startOffset += (off_t)(sourceStat.st_size - (off_t)bytesToCopy);
// copy_file_range couldn't be used, either because we're in an older kernel or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// copy_file_range couldn't be used, either because we're in an older kernel or
// copy_file_range couldn't be used to copy the whole file, either because we're in an older kernel or

// Then copy permissions.
while ((ret = fchmod(outFd, sourceStat.st_mode & (S_IRWXU | S_IRWXG | S_IRWXO))) < 0 && errno == EINTR);
if (ret != 0)
// sendfile couldn't be used; fall back to a manual copy below. This could happen
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// sendfile couldn't be used; fall back to a manual copy below. This could happen
// sendfile couldn't be used to copy the whole file; fall back to a manual copy below. This could happen

#endif // !HAVE_FCOPYFILE

int32_t SystemNative_CopyFile(intptr_t sourceFd, intptr_t destinationFd)
static int32_t CopyFileMetadata(int inFd, int outFd, struct stat_ *sourceStat)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does CopyFileMetadata need to take a pointer to a struct stat_ rather than just declaring one itself? None of the call sites use the passed in pointer after the call, and I don't think the benefit of avoiding the extra local outweighs the cognitive cost of passing around pointers unnecessarily.

#endif // HAVE_SENDFILE_4

#if defined(PAL_COPY_FILE_RANGE_SYSCALL)
static loff_t CopyFileRange(int inFd, loff_t *inOff, int outFd, loff_t *outOff, size_t len, unsigned int flags)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to name this copy_file_range and define it only if it's not already defined?

Copy link
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments to be addressed, otherwise LGTM.

Can you share some perf numbers using File.Copy to highlight the benefits of the change?

@stephentoub
Copy link
Member

In addition to all of the comments to be addressed, this now also has conflicts. @lpereira, do you plan to finalize this PR in the next week and a half? If not, we should close it and you can submit an updated version in the new dotnet/runtime repo when it's available.

@lpereira
Copy link
Author

lpereira commented Nov 4, 2019

@stephentoub In order to make things easier for the repo consolidation, I'll do this and come back with an updated patch once that happens.

@lpereira lpereira closed this Nov 4, 2019
@stephentoub
Copy link
Member

Ok. Sounds good, thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SystemNative_CopyFile() should use copy_file_range() in Linux

9 participants