Skip to content

Have VFS graphdriver use accelerated in-kernel copy#35537

Merged
yongtang merged 4 commits intomoby:masterfrom
sargun:vfs-use-copy_file_range
Dec 5, 2017
Merged

Have VFS graphdriver use accelerated in-kernel copy#35537
yongtang merged 4 commits intomoby:masterfrom
sargun:vfs-use-copy_file_range

Conversation

@sargun
Copy link
Contributor

@sargun sargun commented Nov 17, 2017

- What I did
This change makes the VFS graphdriver use the kernel-accelerated
(copy_file_range) mechanism of copying files, which is able to
leverage reflinks.

The copy package should fulfill all of the capabilities that CopyWithTar
fulfilled as well.

I learned a lot about file system performance on Linux. At least about XFS on machines with reasonably fast I/O, and many cores with lots of memory. The primary barrier here is not disk I/O, as it turns out, but contention inside of the dentry, inode, and allocation group.
- How I did it
I reused the accelerated copy code I had written for the overlay graphdriver.

- How to verify it
I used the existing tests to verify the behaviour.

- Description for the changelog
Use copy_file_range to copy files for VFS on Linux.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just define dirCopy() as a function in both copy_linux and copy_unsupported instead of setting this var?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This var gets changed in by layer/layer_test.go -- why it changes it, I'm not sure. I would feel comfortable changing it, if I knew why it existed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi, this is because the chrootarchive archiver performs a re-exec which does some shadow magic and the unit test is not prepared for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this one, the functions can call copyDir directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nevermind I see the layerstore test is using this... something about nice things...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpuguy83 Anything else then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is chrootarchive really necessary now, without it then there would be no need to expose this, especially since Linux no longer needs it...

Copy link
Contributor Author

@sargun sargun Nov 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would we do with the non-windows bits? I can do it, but I'd not want to do it in this PR.

@cpuguy83
Copy link
Member

Looks like experimental got stuck in a syscall:

Details
goroutine 4277 [runnable]:
syscall.Syscall(0x4a, 0x13, 0x0, 0x0, 0x0, 0x0, 0x0)
	/usr/local/go/src/syscall/asm_linux_amd64.s:18 +0x5
syscall.Fsync(0x13, 0x10, 0x96cdb8)
	/usr/local/go/src/syscall/zsyscall_linux_amd64.go:492 +0x4a
os.(*File).Sync(0xc4216e04f0, 0x1e53018, 0xc4217ae148)
	/usr/local/go/src/os/file_posix.go:121 +0x5b
github.com/docker/docker/pkg/ioutils.(*atomicFileWriter).Close(0xc421bcef90, 0x0, 0x0)
	/go/src/github.com/docker/docker/pkg/ioutils/fswriters.go:68 +0x81
github.com/docker/docker/pkg/ioutils.AtomicWriteFile(0xc42134cb00, 0x71, 0xc421090000, 0xe57, 0xe57, 0xc400000180, 0xc420f24990, 0x2b899b0)
	/go/src/github.com/docker/docker/pkg/ioutils/fswriters.go:41 +0xcb
github.com/docker/docker/image.(*fs).Set(0xc420477e30, 0xc421090000, 0xe57, 0xe57, 0x0, 0x0, 0x0, 0x0)
	/go/src/github.com/docker/docker/image/fs.go:121 +0x122
github.com/docker/docker/image.(*store).Create(0xc42022f1a0, 0xc421090000, 0xe57, 0xe57, 0x0, 0x0, 0x0, 0x0)
	/go/src/github.com/docker/docker/image/store.go:146 +0x2c8
github.com/docker/docker/daemon.(*Daemon).Commit(0xc4203c3200, 0xc421c28580, 0x40, 0xc421b39d80, 0x0, 0x0, 0x0, 0x0)
	/go/src/github.com/docker/docker/daemon/commit.go:207 +0x696
github.com/docker/docker/builder/dockerfile.(*Builder).commitContainer(0xc4203eb4a0, 0xc4213c66c0, 0xc421c28580, 0x40, 0xc42123cc80, 0x2b9cba0, 0xc4217ddf60)
	/go/src/github.com/docker/docker/builder/dockerfile/internals.go:117 +0x117
github.com/docker/docker/builder/dockerfile.dispatchRun(0xc4213c66c0, 0xc420addd04, 0xc4203eb4a0, 0x2bafae0, 0xc421af1260, 0xc420feafc0, 0xc42048ff00, 0xc420feafc0, 0xc4217aed00)
	/go/src/github.com/docker/docker/builder/dockerfile/dispatchers.go:356 +0xa7e
github.com/docker/docker/builder/dockerfile.dispatch(0xc4213c66c0, 0xc420addd04, 0xc4203eb4a0, 0x2bafae0, 0xc421af1260, 0xc420feafc0, 0x7fa8b459a008, 0xc42048ff00, 0x0, 0x0)
	/go/src/github.com/docker/docker/builder/dockerfile/evaluator.go:81 +0x457
github.com/docker/docker/builder/dockerfile.(*Builder).dispatchDockerfileWithCancellation(0xc4203eb4a0, 0xc421a77d10, 0x1, 0x1, 0x0, 0x0, 0x0, 0x5c, 0x2bafae0, 0xc421af1260, ...)
	/go/src/github.com/docker/docker/builder/dockerfile/builder.go:322 +0x940
github.com/docker/docker/builder/dockerfile.(*Builder).build(0xc4203eb4a0, 0x2bafae0, 0xc421af1260, 0xc42048f240, 0x0, 0x0, 0x0)
	/go/src/github.com/docker/docker/builder/dockerfile/builder.go:240 +0x3bb
github.com/docker/docker/builder/dockerfile.(*BuildManager).Build(0xc4203b87c0, 0x7fa8b679bba0, 0xc42048f500, 0x2bacf20, 0xc420107280, 0x2b9bd60, 0xc4217ddf40, 0x2b9cba0, 0xc4217ddf60, 0x2b9cba0, ...)
	/go/src/github.com/docker/docker/builder/dockerfile/builder.go:125 +0x4aa
github.com/docker/docker/api/server/backend/build.(*Backend).Build(0xc42085eae0, 0x7fa8b679bba0, 0xc420107380, 0x2bacf20, 0xc420107280, 0x2b9bd60, 0xc4217ddf40, 0x2b9cba0, 0xc4217ddf60, 0x2b9cba0, ...)

Restarting it.

@cpuguy83
Copy link
Member

Same failure:

panic: DockerSuite.TestBuildAddAndCopyFileWithWhitespace test timed out after 5m0s [recovered]

@cpuguy83
Copy link
Member

It looks like there's some other problems as well.
Janky failed on docker-py tests.

powerpc and z failed on TestCommitHardlink

Details
19:43:33 FAIL: docker_cli_commit_test.go:67: DockerSuite.TestCommitHardlink
19:43:33 
19:43:33 docker_cli_commit_test.go:84:
19:43:33     c.Assert(chunks[1], checker.Contains, chunks[0], check.Commentf("Failed to create hardlink in a container. Expected to find %q in %q", inode, chunks[1:]))
19:43:33 ... obtained string = "\x1b[0;0mfile1\x1b[0m  1466359 \x1b[0;0mfile2\x1b[0m"
19:43:33 ... substring string = "1466358 "
19:43:33 ... Failed to create hardlink in a container. Expected to find "1466358" in ["\x1b[0;0mfile1\x1b[0m  1466359 \x1b[0;0mfile2\x1b[0m"]

@cpuguy83
Copy link
Member

These errors are likely showing up now because CI runs on VFS.

@sargun sargun force-pushed the vfs-use-copy_file_range branch from 65222b7 to 45d7b26 Compare November 20, 2017 20:27
@sargun
Copy link
Contributor Author

sargun commented Nov 20, 2017

@cpuguy83 It seems like the hardlink was creating the issue. I can break the hardlink fix into its own separate commit if you want, but otherwise, things look good to me.

@cpuguy83
Copy link
Member

Separate commit may be good.
Can we add a unit test for hardlinks as well?

@sargun
Copy link
Contributor Author

sargun commented Nov 21, 2017

There is a test for hard links.

@cpuguy83
Copy link
Member

Looks good.
One more nit (outside of splitting the fix for the dir copy hardlinks) is our use of syscall instead of golang.org/x/sys/unix.

@sargun sargun closed this Nov 21, 2017
@sargun sargun force-pushed the vfs-use-copy_file_range branch from 45d7b26 to ab90bc2 Compare November 21, 2017 18:25
@sargun
Copy link
Contributor Author

sargun commented Nov 21, 2017

@cpuguy83 I made the test use golang.org/x/sys/unix, but the old code in copy.go is still using syscall.

@sargun sargun reopened this Nov 21, 2017
@sargun sargun force-pushed the vfs-use-copy_file_range branch 2 times, most recently from 0c48441 to a5e7681 Compare November 21, 2017 18:49
@sargun
Copy link
Contributor Author

sargun commented Nov 21, 2017

I don't see the Windows build being scheduled?

@thaJeztah
Copy link
Member

Let me restart, there was an issue earlier with Jenkins

@sargun
Copy link
Contributor Author

sargun commented Nov 21, 2017

Finally. @cpuguy83 Split the commit. Changed the tests. Should be good to go.

@sargun
Copy link
Contributor Author

sargun commented Nov 22, 2017

Some benchmarks:

reflinks:
time docker create centos:latest echo
3dd564d5823994ed54f8e2d646210b4aebcc7622fb6e35d353483057f9d4dd23

real    0m1.154s
user    0m0.008s
sys     0m0.004s

copy_file_range, no reflinks:
time docker create centos:latest echo
737917a226317df9a2f4f7c46d9ed1e83e32a16f4f827eee5e5730f29e61e441

real    0m1.147s
user    0m0.016s
sys     0m0.004s

legacy:
time docker create centos:latest echo
bd925697d2067a6cde175d6a13b4c25f8c23e926422d32303f06e5084647b045

real    0m1.685s
user    0m0.012s
sys     0m0.008s

@sargun sargun force-pushed the vfs-use-copy_file_range branch 2 times, most recently from b40b11a to 5a082b0 Compare December 1, 2017 02:50
@sargun
Copy link
Contributor Author

sargun commented Dec 1, 2017

It seems like users are no longer able to restart their own Jenkins builds. Is this intentional?

@sargun sargun force-pushed the vfs-use-copy_file_range branch from 5a082b0 to 77ab7c2 Compare December 1, 2017 03:06
@sargun
Copy link
Contributor Author

sargun commented Dec 1, 2017

If it helps, we could merge the non-lolspeed version in, and then I can do another PR with the concurrency stuffs.

Copy link
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now there's a lot of complexity that seems like we could simplify.

There are currently 3 different concurrency primitives in use.
If we can change this to use at most two concurrency primitives (e.g. chan + a waitgroup to simplify chan handling) that will be a good start.

Not sure what to do with the linked list... it's probably ok but don't often see it in go code. Can we use a slice instead?

@sargun
Copy link
Contributor Author

sargun commented Dec 1, 2017

The linked list seems like a better approach than a slice to me because the slice will require constant resizes / reallocations. I don't see any benefit of using a slice, other than potential perf cost.

RE: Concurrency primitives --
Really, there's the waitgroup + chans coordinating the divide and conquer. We could replace some of that by a golang.org/x/sync/errgroup, (https://godoc.org/golang.org/x/sync/errgroup), but I'm not sure how much better that is?

I don't see a better way to deal with the inode references.

@sargun
Copy link
Contributor Author

sargun commented Dec 1, 2017

To me, errgroup would be yet another set of primitives.

@cpuguy83
Copy link
Member

cpuguy83 commented Dec 1, 2017

There's (multiple) mutex + (multiple) chan + waitgroup.

@sargun
Copy link
Contributor Author

sargun commented Dec 1, 2017

I mean, I can also break out the concurrency code into a separate module, if that helps? I don't see a particularly good way to actually remove many of these unless we start using interface{} in places, or overloading?

@sargun sargun force-pushed the vfs-use-copy_file_range branch from 77ab7c2 to 9b16a95 Compare December 1, 2017 17:07
Previously, the code would set the mtime on the directories before
creating files in the directory itself. This was problematic
because it resulted in the mtimes on the directories being
incorrectly set. This change makes it so that the mtime is
set only _after_ all of the files have been created.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
@sargun sargun force-pushed the vfs-use-copy_file_range branch from 9b16a95 to 77a2bc3 Compare December 1, 2017 17:12
@sargun sargun changed the title Have VFS graphdriver use accelerated in-kernel copy; Make accelerated In kernel copy fast Have VFS graphdriver use accelerated in-kernel copy Dec 1, 2017
Copy link
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sargun
Copy link
Contributor Author

sargun commented Dec 4, 2017

DockerSwarmSuite.TestSwarmLockUnlockCluster failed. Can someone restart janky?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants