Skip to content

Conversation

@TBBle
Copy link
Contributor

@TBBle TBBle commented Jul 17, 2020

This builds on #4395 to implement the Compare side of the differ and similarly to Apply, adds a redirect for WCOW to archive.WriteDiff.

With this change, (and a bunch of buildkit hacks), I can now export an image I built with buildkit to any of the types oci, docker, or image.

Fixes: #4394

@theopenlab-ci
Copy link

theopenlab-ci bot commented Jul 17, 2020

Build succeeded.

@TBBle TBBle force-pushed the wcow_compare_layers_to_tar branch 2 times, most recently from f455a3e to c5310ee Compare July 17, 2020 18:38
@TBBle
Copy link
Contributor Author

TBBle commented Jul 17, 2020

I've speculatively enabled the snapshot tests on Windows, in case I've fixed them as part of this. If that's not the case, I'll back that enablement out for later.

Edit: Yup, a mix of complaints from hcsshim and failures due to permissions issues, plus error "unknown" from the Basic test, means this needs more work to support the snapshot test suite. Hopefully the work is in the suite, not the code...

@theopenlab-ci
Copy link

theopenlab-ci bot commented Jul 17, 2020

Build succeeded.

@TBBle
Copy link
Contributor Author

TBBle commented Jul 17, 2020

There's a bunch of refactoring opportunities I saw from this work.

Particularly, it seems that the APIs for archive.WriteDiff and archive.Apply are somewhat nonsensical when applied to windows layers, as we use the Opts to smuggle the real parameters through them, and ignore the code in favour of complete Windows-specific implementations. I think these two methods might need more insight into the thing they're being called on, to actually do their job without so much hand-holding.

Apart from tests, WriteDiff only appears to be used from the walkingDiff, and windowsDiff implementations of Compare, and for taking task checkpoints. The latter is always diffing against "", and does not interact with Windows Container layers.

Apply is used more-widely, but again only windowsDiff knows about Windows Container layers.

I suspect it would be cleaner to not involve archive in the Windows Container layers, but to directly implement those functions in windowsDiff. Particularly if we lean more on hcsshim and go-winio for the actual implementations.

@TBBle TBBle force-pushed the wcow_compare_layers_to_tar branch from c5310ee to 57f840f Compare July 17, 2020 18:56
@theopenlab-ci
Copy link

theopenlab-ci bot commented Jul 17, 2020

Build succeeded.

@TBBle TBBle force-pushed the wcow_compare_layers_to_tar branch from 57f840f to caf6dd5 Compare July 17, 2020 19:18
@theopenlab-ci
Copy link

theopenlab-ci bot commented Jul 17, 2020

Build succeeded.

@TBBle TBBle force-pushed the wcow_compare_layers_to_tar branch from caf6dd5 to 69f5446 Compare July 17, 2020 19:32
@theopenlab-ci
Copy link

theopenlab-ci bot commented Jul 17, 2020

Build succeeded.

@jterry75
Copy link
Contributor

@kevpar / @jstarks - PTAL

@jstarks
Copy link
Contributor

jstarks commented Jul 17, 2020

@ambarve

@theopenlab-ci
Copy link

theopenlab-ci bot commented Jul 21, 2020

Build succeeded.

@jterry75
Copy link
Contributor

@ambarve - PTAL

@TBBle TBBle force-pushed the wcow_compare_layers_to_tar branch from b4d60e0 to d0b60fb Compare July 28, 2020 13:14
@theopenlab-ci
Copy link

theopenlab-ci bot commented Jul 28, 2020

Build succeeded.

@TBBle TBBle marked this pull request as draft September 9, 2020 02:33
@TBBle
Copy link
Contributor Author

TBBle commented Sep 9, 2020

Marked as draft pending rebase on updated #4395, which will remove the winio_tar aspect of this change.

Edit: Done

@TBBle TBBle force-pushed the wcow_compare_layers_to_tar branch from d0b60fb to e648165 Compare September 9, 2020 11:35
@TBBle TBBle marked this pull request as ready for review September 9, 2020 11:39
@theopenlab-ci
Copy link

theopenlab-ci bot commented Apr 5, 2021

Build succeeded.

@TBBle
Copy link
Contributor Author

TBBle commented Apr 5, 2021

Hmm. Rebasing showed a random bunch of HTTP-related failures. The Windows Integration run showed:

2021-04-05T04:02:48.2118915Z     convert_test.go:50: failed to copy: stream error: stream ID 9; HTTP_1_1_REQUIRED
2021-04-05T04:02:48.2120056Z --- FAIL: TestConvert (5.78s)

which is clearly unrelated to my changes here. The Linux suite are all failing on CRI tests, getting '403 forbidden' from GCR for some image or images. We're seeing the same thing (on Linux) on main-branch HEAD, so definitely unrelated.

We'll see if the Windows issue goes away when master is working and this is rebased again. Edit: The PRs that depend on this one (#4415, #4419) passed Windows CI, so I am pretty confident this failure isn't to do with this PR.

@dmcgowan dmcgowan added the platform/windows Windows label Apr 5, 2021
@jterry75
Copy link
Contributor

jterry75 commented Apr 5, 2021

@TBBle - Should we just re-run the tests are are you going ot rebase on some master change that is clean?

@TBBle
Copy link
Contributor Author

TBBle commented Apr 5, 2021

I can't retrigger the tests AFAIK, but if you can, that'd be fine. I had a poke around, and the failures looked more like infrastructure issues than code issues, so I am not specifically waiting on a fix to rebase on-to.

I only rebased this time because I needed to rebase #4419 to verify a change, and figured I'd do the whole stack to keep my git show-branches output rational.

@AkihiroSuda AkihiroSuda requested review from dmcgowan and fuweid April 6, 2021 06:24
Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm on green

@TBBle TBBle force-pushed the wcow_compare_layers_to_tar branch 2 times, most recently from 1587b58 to bea85f2 Compare April 7, 2021 09:44
@theopenlab-ci
Copy link

theopenlab-ci bot commented Apr 7, 2021

Build succeeded.

@TBBle
Copy link
Contributor Author

TBBle commented Apr 7, 2021

Gah. The rebase to current mainline branch passed: https://github.com/containerd/containerd/actions/runs/725556015
And then a comment-only fix failed in unrelated tests on Windows: https://github.com/containerd/containerd/actions/runs/725563341

2021-04-07T10:00:33.8681308Z === CONT  TestContainerKill
2021-04-07T10:00:33.8694170Z     container_test.go:620: failed to mount container storage: hcsshim::PrepareLayer - failed failed in Win32: The system cannot find the path specified. (0x3): unknown
2021-04-07T10:00:33.9335940Z --- FAIL: TestContainerKill (2.32s)
2021-04-07T10:01:33.7263019Z === CONT  TestContainerNoBinaryExists
2021-04-07T10:01:33.7264655Z     container_test.go:759: failed to create task failed to mount container storage: hcsshim::PrepareLayer - failed failed in Win32: The device is not ready. (0x15): unknown
2021-04-07T10:01:33.7552071Z --- FAIL: TestContainerNoBinaryExists (62.14s)

It's green-enough for me, since this PR doesn't actually change any of the on-disk layer data, and these look suspiciously like the HCS-was-busy-and-reported-success-too-early issue I speculated in #4419.

Notably, the two tests this PR enables, TestExport and TestExportAndImport, had passed before this point in the run

2021-04-07T09:59:53.0784565Z --- PASS: TestExport (5.86s)
2021-04-07T10:00:30.3192956Z --- PASS: TestExportAndImport (15.25s)

@jterry75
Copy link
Contributor

jterry75 commented Apr 7, 2021

@TBBle - This looks to be a longstanding bug with the containerd shim here: https://github.com/microsoft/hcsshim/blob/master/cmd/containerd-shim-runhcs-v1/delete.go#L29. The code properly handles the vm/container not found but I believe unintentionally re-uses the err defined when the span status is set here: https://github.com/microsoft/hcsshim/blob/master/cmd/containerd-shim-runhcs-v1/delete.go#L36. I believe this is truly unrelated to your change and will sign off.

Thanks!

@ambarve - Would you mind taking the bug here. The shim should not return an error on the span if that error would be handled. Easiest way would be to rename the span error used and only assign in the case we actually write an error to stderr via a defer statement. But you can come up with whatever logic you like

Copy link
Contributor

@jterry75 jterry75 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jterry75
Copy link
Contributor

jterry75 commented Apr 7, 2021

@fuweid - I looked at the Windows error and agree this is unrelated to this change. Are you ok taking it with the transient error while @ambarve can get a fix out?

@fuweid
Copy link
Member

fuweid commented Apr 9, 2021

@TBBle would you mind to rebase? Your change might be not related to the error. Please rebase and retry it again. Thanks

TBBle added 4 commits April 10, 2021 02:02
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
This parallels the implementation of windowsDiff.Apply, including
bouncing very briefly though archive.WriteDiff and then straight back
out into Windows-specific code.

It's mostly pulling existing mechanisms from non-Windows Compare or
Windows Apply, and highlights that there's probably a lot of scope for
refactoring on top of this.

Now the export-related integration tests pass CI on Windows.

Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
@TBBle TBBle force-pushed the wcow_compare_layers_to_tar branch from bea85f2 to c750498 Compare April 9, 2021 16:02
@theopenlab-ci
Copy link

theopenlab-ci bot commented Apr 9, 2021

Build succeeded.

@jterry75
Copy link
Contributor

jterry75 commented Apr 9, 2021

All green!

Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jterry75 jterry75 merged commit d4fbff1 into containerd:master Apr 12, 2021
@TBBle TBBle deleted the wcow_compare_layers_to_tar branch April 13, 2021 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Windows (WCOW) implementation of diff.Comparer

10 participants