Skip to content

Conversation

@apostasie
Copy link

@apostasie apostasie commented Mar 31, 2025

nerdctl has struggled with a large number of occurrences of a racy issue, documented in: containerd/nerdctl#3513

more specifically in the context of a (simple) call to Convert:
containerd/nerdctl#3509

where an image we just created would sometimes disappear immediately.

It looks to me like this here is the root cause?

cc @AkihiroSuda and @lingdie

@k8s-ci-robot
Copy link

Hi @apostasie. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

var res images.Image
if dstRef != srcRef {
_ = is.Delete(ctx, dstRef)
_ = is.Delete(ctx, dstRef, images.SynchronousDelete())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is dstRef getting deleted here at all?

If there is a race, then synchronous delete would not be the correct solution. GC can happen at any type and if the results of GC changes expected state, that means that references aren't being properly held. References should be kept using a lease or an image reference.

What are the resources getting removed? Why would ensuring those resources are deleted before taking a new reference help with a race condition?

Copy link
Author

@apostasie apostasie Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is dstRef getting deleted here at all?

I am not the author of that code.
I assume Create will fail if the image already exist?

If there is a race

There is clearly a race.

References should be kept using a lease or an image reference.

Convert gets a lease then returns an image ref that we hold on, right?

What are the resources getting removed?

The destination image is getting removed.

Why would ensuring those resources are deleted before taking a new reference help with a race condition?

Are you suggesting that async Delete will not do anything at all?

Anyhow, if you believe the existing code is fine, that's ok, I will just close.
LMK.

Copy link
Author

@apostasie apostasie Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a reproducer if you want to investigate this on your own

func main() {
	const dockerContainerdaddress = "/run/user/501/containerd/containerd.sock"
	ctx := context.Background()
	srcRef := "docker.io/library/busybox:latest"
	dstRef := "somethingelse"

	client, err := containerd.New(dockerContainerdaddress, containerd.WithDefaultNamespace("default"))
	if err != nil {
		fmt.Println("client", err)
		os.Exit(1)
	}

	platMC, _ := platformutil.NewMatchComparer(false, []string{runtime.GOOS + "/" + runtime.GOARCH})

	var result *images.Image
	result, err = converter.Convert(ctx, client, dstRef, srcRef, converter.WithPlatform(platMC))
	if err != nil {
		fmt.Println("convert", err)
		os.Exit(1)
	}

	_, _, _, missing, err := images.Check(ctx, client.ContentStore(), result.Target, platMC)
	if err != nil {
		fmt.Println("check", err)
		os.Exit(1)
	}

	fmt.Println("miss", missing)
}

Add an extra sleep in converter.Convert to simulate pressure:

		_ = is.Delete(ctx, dstRef)
+		time.Sleep(5 * time.Second)
		res, err = is.Create(ctx, dstImg)

Now:

nerdctl rmi -f $(nerdctl images -q)
nerdctl pull busybox
go run main.go
nerdctl images
go run main.go
nerdctl images

First go run:

miss []
REPOSITORY       TAG       IMAGE ID        CREATED                   PLATFORM       SIZE       BLOB SIZE
somethingelse    latest    901847821d9c    Less than a second ago    linux/arm64    4.174MB    1.846MB
busybox          latest    37f7b378a29c    5 seconds ago             linux/arm64    4.174MB    1.846MB

Second go run:

miss [{application/vnd.oci.image.index.v1+json sha256:901847821d9c85076405fcbd24c04c2a4860c542a8831a4b56ac66aea50a09d3 755 [] map[] [] <nil> }]
WARN[0000] content digest sha256:901847821d9c85076405fcbd24c04c2a4860c542a8831a4b56ac66aea50a09d3: not found
REPOSITORY    TAG       IMAGE ID        CREATED           PLATFORM       SIZE       BLOB SIZE
busybox       latest    37f7b378a29c    10 seconds ago    linux/arm64    4.174MB    1.846MB

My sync-delete patch actually is not helping for that specific reproducer, so, updated to just remove Delete entirely which seems to work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether or not there is an underlying issue somewhere related to Delete is anyone's guess I suppose?
Anyhow, not particularly interested in looking into it.
I just need Convert to work.

Signed-off-by: apostasie <spam_blackhole@farcloser.world>
@jirimoravcik
Copy link

Hello, is there any progress on this PR? Can I do anything to help? Thank you

@apostasie
Copy link
Author

Hello, is there any progress on this PR? Can I do anything to help? Thank you

Hard to tell.

Underlying issue is still there in containerd - though we did ensure nerdctl no longer relies on this codepath to avoid it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Needs Triage

Development

Successfully merging this pull request may close these issues.

4 participants