Skip to content

Conversation

@henry118
Copy link
Member

@henry118 henry118 commented Sep 25, 2025

Implement #8881 based on the discussions in the thread.

New config

Add max_concurrent_unpacks config to transfer service. If this value is >1, parallel unpacking feature could be enabled if the snapshotter has "rebase" capability.

[plugins]
  [plugins.'io.containerd.transfer.v1.local']
    max_concurrent_downloads = 20
    concurrent_layer_fetch_buffer = 16777216
    max_concurrent_unpacks = 5   <--- NEW!

New snapshotter capability

Added rebase capability to snapshotters. A snapshotter can declare this capability to allow parallel unpacking. The overlayfs snapshotter is updated in this PR to announce this capability.

The "rebase" is performed in the Commit snapshotter API. The caller will pass in the WithParent snapshot option. The snapshotter can rebase the current snapshot onto the parent during the commit operation.

This rebase capability requires a update to the API protobuf to add a parent field into the CommitSnapshotRequest message.

Updated unpacker logic

The existing doUnpackFn is split to two halves.

  • The top half performs the regular snapshot preparing and layer fetching. Instead of "applying" the layer contents synchronously, the updated logic does it in a go routine.
  • The bottom half performs the snapshot commit depending on the result of top half. There are two modes:
    • Sequential: when the feature is off. It commits each snapshot before moving to the next layer.
    • Parallel: the new logic collects the top half's result and immediately moves to the next layer. At the end, commits are performed sequentially from the collected results.
* **Parallel Unpack**  ([#12332](https://github.com/containerd/containerd/pull/12332))

  Adds support for unpacking layers in parallel during pull operations. This feature is supported with overlayfs and EROFS snapshotters.

@dmcgowan
Copy link
Member

This would need to be handled in the proxied snapshotters as well, in core/snapshots/proxy and core/metadata/snapshots.go

@henry118 henry118 force-pushed the parallel-unpack branch 4 times, most recently from e885452 to 50b6dea Compare October 9, 2025 20:17
@dmcgowan dmcgowan moved this from Needs Triage to Needs Reviewers in Pull Request Review Oct 15, 2025
@dmcgowan dmcgowan added this to the 2.2 milestone Oct 15, 2025
Copy link
Member

@dmcgowan dmcgowan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One little change, I still want to take another look through the unpack code, it was dense before and probably more so now. The core unpack code has always been like that though, but the other parts look good.

@dmcgowan dmcgowan moved this from Needs Reviewers to Review In Progress in Pull Request Review Oct 17, 2025
@dmcgowan dmcgowan moved this from Review In Progress to Needs Update in Pull Request Review Oct 17, 2025
@henry118
Copy link
Member Author

One little change, I still want to take another look through the unpack code, it was dense before and probably more so now. The core unpack code has always been like that though, but the other parts look good.

Yep the unpack code isn't trivial to follow or make changes to. When I am working on this patch my primary focus was keeping the existing logic without too much refactoring.

@henry118 henry118 force-pushed the parallel-unpack branch 2 times, most recently from f1660de to 9da85bf Compare October 18, 2025 06:56
@dmcgowan dmcgowan moved this from Needs Update to Needs Reviewers in Pull Request Review Oct 20, 2025
errs = errors.Join(errs, err)
}
}
if errs != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this return errs and not err?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

}

if info.Parent != "" && base.Parent != "" && base.Parent != info.Parent {
return fmt.Errorf("unabled to change parent of snapshot %q from %q to %q", key, info.Parent, base.Parent)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo unable


// If the snapshot didn't have a parent when created, allow it
// to be rebased on a parent on commit.
if si.Parent == "" && si.Parent != base.Parent {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the logic would be better as first checking si.Parent != base.Parent. If si.Parent is empty, set it, otherwise, we should probably throw an invalid argument error.

statusCh, err := topHalf(i, desc, layerSpan, unpackLayerStart)
if err != nil {
if parallel {
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we close out the status channels here or will the context cancellation propagate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context cancellation will propagate and cause all goroutines created by top halves to return. On return the defer func will close the status channels.

@dmcgowan dmcgowan moved this from Needs Reviewers to Needs Update in Pull Request Review Oct 24, 2025
Signed-off-by: Henry Wang <henwang@amazon.com>
return err
}

if len(info.Parent) > 0 && len(base.Parent) > 0 && info.Parent != base.Parent {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need to make the change in this PR, but I think we could just remove this logic now and let the storage error handle this. This also means there is no other change needed in other snapshotters other than to set the capability.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I can change it in a follow-up PR.

@github-project-automation github-project-automation bot moved this from Needs Update to Review In Progress in Pull Request Review Oct 24, 2025
@dmcgowan dmcgowan added this pull request to the merge queue Oct 24, 2025
@dmcgowan dmcgowan changed the title Implement parallel unpack Parallel unpack Oct 24, 2025
@dmcgowan dmcgowan moved this from Review In Progress to Merge on Green in Pull Request Review Oct 24, 2025
Merged via the queue into containerd:main with commit 2e747a0 Oct 24, 2025
94 of 96 checks passed
@github-project-automation github-project-automation bot moved this from Merge on Green to Done in Pull Request Review Oct 24, 2025
@dmcgowan dmcgowan added area/distribution Image Distribution and removed area/runtime Runtime labels Oct 24, 2025
@dmcgowan dmcgowan changed the title Parallel unpack Add parallel unpack support Oct 24, 2025
@henry118 henry118 deleted the parallel-unpack branch October 24, 2025 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants