Skip to content

shim: SandboxPlatform validation fails when containerd sets SandboxIsolation without SandboxPlatform #2619

@rzlink

Description

@rzlink

Bug Report

Summary

The sandbox platform validation added in PR #2473 (createInternal() in service_internal.go) fails when containerd's default runtime configuration sets SandboxIsolation=1 (HYPERVISOR) without setting SandboxPlatform. This breaks all Hyper-V isolated Windows containers on containerd v2.2.1+.

Error

FailedCreatePodSandBox: failed to create shim task: invalid runtime sandbox platform: 
"" is an invalid OS component of "": OSAndVersion specifier component must match 
"^([A-Za-z0-9_-]+)(?:\\(([A-Za-z0-9_.-]*)\\))?$": invalid argument

Root Cause

Two components interact to cause this bug:

1. containerd config_windows.go — incomplete runtime handler defaults

containerd's code defaults for runhcs-wcow-hypervisor set SandboxIsolation: 1 but omit SandboxPlatform:

"runhcs-wcow-hypervisor": {
    Type: "io.containerd.runhcs.v1",
    Options: map[string]interface{}{
        "SandboxIsolation": 1,
        // SandboxPlatform is NOT set
    },
},

This makes the runtime options non-empty (proto.Equal(shimOpts, &runhcsopts.Options{}) returns false), since SandboxIsolation is set.

2. hcsshim PR #2473 — strict validation assumes SandboxPlatform is always set when options are non-empty

The emptyShimOpts check:

emptyShimOpts := req.Options == nil || proto.Equal(shimOpts, &runhcsopts.Options{})

When emptyShimOpts == false, the code unconditionally validates SandboxPlatform:

if !emptyShimOpts {
    plat, err := platforms.Parse(shimOpts.GetSandboxPlatform())  // fails: "" is not valid

However, when emptyShimOpts == true, the shim correctly infers the platform from the OCI spec without needing SandboxPlatform. The inference logic already exists — it's just not reachable in the non-empty options path.

Affected Versions

Component Version Status
hcsshim v0.14.0-rc.1+ (includes PR #2473) Affected
containerd v2.2.1+ (bundles hcsshim v0.14.0-rc.1) Affected
containerd v2.1.x (bundles hcsshim v0.13.0) Not affected (no validation)

Reproduction

  1. Use stock containerd v2.2.1 with default config (no custom runtime handler options)
  2. Create a pod with runtimeClassName: runhcs-wcow-hypervisor
  3. Pod fails with the error above

Suggested Fix

When runtime options are non-empty but SandboxPlatform is empty, infer the platform from the OCI spec rather than failing:

if !emptyShimOpts {
    sandboxPlatform := shimOpts.GetSandboxPlatform()
    if sandboxPlatform == "" {
        if oci.IsLCOW(&spec) {
            sandboxPlatform = "linux/" + runtime.GOARCH
        } else if oci.IsWCOW(&spec) {
            sandboxPlatform = "windows/" + runtime.GOARCH
        } else {
            return nil, fmt.Errorf("cannot infer sandbox platform from OCI spec")
        }
        shimOpts.SandboxPlatform = sandboxPlatform
    }
    plat, err := platforms.Parse(sandboxPlatform)
    // ... existing validation continues
}

This mirrors the existing behavior when options are entirely empty (the shim already infers platform from the spec in that case).

Environment

  • Windows Server 2022 (build 10.0.20348)
  • Kubernetes v1.33+
  • containerd v2.2.1
  • CAPZ (Cluster API Provider Azure) clusters

/cc @helsaawy

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions