Skip to content

Conversation

@vvoland
Copy link
Contributor

@vvoland vvoland commented Nov 12, 2024

- What I did

  • Changed the on-disk representation of the container to store the full OCI platform and not just operating system
  • Expose the ImageManifestDescriptor when the containerd image store is enabled

- How I did it

- How to verify it
TestInspectImageManifestPlatform

- Description for the changelog

containerd image store: `GET /containers/{name}/json` now returns an `ImageManifestDescriptor` field containing the OCI descriptor of the platform-specific image manifest of the image that was used to create the container.

- A picture of a cute animal (not mandatory but encouraged)

@vvoland vvoland added area/api API status/2-code-review kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. impact/changelog containerd-integration Issues and PRs related to containerd integration labels Nov 12, 2024
@vvoland vvoland added this to the 28.0.0 milestone Nov 12, 2024
@vvoland vvoland self-assigned this Nov 12, 2024
@thaJeztah
Copy link
Member

Giving this PR a spin, and it's functioning as described;

docker run -d --name foo --platform=linux/amd64 nginx:alpine
curl -s --unix-socket /var/run/docker.sock http://foo/containers/foo/json | jq .

full output:

Details
{
  "Id": "db001744b9ee69722c328071d1fd0ccecff56ddf803880fa9fd1b2eaadb227f6",
  "Created": "2024-11-12T17:08:03.990974137Z",
  "Path": "/docker-entrypoint.sh",
  "Args": [
    "nginx",
    "-g",
    "daemon off;"
  ],
  "State": {
    "Status": "running",
    "Running": true,
    "Paused": false,
    "Restarting": false,
    "OOMKilled": false,
    "Dead": false,
    "Pid": 1547,
    "ExitCode": 0,
    "Error": "",
    "StartedAt": "2024-11-12T17:08:04.078103137Z",
    "FinishedAt": "0001-01-01T00:00:00Z"
  },
  "Image": "sha256:d44ffb126f474dfbe8b729d4a984d8d798110637560a8db41979461db6f761d2",
  "ResolvConfPath": "/var/lib/docker/containers/db001744b9ee69722c328071d1fd0ccecff56ddf803880fa9fd1b2eaadb227f6/resolv.conf",
  "HostnamePath": "/var/lib/docker/containers/db001744b9ee69722c328071d1fd0ccecff56ddf803880fa9fd1b2eaadb227f6/hostname",
  "HostsPath": "/var/lib/docker/containers/db001744b9ee69722c328071d1fd0ccecff56ddf803880fa9fd1b2eaadb227f6/hosts",
  "LogPath": "/var/lib/docker/containers/db001744b9ee69722c328071d1fd0ccecff56ddf803880fa9fd1b2eaadb227f6/db001744b9ee69722c328071d1fd0ccecff56ddf803880fa9fd1b2eaadb227f6-json.log",
  "Name": "/foo",
  "RestartCount": 0,
  "Driver": "overlayfs",
  "Platform": "linux",
  "MountLabel": "",
  "ProcessLabel": "",
  "AppArmorProfile": "",
  "ExecIDs": null,
  "HostConfig": {
    "Binds": null,
    "ContainerIDFile": "",
    "LogConfig": {
      "Type": "json-file",
      "Config": {}
    },
    "NetworkMode": "bridge",
    "PortBindings": {},
    "RestartPolicy": {
      "Name": "no",
      "MaximumRetryCount": 0
    },
    "AutoRemove": false,
    "VolumeDriver": "",
    "VolumesFrom": null,
    "ConsoleSize": [
      34,
      141
    ],
    "CapAdd": null,
    "CapDrop": null,
    "CgroupnsMode": "host",
    "Dns": [],
    "DnsOptions": [],
    "DnsSearch": [],
    "ExtraHosts": null,
    "GroupAdd": null,
    "IpcMode": "private",
    "Cgroup": "",
    "Links": null,
    "OomScoreAdj": 0,
    "PidMode": "",
    "Privileged": false,
    "PublishAllPorts": false,
    "ReadonlyRootfs": false,
    "SecurityOpt": null,
    "UTSMode": "",
    "UsernsMode": "",
    "ShmSize": 67108864,
    "Runtime": "runc",
    "Isolation": "",
    "CpuShares": 0,
    "Memory": 0,
    "NanoCpus": 0,
    "CgroupParent": "",
    "BlkioWeight": 0,
    "BlkioWeightDevice": [],
    "BlkioDeviceReadBps": [],
    "BlkioDeviceWriteBps": [],
    "BlkioDeviceReadIOps": [],
    "BlkioDeviceWriteIOps": [],
    "CpuPeriod": 0,
    "CpuQuota": 0,
    "CpuRealtimePeriod": 0,
    "CpuRealtimeRuntime": 0,
    "CpusetCpus": "",
    "CpusetMems": "",
    "Devices": [],
    "DeviceCgroupRules": null,
    "DeviceRequests": null,
    "MemoryReservation": 0,
    "MemorySwap": 0,
    "MemorySwappiness": null,
    "OomKillDisable": false,
    "PidsLimit": null,
    "Ulimits": [],
    "CpuCount": 0,
    "CpuPercent": 0,
    "IOMaximumIOps": 0,
    "IOMaximumBandwidth": 0,
    "MaskedPaths": [
      "/proc/asound",
      "/proc/acpi",
      "/proc/kcore",
      "/proc/keys",
      "/proc/latency_stats",
      "/proc/timer_list",
      "/proc/timer_stats",
      "/proc/sched_debug",
      "/proc/scsi",
      "/sys/firmware",
      "/sys/devices/virtual/powercap"
    ],
    "ReadonlyPaths": [
      "/proc/bus",
      "/proc/fs",
      "/proc/irq",
      "/proc/sys",
      "/proc/sysrq-trigger"
    ]
  },
  "GraphDriver": {
    "Data": null,
    "Name": "overlayfs"
  },
  "Mounts": [],
  "Config": {
    "Hostname": "db001744b9ee",
    "Domainname": "",
    "User": "",
    "AttachStdin": false,
    "AttachStdout": false,
    "AttachStderr": false,
    "ExposedPorts": {
      "80/tcp": {}
    },
    "Tty": false,
    "OpenStdin": false,
    "StdinOnce": false,
    "Env": [
      "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
      "NGINX_VERSION=1.27.2",
      "PKG_RELEASE=1",
      "DYNPKG_RELEASE=1",
      "NJS_VERSION=0.8.6",
      "NJS_RELEASE=1"
    ],
    "Cmd": [
      "nginx",
      "-g",
      "daemon off;"
    ],
    "Image": "nginx:alpine",
    "Volumes": null,
    "WorkingDir": "/",
    "Entrypoint": [
      "/docker-entrypoint.sh"
    ],
    "OnBuild": null,
    "Labels": {
      "maintainer": "NGINX Docker Maintainers <docker-maint@nginx.com>"
    },
    "StopSignal": "SIGQUIT"
  },
  "NetworkSettings": {
    "Bridge": "",
    "SandboxID": "ac1581194490fc5b0b00ca28c47b7341f037c0db7808783484787c1f353718f4",
    "SandboxKey": "/var/run/docker/netns/ac1581194490",
    "Ports": {
      "80/tcp": null
    },
    "HairpinMode": false,
    "LinkLocalIPv6Address": "",
    "LinkLocalIPv6PrefixLen": 0,
    "SecondaryIPAddresses": null,
    "SecondaryIPv6Addresses": null,
    "EndpointID": "5fe67c0d4bccf176f46dd665d8449ee07fc1de6af342334c5c95fa4edccd461e",
    "Gateway": "172.18.0.1",
    "GlobalIPv6Address": "",
    "GlobalIPv6PrefixLen": 0,
    "IPAddress": "172.18.0.2",
    "IPPrefixLen": 16,
    "IPv6Gateway": "",
    "MacAddress": "02:42:ac:12:00:02",
    "Networks": {
      "bridge": {
        "IPAMConfig": null,
        "Links": null,
        "Aliases": null,
        "MacAddress": "02:42:ac:12:00:02",
        "DriverOpts": null,
        "NetworkID": "edfedc75d01b0acd19ef9d03c7b96ec58728adf3f080a22213fca90d4a9abcd4",
        "EndpointID": "5fe67c0d4bccf176f46dd665d8449ee07fc1de6af342334c5c95fa4edccd461e",
        "Gateway": "172.18.0.1",
        "IPAddress": "172.18.0.2",
        "IPPrefixLen": 16,
        "IPv6Gateway": "",
        "GlobalIPv6Address": "",
        "GlobalIPv6PrefixLen": 0,
        "DNSNames": null
      }
    }
  },
  "ImageManifest": {
    "mediaType": "application/vnd.oci.image.manifest.v1+json",
    "digest": "sha256:d213b2a02ef4e7ec85882e8955343cdd08ab49d6548995ad18623f47017c65ee",
    "size": 2498,
    "annotations": {
      "com.docker.official-images.bashbrew.arch": "amd64",
      "org.opencontainers.image.base.digest": "sha256:bbd86dd5c396c076dfb899e207ae75a095c7bd7ff906bca5247e07536b610034",
      "org.opencontainers.image.base.name": "nginx:1.27.2-alpine-slim",
      "org.opencontainers.image.created": "2024-11-12T03:05:02Z",
      "org.opencontainers.image.revision": "6a4c0cb4ac7e53bbbe473df71b61a5bf9f95252f",
      "org.opencontainers.image.source": "https://github.com/nginxinc/docker-nginx.git#6a4c0cb4ac7e53bbbe473df71b61a5bf9f95252f:mainline/alpine",
      "org.opencontainers.image.url": "https://hub.docker.com/_/nginx",
      "org.opencontainers.image.version": "1.27.2-alpine"
    },
    "platform": {
      "architecture": "amd64",
      "os": "linux"
    }
  }
}

Just the Manifest;

curl -s --unix-socket /var/run/docker.sock http://foo/containers/foo/json | jq .ImageManifest
{
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "digest": "sha256:d213b2a02ef4e7ec85882e8955343cdd08ab49d6548995ad18623f47017c65ee",
  "size": 2498,
  "annotations": {
    "com.docker.official-images.bashbrew.arch": "amd64",
    "org.opencontainers.image.base.digest": "sha256:bbd86dd5c396c076dfb899e207ae75a095c7bd7ff906bca5247e07536b610034",
    "org.opencontainers.image.base.name": "nginx:1.27.2-alpine-slim",
    "org.opencontainers.image.created": "2024-11-12T03:05:02Z",
    "org.opencontainers.image.revision": "6a4c0cb4ac7e53bbbe473df71b61a5bf9f95252f",
    "org.opencontainers.image.source": "https://github.com/nginxinc/docker-nginx.git#6a4c0cb4ac7e53bbbe473df71b61a5bf9f95252f:mainline/alpine",
    "org.opencontainers.image.url": "https://hub.docker.com/_/nginx",
    "org.opencontainers.image.version": "1.27.2-alpine"
  },
  "platform": {
    "architecture": "amd64",
    "os": "linux"
  }
}

One thing worth mentioning, and perhaps that's something we need to discuss; currently this will include the manifest of the variant that's used. That's expected, and that DOES make sense because, after all, that's the image in use, and also what would show up as "in-use" when using docker image ls --tree;

docker pull --quiet nginx:alpine

docker image ls --tree
IMAGE                   ID             DISK USAGE   CONTENT SIZE   USED
nginx:alpine            d44ffb126f47        156MB         44.3MB    ✔
├─ linux/amd64          d213b2a02ef4       79.7MB         22.8MB    ✔
├─ linux/arm/v6         ae1ee4b63c14           0B             0B
├─ linux/arm/v7         20ad73eca911           0B             0B
├─ linux/arm64/v8       d1f949a77b81       76.7MB         21.5MB
├─ linux/386            99e1c0714fa0           0B             0B
├─ linux/ppc64le        7fef8bcf8b6c           0B             0B
└─ linux/s390x          8c310bf29cfa           0B             0B

👉 ❓ Question is if there'd be a need to know both the variant used, and the index that was resolved. Because when viewing images in non tree view, that's the digest that would show;

docker image ls
REPOSITORY   TAG       IMAGE ID       CREATED       SIZE
nginx        alpine    d44ffb126f47   5 weeks ago   156MB

I could see use-cases where it may be relevant to know what "image" (the index) is something that's of interest, and it's not really possible to resolve that, because the same variant can be part of multiple indexes.

@thaJeztah
Copy link
Member

Let me also pull in @ctalledo @ctalledo @cdupuis - in case they have specific needs around this.

@vvoland
Copy link
Contributor Author

vvoland commented Nov 13, 2024

There's already Image field that references the full root image. Unfortunately it's only a digest and not a full descriptor.

We could:

  1. Add additional ResolvedImage field that would expose a descriptor of the image
  • and possibly deprecate Image? 🤔
  1. Rely on a separate docker image inspect call to get the root image descriptor

@thaJeztah
Copy link
Member

There's already Image field that references the full root image. Unfortunately it's only a digest and not a full descriptor.

Ah, you're right! I had in the back of my mind that there was some field we updated when removing the image (and update image name -> digest), but I guess that's not this field, but probably the one in .Config.Image;

docker container inspect --format '{{.Config.Image}}' test
nginx:alpine

docker container inspect --format '{{.Image}}' test
sha256:2140dad235c130ac861018a4e13a6bc8aea3a35f3a40e20c1b060d51a7efd250

So, yeah, maybe we have all we need already. I was mostly looking "is there anything we may be missing, because as we're making changes, now may be the time to make them".

Current changes LGTM, so I'd be happy to bring it in, mostly looking for input to verify if we're all good.

@vvoland vvoland force-pushed the c8d-inspect-imagemanifest branch 5 times, most recently from efc0b12 to c5ddeeb Compare November 15, 2024 09:27
@vvoland vvoland changed the title c8d/container/inspect: Return ImageManifest c8d/container/inspect: Return ImageManifestDescriptor Nov 15, 2024
Copy link
Member

@laurazard laurazard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion/nit for a comment on the platform deduction logic.

@vvoland vvoland force-pushed the c8d-inspect-imagemanifest branch 2 times, most recently from 9d71e8f to a12d176 Compare November 15, 2024 14:00
Comment on lines 89 to 91
// Deprecated: use [ImagePlatform.OS] instead
OS string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this is a silly-question, because we have too many Container structs in our code, so I lost a bit which one's used for what (note to self: we should really improve GoDoc for these types to not only describe "what's in them", but also what their purpose is).

I think this Container type is purely for storing state on-disk, correct? So preserving this field for backward-compatibility is backward-compatibility with existing state on-disk (up/downgrade dockerd), and this type is not returned over the API (so no compatibility issue there).

This PR adds a migration step during restore, so state should be migrated, which means that we could probably draw a roadmap when it's time to remove (1 .. 2 versions?). And it may be good to leave a comment in places where that's relevant.

I've ran into "this field is kept for backward compatibility" comments on various locations where it's no longer clear "... erm.. backward compatible with what? is this still relevant?", and such code never being removed because nobody knows 😂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this Container type is purely for storing state on-disk, correct? So preserving this field for backward-compatibility is backward-compatibility with existing state on-disk (up/downgrade dockerd), and this type is not returned over the API (so no compatibility issue there).

Correct.

This PR adds a migration step during restore, so state should be migrated, which means that we could probably draw a roadmap when it's time to remove (1 .. 2 versions?). And it may be good to leave a comment in places where that's relevant.

Good point, I'll open a tracking ticket for that.

Not sure if we can remove it that quickly though. If we remove it in eg. v30, the migration won't work for a pre-v28 daemons that skip the intermediate versions and update to v30+ right away.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #48892 and mentioned it in a TODO comment.

Comment on lines +460 to 462
if cntr.ImagePlatform.OS != runtime.GOOS {
return []string{"/bin/sh", "-c"}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminds me I still need to dig into some of these, I think there's still some LCOW leftovers dangling around, and there may be some broken logic in some places (#42170)

Comment on lines 40 to 42
// deduceContainerPlatform tries to deduce `ctr`'s platform.
// If both `ctr.OS` and `ctr.ImageManifest` are empty, it's not possible to deduce
// the container platform and an error is returned.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we still need to bother with that (probably-not), but ISTR that because pre-historic images didn't have platform (there was only Linux, and there was only x86), some codepaths treated "if nothing present, it must be linux/amd64).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pre-historic image case is already handled by container.FromDisk directly where it just defaults to host platform (previous behavior was to default to host OS).

In practice it shouldn't really matter as this branch of deduceContainerPlatform shouldn't be reached at all.
I'll adjust the comment though as it's wrong 😅 In this case it explicitly returns a default platform to match the pre-historic image behavior.

var errs []error

isValidPlatform := func(p ocispec.Platform) bool {
return p.OS != "" && p.Architecture != ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silly question; should we consider only an OS be "valid"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it should be valid - the architecture field is required by the OCI spec:

  • architecture string, REQUIRED

The CPU architecture which the binaries in this image are built to run on.
Configurations SHOULD use, and implementations SHOULD understand, values listed in the Go Language document for [GOARCH][go-environment].
(https://github.com/opencontainers/image-spec/blob/main/config.md#properties)

if err != nil {
errs = append(errs, err)
} else {
if isValidPlatform(p) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just occurred to me that ReadPlatformFromConfigByImageManifest above should already return an error if either OS or Architecture is empty;

	if cfg.Platform.OS == "" || cfg.Platform.Architecture == "" {
		return ocispec.Platform{}, errors.New("malformed image config")
	}

So, is this check perhaps redundant, and we could just if err != nil ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, not really, the test implementation for example doesn't do this, and we also need the same for the ReadPlatformFromImage, so I'll just remove the extra check from ReadPlatformFromConfigByImageManifest.

The function from this interface shouldn't have any extra logic anyway, as they only abstract the "read platform from image" part.

@vvoland vvoland force-pushed the c8d-inspect-imagemanifest branch 3 times, most recently from bc7cd93 to 3900aa2 Compare November 18, 2024 16:36
Add platform specific variants of the hello-world image

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Change the persistent container metadata to store the whole platform
(as defined by OCI) instead of only the operating system.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
`ImageManifestDescriptor` will contain an OCI descriptor of
platform-specific manifest of the image that was picked when creating
the container.

Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
@vvoland vvoland force-pushed the c8d-inspect-imagemanifest branch from 3900aa2 to 44ed306 Compare November 19, 2024 12:56
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@laurazard laurazard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah
Copy link
Member

flaky test (on windows); tracked in #48881

=== FAIL: github.com/docker/docker/integration/networking TestPortMappedHairpinWindows (12.58s)
    nat_windows_test.go:108: assertion failed: error is not nil: Post "http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.48/containers/89679925e929b336aec707f6168fec5787742fbf5e6dc3d4ea25ee023158fc79/start": context deadline exceeded
    panic.go:629: assertion failed: error is not nil: Error response from daemon: error while removing network: network clientnet id 80ecb727ff249520add2815b7823db56cb665b95b80750fef22bbf31c9270b3c has active endpoints

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/api API containerd-integration Issues and PRs related to containerd integration impact/api impact/changelog kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. status/4-merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants