Skip to content

Conversation

@slonopotamus
Copy link
Contributor

Not sure I chose the best place for ReadSpec function, feel free to suggest a better one.

@k8s-ci-robot
Copy link

Hi @slonopotamus. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@slonopotamus slonopotamus force-pushed the uncopypaste-read-spec branch 2 times, most recently from fa47d23 to bbfa522 Compare July 6, 2023 12:31
@djdongjin
Copy link
Member

/ok-to-test

Copy link
Member

@djdongjin djdongjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious if we should remove those parsing (readSpec or loadSpec). I might be wrong but it seems these customized parsing is to avoid parsing the whole specs.Spec and only parse the necessary field (e.g., Root.Path, Annotations)

@slonopotamus slonopotamus force-pushed the uncopypaste-read-spec branch 3 times, most recently from 26ddd89 to 07d8ba9 Compare July 7, 2023 08:25
Copy link
Member

@dcantah dcantah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the change is great (and where you put ReadSpec I think makes perfect sense), but could you benchmark unmarshalling just the annotations/rootfs path vs. the entire spec? If it's negligible I think this is fine as is, and if it's not maybe we just leave in the two trimmed down versions.

@slonopotamus
Copy link
Contributor Author

slonopotamus commented Jul 11, 2023

I think the change is great (and where you put ReadSpec I think makes perfect sense), but could you benchmark unmarshalling just the annotations/rootfs path vs. the entire spec? If it's negligible I think this is fine as is, and if it's not maybe we just leave in the two trimmed down versions.

See https://github.com/slonopotamus/parse-spec-bench

UPD: added allocs

$ go test -v -bench=.
goos: linux
goarch: amd64
pkg: parse_spec_bench
cpu: AMD Ryzen 7 3700X 8-Core Processor             
BenchmarkParseFullSpec
BenchmarkParseFullSpec-16                   6352            307946 ns/op           79002 B/op        743 allocs/op
BenchmarkParseAnnotations
BenchmarkParseAnnotations-16                7495            150448 ns/op           31408 B/op         18 allocs/op
BenchmarkParseHooks
BenchmarkParseHooks-16                      7221            143729 ns/op           31576 B/op         22 allocs/op
PASS
ok      parse_spec_bench        4.183s

@thaJeztah
Copy link
Member

Ah, lol, was also writing a minimal benchmark, but using a const;

Details
package bla

import (
	"encoding/json"
	"strings"
	"testing"

	"github.com/opencontainers/runtime-spec/specs-go"
)

const containerOCI = `{
  "ociVersion": "1.1.0-rc.2",
  "process": {
    "user": {
      "uid": 0,
      "gid": 0,
      "additionalGids": [
        0,
        0,
        1,
        2,
        3,
        4,
        6,
        10,
        11,
        20,
        26,
        27
      ]
    },
    "args": [
      "/docker-entrypoint.sh",
      "nginx",
      "-g",
      "daemon off;"
    ],
    "env": [
      "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
      "HOSTNAME=3a428b39440b",
      "NGINX_VERSION=1.25.1",
      "PKG_RELEASE=1",
      "NJS_VERSION=0.7.12"
    ],
    "cwd": "/",
    "capabilities": {
      "bounding": [
        "CAP_CHOWN",
        "CAP_DAC_OVERRIDE",
        "CAP_FSETID",
        "CAP_FOWNER",
        "CAP_MKNOD",
        "CAP_NET_RAW",
        "CAP_SETGID",
        "CAP_SETUID",
        "CAP_SETFCAP",
        "CAP_SETPCAP",
        "CAP_NET_BIND_SERVICE",
        "CAP_SYS_CHROOT",
        "CAP_KILL",
        "CAP_AUDIT_WRITE"
      ],
      "effective": [
        "CAP_CHOWN",
        "CAP_DAC_OVERRIDE",
        "CAP_FSETID",
        "CAP_FOWNER",
        "CAP_MKNOD",
        "CAP_NET_RAW",
        "CAP_SETGID",
        "CAP_SETUID",
        "CAP_SETFCAP",
        "CAP_SETPCAP",
        "CAP_NET_BIND_SERVICE",
        "CAP_SYS_CHROOT",
        "CAP_KILL",
        "CAP_AUDIT_WRITE"
      ],
      "permitted": [
        "CAP_CHOWN",
        "CAP_DAC_OVERRIDE",
        "CAP_FSETID",
        "CAP_FOWNER",
        "CAP_MKNOD",
        "CAP_NET_RAW",
        "CAP_SETGID",
        "CAP_SETUID",
        "CAP_SETFCAP",
        "CAP_SETPCAP",
        "CAP_NET_BIND_SERVICE",
        "CAP_SYS_CHROOT",
        "CAP_KILL",
        "CAP_AUDIT_WRITE"
      ]
    },
    "oomScoreAdj": 0
  },
  "root": {
    "path": "/var/lib/docker/rootfs/overlayfs/3a428b39440b9c4d322b2c2ade136f24c745e9349e8d93845d437396183249c9"
  },
  "hostname": "3a428b39440b",
  "mounts": [
    {
      "destination": "/proc",
      "type": "proc",
      "source": "proc",
      "options": [
        "nosuid",
        "noexec",
        "nodev"
      ]
    },
    {
      "destination": "/dev",
      "type": "tmpfs",
      "source": "tmpfs",
      "options": [
        "nosuid",
        "strictatime",
        "mode=755",
        "size=65536k"
      ]
    },
    {
      "destination": "/dev/pts",
      "type": "devpts",
      "source": "devpts",
      "options": [
        "nosuid",
        "noexec",
        "newinstance",
        "ptmxmode=0666",
        "mode=0620",
        "gid=5"
      ]
    },
    {
      "destination": "/sys",
      "type": "sysfs",
      "source": "sysfs",
      "options": [
        "nosuid",
        "noexec",
        "nodev",
        "ro"
      ]
    },
    {
      "destination": "/sys/fs/cgroup",
      "type": "cgroup",
      "source": "cgroup",
      "options": [
        "ro",
        "nosuid",
        "noexec",
        "nodev"
      ]
    },
    {
      "destination": "/dev/mqueue",
      "type": "mqueue",
      "source": "mqueue",
      "options": [
        "nosuid",
        "noexec",
        "nodev"
      ]
    },
    {
      "destination": "/dev/shm",
      "type": "tmpfs",
      "source": "shm",
      "options": [
        "nosuid",
        "noexec",
        "nodev",
        "mode=1777",
        "size=67108864"
      ]
    },
    {
      "destination": "/etc/resolv.conf",
      "type": "bind",
      "source": "/var/lib/docker/containers/3a428b39440b9c4d322b2c2ade136f24c745e9349e8d93845d437396183249c9/resolv.conf",
      "options": [
        "rbind",
        "rprivate"
      ]
    },
    {
      "destination": "/etc/hostname",
      "type": "bind",
      "source": "/var/lib/docker/containers/3a428b39440b9c4d322b2c2ade136f24c745e9349e8d93845d437396183249c9/hostname",
      "options": [
        "rbind",
        "rprivate"
      ]
    },
    {
      "destination": "/etc/hosts",
      "type": "bind",
      "source": "/var/lib/docker/containers/3a428b39440b9c4d322b2c2ade136f24c745e9349e8d93845d437396183249c9/hosts",
      "options": [
        "rbind",
        "rprivate"
      ]
    }
  ],
  "hooks": {
    "prestart": [
      {
        "path": "/proc/598/exe",
        "args": [
          "libnetwork-setkey",
          "-exec-root=/var/run/docker",
          "3a428b39440b9c4d322b2c2ade136f24c745e9349e8d93845d437396183249c9",
          "c6e2fee681bb"
        ]
      }
    ]
  },
  "linux": {
    "sysctl": {
      "net.ipv4.ip_unprivileged_port_start": "0",
      "net.ipv4.ping_group_range": "0 2147483647"
    },
    "resources": {
      "devices": [
        {
          "allow": false,
          "access": "rwm"
        },
        {
          "allow": true,
          "type": "c",
          "major": 1,
          "minor": 5,
          "access": "rwm"
        },
        {
          "allow": true,
          "type": "c",
          "major": 1,
          "minor": 3,
          "access": "rwm"
        },
        {
          "allow": true,
          "type": "c",
          "major": 1,
          "minor": 9,
          "access": "rwm"
        },
        {
          "allow": true,
          "type": "c",
          "major": 1,
          "minor": 8,
          "access": "rwm"
        },
        {
          "allow": true,
          "type": "c",
          "major": 5,
          "minor": 0,
          "access": "rwm"
        },
        {
          "allow": true,
          "type": "c",
          "major": 5,
          "minor": 1,
          "access": "rwm"
        },
        {
          "allow": false,
          "type": "c",
          "major": 10,
          "minor": 229,
          "access": "rwm"
        },
        {
          "allow": false,
          "access": "rwm"
        },
        {
          "allow": true,
          "type": "c",
          "major": 1,
          "minor": 5,
          "access": "rwm"
        },
        {
          "allow": true,
          "type": "c",
          "major": 1,
          "minor": 3,
          "access": "rwm"
        },
        {
          "allow": true,
          "type": "c",
          "major": 1,
          "minor": 9,
          "access": "rwm"
        },
        {
          "allow": true,
          "type": "c",
          "major": 1,
          "minor": 8,
          "access": "rwm"
        },
        {
          "allow": true,
          "type": "c",
          "major": 5,
          "minor": 0,
          "access": "rwm"
        },
        {
          "allow": true,
          "type": "c",
          "major": 5,
          "minor": 1,
          "access": "rwm"
        },
        {
          "allow": false,
          "type": "c",
          "major": 10,
          "minor": 229,
          "access": "rwm"
        }
      ],
      "blockIO": {}
    },
    "cgroupsPath": "/docker/3a428b39440b9c4d322b2c2ade136f24c745e9349e8d93845d437396183249c9",
    "namespaces": [
      {
        "type": "mount"
      },
      {
        "type": "network"
      },
      {
        "type": "uts"
      },
      {
        "type": "pid"
      },
      {
        "type": "ipc"
      },
      {
        "type": "cgroup"
      }
    ],
    "seccomp": {
      "defaultAction": "SCMP_ACT_ERRNO",
      "defaultErrnoRet": 1,
      "architectures": [
        "SCMP_ARCH_AARCH64",
        "SCMP_ARCH_ARM"
      ],
      "syscalls": [
        {
          "names": [
            "accept",
            "accept4",
            "access",
            "adjtimex",
            "alarm",
            "bind",
            "brk",
            "capget",
            "capset",
            "chdir",
            "chmod",
            "chown",
            "chown32",
            "clock_adjtime",
            "clock_adjtime64",
            "clock_getres",
            "clock_getres_time64",
            "clock_gettime",
            "clock_gettime64",
            "clock_nanosleep",
            "clock_nanosleep_time64",
            "close",
            "close_range",
            "connect",
            "copy_file_range",
            "creat",
            "dup",
            "dup2",
            "dup3",
            "epoll_create",
            "epoll_create1",
            "epoll_ctl",
            "epoll_ctl_old",
            "epoll_pwait",
            "epoll_pwait2",
            "epoll_wait",
            "epoll_wait_old",
            "eventfd",
            "eventfd2",
            "execve",
            "execveat",
            "exit",
            "exit_group",
            "faccessat",
            "faccessat2",
            "fadvise64",
            "fadvise64_64",
            "fallocate",
            "fanotify_mark",
            "fchdir",
            "fchmod",
            "fchmodat",
            "fchown",
            "fchown32",
            "fchownat",
            "fcntl",
            "fcntl64",
            "fdatasync",
            "fgetxattr",
            "flistxattr",
            "flock",
            "fork",
            "fremovexattr",
            "fsetxattr",
            "fstat",
            "fstat64",
            "fstatat64",
            "fstatfs",
            "fstatfs64",
            "fsync",
            "ftruncate",
            "ftruncate64",
            "futex",
            "futex_time64",
            "futex_waitv",
            "futimesat",
            "getcpu",
            "getcwd",
            "getdents",
            "getdents64",
            "getegid",
            "getegid32",
            "geteuid",
            "geteuid32",
            "getgid",
            "getgid32",
            "getgroups",
            "getgroups32",
            "getitimer",
            "getpeername",
            "getpgid",
            "getpgrp",
            "getpid",
            "getppid",
            "getpriority",
            "getrandom",
            "getresgid",
            "getresgid32",
            "getresuid",
            "getresuid32",
            "getrlimit",
            "get_robust_list",
            "getrusage",
            "getsid",
            "getsockname",
            "getsockopt",
            "get_thread_area",
            "gettid",
            "gettimeofday",
            "getuid",
            "getuid32",
            "getxattr",
            "inotify_add_watch",
            "inotify_init",
            "inotify_init1",
            "inotify_rm_watch",
            "io_cancel",
            "ioctl",
            "io_destroy",
            "io_getevents",
            "io_pgetevents",
            "io_pgetevents_time64",
            "ioprio_get",
            "ioprio_set",
            "io_setup",
            "io_submit",
            "io_uring_enter",
            "io_uring_register",
            "io_uring_setup",
            "ipc",
            "kill",
            "landlock_add_rule",
            "landlock_create_ruleset",
            "landlock_restrict_self",
            "lchown",
            "lchown32",
            "lgetxattr",
            "link",
            "linkat",
            "listen",
            "listxattr",
            "llistxattr",
            "_llseek",
            "lremovexattr",
            "lseek",
            "lsetxattr",
            "lstat",
            "lstat64",
            "madvise",
            "membarrier",
            "memfd_create",
            "memfd_secret",
            "mincore",
            "mkdir",
            "mkdirat",
            "mknod",
            "mknodat",
            "mlock",
            "mlock2",
            "mlockall",
            "mmap",
            "mmap2",
            "mprotect",
            "mq_getsetattr",
            "mq_notify",
            "mq_open",
            "mq_timedreceive",
            "mq_timedreceive_time64",
            "mq_timedsend",
            "mq_timedsend_time64",
            "mq_unlink",
            "mremap",
            "msgctl",
            "msgget",
            "msgrcv",
            "msgsnd",
            "msync",
            "munlock",
            "munlockall",
            "munmap",
            "name_to_handle_at",
            "nanosleep",
            "newfstatat",
            "_newselect",
            "open",
            "openat",
            "openat2",
            "pause",
            "pidfd_open",
            "pidfd_send_signal",
            "pipe",
            "pipe2",
            "pkey_alloc",
            "pkey_free",
            "pkey_mprotect",
            "poll",
            "ppoll",
            "ppoll_time64",
            "prctl",
            "pread64",
            "preadv",
            "preadv2",
            "prlimit64",
            "process_mrelease",
            "pselect6",
            "pselect6_time64",
            "pwrite64",
            "pwritev",
            "pwritev2",
            "read",
            "readahead",
            "readlink",
            "readlinkat",
            "readv",
            "recv",
            "recvfrom",
            "recvmmsg",
            "recvmmsg_time64",
            "recvmsg",
            "remap_file_pages",
            "removexattr",
            "rename",
            "renameat",
            "renameat2",
            "restart_syscall",
            "rmdir",
            "rseq",
            "rt_sigaction",
            "rt_sigpending",
            "rt_sigprocmask",
            "rt_sigqueueinfo",
            "rt_sigreturn",
            "rt_sigsuspend",
            "rt_sigtimedwait",
            "rt_sigtimedwait_time64",
            "rt_tgsigqueueinfo",
            "sched_getaffinity",
            "sched_getattr",
            "sched_getparam",
            "sched_get_priority_max",
            "sched_get_priority_min",
            "sched_getscheduler",
            "sched_rr_get_interval",
            "sched_rr_get_interval_time64",
            "sched_setaffinity",
            "sched_setattr",
            "sched_setparam",
            "sched_setscheduler",
            "sched_yield",
            "seccomp",
            "select",
            "semctl",
            "semget",
            "semop",
            "semtimedop",
            "semtimedop_time64",
            "send",
            "sendfile",
            "sendfile64",
            "sendmmsg",
            "sendmsg",
            "sendto",
            "setfsgid",
            "setfsgid32",
            "setfsuid",
            "setfsuid32",
            "setgid",
            "setgid32",
            "setgroups",
            "setgroups32",
            "setitimer",
            "setpgid",
            "setpriority",
            "setregid",
            "setregid32",
            "setresgid",
            "setresgid32",
            "setresuid",
            "setresuid32",
            "setreuid",
            "setreuid32",
            "setrlimit",
            "set_robust_list",
            "setsid",
            "setsockopt",
            "set_thread_area",
            "set_tid_address",
            "setuid",
            "setuid32",
            "setxattr",
            "shmat",
            "shmctl",
            "shmdt",
            "shmget",
            "shutdown",
            "sigaltstack",
            "signalfd",
            "signalfd4",
            "sigprocmask",
            "sigreturn",
            "socketcall",
            "socketpair",
            "splice",
            "stat",
            "stat64",
            "statfs",
            "statfs64",
            "statx",
            "symlink",
            "symlinkat",
            "sync",
            "sync_file_range",
            "syncfs",
            "sysinfo",
            "tee",
            "tgkill",
            "time",
            "timer_create",
            "timer_delete",
            "timer_getoverrun",
            "timer_gettime",
            "timer_gettime64",
            "timer_settime",
            "timer_settime64",
            "timerfd_create",
            "timerfd_gettime",
            "timerfd_gettime64",
            "timerfd_settime",
            "timerfd_settime64",
            "times",
            "tkill",
            "truncate",
            "truncate64",
            "ugetrlimit",
            "umask",
            "uname",
            "unlink",
            "unlinkat",
            "utime",
            "utimensat",
            "utimensat_time64",
            "utimes",
            "vfork",
            "vmsplice",
            "wait4",
            "waitid",
            "waitpid",
            "write",
            "writev"
          ],
          "action": "SCMP_ACT_ALLOW"
        },
        {
          "names": [
            "process_vm_readv",
            "process_vm_writev",
            "ptrace"
          ],
          "action": "SCMP_ACT_ALLOW"
        },
        {
          "names": [
            "socket"
          ],
          "action": "SCMP_ACT_ALLOW",
          "args": [
            {
              "index": 0,
              "value": 40,
              "op": "SCMP_CMP_NE"
            }
          ]
        },
        {
          "names": [
            "personality"
          ],
          "action": "SCMP_ACT_ALLOW",
          "args": [
            {
              "index": 0,
              "value": 0,
              "op": "SCMP_CMP_EQ"
            }
          ]
        },
        {
          "names": [
            "personality"
          ],
          "action": "SCMP_ACT_ALLOW",
          "args": [
            {
              "index": 0,
              "value": 8,
              "op": "SCMP_CMP_EQ"
            }
          ]
        },
        {
          "names": [
            "personality"
          ],
          "action": "SCMP_ACT_ALLOW",
          "args": [
            {
              "index": 0,
              "value": 131072,
              "op": "SCMP_CMP_EQ"
            }
          ]
        },
        {
          "names": [
            "personality"
          ],
          "action": "SCMP_ACT_ALLOW",
          "args": [
            {
              "index": 0,
              "value": 131080,
              "op": "SCMP_CMP_EQ"
            }
          ]
        },
        {
          "names": [
            "personality"
          ],
          "action": "SCMP_ACT_ALLOW",
          "args": [
            {
              "index": 0,
              "value": 4294967295,
              "op": "SCMP_CMP_EQ"
            }
          ]
        },
        {
          "names": [
            "arm_fadvise64_64",
            "arm_sync_file_range",
            "sync_file_range2",
            "breakpoint",
            "cacheflush",
            "set_tls"
          ],
          "action": "SCMP_ACT_ALLOW"
        },
        {
          "names": [
            "clone"
          ],
          "action": "SCMP_ACT_ALLOW",
          "args": [
            {
              "index": 0,
              "value": 2114060288,
              "op": "SCMP_CMP_MASKED_EQ"
            }
          ]
        },
        {
          "names": [
            "clone3"
          ],
          "action": "SCMP_ACT_ERRNO",
          "errnoRet": 38
        },
        {
          "names": [
            "chroot"
          ],
          "action": "SCMP_ACT_ALLOW"
        }
      ]
    },
    "maskedPaths": [
      "/proc/asound",
      "/proc/acpi",
      "/proc/kcore",
      "/proc/keys",
      "/proc/latency_stats",
      "/proc/timer_list",
      "/proc/timer_stats",
      "/proc/sched_debug",
      "/proc/scsi",
      "/sys/firmware"
    ],
    "readonlyPaths": [
      "/proc/bus",
      "/proc/fs",
      "/proc/irq",
      "/proc/sys",
      "/proc/sysrq-trigger"
    ]
  }
}
`

type hookSpec struct {
	Root struct {
		Path string `json:"path"`
	} `json:"root"`
}

func BenchmarkUnmarshalShallow(b *testing.B) {
	b.ReportAllocs()
	for i := 0; i < b.N; i++ {
		f := strings.NewReader(containerOCI)
		var s hookSpec
		_ = json.NewDecoder(f).Decode(&s)
	}
}

func BenchmarkUnmarshalOCISpec(b *testing.B) {
	b.ReportAllocs()
	for i := 0; i < b.N; i++ {
		f := strings.NewReader(containerOCI)
		var s specs.Spec
		_ = json.NewDecoder(f).Decode(&s)
	}
}
go test -v -bench .
goos: darwin
goarch: arm64
pkg: github.com/containerd/containerd/bla
BenchmarkUnmarshalShallow
BenchmarkUnmarshalShallow-10    	    9264	    119462 ns/op	   64304 B/op	      21 allocs/op
BenchmarkUnmarshalOCISpec
BenchmarkUnmarshalOCISpec-10    	    6484	    179632 ns/op	  116481 B/op	     817 allocs/op
PASS
ok  	github.com/containerd/containerd/bla	2.588s

@slonopotamus
Copy link
Contributor Author

slonopotamus commented Jul 11, 2023

So, we have some numbers. I'm not sure what conclusion to make based upon them. :D

Parsing the full spec is around 2x slower than parsing one/two fields out of it.

@thaJeztah
Copy link
Member

thaJeztah commented Jul 11, 2023

Yup, and 39x the allocations, so I'm tempted to say "use a shallow struct" (but leave a comment why we're doing so).

If we do, I'd also opt for using the OCI spec's Root type to make it more transparent what we're doing, and to make sure we match the spec;

// hookSpec is a shallow version of [specs.Spec] containing only the
// fields we need for the hook. We use a shallow struct to reduce
// the overhead of unmarshaling.
type hookSpec struct {
	// Root configures the container's root filesystem.
	Root *specs.Root `json:"root,omitempty"`
}

@slonopotamus slonopotamus force-pushed the uncopypaste-read-spec branch from 07d8ba9 to 71c6baf Compare July 11, 2023 10:44
@slonopotamus
Copy link
Contributor Author

slonopotamus commented Jul 11, 2023

`Root *specs.Root `json:"root,omitempty"``

That's what I am talking about. Current hookSpec already diverged, it doesn't have omitempty. But as you wish. Updated PR, restoring both shallow spec structs.

@slonopotamus slonopotamus force-pushed the uncopypaste-read-spec branch from 71c6baf to e439f2b Compare July 11, 2023 11:23
Signed-off-by: Marat Radchenko <marat@slonopotamus.org>
@slonopotamus slonopotamus force-pushed the uncopypaste-read-spec branch from e439f2b to 9e34b8b Compare July 11, 2023 11:41
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@estesp estesp merged commit 34b1653 into containerd:main Jul 11, 2023
@slonopotamus slonopotamus deleted the uncopypaste-read-spec branch July 11, 2023 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants