Skip to content

runc: git bisect start v1.2.0-rc.2 v1.1.0 (manually)#48366

Closed
AkihiroSuda wants to merge 11 commits intomoby:masterfrom
AkihiroSuda:runc-bisect
Closed

runc: git bisect start v1.2.0-rc.2 v1.1.0 (manually)#48366
AkihiroSuda wants to merge 11 commits intomoby:masterfrom
AkihiroSuda:runc-bisect

Conversation

@AkihiroSuda
Copy link
Member

@AkihiroSuda AkihiroSuda commented Aug 23, 2024

Trying to find out the regression commit that is causing the compatibility issue of runc v1.2 RC.

Related:

Full(er) list of test failures:

  • TestBuildUserNamespaceValidateCapabilitiesAreV2
  • TestNetworkNat
  • TestNetworkLocalhostTCPNat
  • TestBuildSquashParent
  • TestDiff
  • TestUsernsCommit
  • TestReadPluginNoRead/explicitly_enabled_caching

Most failures are related to not getting output from a container run. As I can't repro this locally I'm not sure how to bisect it.

Originally posted by @kolyshkin in #48160 (comment)


  • 🔴 v1.2.0-rc.2
  • 🔴 v1.1.0-974-g1950892f (The (merge) commit to disable dmz by default, although it still compiles dmz by default)
  • 🔴 v1.1.0-941-g7c004d8e w/ runc_nodmz
  • 🔴 v1.1.0-931-gee73091a w/ runc_nodmz
  • 🔴 v1.1.0-929-gf2f16213 w/ runc_nodmz 🔥 causing stdio regression
  • ⚪ v1.1.0-928-g8e1cd2f5 w/ runc_nodmz (failures are similar to v1.1.0-753-g321aa20c)
  • ⚪ v1.1.0-924-g4baaf18c w/ runc_nodmz (failures are similar to v1.1.0-753-g321aa20c)
  • ⚪ v1.1.0-909-gab146f23 w/ runc_nodmz (failures are similar to v1.1.0-753-g321aa20c)
  • ⚪ v1.1.0-843-g823636c3 w/ runc_nodmz (failures are similar to v1.1.0-753-g321aa20c)
  • ⚪ v1.1.0-753-g321aa20c (the last commit before the introduction of dmz; failures are different from OP runc: git bisect start v1.2.0-rc.2 v1.1.0 (manually) #48366 (comment))
  • 🟢 v1.1.0-562-gf8ad20f5
  • 🟢 v1.1.0

(🟢=Good, 🟡=Pending, 🔴=Bad for the same reason as @kolyshkin reported, ⚪=CI fails due to another reason)

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@AkihiroSuda AkihiroSuda changed the title git bisect start v1.2.0-rc.2 v1.1.0 (manually) runc: git bisect start v1.2.0-rc.2 v1.1.0 (manually) Aug 23, 2024
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@AkihiroSuda
Copy link
Member Author

Failure in v1.1.0-753-g321aa20c is different from what was reported in #48160 (comment)

=== RUN   TestNoNewPrivileges/CapabilityRequested=false
    capabilities_linux_test.go:104: test produced invalid error: "writing sync \"procError\": write pipe: file already closed\nexec /bin/cat: operation not permitted", expected "exec /bin/cat: operation not permitted". Stdout:""
--- FAIL: TestNoNewPrivileges (9.93s)
    --- PASS: TestNoNewPrivileges/CapabilityRequested=true (0.33s)
    --- FAIL: TestNoNewPrivileges/CapabilityRequested=false (0.31s)
FAIL

https://github.com/moby/moby/actions/runs/10527465798/job/29171779480?pr=48366

=== FAIL: amd64.integration-cli TestDockerCLIRunSuite/TestRunOOMExitCode (0.21s)
    docker_cli_run_unix_test.go:605: assertion failed: error is not nil: wrong exit code for OOM container: expected 137, got 125 (output: "/usr/local/cli-integration/docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: failed to write \"16777216\": write /sys/fs/cgroup/memory/docker/7dbb71c83095125594eb6892e13cafd6fbac60e0e45c85c2893fbcb7c6071c55/memory.memsw.limit_in_bytes: device or resource busy: unknown.\ntime=\"2024-08-23T15:11:18Z\" level=error msg=\"error waiting for container: context canceled\" \n")
    --- FAIL: TestDockerCLIRunSuite/TestRunOOMExitCode (0.21s)

https://github.com/moby/moby/actions/runs/10527465798/job/29171799694?pr=48366

Co-authored-by: Kir Kolyshkin <kolyshkin@gmail.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@rata
Copy link
Contributor

rata commented Aug 23, 2024

I think you can work-around failures in runc_dmz using EXTRA_BUILDTAGS="runc_nodmz". This will never compile it (it is disabled by default, so you can just avoid compiling it too).

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@AkihiroSuda
Copy link
Member Author

v1.1.0-843-g823636c3 w/ runc_nodmz fails with the same reasons as v1.1.0-753-g321aa20c, which is different from the OP

=== RUN   TestNoNewPrivileges/CapabilityRequested=false
    capabilities_linux_test.go:104: test produced invalid error: "writing sync procError: write sync: file already closed\nexec /bin/cat: operation not permitted", expected "exec /bin/cat: operation not permitted". Stdout:""
--- FAIL: TestNoNewPrivileges (8.89s)
    --- PASS: TestNoNewPrivileges/CapabilityRequested=true (0.30s)
    --- FAIL: TestNoNewPrivileges/CapabilityRequested=false (0.29s)

https://github.com/moby/moby/actions/runs/10528984279/job/29175849263?pr=48366

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@AkihiroSuda
Copy link
Member Author

v1.1.0-929-gf2f16213 (opencontainers/runc@f2f1621 init: close internal fds before execve) seems the first commit that is causing the issue reported by @kolyshkin ("Most failures are related to not getting output from a container run.").

=== Failed
=== FAIL: amd64.integration.container TestCreateWithCDIDevices (0.87s)
    cdi_test.go:68: assertion failed: [] does not contain FOO=injected

=== FAIL: amd64.integration.container TestDiff (0.34s)
    diff_test.go:28: assertion failed: 
        --- expected
        +++ items
          []container.FilesystemChange(
        - 	{{Kind: s"A", Path: "/foo"}, {Kind: s"A", Path: "/foo/bar"}},
        + 	nil,
          )
        

=== FAIL: amd64.integration.container TestNetworkNat (0.36s)
    nat_test.go:33: assertion failed: error is not nil: dial tcp 172.17.0.3:8080: connect: connection refused
=== FAIL: amd64.integration.plugin.logging TestReadPluginNoRead/explicitly_enabled_caching (1.00s)
    read_test.go:90: assertion failed: strings.TrimSpace(buf.String()) is not "hello world": []
    --- FAIL: TestReadPluginNoRead/explicitly_enabled_caching (1.00s)

=== FAIL: amd64.integration.plugin.logging TestReadPluginNoRead/default (1.03s)
    read_test.go:90: assertion failed: strings.TrimSpace(buf.String()) is not "hello world": []
    --- FAIL: TestReadPluginNoRead/default (1.03s)

=== FAIL: amd64.integration.plugin.logging TestReadPluginNoRead (5.20s)
    read_test.go:93: [d8b21c48ed581] daemon is not started

...

Looks like stdout is closed too early on some condition.

cc @cyphar @kolyshkin @rata @thaJeztah


Its parent commit v1.1.0-928-g8e1cd2f5 (opencontainers/runc@8e1cd2f, init: verify after chdir that cwd is inside the container) still fails due to other issues.

=== FAIL: amd64.integration.capabilities TestNoNewPrivileges/CapabilityRequested=false (0.32s)
    capabilities_linux_test.go:104: test produced invalid error: "writing sync procError: write sync: file already closed\nexec /bin/cat: operation not permitted", expected "exec /bin/cat: operation not permitted". Stdout:""
    --- FAIL: TestNoNewPrivileges/CapabilityRequested=false (0.32s)
=== FAIL: amd64.integration.container TestUpdateRestartPolicy (0.35s)
    update_test.go:31: assertion failed: error is not nil: Error response from daemon: Cannot update container a33223f3221da201c20e58724db3bc9b04586d80b305096181f8b46b4cd95fd3: runc did not terminate successfully: exit status 1: cannot set cpu limit: container could not join or create cgroup
        : unknown

...

@thaJeztah
Copy link
Member

Interesting that commit was part of a CVE fix that also went to the v1.1 branch (reading from my phone, so didn't check if the patch was the same there)

@AkihiroSuda
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants