Skip to content

Conversation

@klihub
Copy link
Member

@klihub klihub commented Sep 13, 2024

Fix a race where an asynchronous server.Serve() invoked in a a goroutine races with an almost immediate server.Shutdown().

If Shutdown() finishes its locked closing of listeners before Serve() gets around to add the new one, Serve will sit stuck forever in l.Accept(), unless the caller closes the listener in addition to Shutdown().

This is probably almost impossible to trigger in real life, but unit tests which run the server and client in the same process can trigger this. If a test then tries to verify after a Shutdown() a final ErrServerClosed error from Serve() it gets stuck forever.

@klihub klihub requested review from dmcgowan and fuweid September 13, 2024 13:10
@klihub klihub force-pushed the fixes/serve-listen-shutdown-race branch 2 times, most recently from f4a5a58 to 4ca1d79 Compare September 13, 2024 13:24
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Fix a race where an asynchronous server.Serve() invoked in a
a goroutine races with an almost immediate server.Shutdown().
If Shutdown() finishes its locked closing of listeners before
Serve() gets around to add the new one, Serve will sit stuck
forever in l.Accept(), unless the caller closes the listener
in addition to Shutdown().

This is probably almost impossible to trigger in real life,
but some of the unit tests, which run the server and client
in the same process, occasionally do trigger this. Then, if
the test tries to verify a final ErrServerClosed error from
Serve() after Shutdown() it gets stuck forever.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
@klihub klihub force-pushed the fixes/serve-listen-shutdown-race branch from 4ca1d79 to c4d96d5 Compare September 16, 2024 06:53
@AkihiroSuda AkihiroSuda merged commit b71d9de into containerd:main Oct 29, 2024
@klihub klihub deleted the fixes/serve-listen-shutdown-race branch October 29, 2024 06:59
@alam0rt
Copy link

alam0rt commented Jan 15, 2025

This is probably almost impossible to trigger in real life

I think we've definitely come across this: containerd/containerd#8981 (comment)

going to test out 2.0.2 and can let you know if we get these issues.

Mengkzhaoyun pushed a commit to open-beagle/containerd that referenced this pull request Feb 7, 2025
containerd 2.0.2

Welcome to the v2.0.2 release of containerd!

The second patch release for containerd 2.0 includes a number of bug fixes and improvements.

* Remove confusing warning in cri runtime config migration ([#11256](containerd/containerd#11256))
* Fix runtime platform loading in cri image plugin init ([#11248](containerd/containerd#11248))

* Update runc binary to v1.2.4 ([#11239](containerd/containerd#11239))

Please try out the release binaries and report any issues at
https://github.com/containerd/containerd/issues.

* Jin Dong
* Derek McGowan
* Akihiro Suda
* Kazuyoshi Kato
* Henry Wang
* Krisztian Litkey
* Phil Estes
* Samuel Karp
* Sebastiaan van Stijn
* Akhil Mohan
* Brian Goff
* Chongyi Zheng
* Maksym Pavlenko
* Mike Brown
* Pierre Gimalac
* Wei Fu
<details><summary>23 commits</summary>
<p>

* Prepare release notes for v2.0.2 ([#11245](containerd/containerd#11245))
  * [`cdaf4dfb4`](containerd/containerd@cdaf4df) Prepare release notes for v2.0.2
* Update platforms to latest rc ([#11259](containerd/containerd#11259))
  * [`eb125e1dd`](containerd/containerd@eb125e1) Update platforms to latest rc
* Remove confusing warning in cri runtime config migration ([#11256](containerd/containerd#11256))
  * [`468079c5c`](containerd/containerd@468079c) Remove confusing warning in cri runtime config migration
* Fix runtime platform loading in cri image plugin init ([#11248](containerd/containerd#11248))
  * [`a2d9d4fd5`](containerd/containerd@a2d9d4f) Fix runtime platform loading in cri image plugin init
* make sure console master tty is closed on task exit ([#11246](containerd/containerd#11246))
  * [`184ffad01`](containerd/containerd@184ffad) Add integ test to check tty leak
  * [`17181ed33`](containerd/containerd@17181ed) fix master tty leak due to leaking init container object
* Bump up otelttrpc to 0.1.0 ([#11242](containerd/containerd#11242))
  * [`8666e7422`](containerd/containerd@8666e74) Bump up otelttrpc to 0.1.0
* ctr: `ctr images import --all-platforms`: fix unpack ([#11236](containerd/containerd#11236))
  * [`c4270430d`](containerd/containerd@c427043) ctr: `ctr images import --all-platforms`: fix unpack
* Update runc binary to v1.2.4 ([#11239](containerd/containerd#11239))
  * [`7373ddd70`](containerd/containerd@7373ddd) update runc binary to v1.2.4
* downgrade go-difflib and go-spew to tagged releases ([#11222](containerd/containerd#11222))
  * [`f34147772`](containerd/containerd@f341477) downgrade go-difflib and go-spew to tagged releases
* Add a build tag to disable std `plugin` import ([#11213](containerd/containerd#11213))
  * [`dca769485`](containerd/containerd@dca7694) chore: add a build tag to disable containerd plugin import
* Update golangci to 1.60.3 ([#11187](containerd/containerd#11187))
  * [`5942b3fcb`](containerd/containerd@5942b3f) Update golangci to 1.60.3
</p>
</details>
<details><summary>6 commits</summary>
<p>

* Add dependabot and upgrade golang and dependency versions ([containerd/otelttrpc#3](containerd/otelttrpc#3))
  * [`2d46141`](containerd/otelttrpc@2d46141) upgrade golang, deps, CI versions
  * [`64922e7`](containerd/otelttrpc@64922e7) Add dependabot CI
* Fix concurrent map panic on metadata ([containerd/otelttrpc#2](containerd/otelttrpc#2))
  * [`2ba3be1`](containerd/otelttrpc@2ba3be1) Fix concurrent map panic on inject metadata
  * [`f50a922`](containerd/otelttrpc@f50a922) UT for concurrent inject/extract metadata
</p>
</details>
<details><summary>6 commits</summary>
<p>

* Move windows matcher logic so all platforms can use ([containerd/platforms#22](containerd/platforms#22))
  * [`7c58292`](containerd/platforms@7c58292) Move windows matcher logic so all platforms can use
* replace testify with stdlib in tests ([containerd/platforms#21](containerd/platforms#21))
  * [`86a86b7`](containerd/platforms@86a86b7) replace testify with stdlib in tests
* Replace arm64 minor variant logic with lookup table ([containerd/platforms#18](containerd/platforms#18))
  * [`364665a`](containerd/platforms@364665a) Replace arm64 minor variant logic with lookup table
</p>
</details>
<details><summary>5 commits</summary>
<p>

* Add MD.Clone function ([containerd/ttrpc#177](containerd/ttrpc#177))
  * [`430f734`](containerd/ttrpc@430f734) Add MD.Clone
* server: fix a Serve() vs. (immediate) Shutdown() race ([containerd/ttrpc#175](containerd/ttrpc#175))
  * [`c4d96d5`](containerd/ttrpc@c4d96d5) server: fix Serve() vs. immediate Shutdown() race.
  * [`ed6c3ba`](containerd/ttrpc@ed6c3ba) server_test: add Serve()/Shutdown() race test.
</p>
</details>

* **github.com/containerd/otelttrpc**  ea5083fda723 -> v0.1.0
* **github.com/containerd/platforms**  v1.0.0-rc.0 -> v1.0.0-rc.1
* **github.com/containerd/ttrpc**      v1.2.6 -> v1.2.7
* **github.com/davecgh/go-spew**       d8f796af33cc -> v1.1.1
* **github.com/pmezard/go-difflib**    5d4384ee4fb2 -> v1.0.0
* **github.com/stretchr/testify**      v1.9.0 -> v1.10.0

Previous release can be found at [v2.0.1](https://github.com/containerd/containerd/releases/tag/v2.0.1)
* `containerd-<VERSION>-<OS>-<ARCH>.tar.gz`:         ✅Recommended. Dynamically linked with glibc 2.31 (Ubuntu 20.04).
* `containerd-static-<VERSION>-<OS>-<ARCH>.tar.gz`:  Statically linked. Expected to be used on non-glibc Linux distributions. Not position-independent.

In addition to containerd, typically you will have to install [runc](https://github.com/opencontainers/runc/releases)
and [CNI plugins](https://github.com/containernetworking/plugins/releases) from their official sites too.

See also the [Getting Started](https://github.com/containerd/containerd/blob/main/docs/getting-started.md) documentation.
@dmcgowan dmcgowan changed the title server: fix a Serve() vs. (immediate) Shutdown() race Fix race between serve and immediate shutdown on the server Feb 24, 2025
@dmcgowan dmcgowan added the area/runtime Runtime label Apr 15, 2025
mansikulkarni96 added a commit to mansikulkarni96/containerd that referenced this pull request Dec 4, 2025
containerd 2.1.0

Welcome to the v2.1.0 release of containerd!

The first minor release of containerd 2.x focuses on continued stability alongside
new features and improvements. This is the first time-based released for containerd.
Most the feature set and core functionality has long been stable and harderened in production
environments, so now we transition to a balance of timely delivery of new functionality
with the same high confidence in stability and performance.

* Add no_sync option to boost boltDB performance on ephemeral environments ([containerd#10745](containerd#10745))
* Add content create event ([containerd#11006](containerd#11006))
* Erofs snapshotter and differ ([containerd#10705](containerd#10705))

* Update CRI to use transfer service for image pull by default ([containerd#8515](containerd#8515))
* Support multiple cni plugin bin dirs ([containerd#11311](containerd#11311))
* Support container restore through CRI/Kubernetes ([containerd#10365](containerd#10365))
* Add OCI/Image Volume Source support ([containerd#10579](containerd#10579))
* Enable Writable cgroups for unprivileged containers ([containerd#11131](containerd#11131))
* Fix recursive RLock() mutex acquisition ([containerd/go-cni#126](containerd/go-cni#126))
* Support CNI STATUS Verb ([containerd/go-cni#123](containerd/go-cni#123))

* Retry last registry host on 50x responses ([containerd#11484](containerd#11484))
* Multipart layer fetch ([containerd#10177](containerd#10177))
* Enable HTTP debug and trace for transfer based puller ([containerd#10762](containerd#10762))
* Add support for unpacking custom media types  ([containerd#11744](containerd#11744))
* Add dial timeout field to hosts toml configuration ([containerd#11106](containerd#11106))

* Expose Pod assigned IPs to NRI plugins ([containerd#10921](containerd#10921))

* Support multiple uid/gid mappings ([containerd#10722](containerd#10722))
* Fix race between serve and immediate shutdown on the server ([containerd/ttrpc#175](containerd/ttrpc#175))

* Update FreeBSD defaults and re-organize platform defaults ([containerd#11017](containerd#11017))

* Postpone cri config deprecations to v2.2 ([containerd#11684](containerd#11684))
* Remove deprecated dynamic library plugins ([containerd#11683](containerd#11683))
* Remove the support for Schema 1 images ([containerd#11681](containerd#11681))

Please try out the release binaries and report any issues at
https://github.com/containerd/containerd/issues.

* Derek McGowan
* Phil Estes
* Akihiro Suda
* Maksym Pavlenko
* Jin Dong
* Wei Fu
* Sebastiaan van Stijn
* Samuel Karp
* Mike Brown
* Adrien Delorme
* Austin Vazquez
* Akhil Mohan
* Kazuyoshi Kato
* Henry Wang
* Gao Xiang
* ningmingxiao
* Krisztian Litkey
* Yang Yang
* Archit Kulkarni
* Chris Henzie
* Iceber Gu
* Alexey Lunev
* Antonio Ojea
* Davanum Srinivas
* Marat Radchenko
* Michael Zappa
* Paweł Gronowski
* Rodrigo Campos
* Alberto Garcia Hierro
* Amit Barve
* Andrey Smirnov
* Divya
* Etienne Champetier
* Kirtana Ashok
* Philip Laine
* QiPing Wan
* fengwei0328
* zounengren
* Adrian Reber
* Alfred Wingate
* Amal Thundiyil
* Athos Ribeiro
* Brian Goff
* Cesar Talledo
* ChengyuZhu6
* Chongyi Zheng
* Craig Ingram
* Danny Canter
* David Son
* Fupan Li
* HirazawaUi
* Jing Xu
* Jonathan A. Sternberg
* Jose Fernandez
* Kaita Nakamura
* Kohei Tokunaga
* Lei Liu
* Marco Visin
* Mike Baynton
* Qiyuan Liang
* Sameer
* Shiming Zhang
* Swagat Bora
* Teresaliu
* Tony Fang
* Tõnis Tiigi
* Vered Rosen
* Vinayak Goyal
* bo.jiang
* chriskery
* luchenhan
* mahmut
* zhaixiaojuan

* **github.com/Microsoft/hcsshim**                                                 v0.12.9 -> v0.13.0-rc.3
* **github.com/cilium/ebpf**                                                       v0.11.0 -> v0.16.0
* **github.com/containerd/cgroups/v3**                                             v3.0.3 -> v3.0.5
* **github.com/containerd/containerd/api**                                         v1.8.0 -> v1.9.0
* **github.com/containerd/continuity**                                             v0.4.4 -> v0.4.5
* **github.com/containerd/go-cni**                                                 v1.1.10 -> v1.1.12
* **github.com/containerd/imgcrypt/v2**                                            v2.0.0-rc.1 -> v2.0.1
* **github.com/containerd/otelttrpc**                                              ea5083fda723 -> v0.1.0
* **github.com/containerd/platforms**                                              v1.0.0-rc.0 -> v1.0.0-rc.1
* **github.com/containerd/ttrpc**                                                  v1.2.6 -> v1.2.7
* **github.com/containerd/typeurl/v2**                                             v2.2.2 -> v2.2.3
* **github.com/containernetworking/cni**                                           v1.2.3 -> v1.3.0
* **github.com/containernetworking/plugins**                                       v1.5.1 -> v1.7.1
* **github.com/containers/ocicrypt**                                               v1.2.0 -> v1.2.1
* **github.com/davecgh/go-spew**                                                   d8f796af33cc -> v1.1.1
* **github.com/fsnotify/fsnotify**                                                 v1.7.0 -> v1.9.0
* **github.com/go-jose/go-jose/v4**                                                v4.0.4 -> v4.0.5
* **github.com/google/go-cmp**                                                     v0.6.0 -> v0.7.0
* **github.com/grpc-ecosystem/grpc-gateway/v2**                                    v2.22.0 -> v2.26.1
* **github.com/klauspost/compress**                                                v1.17.11 -> v1.18.0
* **github.com/mdlayher/socket**                                                   v0.4.1 -> v0.5.1
* **github.com/moby/spdystream**                                                   v0.4.0 -> v0.5.0
* **github.com/moby/sys/user**                                                     v0.3.0 -> v0.4.0
* **github.com/opencontainers/image-spec**                                         v1.1.0 -> v1.1.1
* **github.com/opencontainers/runtime-spec**                                       v1.2.0 -> v1.2.1
* **github.com/opencontainers/selinux**                                            v1.11.1 -> v1.12.0
* **github.com/pelletier/go-toml/v2**                                              v2.2.3 -> v2.2.4
* **github.com/petermattis/goid**                                                  4fcff4a6cae7 **_new_**
* **github.com/pmezard/go-difflib**                                                5d4384ee4fb2 -> v1.0.0
* **github.com/prometheus/client_golang**                                          v1.20.5 -> v1.22.0
* **github.com/prometheus/common**                                                 v0.55.0 -> v0.62.0
* **github.com/sasha-s/go-deadlock**                                               v0.3.5 **_new_**
* **github.com/smallstep/pkcs7**                                                   v0.1.1 **_new_**
* **github.com/stretchr/testify**                                                  v1.9.0 -> v1.10.0
* **github.com/tchap/go-patricia/v2**                                              v2.3.1 -> v2.3.2
* **github.com/urfave/cli/v2**                                                     v2.27.5 -> v2.27.6
* **github.com/vishvananda/netlink**                                               v1.3.0 -> 0e7078ed04c8
* **github.com/vishvananda/netns**                                                 v0.0.4 -> v0.0.5
* **go.etcd.io/bbolt**                                                             v1.3.11 -> v1.4.0
* **go.opentelemetry.io/auto/sdk**                                                 v1.1.0 **_new_**
* **go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc**  v0.56.0 -> v0.60.0
* **go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp**                v0.56.0 -> v0.60.0
* **go.opentelemetry.io/otel**                                                     v1.31.0 -> v1.35.0
* **go.opentelemetry.io/otel/exporters/otlp/otlptrace**                            v1.31.0 -> v1.35.0
* **go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc**              v1.31.0 -> v1.35.0
* **go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp**              v1.31.0 -> v1.35.0
* **go.opentelemetry.io/otel/metric**                                              v1.31.0 -> v1.35.0
* **go.opentelemetry.io/otel/sdk**                                                 v1.31.0 -> v1.35.0
* **go.opentelemetry.io/otel/trace**                                               v1.31.0 -> v1.35.0
* **go.opentelemetry.io/proto/otlp**                                               v1.3.1 -> v1.5.0
* **golang.org/x/crypto**                                                          v0.28.0 -> v0.36.0
* **golang.org/x/exp**                                                             aacd6d4b4611 -> 2d47ceb2692f
* **golang.org/x/mod**                                                             v0.21.0 -> v0.24.0
* **golang.org/x/net**                                                             v0.30.0 -> v0.38.0
* **golang.org/x/oauth2**                                                          v0.22.0 -> v0.27.0
* **golang.org/x/sync**                                                            v0.8.0 -> v0.14.0
* **golang.org/x/sys**                                                             v0.26.0 -> v0.33.0
* **golang.org/x/term**                                                            v0.25.0 -> v0.30.0
* **golang.org/x/text**                                                            v0.19.0 -> v0.23.0
* **golang.org/x/time**                                                            v0.3.0 -> v0.7.0
* **google.golang.org/genproto/googleapis/api**                                    5fefd90f89a9 -> 56aae31c358a
* **google.golang.org/genproto/googleapis/rpc**                                    324edc3d5d38 -> 56aae31c358a
* **google.golang.org/grpc**                                                       v1.67.1 -> v1.72.0
* **google.golang.org/protobuf**                                                   v1.35.1 -> v1.36.6
* **k8s.io/api**                                                                   v0.31.2 -> v0.32.3
* **k8s.io/apimachinery**                                                          v0.31.2 -> v0.32.3
* **k8s.io/apiserver**                                                             v0.31.2 -> v0.32.3
* **k8s.io/client-go**                                                             v0.31.2 -> v0.32.3
* **k8s.io/cri-api**                                                               v0.31.2 -> v0.32.3
* **k8s.io/kubelet**                                                               v0.31.2 -> v0.32.3
* **k8s.io/utils**                                                                 18e509b52bc8 -> 3ea5e8cea738
* **sigs.k8s.io/json**                                                             bc3834ca7abd -> 9aa6b5e7a4b3
* **sigs.k8s.io/structured-merge-diff/v4**                                         v4.4.1 -> v4.4.2
* **tags.cncf.io/container-device-interface**                                      v0.8.0 -> v1.0.1
* **tags.cncf.io/container-device-interface/specs-go**                             v0.8.0 -> v1.0.0

Previous release can be found at [v2.0.0](https://github.com/containerd/containerd/releases/tag/v2.0.0)
* `containerd-<VERSION>-<OS>-<ARCH>.tar.gz`:         ✅Recommended. Dynamically linked with glibc 2.35 (Ubuntu 22.04).
* `containerd-static-<VERSION>-<OS>-<ARCH>.tar.gz`:  Statically linked. Expected to be used on Linux distributions that do not use glibc >= 2.35. Not position-independent.

In addition to containerd, typically you will have to install [runc](https://github.com/opencontainers/runc/releases)
and [CNI plugins](https://github.com/containernetworking/plugins/releases) from their official sites too.

See also the [Getting Started](https://github.com/containerd/containerd/blob/main/docs/getting-started.md) documentation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants