Skip to content

Dangling file in /var/lib/cni/networks/podman prevents container starting podman 3.0 #9465

@skitoxe

Description

@skitoxe

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description
When doing an podman stop <container> && sleep 5 && podman start <container> or just a normal stop and start i on some occasions get an orphan in /var/lib/cni/networks/podman/.
In exactly the same manner as described in issue #3759, which is now closed. But is see further comments indicating this is still an issue and i can see that its still a problem in 3.0. I have to manually delete the file to get the container to start again. Error message recived is:

ERRO[0000] Error adding network: failed to allocate for range 0: 10.89.0.64 has been allocated to 5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe, duplicate allocation is not allowed
ERRO[0000] Error while adding pod to CNI network "podman_default": failed to allocate for range 0: 10.89.0.64 has been allocated to 5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe, duplicate allocation is not allowed
Error: unable to start container "5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe": error configuring network namespace for container 5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe: failed to allocate for range 0: 10.89.0.64 has been allocated to 5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe, duplicate allocation is not allowed

Steps to reproduce the issue:

  1. podman stop && sleep 5 && podman start

Describe the results you received:
No starting container and the follwing error when trying to start:

ERRO[0000] Error adding network: failed to allocate for range 0: 10.89.0.64 has been allocated to 5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe, duplicate allocation is not allowed
ERRO[0000] Error while adding pod to CNI network "podman_default": failed to allocate for range 0: 10.89.0.64 has been allocated to 5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe, duplicate allocation is not allowed
Error: unable to start container "5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe": error configuring network namespace for container 5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe: failed to allocate for range 0: 10.89.0.64 has been allocated to 5e1c23b05eee19f0833d7ff116fa50886ead824f944652239640c8aefdd1e5fe, duplicate allocation is not allowed

Describe the results you expected:
Container starts

Additional information you deem important (e.g. issue happens only occasionally):
Seems to be more common when run by a crontab. But that might be my imagination.
Output of podman version:

Version:      3.0.0
API Version:  3.0.0
Go Version:   go1.15.7
Built:        Fri Feb 12 00:12:57 2021
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.19.2
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.26-1.fc33.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.26, commit: 777074ecdb5e883b9bec233f3630c5e7fa37d521'
  cpus: 40
  distribution:
    distribution: fedora
    version: "33"
  eventLogger: journald
  hostname: xxxx
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.10.15-200.fc33.x86_64
  linkmode: dynamic
  memFree: 190546157568
  memTotal: 201951862784
  ociRuntime:
    name: crun
    package: crun-0.17-1.fc33.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.17
      commit: 0e9229ae34caaebcb86f1fde18de3acaf18c6d9a
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    selinuxEnabled: true
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 4294963200
  swapTotal: 4294963200
  uptime: 92h 34m 59.04s (Approximately 3.83 days)
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 3
    paused: 0
    running: 3
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageStore:
    number: 3
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 3.0.0
  Built: 1613085177
  BuiltTime: Fri Feb 12 00:12:57 2021
  GitCommit: ""
  GoVersion: go1.15.7
  OsArch: linux/amd64
  Version: 3.0.0

Package info (e.g. output of rpm -q podman or apt list podman):

podman-3.0.0-1.fc33.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):
Bare metal server dual xeon cpus and 192GB RAM.

I also have this in my /usr/lib/tmpfiles.d/podman.conf

# /tmp/podman-run-* directory can contain content for Podman containers that have run
# for many days. This following line prevents systemd from removing this content.
x /tmp/podman-run-*
D! /run/podman 0700 root root
D! /var/lib/cni/networks

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.locked - please file new issue/PRAssist humans wanting to comment on an old issue or PR with locked comments.stale-issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions