Skip to content

Change of behavior of routing packets since docker v25 #47145

@pseusys

Description

@pseusys

Description

In my project, for testing purposes I have the following setup:
There are two containers, client and server. Both client and server are connected to an internal bridge network, serveris also connected to a regular bridge network.
When client tries to connect to an external IP (e.g. 8.8.8.8), it can not resolve the address (since the container is connected to an internal network only). In that case the packet gets marked by iptables -j MARK then it is forwarded to a special ip routing table (using ip rule fwmark lookup) and then via this table it gets routed to server container and to internet from there.

This setup worked with docker 24, but since docker 25 no packets are being forwarded anymore and connection fails with error message [Errno 101] Network is unreachable. Unfortunately, I do not have an opportunity to try the same setup with two physical computers or virtual machines, so could you please help me finding out: which behavior is expected for two real machines?
Will my setup work for two physical computers instead of two containers (like in docker 24) or will it fail (like in docker 25)?

Reproduce

Here's a little simplified setup of my issue:

compose.yml:

version: "3.3"

services:
  client:
    build:
      dockerfile: Dockerfile
    privileged: true
    entrypoint: [ "/bin/sh", "client-startup.sh" ]
    networks:
      internal_network:
        ipv4_address: 10.65.0.65
    depends_on:
      - server

  server:
    build:
      dockerfile: Dockerfile
    privileged: true
    entrypoint: [ "/bin/sh", "server-startup.sh" ]
    networks:
      internal_network:
        ipv4_address: 10.65.0.87
      external_network:
        ipv4_address: 10.87.0.87


networks:
  internal_network:
    internal: true
    ipam:
      config:
        - subnet: 10.65.0.0/24
          gateway: 10.65.0.1
  external_network:
    ipam:
      config:
        - subnet: 10.87.0.0/24
          gateway: 10.87.0.1

Dockerfile:

FROM alpine:3.18

RUN apk add --no-cache iptables

COPY ./*.sh ./

client-startup.sh:

#! /bin/sh

echo "setup iptables..."
iptables -t mangle -A OUTPUT --src "10.65.0.65" -o eth0 --dst "10.65.0.87" -j ACCEPT
iptables -t mangle -A OUTPUT -o eth0 ! --dst "10.65.0.0/24" -j MARK --set-mark 0x41
iptables -t mangle -A OUTPUT -o eth0 ! --dst "10.65.0.0/24" -j ACCEPT

echo "setup ip rule and route..."
ip rule add fwmark 0x41 lookup 65
ip route replace default via "10.65.0.87" table 65

echo "check connection..."
wget -T 5 -O - "8.8.8.8"

echo "check config..."
ip rule list
ip route show table 65
iptables -t mangle -vnL

server-startup.sh:

#! /bin/sh

echo 1 > /proc/sys/net/ipv4/ip_forward

iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
iptables -t nat -A POSTROUTING -o eth1 -j MASQUERADE
iptables -A FORWARD -i eth0 -j ACCEPT
iptables -A FORWARD -i eth1 -j ACCEPT

sleep infinity

Expected behavior

After running docker compose up --build client I expect to see connection message, but see something like wget: can't connect to remote host (8.8.8.8): Network unreachable instead.

docker version

Client: Docker Engine - Community
 Version:           25.0.0
 API version:       1.44
 Go version:        go1.21.6
 Git commit:        e758fe5
 Built:             Thu Jan 18 17:09:59 2024
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          25.0.0
  API version:      1.44 (minimum version 1.24)
  Go version:       go1.21.6
  Git commit:       615dfdf
  Built:            Thu Jan 18 17:09:59 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.27
  GitCommit:        a1496014c916f9e62104b33d1bb5bd03b0858e59
 runc:
  Version:          1.1.11
  GitCommit:        v1.1.11-0-g4bccb38
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client: Docker Engine - Community
 Version:    25.0.0
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.12.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.24.1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 4
  Running: 4
  Paused: 0
  Stopped: 0
 Images: 928
 Server Version: 25.0.0
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: a1496014c916f9e62104b33d1bb5bd03b0858e59
 runc version: v1.1.11-0-g4bccb38
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.15.0-91-generic
 Operating System: Linux Mint 21.3
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 15.49GiB
 Name: ascension
 ID: ICV5:5ZVX:MTEJ:XVHM:C72A:IQ37:IU3B:RSAV:X625:WWPZ:PNGO:KS7W
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Additional Info

Again, I'm not claiming that this or previous behavior is correct, I've just noticed (potentially undocumented?) change and would like to point that out. Also, if I'm doing something wrong, it would be great if you could spare a minute and give me an advice on what is the problem with my approach.

Metadata

Metadata

Labels

area/networkingNetworkingkind/bugBugs are bugs. The cause may or may not be known at triage time so debugging may be needed.status/0-triageversion/25.0

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions