Skip to content

[Tracking] Calico Pod-to-pod connectivity interrupted on kernel 5.4 with Mellanox ConnectX4 Lx (MT27710 Family) #183

@iaguis

Description

@iaguis

Flatcar Status: A workaround is in place for releases 2605.4.0 and above. This issue is kept for tracking the bug with newer kernel releases.

Description

On Kubernetes, pod-to-pod connections between pods running on different nodes don't work on current Flatcar Beta (2605.3.0) and Alpha (2605.1.0). Stable (2512.4.0) works fine for me. My clusters run calico.

Impact

My Kubernetes clusters don't work on Flatcar Beta and Alpha.

Environment and steps to reproduce

  1. Set-up: Flatcar running on Packet.
  2. Task: N/A
  3. Action(s): (see the "Additional information" section for more details)
    a. Started pod A with a web server
    b. Started pod B on a different node of the cluster
    c. Tried to connect from pod B to pod A
  4. Error: Connection timeout. However, ping works between the containers.

Expected behavior

An established connection.

Additional information

  • I used the following lokocfg and Lokomotive v0.4.0:

    cluster "packet" {
      asset_dir = "./assets"
    
      cluster_name = "broken"
    
      controller_count = 1
      controller_type = "t1.small.x86"
    
      enable_tls_bootstrap = false
    
      os_channel = "alpha"
    
      dns {
        provider = "route53"
        zone = "example.net"
      }
    
      facility = "sjc1"
      project_id = "..."
    
      ssh_pubkeys = [
        "ssh-rsa AAAAB3...",
      ]
    
      management_cidrs = ["0.0.0.0/0"]
      node_private_cidr = "10.xxx.xxx.xxx/25"
    
      enable_aggregation = true
    
      oidc {}
    
      worker_pool "pool-1" {
        count = 2
        node_type = "c2.medium.x86"
    
        os_channel = "alpha"
      }
    }
    
  • I followed the instructions in Debug Services to reproduce the issue.

  • I ran tcpdump and see the packets coming to the receiving host through the Calico tunl0 interface but they never reach the receiving container veth.

  • I checked the iptables counters on the receiving host and see that this rule gets triggered when I try to connect from pod B to pod A:

    core@test-broken-pool-1-worker-1 ~ $ sudo iptables-save -c | grep DROP | grep -v "\[0:0\]"
    ...
    [9:540] -A cali-fh-tunl0 -m comment --comment "cali:Su0l1tIx53hedKuv" -m conntrack --ctstate INVALID -j DROP
    ...
    
  • I tried fixes similar to the one mentioned in systemd 245 (channel 2605) breaks cilium pod to out-of-node traffic #181 but even setting rp_filter=0 on all interfaces doesn't fix the issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    channel/alphaIssue concerns the Alpha channel.channel/betaIssue concerns the Beta channel.kind/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions