Skip to content

Pod can be stuck in ContainerCreating state for ~15m with identity-allocation-mode=kvstore #12945

@christarazi

Description

@christarazi

Bug report

A Pod remains stuck in ContainerCreating state for up to 15m with identity-allocation-mode=kvstore. After ~15m, the pod transitions to Running state, and everything proceeds normally. This delay (presumably in identity allocation) occurs when at least one etcd node is down. If the etcd cluster has quorum (e.g. 2/3), then this delay should not happen.

General Information

How to reproduce the issue

  1. Deploy K8s cluster with etcd nodes
  2. Deploy Cilium with the following config:
    • kvstore: etcd
    • identity-allocation-mode: kvstore
  3. Block one etcd node via https://gist.github.com/christarazi/11aadf01d353112eb10ed82373569155. Note this must be done before deploying the pods in the next step.
  4. Deploy nginx (ensure that at least one replica lands on the Cilium instance that you've applied the above etcd iptables rules to):
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
    cool: stuff
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugThis is a bug in the Cilium logic.priority/highThis is considered vital to an upcoming release.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions