-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Spire fails to start if on recent version #40533
Copy link
Copy link
Open
Labels
area/agentCilium agent related.Cilium agent related.area/servicemeshGH issues or PRs regarding servicemeshGH issues or PRs regarding servicemeshfeature/authenticationhelp-wantedYou can help! Post a detailed plan on the issue or create a PR to solve this issue.You can help! Post a detailed plan on the issue or create a PR to solve this issue.kind/bugThis is a bug in the Cilium logic.This is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.This was reported by a user in the Cilium community, eg via Slack.pinnedThese issues are not marked stale by our issue bot.These issues are not marked stale by our issue bot.
Description
Is there an existing issue for this?
- I have searched the existing issues
Version
equal or higher than v1.17.5 and lower than v1.18.0
What happened?
Getting an EKS cluster up and running with Cilium from scratch, with Spire v.1.12.4 instead of the default 1.9.6 fails with
time="2025-07-15T12:07:21Z" level=info msg="DataStore closed" subsystem_name=catalog
time="2025-07-15T12:07:21Z" level=error msg="Fatal run error" error="listen unix /tmp/spire-server/private/api.sock: bind: permission denied"
time="2025-07-15T12:07:21Z" level=error msg="Server crashed" error="listen unix /tmp/spire-server/private/api.sock: bind: per ││ mission denied"
How can we reproduce the issue?
- Install Cilium with Spire enabled
- Override Spire version with 1.12.4
- Start
Cilium Version
1.17.5
Kernel Version
$ uname -a
Linux ip-10-0-30-117.eu-central-2.compute.internal 6.1.141-155.222.amzn2023.aarch64 #1 SMP Tue Jun 17 10:29:19 UTC 2025 aarch64 aarch64 aarch64 GNU/Linux
Kubernetes Version
Client Version: v1.31.2
Kustomize Version: v5.4.2
Server Version: v1.32.5-eks-5d4a308
Regression
Yes
Sysdump
No response
Relevant log output
Anything else?
To save the trouble to find the shas and versions:
SERVER versions
1.10.4: sha256:4e77ca7017e5279af36cd2bc2fa9ba06a164b5bc352404a03bc4812881498ee8
1.11.3: sha256:fb2829b45b9c2b1ee47c3264de76375f03e9c0a0ae37188404b103aaeccadb46
1.12.4: sha256:34147f27066ab2be5cc10ca1d4bfd361144196467155d46c45f3519f41596e49
AGENT version
1.10.4: sha256:ecb3e00fa38d38e0166a678e4c4c1e1b3517b63a37e2a57154e7f5d5d0ee1f98
1.11.3: sha256:38b39a91441d646aba5fa518fffda30a14918f24af117b483529a271a96e5001
1.12.4: sha256:163970884fba18860cac93655dc32b6af85a5dcf2ebb7e3e119a10888eff8fcd
Released via openTofu with:
resource "helm_release" "cilium" {
repository = "https://helm.cilium.io/"
chart = "cilium"
version = var.cilium_version
name = "cilium"
namespace = "kube-system"
# allow ample time at cluster bootstrap mainly due to ASG nodes coming up
timeout = "900"
set {
name = "cni.exclusive"
value = "true"
}
set {
name = "enableIPv4Masquerade"
value = "false"
}
set {
name = "routingMode"
value = "native"
}
set {
name = "ipam.mode"
value = "eni"
}
set {
name = "eni.enabled"
value = "true"
}
set {
name = "eni.awsEnablePrefixDelegation"
value = "true"
}
set {
name = "egressMasqueradeInterfaces"
value = "ens+"
}
set {
name = "endpointRoutes.enabled"
value = "true"
}
set {
name = "hubble.relay.enabled"
value = "true"
}
set {
name = "hubble.ui.enabled"
value = "true"
}
set {
name = "hubble.metrics.enabled"
value = "{dns,drop,tcp,flow,port-distribution,icmp,httpV2:exemplars=true;labelsContext=source_ip\\,source_namespace\\,source_workload\\,destination_ip\\,destination_namespace\\,destination_workload\\,traffic_direction}"
}
# enables cilium agent metrics
set {
name = "prometheus.enabled"
value = "true"
}
# enables cilium operator metrics
set {
name = "operator.prometheus.enabled"
value = "true"
}
# given this, do not install kube-proxy or expect to see it
set {
name = "kubeProxyReplacement"
value = "true"
}
set {
name = "k8sServiceHost"
value = replace(module.eks.cluster_endpoint, "https://", "")
}
set {
name = "k8sServicePort"
value = "443"
}
# e2e encryption, node <-> pod, pod <-> pod
set {
name = "encryption.enabled"
value = "true"
}
set {
name = "encryption.type"
value = "wireguard"
}
set {
name = "encryption.nodeEncryption"
value = "true"
}
# mTLS (through SPIRE) - using values block for complex objects
values = [
yamlencode({
authentication = {
mutual = {
spire = {
install = {
agent = {
image = {
digest = var.cilium_spire_version.agent.sha256
repository = "ghcr.io/spiffe/spire-agent"
tag = var.cilium_spire_version.agent.version
pullPolicy = "IfNotPresent"
useDigest = true
}
}
server = {
image = {
digest = var.cilium_spire_version.server.sha256
repository = "ghcr.io/spiffe/spire-server"
tag = var.cilium_spire_version.server.version
pullPolicy = "IfNotPresent"
useDigest = true
}
# changed from the default helm value v1.9.6 to 1.10.x
# If upgrading, you need to fiddle with EBS permissions or start over
# See PR #3722
podSecurityContext = {
runAsUser = 1000
runAsGroup = 1000
fsGroup = 1000
}
}
}
}
}
}
})
]
set {
name = "authentication.mutual.spire.enabled"
value = "true"
}
set {
name = "authentication.mutual.spire.install.enabled"
value = "true"
}
# node firewall
set {
name = "hostFirewall.enabled"
value = "true"
}
set {
name = "envoy.idleTimeoutDurationSeconds"
value = "180"
}
set {
name = "operator.replicas"
value = "3"
}
set {
name = "dnsProxy.dnsRejectResponseCode"
value = "nameError"
}
}
Cilium Users Document
- Are you a user of Cilium? Please add yourself to the Users doc
Code of Conduct
- I agree to follow this project's Code of Conduct
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
area/agentCilium agent related.Cilium agent related.area/servicemeshGH issues or PRs regarding servicemeshGH issues or PRs regarding servicemeshfeature/authenticationhelp-wantedYou can help! Post a detailed plan on the issue or create a PR to solve this issue.You can help! Post a detailed plan on the issue or create a PR to solve this issue.kind/bugThis is a bug in the Cilium logic.This is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.This was reported by a user in the Cilium community, eg via Slack.pinnedThese issues are not marked stale by our issue bot.These issues are not marked stale by our issue bot.