-
Notifications
You must be signed in to change notification settings - Fork 49
oem-gce.service is broken after upgrading to flatcar 3139.2.0 #714
Copy link
Copy link
Closed
Labels
kind/bugSomething isn't workingSomething isn't workingplatform/GCPRelated to Google Cloud PlatformRelated to Google Cloud Platform
Description
Description
We have a Kubernetes cluster running on gcp nodes using flatcar os. When we updated nodes from flatcar version 3033.2.4 to 3139.2.0 (latest) we noticed that oem-gce.service fails to start with the following logs:
core@master-k8s-exp-1-839c ~ $ journalctl -u oem-gce.service
Apr 12 09:29:25 localhost systemd[1]: Starting GCE Linux Agent...
Apr 12 09:29:26 localhost mkfs.ext4[981]: mke2fs 1.45.5 (07-Jan-2020)
Apr 12 09:29:26 localhost mkfs.ext4[981]: [82B blob data]
Apr 12 09:29:26 localhost mkfs.ext4[981]: Creating filesystem with 262144 4k blocks and 65536 inodes
Apr 12 09:29:26 localhost mkfs.ext4[981]: Filesystem UUID: 9376da71-5201-4916-b603-141a89463da7
Apr 12 09:29:26 localhost mkfs.ext4[981]: Superblock backups stored on blocks:
Apr 12 09:29:26 localhost mkfs.ext4[981]: 32768, 98304, 163840, 229376
Apr 12 09:29:26 localhost mkfs.ext4[981]: [41B blob data]
Apr 12 09:29:26 localhost mkfs.ext4[981]: [38B blob data]
Apr 12 09:29:26 localhost mkfs.ext4[981]: Creating journal (8192 blocks): done
Apr 12 09:29:27 localhost mkfs.ext4[981]: [75B blob data]
Apr 12 09:29:27 localhost umount[1033]: umount: /var/lib/flatcar-oem-gce.img: not mounted.
Apr 12 09:29:33 master-k8s-exp-1-839c.c.uw-dev.internal systemd-nspawn[1714]: Spawning container oem-gce on /var/lib/flatcar-oem-gce.img.
Apr 12 09:29:33 master-k8s-exp-1-839c.c.uw-dev.internal systemd-nspawn[1714]: Press ^] three times within 1s to kill container.
Apr 12 09:29:33 master-k8s-exp-1-839c.c.uw-dev.internal systemd[1]: Started GCE Linux Agent.
Apr 12 09:29:33 master-k8s-exp-1-839c.c.uw-dev.internal systemd-nspawn[1714]: + '[' -e /etc/default/instance_configs.cfg.template ']'
Apr 12 09:29:33 master-k8s-exp-1-839c.c.uw-dev.internal systemd-nspawn[1714]: + echo -e '[InstanceSetup]\nset_host_keys = false'
Apr 12 09:29:33 master-k8s-exp-1-839c.c.uw-dev.internal systemd-nspawn[1714]: + /usr/bin/google_instance_setup
Apr 12 09:29:33 master-k8s-exp-1-839c.c.uw-dev.internal systemd-nspawn[1714]: /init.sh: /usr/bin/google_instance_setup: /usr/lib/python-exec/python3.9/python3: bad interpreter: No such file or directory
Apr 12 09:29:33 master-k8s-exp-1-839c.c.uw-dev.internal systemd-nspawn[1714]: Container oem-gce failed with error code 126.
Apr 12 09:29:33 master-k8s-exp-1-839c.c.uw-dev.internal systemd[1]: oem-gce.service: Main process exited, code=exited, status=126/n/a
Apr 12 09:29:33 master-k8s-exp-1-839c.c.uw-dev.internal systemd[1]: oem-gce.service: Failed with result 'exit-code'
Impact
Since this service is critical for a few gcp features, including setting up host ip routes, a lot of things broke. In particular, we noticed as our external load balancing to the cluster failed.
Environment and steps to reproduce
Latest flatcar on gcp nodes should be enough to observe this behaviour.
Additional information
Flatcar Container Linux by Kinvolk 3139.2.0 (Oklo)
Kernel: 5.15.32-flatcar
Kubernetes: 1.22.5
Container-Runtime: docker://20.10.12
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
kind/bugSomething isn't workingSomething isn't workingplatform/GCPRelated to Google Cloud PlatformRelated to Google Cloud Platform