Skip to content

Equinix Metal tests are failling on all channels because of emergency shell #1536

@tormath1

Description

@tormath1

In the last builds (on all channels), we noticed this:

     --- PASS: kubeadm.v1.29.2.flannel.base/node_readiness (10.52s)
     --- PASS: kubeadm.v1.29.2.flannel.base/nginx_deployment (5.47s)
     --- PASS: kubeadm.v1.29.2.flannel.base/NFS_deployment (17.22s)
         harness.go:598: Found emergency shell on machine 704a8550-4d86-4a00-9fe4-70c5a6e00cc8 console
         harness.go:598: Found emergency shell on machine 35dedd13-6dd9-4795-9d74-3e09126a3045 console
         harness.go:598: Found emergency shell on machine 66848575-9be0-4680-923b-8bd52b4f65ed console

Tests are passing but there is an emergency shell in the logs, so the test is considered as a failure.

Impact

Tests are failing on Equinix Metal and they continue to run after 14h sometimes as the CI is trying to rerun the tests + provisioning time.

Environment and steps to reproduce

  1. Run Equinix Metal tests in the CI or repro locally with an Equinix Metal account

Expected behavior

Tests are passing

Additional information

systemd-networkd is not running when ignition-mount is started but Ignition tries, at each stage, to request Equinix Metal metadata service, so it fails. It's not a systemd issue as LTS is affected and the breakage, on LTS, occurs between the last release (that was working) and the current nightlies: flatcar/scripts@lts-3510.3.5...lts-3510.3.5-nightly-20240830-2100 - could be a kernel issue?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions