Skip to content

Swap fails to mount due to a race condition #1729

@alehed

Description

@alehed

Bug

Usually on the first run of ignition it fails with the swap device unit timing out. After that, if I power-cycle, on the second boot Ignition will go though without errors.

The problem seems to be due to a race condition where the boot disk preparation doesn't complete before the swap unit is expected to be there. Note that sometimes the bug doesn't happen, but usually it does.

Operating System Version

I am running ignition on the latest CoreOS version. The Ignition file was compiled from a Butane file on Linux with Butane version v0.18.0.

Ignition Version

2.16.2

Environment

I am running Ignition on bare metal on a Raspberry Pi 3 (it also happens on the Raspberry Pi 4) in Fedora CoreOS (38.20230918.3.2).

Expected Behavior

Mounting swap succeeds on first boot.

Actual Behavior

The swap device fails to mount causing the first boot to fail and present a recovery shell.

Reproduction Steps

  1. Run coreos-installer on an instance with two disks giving it the ignition file from below.
  2. Reboot

Other Information

The following ignition file can be used to reproduce this:

{
  "ignition": {
    "version": "3.4.0"
  },
  "passwd": {
    "users": [
      {
        "name": "core",
        "sshAuthorizedKeys": [
          "ssh-ed25519 ...."
        ]
      }
    ]
  },
  "storage": {
    "disks": [
      {
        "device": "/dev/disk/by-id/coreos-boot-disk",
        "partitions": [
          {
            "label": "root",
            "number": 4,
            "resize": true,
            "sizeMiB": 8192
          },
          {
            "label": "swap",
            "number": 5,
            "sizeMiB": 4096
          }
        ],
        "wipeTable": false
      },
      {
        "device": "/dev/disk/by-id/usb-...",
        "partitions": [
          {
            "label": "var",
            "number": 1
          }
        ],
        "wipeTable": false
      }
    ],
    "filesystems": [
      {
        "device": "/dev/disk/by-partlabel/swap",
        "format": "swap"
      },
      {
        "device": "/dev/disk/by-partlabel/var",
        "format": "ext4",
        "path": "/var"
      }
    ]
  },
  "systemd": {
    "units": [
      {
        "contents": "# Generated by Butane\n[Swap]\nWhat=/dev/disk/by-partlabel/swap\n\n[Install]\nRequiredBy=swap.target",
        "enabled": true,
        "name": "dev-disk-by\\x2dpartlabel-swap.swap"
      },
      {
        "contents": "# Generated by Butane\n[Unit]\nRequires=systemd-fsck@dev-disk-by\\x2dpartlabel-var.service\nAfter=systemd-fsck@dev-disk-by\\x2dpartlabel-var.service\n\n[Mount]\nWhere=/var\nWhat=/dev/disk/by-partlabel/var\nType=ext4\n\n[Install]\nRequiredBy=local-fs.target",
        "enabled": true,
        "name": "var.mount"
      }
    ]
  }
}

I didn't manage to reproduce this with a single disk, but due to it being a race condition and not always happening, I cannot say for certain it doesn't also happen without the separate var device.

The rdsosreport.txt looks like this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions