Skip to content

Make one-shot units more robust #311

@margamanterola

Description

@margamanterola

Current situation
We have some one-shot units, like coreos-metadata, that don't get retried if they failed when they ran the first time. They just stay around as failed.

Impact
For coreos-metadata this means that if the metadata service is unavailable when the machine boots, but later becomes available, the machine never recovers.

Ideal future situation
To make this type of units more robust, we should add Restart=on-failure (as well as some delay, like say RestartSec=10 or maybe 1m, unfortunately there's no exponential backoff).

Additionally, we should consider adding RemainAfterExit=yes, so that these units don't get executed more than once it they get pulled in as as wanted/required. Otherwise, it could mean that an existing file gets lost when the server is unavailable later.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions