Skip to content

roachprod: validate host before running a command #89437

@renatolabs

Description

@renatolabs

Describe the problem

When running a roachprod command that doesn't synchronize the local cache, it's possible that the underlying cluster may have expired and the IPs associated with the VMs have been reused by a new cluster. In such cases, roachprod will happily run the commands on those hosts thinking it belongs to the old cluster.

This has happened in #88445, and debugging the aftermath of this type of error can be very puzzling and time consuming

Desired behavior

roachprod could validate the host before running a command. This could be done by checking hostnames, creation timestamps, or something else.

Jira issue: CRDB-20253

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-testingTesting tools and infrastructureC-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.T-testengTestEng Team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions