Skip to content

privval: duplicate SignerListener: Listening for new connection module=privval during chain upgrades #3828

@tony-iqlusion

Description

@tony-iqlusion

Bug Report

Setup

  • CometBFT version: v0.38.9
  • Have you tried the latest version: no
  • ABCI app: gaiad
  • Environment:
    • OS: Debian 12
    • Install tools:
    • Others:

Config

priv_validator_laddr = "X.X.X.X:Y"

What happened?

During a chain upgrade, CometBFT seemingly attempts to accept two privval connections. TMKMS only opens one such connection, which otherwise works/remains stable. But in the logs (see below), the node continues to print the following in a loop, which is confusing to users during a chain upgrade and spammy in logs during a time when people are trying to keep an eye on them to figure out what happened.

What did you expect to happen?

These lines do not appear in the logs, though they seem to be a symptom of a deeper problem.

How to reproduce it

This is the tough part: we only witness this during a chain upgrade. It has impacted several recent chain upgrades however, including today's gaia v19 upgrade as well as recent Neutron upgrades.

It seems to be happening with every recent chain upgrade, however, when TMKMS is being used as a remote signer.

The problem persists after the chain has started, but can be fixed after a chain upgrade has completed by restarting the node.

Logs

2:36PM INF SignerListener: Listening for new connection module=privval
2:36PM ERR SignerListener: Error accepting connection err="accept tcp [::]:<port>: i/o timeout" module=privval
2:36PM INF SignerListener: Listening for new connection module=privval
... [etc] ...

The TMKMS side shows a stable connection without problems:

INFO tmkms::session: [cosmoshub-4@tcp://x.x.x.x:y] connected to validator successfully
WARN tmkms::session: [cosmoshub-4@tcp://x.x.x.x:y]: unverified validator peer ID! (ac9e6866a2b448a4444514b808f564bffe9d2b5d)
INFO tmkms::session: [cosmoshub-4@tcp://x.x.x.x:y]: signed Prevote:<nil> at h/r/s 21835201/0/1 (101 ms)

Anything else we need to know

This doesn't actually break anything, though it is very confusing during chain upgrades, and introduces a red herring when people are otherwise trying to debug TMKMS issues (e.g. people interpreted this error as potentially being a problem with enabling extension vote signing, even though everything was working)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingneeds-triageThis issue/PR has not yet been triaged by the team.

    Type

    No type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions