storage: node does not become live after restart until many hours later

**BUG REPORT**

Please describe the issue you observed, and any steps we can take to reproduce it:

- Which version of CockroachDB are you using?
2.0.4

- What did you do?
Rebooted a node

- What did you expect to see?
Node to come up

- What did you see instead?
cockroachdb starts but node never becomes live

- What was the impact?
Minor, rest of the cluster is live

I'm seeing lots of messages like this:
```
W180801 21:55:27.492922 343 storage/node_liveness.go:501  [n17,hb] slow heartbeat took 4.5s
W180801 21:55:27.492930 343 storage/node_liveness.go:438  [n17,hb] failed node liveness heartbeat: context deadline exceeded
W180801 21:55:28.035125 344 sql/jobs/registry.go:300  canceling all jobs due to liveness failure
```

NTP sync and network connectivity seems good.

Log file for this node:
[cockroach.log](https://github.com/cockroachdb/cockroach/files/2251285/cockroach.log)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: node does not become live after restart until many hours later #28179

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

storage: node does not become live after restart until many hours later #28179

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions