[Zen2] Introduce gossip-like discovery of master nodes#32246
[Zen2] Introduce gossip-like discovery of master nodes#32246DaveCTurner merged 77 commits intoelastic:zen2from
Conversation
This commit introduces the `PeerFinder` which can be used to collect the identities of the master-eligible nodes in a masterless cluster, based on the `UnicastHostsProvider`, the nodes in the `ClusterState`, and nodes that other nodes have discovered.
|
Pinging @elastic/es-distributed |
ywelsch
left a comment
There was a problem hiding this comment.
I've done an initial round. I've left mostly smaller comments, the general flow and the testing looks good.
|
|
||
| public abstract class PeerFinder extends AbstractLifecycleComponent { | ||
|
|
||
| public static final String REQUEST_PEERS_ACTION_NAME = "internal:discovery/zen2/requestpeers"; |
There was a problem hiding this comment.
should we remove the zen2 part here and just have this under discovery? Also maybe request_peers?
|
|
||
| // the time between attempts to find all peers | ||
| public static final Setting<TimeValue> CONSENSUS_FIND_PEERS_DELAY_SETTING = | ||
| Setting.timeSetting("discovery.zen2.find_peers_delay", |
| public static final String REQUEST_PEERS_ACTION_NAME = "internal:discovery/zen2/requestpeers"; | ||
|
|
||
| // the time between attempts to find all peers | ||
| public static final Setting<TimeValue> CONSENSUS_FIND_PEERS_DELAY_SETTING = |
There was a problem hiding this comment.
consensus? 🗡
just FIND_PEERS_DELAY_SETTING
I also wonder if we should call this FIND_PEERS_INTERVAL_SETTING
There was a problem hiding this comment.
Also, register the setting please
| return getFoundPeersSet(); | ||
| } | ||
|
|
||
| public boolean foundQuorumFrom(VotingConfiguration votingConfiguration) { |
There was a problem hiding this comment.
This method is not used anywhere (also not in tests?)
| this.executorServiceFactory = executorServiceFactory; | ||
| } | ||
|
|
||
| public Iterable<DiscoveryNode> getFoundPeers() { |
There was a problem hiding this comment.
this method is not used anywhere (also not in tests?).
|
|
||
| private class ActivePeerFinder { | ||
| private boolean running; | ||
| private final Map<DiscoveryNode, FoundPeer> foundPeers; |
There was a problem hiding this comment.
can you make this package-visible? It's accessed by PeerFinder.
There was a problem hiding this comment.
Yep, fixed. Is there a tool to tell you this? Also running.
| } | ||
|
|
||
| logger.trace("startProbe({}) found disconnected {}, probing again", transportAddress, cachedNode); | ||
| connectedNodes.remove(transportAddress, cachedNode); |
There was a problem hiding this comment.
should we also clean-up foundPeers?
There was a problem hiding this comment.
I think not, but added a TODO for further discussion.
There was a problem hiding this comment.
foundPeers is no longer a thing - we just use connectedNodes throughout.
| import java.util.Optional; | ||
|
|
||
| public class PeersResponse extends TransportResponse { | ||
| private final Optional<DiscoveryNode> masterNode; |
There was a problem hiding this comment.
I wonder if it would be nicer to have this as a boolean (isActiveMaster) stating whether the current node providing the peer response is an active master node, and have the list of discoveryNodes just called masterNodes.
There was a problem hiding this comment.
As it is now, if you contact a follower the next step is just to contact the follower's leader. With a flag you'd go ahead and contact all the follower's peers before discovering that one of them is the leader.
In a properly-configured cluster it doesn't seem to make much difference, but if badly configured (e.g. the leader is not in all the unicast hosts lists, and there are lots of master-eligible nodes) then there's less traffic this way.
There was a problem hiding this comment.
not sure I follow (no pun intended). A follower would return isActiveMaster = false + the active master as singleton in the list of masterEligibleNodes, which would lead to exact same behavior as what's currently in the PR if I understand this correctly.
There was a problem hiding this comment.
We discussed on Zoom. There's no strong argument either way. I slightly prefer that as it is now there's a difference in the responses between a follower and a candidate that knows of a single node, although we don't use this difference anywhere. I also see that moving to a single list (a singleton from followers) or a boolean would remove a couple of lines of code that deal with the master node as a special case. We'll leave it as is.
|
|
||
| public class PeersRequest extends TransportRequest { | ||
| private final DiscoveryNode sourceNode; | ||
| private final List<DiscoveryNode> discoveryNodes; |
There was a problem hiding this comment.
should we call this masterNodes (or masterEligibleNodes)?
There was a problem hiding this comment.
Renamed to candidateNodes to reflect that they're only there if there isn't a known leader.
| /** | ||
| * Label a <code>Runnable</code>, overriding its <code>toString()</code> method. | ||
| */ | ||
| public static Runnable labelRunnable(final Runnable runnable, final String label) { |
There was a problem hiding this comment.
this could be dangerous if an abstract runnable is passed in
add an assertion that it's not an AbstractRunnable?
There was a problem hiding this comment.
Ah yes, this is bad. I just removed it.
…st pass it in at activation
| * under the License. | ||
| */ | ||
|
|
||
| package org.elasticsearch.cluster.coordination; |
|
|
||
| if (active == false) { | ||
| logger.trace("ActivePeerFinder#handleWakeUp(): not running"); | ||
| logger.trace("PeerFinder: not running"); |
There was a problem hiding this comment.
do we even need to say that this is PeerFinder? We have a logger for PeerFinder that already spits out the class name (same thing for other logging occurrences in this class).
| active = false; | ||
| handleWakeUp(); | ||
| } | ||
| active = false; |
There was a problem hiding this comment.
The tests deactivate the PeerFinder at the end, regardless of whether it's active or not, and there's no particular reason to avoid double-deactivation or to make sure that each test ends with an active PeerFinder.
Also although I am pretty sure that we can't change the leader without being active, this is not something we guarantee, nor do I think we need to.
The `PeerFinder`, introduced in elastic#32246, obtains the collection of seed addresses configured by the user from a `ConfiguredHostsResolver`. In reality this collection comes from the `UnicastHostsProvider` via a slightly complicated threading model that performs the resolution of hostnames to addresses using a dedicated `ExecutorService`. This commit introduces an adapter to allow the `PeerFinder` to obtain its seed addresses in this manner.
The `PeerFinder`, introduced in elastic#32246, needs to be able to identify, and connect to, a remote master node using only its `TransportAddress`. This can be done by opening a single-channel connection to the address, performing a handshake, and only then forming a full-blown connection to the node. This change implements this logic.
The `PeerFinder`, introduced in #32246, needs to be able to identify, and connect to, a remote master node using only its `TransportAddress`. This can be done by opening a single-channel connection to the address, performing a handshake, and only then forming a full-blown connection to the node. This change implements this logic.
The `PeerFinder`, introduced in #32246, obtains the collection of seed addresses configured by the user from a `ConfiguredHostsResolver`. In reality this collection comes from the `UnicastHostsProvider` via a slightly complicated threading model that performs the resolution of hostnames to addresses using a dedicated `ExecutorService`. This commit introduces an adapter to allow the `PeerFinder` to obtain its seed addresses in this manner.
Today, CapturingTransport#createCapturingTransportService creates a transport service with a connection manager with reasonable default behaviours, but overriding this behaviour in a consumer is a litle tricky. Additionally, the default behaviour for opening a connection duplicates the content of the CapturingTransport#openConnection() method. This change removes this duplication by delegating to openConnection() and introduces overridable nodeConnected() and onSendRequest() methods so that consumers can alter this behaviour more easily. Relates elastic#32246 in which we test the mechanisms for opening connections to unknown (and possibly unreachable) nodes.
Today, CapturingTransport#createCapturingTransportService creates a transport service with a connection manager with reasonable default behaviours, but overriding this behaviour in a consumer is a litle tricky. Additionally, the default behaviour for opening a connection duplicates the content of the CapturingTransport#openConnection() method. This change removes this duplication by delegating to openConnection() and introduces overridable nodeConnected() and onSendRequest() methods so that consumers can alter this behaviour more easily. Relates #32246 in which we test the mechanisms for opening connections to unknown (and possibly unreachable) nodes.
Today, CapturingTransport#createCapturingTransportService creates a transport service with a connection manager with reasonable default behaviours, but overriding this behaviour in a consumer is a litle tricky. Additionally, the default behaviour for opening a connection duplicates the content of the CapturingTransport#openConnection() method. This change removes this duplication by delegating to openConnection() and introduces overridable nodeConnected() and onSendRequest() methods so that consumers can alter this behaviour more easily. Relates #32246 in which we test the mechanisms for opening connections to unknown (and possibly unreachable) nodes.
This commit introduces the
PeerFinderwhich can be used to collect theidentities of the master-eligible nodes in a masterless cluster, based on the
UnicastHostsProvider, the nodes in theClusterState, and nodes that othernodes have discovered.