orchestrator runs as a highly available service. This document lists the various ways of achieving HA for orchestrator, as well as less/not highly available setups.
HA is achieved by choosing either:
orchestrator/raftsetup, whereorchestratornodes communicate by raft consensus. Eachorchestratornode has a private database backend, eitherMySQLorsqlite. See also orchestrator/raft documentation- Shared backend setup. Multiple
orchestratornodes all talk to the same backend, which may be a Galera/XtraDB Cluster/InnoDB Cluster/NDB Cluster. Synchronization is done at the database level.
See also: orchestrator/raft vs. synchronous replication setup
You may choose different availability types, based on your requirements.
- No high availability: easiest, simplest setup, good for testing or dev setups. Can use
MySQLorsqlite - Semi HA: backend is based on normal MySQL replication.
orchestratordoes not eat its own dog food and cannot failover its own backend. - HA: as depicted above; support for no single point of failure. Different solutions have different tradeoff in terms of resource utilization, supported software, type of client access.
Discussed below are all options.
This setup is good for CI testing, for local dev machines or otherwise experiments. It is a single-orchestrator node with a single DB backend.
The DB backend may be a MySQL server or it may be a sqlite DB, bundled with orchestrator (no dependencies, no additional software required)
This setup provides semi HA for orchestrator. Two variations available:
-
Multiple
orchestratornodes talk to the same backend database. HA of theorchestratorservices is achieved. However HA of the backend database is not achieved. Backend database may be amasterwith replicas, butorchestratoris unable to eat its own dog food and failover its very backend DB.If the backend
masterdies, it takes someone or something else to failover theorchestratorservice onto a promoted replica. -
Multiple
orchestratorservices all talk to a proxy server, which load balances an active-activeMySQLmaster-master setup withSTATEMENTbased replication.- The proxy always directs to same server (e.g.
firstalgorithm forHAProxy) unless that server is dead. - Death of the active master causes
orchestratorto talk to other master, which may be somewhat behind.orchestratorwill typically self reapply the missing changes by nature of its continuous discovery. orchestratorqueries guaranteeSTATEMENTbased replication will not cause duplicate errors, and master-master setup will always achieve consistency.orchestratorwill be able to recover the death of a backend master even if in the middle of runnign a recovery (recovery will re-initiate on alternate master)- Split brain is possible. Depending on your setup, physical locations, type of proxy, there can be different
orchestratorservice nodes speaking to different backendMySQLservers. This scenario can lead two twoorchestratorservices which consider themselves as "active", both of which will run failovers independently, which would lead to topology corruption.
- The proxy always directs to same server (e.g.
To access your orchestrator service you may speak to any healthy node.
Both these setups are well known to run in production for very large environments.
HA is achieved by highly available shared backend. Existing solutions are:
- Galera
- XtraDB Cluster
- InnoDB Cluster
- NDB Cluster
In all of the above the MySQL nodes run synchronous replication (using the common terminology).
Two variations exist:
-
Your Galera/XtraDB Cluster/InnoDB Cluster runs with a single-writer node. Multiple
orchestratornodes will speak to the single writer DB, probably via proxy. If the writer DB fails, the backend cluster promotes a different DB as writer; it is up to your proxy to identify that and directorchestrator's traffic to the promoted server. -
Your Galera/XtraDB Cluster/InnoDB Cluster runs in multiple writers mode. A nice setup would couple each
orchestratornode with a DB server (possibly on the very same box). Since replication is synchronous there is no split brain. Only oneorchestratornode can ever be the leader, and that leader will only speak with a consensus of the DB nodes.
In this setup there could be a substantial amount of traffic between the MySQL nodes. In cross-DC setups this may imply larger commit latencies (each commit may need to travel cross DC).
To access your orchestrator service you may speak to any healthy node. It is advisable you speak only to the leader via proxy (use /api/leader-check as HTTP health check for your proxy).
The latter setup is known to run in production at a very large environment on 3 or 5 nodes setup.
orchestrator nodes will directly communicate via raft consensus algorithm. Each orchestrator node has its own private backend database. This can be MySQL or sqlite.
Only one orchestrator node assumes leadership, and is always a part of a consensus. However all other nodes are independently active and are polling your topologies.
In this setup there is:
- No communication between the DB nodes.
- Minimal communication between the
orchestrator. *ncommunication toMySQLtopology nodes. A3node setup means each topologyMySQLserver is probed by3differentorchestratornodes, independently.
It is recommended to run a 3-node or a 5-node setup.
sqlite is embedded within orchestrator and does not require an external dependency. MySQL outperforms sqlite on busy setups.
To access your orchestrator service you may only speak to the leader node.
- Use
/api/leader-checkas HTTP health check for your proxy. - Or use orchestrator-client with multiple
orchestratorbackends;orchestrator-clientwill figure out the identity of the leader and will send requests to the leader.
orchestrator/raft is a newer development, and is being tested in production at this time. Please read the orchestrator/raft documentation for all implications.




