-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Closed
Labels
Description
Design
Membership
- Worker nodes periodically report the following info to the master:
-- worker ID: unique string in the cluster. e.g.workerhost01-containerd-overlay
-- connection info for connecting to the worker from the master: implementation-specific. probably, e.g.tcp://workerhost01:12345orunix://run/buildkit/instance01.sock
--- Support for UNIX socket should be useful for testing purpose
--- not unique; can be shared among multiple workers. (master puts workerID to all request messages)
-- performance stat: loadavg, disk quota usage, and so on
-- annotations
e.g.
{
"worker_id": "workerhost01-containerd-overlay",
"connections":[
{
"type": "grpc.v0",
"socket": "tcp://workerhost01.12345"
}
],
"stats":[
{
"type": "cpu.v0",
"loadavg": [0.01, 0.02, 0.01]
}
],
"annotations": {
"os": "linux",
"arch": "amd64",
"executor": "containerd",
"snapshotter": "overlay",
"com.example.userspecific": "blahblahblah",
}
}Cache query
- With the connection info above, managers can ask a worker whether the worker has the cache for the
CacheKey.
-- the answer does not need to be 100% accurate.
-- How to transfer the cache data is another topic: Cache transfer #224
Initial naive implementation
-
Stateless master
-- When the master dies, the orchestrator (k8s/swarm) restarts the master (and membership info will be lost)
-- Multiple masters could be started, but no connection between masters -
Worker connects to the master using gRPC
-- the master address(es) can be specified via the daemon CLI flag:--join tcp://master:12345 -
Master connects to all workers using gRPC for querying cache existence
-- does not scale for dozens of nodes, but probably acceptable for the initial work
Future possible implementation
- Use IPFS (or just libp2p DHT library) for querying cache existence (and also transfer)?
-- Membership state can be saved to IPFS as well?
-- or Infinite? (is it still active?)
Reactions are currently unavailable