Skip to content

WIP: cluster:swarm runner#112

Closed
jimpick wants to merge 3 commits intomasterfrom
runner/cluster-swarm
Closed

WIP: cluster:swarm runner#112
jimpick wants to merge 3 commits intomasterfrom
runner/cluster-swarm

Conversation

@jimpick
Copy link
Copy Markdown
Contributor

@jimpick jimpick commented Nov 1, 2019

I have an automated Ansible setup for Docker swarm in the aws-ansible branch (needs just a bit more cleanup). I first tested it with a shell script, but then I realized that it wouldn't be too much work to modify the local:docker runner to talk to the swarm API.

Here's a demo running the smlbench2 2-container bitswap test across 2 different EC2 vms in the Docker swarm cluster:

tg-docker-swarm-runner

https://github.com/ipfs/testground/blob/aws-ansible/pkg/runner/cluster_swarm.go

It's not quite done yet ... I've hard-coded the image (pulling from ECR), the REDIS_HOST environment variable, and the network. We'll want to upload the image to ECR after building it before creating the service.


Not complete yet ... some hard-coded values. See demo of it running in #110 (along with notes about what is still missing)

I just hacked this together to test the ansible setup I've been prototyping in the aws-ansible branch:

https://github.com/ipfs/testground/tree/aws-ansible

(tested with the smlbench2 test, which isn't on master yet ... that's going to be the basis of #94)

@daviddias daviddias requested a review from raulk November 4, 2019 12:38
var AllRunners = []runner.Runner{
&runner.LocalDockerRunner{},
&runner.LocalExecutableRunner{},
&runner.ClusterSwarmRunner{},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&runner.ClusterSwarmRunner{},
&runner.ClusterDockerSwarmRunner{},

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure if naming yet another Runner just for Docker Swarm is the best path. It probably worth to just have Docker Swarm as the default Docker Runner

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps this is me just getting confused by seeing Docker Swarm being called a Runner (given that in the end it is just the same thing running in a Docker container, but managed by Docker Swarm)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think calling this ClusterSwarmRunner is fine. I was using the convention scope:technology, e.g. local:exec, local:docker, etc. to name runners. We can also rename "runners" to "schedulers" if we feel that's more accurate.

// docker service create --replicas 1 --name helloworld alpine ping docker.com

replicas := uint64(input.Instances)
spec := swarm.ServiceSpec{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I get a clarification of this way of specing the service vs. using docker-compose.yml? (https://docs.docker.com/compose/)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we've decided to create a temporary overlay network per run, we could use Docker Compose to create both the network and the service in one go. However, IIRC Docker Compose is really a CLI tool written in Python that transforms a YAML into docker calls. We cannot use it from Golang :-(

@daviddias daviddias changed the title [WIP] cluster:swarm runner WIP: cluster:swarm runner Nov 4, 2019
}

// Temp Redis fix
env = append(env, "REDIS_HOST=172.31.14.166")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just flagging things to discuss.

},
TaskTemplate: swarm.TaskSpec{
ContainerSpec: &swarm.ContainerSpec{
Image: "909427826938.dkr.ecr.us-west-2.amazonaws.com/testground:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855@sha256:e175f10c2fc0545ede1de08458dffbea5b3efb3c023963028eac9129f4fd5920",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on discussion today, the docker:go builder should push the image to ECR if we receive a docker_registry param.

},
Networks: []swarm.NetworkAttachmentConfig{
{
Target: "hw6dcms11qhf3iv3rr2j8a2vb",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to create a new overlay network per deployment.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. the data plane would be specific to the test case.

Env: env,
},
RestartPolicy: &swarm.RestartPolicy{
Condition: swarm.RestartPolicyConditionNone,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

var AllRunners = []runner.Runner{
&runner.LocalDockerRunner{},
&runner.LocalExecutableRunner{},
&runner.ClusterSwarmRunner{},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think calling this ClusterSwarmRunner is fine. I was using the convention scope:technology, e.g. local:exec, local:docker, etc. to name runners. We can also rename "runners" to "schedulers" if we feel that's more accurate.

// docker service create --replicas 1 --name helloworld alpine ping docker.com

replicas := uint64(input.Instances)
spec := swarm.ServiceSpec{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we've decided to create a temporary overlay network per run, we could use Docker Compose to create both the network and the service in one go. However, IIRC Docker Compose is really a CLI tool written in Python that transforms a YAML into docker calls. We cannot use it from Golang :-(

@raulk raulk self-assigned this Nov 4, 2019
@raulk
Copy link
Copy Markdown
Contributor

raulk commented Nov 4, 2019

Taking this to land it on top of @jimpick's work on the infra side of things.

jimpick and others added 3 commits November 6, 2019 00:09
Needs to connect to the right network and fetch the right
image.
Just need to make those a bit more dynamic and we will have a working
runner!
@raulk
Copy link
Copy Markdown
Contributor

raulk commented Nov 6, 2019

Closing this PR in favour of #126. Thanks for the head start here, @jimpick! Let's land it!

@raulk raulk closed this Nov 6, 2019
@daviddias daviddias deleted the runner/cluster-swarm branch December 11, 2019 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants