Don't delete tasks until they're actually removed by the agent by nishanttotla · Pull Request #2446 · moby/swarmkit

nishanttotla · 2017-11-15T19:58:38Z

This PR is a proposed way to fix #2407. We may choose to solve the same problem differently, but here's what this PR does:

Add a new task state called REMOVE.
On service removals or scale downs, tasks that are due to be removed get updated to have desired state REMOVE (as opposed to just straight up deleting them like the existing behavior. This existing behavior causes networking issues as detailed in [Proposal] Resolving the IP address contention issue #2407, and also reporting issues such as service ps not showing all instances after scaling down moby#34967).
The agent detects desired state REMOVE and initiates task shutdown.
The task reaper deletes tasks that have desired state REMOVE and actual state >=SHUTDOWN

Things this PR does not deal with

Changing API output (do we want to be able to do docker service ps and see tasks in REMOVE state until they're deleted? What about scale downs?)
Updating the design docs

codecov · 2017-11-15T22:53:28Z

Codecov Report

Merging #2446 into master will decrease coverage by 3.2%.
The diff coverage is 71.42%.

@@            Coverage Diff             @@
##           master    #2446      +/-   ##
==========================================
- Coverage   63.74%   60.53%   -3.21%     
==========================================
  Files          64      128      +64     
  Lines       11793    26426   +14633     
==========================================
+ Hits         7517    15996    +8479     
- Misses       3662     9017    +5355     
- Partials      614     1413     +799

stevvooe · 2017-11-17T00:02:54Z

@nishanttotla As I stated in the other issue, adding a new state for removal doesn't make a whole lot of sense. There is already the functionality of removing the task from the assignment set that activates the removal workflow. We can tell that this state is unnecessary, because it is just a noop on the agent and no actual removal is happening. States are meant to set a target for a state and then the agent achieves that target.

The main problem that is addressed here is that the orchestrator needs to confirm shutdown (some sort of ack) before releasing the ip address to another container. If a task is reported as COMPLETED or SHUTDOWN, the ip address should be released. However, it seems we remove tasks before confirming their final state and that is likely the root cause of the bug.

What I am not understanding with this PR is why you think you need a removal state. From the changes to the agent, it is clear that the shutdown (or completed, if it is already down) state is sufficient.

The correct solution here is to remove these tasks from the assignment set, so they are not sent down to the agent. The agent will proceed with the remove. You can see part of this logic here. If you want acknowledgement of cleanup, you don't need to do this. Simply set the target state to shutdown, and the task will get shutdown and release resources (like ip address). You may also be able to use the existing ORPHANED or DEAD desired state to signal the task reaper to actually remove the task.

Before proceeding with a PR, it would be good to fully complete the design discussion so that we can find a proper solution.

dperny

without yet addressing stephen's comments, here's a rough overview of the PR. haven't reviewed tests, and not solid on the overall design.

dperny · 2017-11-20T22:55:45Z

manager/orchestrator/replicated/services.go

+				t.DesiredState = state
+
+				if err := store.UpdateTask(tx, t); err != nil {
+					log.G(ctx).WithError(err).Errorf("failed to update task desired state to %s", state.String())


why not return an error here, and abort the transaction?

there might be a good reason not to, but logging an error is functionally the same as ignoring it altogether, except that you have some context when it breaks.

It seems like we have this pattern elsewhere too, for instance in deleteTask just below this. Same for your comment below.

I think its OK to revisit existing patterns. I agree with @dperny that we should fail the txn.

dperny · 2017-11-20T22:57:04Z

manager/orchestrator/service.go

+				t.DesiredState = api.TaskStateRemove
+
+				if err := store.UpdateTask(tx, t); err != nil {
+					log.G(ctx).WithError(err).Errorf("failed to update task desired state to REMOVE")


again here, why not return the error and abort the transaction?

dperny · 2017-11-20T23:02:01Z

manager/orchestrator/service.go

+	)
+	s.View(func(tx store.ReadTx) {
+		tasks, err = store.FindTasks(tx, store.ByServiceID(service.ID))
+	})


there's a possible race here between the view and the batch below, i think. fix might be to move the view into the batch but before the loop below. not sure if we use this pattern elsewhere.

@dperny same pattern as in DeleteServiceTasks above this code. I see your point, but does this mean we've missed this race condition in the past?

dperny · 2017-11-20T23:07:14Z

manager/orchestrator/replicated/services.go

 			return
 		}
-		orchestrator.DeleteServiceTasks(ctx, r.store, v.Service)
+		orchestrator.SetServiceTasksRemove(ctx, r.store, v.Service)


You need to make the same change in github.com/swarmkit/manager/orchestrator/global/global.go L150.

Then, you can remove the DeleteServiceTasks method, because it has no other uses in the codebase.

dperny · 2017-11-20T23:09:30Z

manager/orchestrator/replicated/services.go

 		r.updater.Update(ctx, r.cluster, service, sortedSlots[:specifiedSlots])
 		err = r.store.Batch(func(batch *store.Batch) error {
 			r.deleteTasksMap(ctx, batch, deadSlots)
-			r.deleteTasks(ctx, batch, sortedSlots[specifiedSlots:])


you can remove the (*replicated.Orchestrator).deleteTasks method, because it's no longer used.

dperny · 2017-11-20T23:11:43Z

agent/exec/controller.go

 	}()

-	if task.DesiredState == api.TaskStateShutdown {
+	if task.DesiredState == api.TaskStateShutdown || task.DesiredState == api.TaskStateRemove {


why not if task.DesiredState >= api.TaskStateShutdown, to encompass possible future desired states?

I don't have a solid answer for this yet, I believe I did the exact comparison since the previous condition was an exact comparison itself.

I think you can go with >= api.TaskStateShutdown here. This branch bounds the largest state achievable in the agent as SHUTDOWN, which is the behavior we want.

anshulpundir

Initial comments. Will review further and add more.

anshulpundir · 2017-11-20T23:35:01Z

agent/exec/controller.go

 	}()

-	if task.DesiredState == api.TaskStateShutdown {
+	if task.DesiredState == api.TaskStateShutdown || task.DesiredState == api.TaskStateRemove {


Please add a comment.

anshulpundir · 2017-11-20T23:36:01Z

api/types.pb.go

 	// The main purpose of this state is to free up resources associated with service tasks on
 	// unresponsive nodes without having to delete those tasks. This state is directly assigned
 	// to the task by the orchestrator.
+	TaskStateRemove   TaskState = 800


Please move this above the comment for TaskStateOrphaned and add a comment for this state.

dperny · 2017-11-21T00:10:04Z

@stevvooe @nishanttotla why do we need to remove from the assignment set? why is it not sufficient to simply set the task state to "shutdown", then let the task reaper clean up the task when it gets reported as shutdown?

Here is a quick diff of what I'm talking about. No tests, not sure it even builds, just trying to convey the idea:
master...dperny:sequential-task-removal

dperny · 2017-11-21T00:11:35Z

Ah, I see, Slot doesn't work like that.

dperny · 2017-11-21T01:00:58Z

Updated the diff of my counterproposal. Check if all tasks in a slot are dead, and, if so, delete all the tasks in that slot.

stevvooe · 2017-11-21T01:07:49Z

why do we need to remove from the assignment set?

I am not sure that it is necessary, either. I was trying to communicate that "removal" is proxied through removal from the assignment set. Looking at this closer, having shutdown confirmation seems like the first step (which I hope is already working).

It seems like the missing component here is acknowledgement that the allocator has returned the tasks resources to the pool before removing the task.

nishanttotla · 2017-11-21T20:24:49Z

Based on our offline discussion, I'm going to hold off on updating the state name until we all can agree on it. In the meantime, I'll make sure this PR works right.

I'll also make sure I update design docs as an immediate follow up.

stevvooe · 2017-11-21T20:35:13Z

agent/exec/controller_test.go

 	})
 }

+func TestRemove(t *testing.T) {


TestDesiredStateRemove

Include a comment explaining that we are testing that the agent maintains SHUTDOWN as the maximum state in the agent.

anshulpundir · 2017-11-21T22:02:27Z

api/types.proto

+	// The main purpose of the REMOVE state is to correctly handle service deletions
+	// and scale downs. This allows us to keep track of tasks that have been marked
+	// for deletion, but can't yet be removed because the agent is in the process of
+	// shutting them down.


We should also mention that tasks marked with this states are removed from the system once they have been shut down by the agent.

anshulpundir · 2017-11-21T22:04:05Z

manager/orchestrator/replicated/services.go

@@ -50,7 +50,7 @@ func (r *Orchestrator) handleServiceEvent(ctx context.Context, event events.Even
 		if !orchestrator.IsReplicatedService(v.Service) {


unrelated by not totally: Please add a comment for this case.

anshulpundir · 2017-11-21T22:05:54Z

manager/orchestrator/replicated/services.go

 		err = r.store.Batch(func(batch *store.Batch) error {
 			r.deleteTasksMap(ctx, batch, deadSlots)
-			r.deleteTasks(ctx, batch, sortedSlots[specifiedSlots:])
+			r.setTasksDesiredState(ctx, batch, sortedSlots[specifiedSlots:], api.TaskStateRemove)


Please add a comment.
For bonus points, you can also add comments for the reconcile() function :)

anshulpundir · 2017-11-21T22:08:43Z

manager/orchestrator/replicated/services.go


-func (r *Orchestrator) deleteTasks(ctx context.Context, batch *store.Batch, slots []orchestrator.Slot) {
+// setTasksDesiredState sets the desired state of all tasks to what
+// is requested.


sets the desired state of all tasks to what is requested => sets the desired state for all tasks for the given slots to the given state.

anshulpundir · 2017-11-21T22:10:50Z

manager/orchestrator/replicated/services.go

 	for _, slot := range slots {
 		for _, t := range slot {
-			r.deleteTask(ctx, batch, t)
+			err := batch.Update(func(tx store.Tx) error {


Why not do all the updates in the same txn ?

@anshulpundir how do you mean?

Nevermind. I realized that updates to the store are batched using store.Batch.

anshulpundir · 2017-11-21T22:12:34Z

manager/orchestrator/replicated/task_reaper_test.go

 	assert.Len(t, foundTasks, 4)
 }
+
+func TestTaskStateRemoveOnScaledown(t *testing.T) {


Please add a comment specifying what the test is about and the steps involved.

anshulpundir · 2017-11-21T22:12:43Z

manager/orchestrator/replicated/task_reaper_test.go

+	assert.Len(t, foundTasks, 1)
+}
+
+func TestTaskStateRemoveOnServiceRemoval(t *testing.T) {


Same as above

anshulpundir · 2017-11-21T22:14:12Z

manager/orchestrator/taskreaper/task_reaper.go

 					tr.orphaned = append(tr.orphaned, t.ID)
 				}
+				// add tasks that have been shutdown and marked "remove"
+				if t.DesiredState == api.TaskStateRemove && t.Status.State >= api.TaskStateShutdown {


nit: Please update comments for Run() above for this change.

anshulpundir · 2017-11-21T22:16:18Z

manager/orchestrator/service.go

+					log.G(ctx).WithError(err).Errorf("failed to update task desired state to REMOVE")
 				}
 				return nil
 			})


The log statement here https://github.com/docker/swarmkit/pull/2446/files#diff-93afa918180cbe7ddd6db3b78a89fb8aR64 doesn't seem accurate. Its the failure of task update txn.

anshulpundir · 2017-11-21T22:17:25Z

manager/orchestrator/taskreaper/task_reaper.go

 				if t.Status.State >= api.TaskStateOrphaned && t.ServiceID == "" {
 					tr.orphaned = append(tr.orphaned, t.ID)
 				}
+				// add tasks that have been shutdown and marked "remove"


Please also comment on why we remove these tasks.

Signed-off-by: Nishant Totla <nishanttotla@gmail.com>

nishanttotla · 2017-11-23T01:01:52Z

I've address most of the above comments.

aaronlehmann · 2017-11-24T14:56:00Z

api/types.proto

+	// for deletion, but can't yet be removed because the agent is in the process of
+	// shutting them down. Once the agent has shut down tasks with desired state
+	// REMOVE, the task reaper is responsible for removing them.
+	REMOVE = 800 [(gogoproto.enumvalue_customname)="TaskStateRemove"];


832 for consistency with the others?

i think you missed the ORPHANED = 832 below, which makes this correct at halfway between REJECTED and ORPHANED.

// Remove ...

aaronlehmann · 2017-11-24T14:58:31Z

manager/orchestrator/replicated/services.go

+
+				err := store.UpdateTask(tx, t)
+				if err != nil {
+					log.G(ctx).WithError(err).Errorf("failed to update task %s desired state to %s", t.ID, state.String())


This log statement seems redundant with the one below.

aaronlehmann · 2017-11-24T15:01:54Z

manager/orchestrator/replicated/services.go

-			r.deleteTask(ctx, batch, t)
+			err := batch.Update(func(tx store.Tx) error {
+				// update desired state
+				t.DesiredState = state


It's good practice to make sure that the state is being advanced, and never moves backwards. Currently there isn't a state beyond "remove" but we could potentially add one later.

aaronlehmann · 2017-11-24T15:02:16Z

manager/orchestrator/service.go

-				if err := store.DeleteTask(tx, t.ID); err != nil {
-					log.G(ctx).WithError(err).Errorf("failed to delete task")
+				// update desired state to REMOVE
+				t.DesiredState = api.TaskStateRemove


It's good practice to make sure that the state is being advanced, and never moves backwards. Currently there isn't a state beyond "remove" but we could potentially add one later.

dperny · 2017-11-27T23:22:49Z

i'm gonna carry this PR to completion for @nishanttotla because he's not available right now.

nishanttotla added area/orchestration kind/enhancement labels Nov 15, 2017

nishanttotla force-pushed the task-management-improvements branch 9 times, most recently from 51a93be to c70768d Compare November 16, 2017 21:27

nishanttotla changed the title ~~[WIP] Don't delete tasks until they're actually removed by the agent~~ [Don't delete tasks until they're actually removed by the agent Nov 16, 2017

nishanttotla changed the title ~~[Don't delete tasks until they're actually removed by the agent~~ Don't delete tasks until they're actually removed by the agent Nov 16, 2017

nishanttotla mentioned this pull request Nov 17, 2017

[Proposal] Resolving the IP address contention issue #2407

Closed

dperny requested changes Nov 20, 2017

View reviewed changes

anshulpundir reviewed Nov 20, 2017

View reviewed changes

nishanttotla force-pushed the task-management-improvements branch from 7975498 to 108aaef Compare November 20, 2017 23:37

stevvooe reviewed Nov 21, 2017

View reviewed changes

anshulpundir reviewed Nov 21, 2017

View reviewed changes

nishanttotla force-pushed the task-management-improvements branch 4 times, most recently from 6e95d25 to 6e125f6 Compare November 22, 2017 21:59

nishanttotla force-pushed the task-management-improvements branch 8 times, most recently from 1c5356f to 9a9ec6c Compare November 23, 2017 00:59

nishanttotla added 3 commits November 22, 2017 17:00

Adding new Task Status REMOVE

d7445df

Signed-off-by: Nishant Totla <nishanttotla@gmail.com>

Set task desired state to REMOVE for service removal and scale down

4e342ad

Signed-off-by: Nishant Totla <nishanttotla@gmail.com>

Add unit tests for Task state REMOVE behavior

618281c

Signed-off-by: Nishant Totla <nishanttotla@gmail.com>

nishanttotla force-pushed the task-management-improvements branch from 9a9ec6c to 618281c Compare November 23, 2017 01:00

aaronlehmann reviewed Nov 24, 2017

View reviewed changes

dperny mentioned this pull request Nov 28, 2017

Carry "Don't delete tasks until they're actually removed by the agent" #2461

Merged

nishanttotla closed this Dec 1, 2017

nishanttotla deleted the task-management-improvements branch December 1, 2017 00:15

thaJeztah mentioned this pull request Dec 6, 2017

Vendor swarmkit to 4429c763 moby/moby#35698

Merged

		@@ -50,7 +50,7 @@ func (r *Orchestrator) handleServiceEvent(ctx context.Context, event events.Even
		if !orchestrator.IsReplicatedService(v.Service) {

Conversation

nishanttotla commented Nov 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Nov 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

stevvooe commented Nov 17, 2017

Uh oh!

dperny left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anshulpundir left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dperny commented Nov 21, 2017

Uh oh!

dperny commented Nov 21, 2017

Uh oh!

dperny commented Nov 21, 2017

Uh oh!

stevvooe commented Nov 21, 2017

Uh oh!

nishanttotla commented Nov 21, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nishanttotla commented Nov 15, 2017 •

edited

Loading

codecov bot commented Nov 15, 2017 •

edited

Loading