Skip to content

scheduler: Version-aware failure tracking#2253

Merged
aaronlehmann merged 4 commits intomoby:masterfrom
aaronlehmann:scheduler-recent-failure-versioned
Jun 29, 2017
Merged

scheduler: Version-aware failure tracking#2253
aaronlehmann merged 4 commits intomoby:masterfrom
aaronlehmann:scheduler-recent-failure-versioned

Conversation

@aaronlehmann
Copy link
Copy Markdown
Collaborator

This addresses a longstanding TODO which I forgot about. Currently, the scheduler tracks potential failure loops by service ID. However, if the service is updated, it won't distinguish the new version of the service from the old one. This means even if the operator fixes the underlying problem, the new tasks from the updated service will still avoid the node where the problem was detected.

Change the failure tracking to be sensitive to SpecVersion as well. If SpecVersion changes, we'll treat the task as a different kind of task for failure tracking purposes.

Also, add a periodic cleanup pass to make sure recentFailures doesn't grow without bound, and clean up addOrUpdateNode so it doesn't try to fix up uninitialized nodeInfo structs (which is fragile).

cc @aluzzardi

Aaron Lehmann added 3 commits June 13, 2017 17:17
This addresses a longstanding TODO which I forgot about. Currently, the
scheduler tracks potential failure loops by service ID. However, if the
service is updated, it won't distinguish the new version of the service
from the old one. This means even if the operator fixes the underlying
problem, the new tasks from the updated service will still avoid the
node where the problem was detected.

Change the failure tracking to be sensitive to SpecVersion as well. If
SpecVersion changes, we'll treat the task as a different kind of task
for failure tracking purposes.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
This map could technically grow without bound. When adding a new failure
record, if the cleanup hasn't happened recently, loop over all the
recentFailures entries and remove any that only consist of expired
records.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
This function previously could take an uninitialized NodeInfo structure
and fill in whatever was missing. This is very error-prone, so remove
this logic and change the only caller that relies on it to always pass
in a properly initialized NodeInfo.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Jun 14, 2017

Codecov Report

Merging #2253 into master will decrease coverage by 6.25%.
The diff coverage is 57.14%.

@@            Coverage Diff             @@
##           master    #2253      +/-   ##
==========================================
- Coverage   66.63%   60.38%   -6.26%     
==========================================
  Files          92      124      +32     
  Lines       13480    20278    +6798     
==========================================
+ Hits         8983    12245    +3262     
- Misses       3550     6648    +3098     
- Partials      947     1385     +438

Copy link
Copy Markdown
Contributor

@cyli cyli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the caveat I'm not super familiar with the scheduler code.

Nonblocking - I'm not sure how feasible these are, especially given that we probably don't inject a clock into the scheduler, but would it be possible to add tests that check that:

  1. When a node starts up, failed tasks aren't immediately cleaned up
  2. Tasks are not cleared before the monitoring period is up?

@aaronlehmann
Copy link
Copy Markdown
Collaborator Author

When a node starts up, failed tasks aren't immediately cleaned up

I don't understand this one. Does this mean a node becoming READY? The scheduler itself starting up?

Tasks are not cleared before the monitoring period is up?

As you mentioned, we don't inject a clock. I could augment the existing tests to check that recentFailures[x] is monotonic during the test run. Not sure how useful this is. I'd be curious to hear what tests you would write for something like this.

@cyli
Copy link
Copy Markdown
Contributor

cyli commented Jun 15, 2017

I don't understand this one. Does this mean a node becoming READY? The scheduler itself starting up?

Er sorry, again, not super familiar with the scheduler life cycle. Whenever nodeInfo is first created for a node, I guess.

Tasks are not cleared before the monitoring period is up?

Oops, I realized I meant recentFailures[x] is not cleaned up before the monitoring period is up. This one is only really feasible with an injected clock I guess, otherwise will only probabilistically fail, but similar to your current test of simulating task failures - simulate that they failed at particular times, and if even if the number is more than the requisite to be marked unhealthy (I guess 5?), if they're spread out across/past the monitoring boundary the node is not considered unhealthy, and the oldest failure is only cleaned up past the monitoring boundary.

But none of these may be possible without the clock injection, hence non-blocking. :)

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
@aaronlehmann
Copy link
Copy Markdown
Collaborator Author

Added a commit that makes sure the values of recentFailures progress as expected, and aren't cleaned up during the tests (which are expected to take less than 5 minutes).

@cyli
Copy link
Copy Markdown
Contributor

cyli commented Jun 16, 2017

LGTM, thanks for adding the assertions

@aaronlehmann aaronlehmann merged commit 99d5d7b into moby:master Jun 29, 2017
@aaronlehmann aaronlehmann deleted the scheduler-recent-failure-versioned branch June 29, 2017 21:21
andrewhsu pushed a commit to docker-archive/docker-ce that referenced this pull request Jul 14, 2017
- moby/swarmkit#2266 (support for templating Node.Hostname in docker executor)
- moby/swarmkit#2281 (change restore action on objects to be update, not delete/create)
- moby/swarmkit#2285 (extend watch queue with timeout and size limit)
- moby/swarmkit#2253 (version-aware failure tracking in the scheduler)
- moby/swarmkit#2275 (update containerd and port executor to container client library)
- moby/swarmkit#2292 (rename some generic resources)
- moby/swarmkit#2300 (limit the size of the external CA response)
- moby/swarmkit#2301 (delete global tasks when the node running them is deleted)

Minor cleanups, dependency bumps, and vendoring:
- moby/swarmkit#2271
- moby/swarmkit#2279
- moby/swarmkit#2283
- moby/swarmkit#2282
- moby/swarmkit#2274
- moby/swarmkit#2296 (dependency bump of etcd, go-winio)

Signed-off-by: Ying Li <ying.li@docker.com>
Upstream-commit: 4509a00
Component: engine
silvin-lubecki pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Mar 16, 2020
- moby/swarmkit#2266 (support for templating Node.Hostname in docker executor)
- moby/swarmkit#2281 (change restore action on objects to be update, not delete/create)
- moby/swarmkit#2285 (extend watch queue with timeout and size limit)
- moby/swarmkit#2253 (version-aware failure tracking in the scheduler)
- moby/swarmkit#2275 (update containerd and port executor to container client library)
- moby/swarmkit#2292 (rename some generic resources)
- moby/swarmkit#2300 (limit the size of the external CA response)
- moby/swarmkit#2301 (delete global tasks when the node running them is deleted)

Minor cleanups, dependency bumps, and vendoring:
- moby/swarmkit#2271
- moby/swarmkit#2279
- moby/swarmkit#2283
- moby/swarmkit#2282
- moby/swarmkit#2274
- moby/swarmkit#2296 (dependency bump of etcd, go-winio)

Signed-off-by: Ying Li <ying.li@docker.com>
Upstream-commit: 4509a00
Component: engine
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants