Skip to content

Add default timeout to pkg/plugins/client#25794

Merged
anusha-ragunathan merged 1 commit intomoby:masterfrom
chenchun:timeout
Sep 12, 2016
Merged

Add default timeout to pkg/plugins/client#25794
anusha-ragunathan merged 1 commit intomoby:masterfrom
chenchun:timeout

Conversation

@chenchun
Copy link
Copy Markdown
Contributor

Happened to find some docker create command hangs for a long time due to remote volume plugin. The goroutine stack is as follows. I believe timeout is necessary.

d1afb70c-533e-49c6-a0e6-a8dee4968dcc

@GordonTheTurtle GordonTheTurtle added status/0-triage dco/no Automatically set by a bot when one of the commits lacks proper signature labels Aug 17, 2016
@GordonTheTurtle GordonTheTurtle added the dco/no Automatically set by a bot when one of the commits lacks proper signature label Aug 17, 2016
@chenchun
Copy link
Copy Markdown
Contributor Author

Signed my patch.

@thaJeztah thaJeztah removed the dco/no Automatically set by a bot when one of the commits lacks proper signature label Aug 17, 2016
@thaJeztah
Copy link
Copy Markdown
Member

ping @anusha-ragunathan PTAL

@anusha-ragunathan
Copy link
Copy Markdown
Contributor

gofmt errors:
--> Making bundle: validate-gofmt (in bundles/1.13.0-dev/validate-gofmt)
These files are not properly gofmt'd:

  • pkg/plugins/client.go

@tophj-ibm
Copy link
Copy Markdown
Contributor

@chenchun also

---> Making bundle: validate-lint (in bundles/1.13.0-dev/validate-lint)
Errors from golint:
pkg/plugins/client.go:20:2: const defaultHttpTimeOut should be defaultHTTPTimeOut

Please fix the above errors. You can test via "golint" and commit the result.

@chenchun
Copy link
Copy Markdown
Contributor Author

@anusha-ragunathan @tophj-ibm Fixed the format errors.

@anusha-ragunathan
Copy link
Copy Markdown
Contributor

@chenchun : Plugin calls already have a timeout mechanism. It is not at the HTTP Client request level. It is application specific (plugin specific in this case). There's exponential backoff (2, 4, 8, 16 seconds...), with retries, with a maximum timeout of 30 seconds. You were probably waiting in a longer backoff time period (eg 16 seconds) when you experienced the issue. You should see daemon log entries such as https://github.com/docker/docker/blob/master/pkg/plugins/client.go#L135

Given that, I think this PR can be closed.

@chenchun
Copy link
Copy Markdown
Contributor Author

chenchun commented Sep 2, 2016

I don't keep the log. But I believe it is not the backoff retry that was happening. HTTP request timeout is a basic need if the server takes a long time to reply or even not given a reply like the test I write. In this situation the application specific retry won't work. So I don't agree with you.

@anusha-ragunathan
Copy link
Copy Markdown
Contributor

Both cases you mention should be handled by the current plugin timeout mechanism.

@chenchun
Copy link
Copy Markdown
Contributor Author

chenchun commented Sep 8, 2016

@anusha-ragunathan It happens again and it hangs on callWithRetry for 182 minutes. It is much more than 1+2+4+..+16 seconds.
And there isn't any "Unable to connect to plugin" warnings in the log. I hope you reconsider this.

07165439-f278-4e66-aea8-f58067c3a247

If the http server never reply, the c.http.Do(req) statement will continue to block forever and it won't return an error.

@anusha-ragunathan
Copy link
Copy Markdown
Contributor

anusha-ragunathan commented Sep 8, 2016

The full stack-trace helps. You are correct. The application specific backoff/retry logic is for errors that are returned from the http request (eg, connection refused, no socket available to connect, etc). If the call is blocked, then we will not timeout.

Reopening, since this PR fixes a valid issue.

Copy link
Copy Markdown
Contributor

@anusha-ragunathan anusha-ragunathan Sep 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3s is low, especially given that we are more lenient with non-blocking http errors (max of 16s).

go-connections uses 32s for default socket timeouts. https://github.com/docker/docker/blob/3d13fddd2bc4d679f0eaa68b0be877e5a816ad53/vendor/src/github.com/docker/go-connections/sockets/sockets.go#L11 Lets do the same.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm surprised that the 32s net.Dial timeout didnt kick in.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because the http client is successfully connected to the server. It's the server that didn't response to the http request.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed timeout to 32s

@anusha-ragunathan
Copy link
Copy Markdown
Contributor

LGTM

@tiborvass
Copy link
Copy Markdown
Contributor

@chenchun I know it was there before your PR, but if we could add to the timeout constants a * time.Second that'd be appreciated. It will need some type conversions modifications in the code I believe. If it's too complex then forget it. Otherwise PR looks good.

Signed-off-by: Chun Chen <ramichen@tencent.com>
@anusha-ragunathan anusha-ragunathan merged commit a4d1365 into moby:master Sep 12, 2016
@chenchun chenchun deleted the timeout branch September 13, 2016 01:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants