Skip to content

R4R: Add timeouts to http servers#2780

Merged
ebuchman merged 8 commits intotendermint:developfrom
zmanian:zaki/HttpServerWithTimeouts
Nov 17, 2018
Merged

R4R: Add timeouts to http servers#2780
ebuchman merged 8 commits intotendermint:developfrom
zmanian:zaki/HttpServerWithTimeouts

Conversation

@zmanian
Copy link
Contributor

@zmanian zmanian commented Nov 7, 2018

The current http server implementation does not enforce anytime outs on connections. This facilitates file descriptor exhaustion.

This introduces timeouts into the http server implementation.

… with ones with timeouts to prevent file descriptor exhaustion
@zmanian zmanian changed the base branch from master to develop November 7, 2018 21:03
@zmanian zmanian changed the title Zaki/http server with timeouts R4R: Zaki/http server with timeouts Nov 7, 2018
@zmanian zmanian changed the title R4R: Zaki/http server with timeouts R4R: Add timeouts to http servers Nov 7, 2018
logger.Info("RPC HTTP server stopped", "err", err)
s := &http.Server{
Handler: RecoverAndLogHandler(maxBytesHandler{h: handler, n: maxBodyBytes}, logger),
ReadTimeout: 10 * time.Second,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 10 seconds too long?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd lower both to 3s

Suggested change
ReadTimeout: 10 * time.Second,
ReadTimeout: 3 * time.Second,

@melekes
Copy link
Contributor

melekes commented Nov 8, 2018

==================
WARNING: DATA RACE
Write at 0x00c420316270 by goroutine 31:
  github.com/tendermint/tendermint/rpc/lib/server.StartHTTPAndTLSServer.func1()
      /go/src/github.com/tendermint/tendermint/rpc/lib/server/http_server.go:112 +0x259

Previous write at 0x00c420316270 by goroutine 32:
  github.com/tendermint/tendermint/rpc/lib/server.StartHTTPAndTLSServer()
      /go/src/github.com/tendermint/tendermint/rpc/lib/server/http_server.go:119 +0xd7f
  github.com/tendermint/tendermint/rpc/lib/server.TestStartHTTPAndTLSServer()
      /go/src/github.com/tendermint/tendermint/rpc/lib/server/http_server_test.go:74 +0xe0
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:777 +0x16d

Goroutine 31 (running) created at:
  github.com/tendermint/tendermint/rpc/lib/server.StartHTTPAndTLSServer()
      /go/src/github.com/tendermint/tendermint/rpc/lib/server/http_server.go:105 +0xaa8
  github.com/tendermint/tendermint/rpc/lib/server.TestStartHTTPAndTLSServer()
      /go/src/github.com/tendermint/tendermint/rpc/lib/server/http_server_test.go:74 +0xe0
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:777 +0x16d

Goroutine 32 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:824 +0x564
  testing.runTests.func1()
      /usr/local/go/src/testing/testing.go:1063 +0xa4
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:777 +0x16d
  testing.runTests()
      /usr/local/go/src/testing/testing.go:1061 +0x4e1
  testing.(*M).Run()
      /usr/local/go/src/testing/testing.go:978 +0x2cd
  main.main()
      _testmain.go:112 +0x324
==================
--- FAIL: TestStartHTTPAndTLSServer (0.00s)
	Error Trace:	http_server_test.go:75
	Error:      	Expected nil, but got: &netutil.limitListener{Listener:(*net.TCPListener)(0xc4202c0040), sem:(chan struct {})(0xc4202c44e0), closeOnce:sync.Once{m:sync.Mutex{state:0, sema:0x0}, done:0x0}, done:(chan struct {})(0xc4202c4540)}
	Test:       	TestStartHTTPAndTLSServer
	testing.go:730: race detected during execution of test
FAIL

Copy link
Contributor

@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome 💠 Thanks @zmanian

logger.Info("RPC HTTP server stopped", "err", err)
s := &http.Server{
Handler: RecoverAndLogHandler(maxBytesHandler{h: handler, n: maxBodyBytes}, logger),
ReadTimeout: 10 * time.Second,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd lower both to 3s

Suggested change
ReadTimeout: 10 * time.Second,
ReadTimeout: 3 * time.Second,

@jackzampolin jackzampolin mentioned this pull request Nov 9, 2018
6 tasks
@jaekwon
Copy link
Contributor

jaekwon commented Nov 10, 2018

This has magic constants in several places, and I'm not sure how it relates to other timeout parameters that we already have for peer connections. Needs a bit of review to parameterize & make consistent w/ the rest of the codebase.

Copy link
Contributor

@jaekwon jaekwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See main thread

@ebuchman
Copy link
Contributor

not sure how it relates to other timeout parameters that we already have for peer connections.

What does this have to do with peer connections? Are you saying we should make the http and p2p connection timeouts somehow related?

See main thread

What is this referring to ?

@jaekwon
Copy link
Contributor

jaekwon commented Nov 15, 2018

What does this have to do with peer connections? Are you saying we should make the http and p2p connection timeouts somehow related?

Ah no, we shouldn't do that. Lets just use constants instead of 10, and otherwise good to go.

@ebuchman
Copy link
Contributor

ebuchman commented Nov 15, 2018

Oh - this breaks /broadcast_tx_commit. If committing the tx takes longer than the timeout here, then once the commit finally happens we get curl: (52) Empty reply from server.

Not sure how to reconcile ...

Does anyone know if/what the issue was here in the first place? Is there any endpoint that can hang for a while that we'd need this for?

@zmanian @jackzampolin

@melekes
Copy link
Contributor

melekes commented Nov 16, 2018

Oh - this breaks /broadcast_tx_commit. If committing the tx takes longer than the timeout here, then once the commit finally happens we get curl: (52) Empty reply from server.

Right. /broadcast_tx_commit timeout is 10 sec. now

var deliverTxTimeout = 10 * time.Second // TODO: configurable?
. We can make it smaller or increase read/write timeouts (to 10 sec.).

maxBodyBytes = int64(1000000) // 1MB

// same as the net/http default
maxHeaderBytes = 1 << 20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go will use a default in the absence of such. Is there a reason to duplicate on our side?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be explicit.

@ebuchman
Copy link
Contributor

We can make it smaller or increase read/write timeouts (to 10 sec.)

We need to make one a function of the other so this doesn't become some kind of implicit dependency between constants.

I also still want to understand why we need these timeouts in the first place - under what circumstance are they actually important to us ?

@ebuchman
Copy link
Contributor

Ok, I read https://blog.cloudflare.com/exposing-go-on-the-internet/ and https://blog.cloudflare.com/the-complete-guide-to-golang-net-http-timeouts/

It's clear we need to do this to handle hostile clients, but I still wonder if the original issue that prompted this might lie deeper than this fix, assuming our clients were functioning properly. I suspect there are other issues with file descriptors in Tendermint, but anyways.

@zmanian @melekes PTAL.

The `broadcast_tx_commit` endpoint has it's own timeout.
If this is longer than the http server's WriteTimeout, the
user will receive an error. Here, we export the WriteTimeout
and set the broadcast_tx_commit timeout to be less than it.

In the future, we should use a config struct for the timeouts
to avoid the need to export. The broadcast_tx_commit timeout
may also become configurable, but we must check that it's less
than the server's WriteTimeout.
@codecov-io
Copy link

Codecov Report

Merging #2780 into develop will increase coverage by 0.03%.
The diff coverage is 0%.

@@             Coverage Diff             @@
##           develop    #2780      +/-   ##
===========================================
+ Coverage    62.09%   62.13%   +0.03%     
===========================================
  Files          212      212              
  Lines        17279    17283       +4     
===========================================
+ Hits         10730    10738       +8     
+ Misses        5642     5639       -3     
+ Partials       907      906       -1
Impacted Files Coverage Δ
rpc/core/pipe.go 30.23% <ø> (ø) ⬆️
rpc/core/mempool.go 0% <0%> (ø) ⬆️
privval/tcp_server.go 78.57% <0%> (-2.86%) ⬇️
evidence/reactor.go 63.44% <0%> (+0.81%) ⬆️
libs/flowrate/flowrate.go 82.35% <0%> (+1.61%) ⬆️
privval/ipc_server.go 69.81% <0%> (+5.66%) ⬆️

@zmanian
Copy link
Contributor Author

zmanian commented Nov 17, 2018

I think the other file descriptor issues might a result of leveldb settings

Copy link
Contributor

@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💠

@ebuchman ebuchman merged commit e6fc10f into tendermint:develop Nov 17, 2018
cboh4 pushed a commit to scrtlabs/tendermint that referenced this pull request Apr 7, 2025
…the JSON (backport tendermint#2774) (tendermint#2780)

This change fixes a bug in which BitArray.UnmarshalJSON hadn't accounted
for the fact that invoking NewBitArray(<=0) returns nil and hence when
dereferenced would crash with a runtime nil pointer dereference. This
bug was found by my security analysis and fuzzing too.

Author: @odeke-em 

Fixes cometbft/cometbft#2658

---

#### PR checklist

- [x] Tests written/updated
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [ ] ~~Updated relevant documentation (`docs/` or `spec/`) and code
comments~~
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
<hr>This is an automatic backport of pull request tendermint#2774 done by
[Mergify](https://mergify.com).

---------

Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
Co-authored-by: Andy Nogueira <me@andynogueira.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants