test: Increase timeout to mitigate non-deterministic test failure by liamsi · Pull Request #3580 · tendermint/tendermint

liamsi · 2019-04-18T11:02:37Z

This should fix #3576 (ran it many times locally but only time will tell). The test actually only checked for the opcode of the error. From the name of the test we actually want to test if we see a timeout after a pre-defined time.

Updated all relevant documentation in docs
Updated all code comments where relevant
Wrote tests
Updated CHANGELOG_PENDING.md

listener: - before this caused the readWriteTimeout to kick in (rarely) while Accept - as a side-effect: remove obsolete time.Sleep: in both listener cases the Accept will only return after successfully accepting and the timeout that is supposed to be tested here will be triggered because there is a read without a write - see if we actually run into a timeout error (the whole purpose of this test AFAIU) Signed-off-by: Ismail Khoffi <Ismail.Khoffi@gmail.com>

codecov-io · 2019-04-18T11:08:14Z

Codecov Report

Merging #3580 into develop will decrease coverage by 0.17%.
The diff coverage is n/a.

@@             Coverage Diff             @@
##           develop    #3580      +/-   ##
===========================================
- Coverage    64.33%   64.15%   -0.18%     
===========================================
  Files          213      213              
  Lines        17410    17345      -65     
===========================================
- Hits         11201    11128      -73     
+ Misses        5292     5291       -1     
- Partials       917      926       +9

Impacted Files	Coverage Δ
p2p/pex/errors.go	`11.11% <0%> (-11.12%)`	⬇️
p2p/pex/pex_reactor.go	`79.44% <0%> (-3.55%)`	⬇️
privval/signer_remote.go	`80% <0%> (-2%)`	⬇️
blockchain/reactor.go	`70.56% <0%> (-1.87%)`	⬇️
consensus/reactor.go	`71.54% <0%> (-1.66%)`	⬇️
rpc/client/httpclient.go	`66.51% <0%> (-1.13%)`	⬇️
p2p/pex/addrbook.go	`67% <0%> (-1%)`	⬇️
consensus/state.go	`78.82% <0%> (-0.59%)`	⬇️
proxy/multi_app_conn.go	`0% <0%> (ø)`	⬆️
blockchain/pool.go	`80.92% <0%> (+0.65%)`	⬆️
... and 4 more

melekes · 2019-04-18T11:52:33Z

privval/socket_listeners_test.go


 func TestListenerTimeoutReadWrite(t *testing.T) {
-	for _, tc := range listenerTestCases(t, time.Second, time.Millisecond) {
+	var (


Suggested change

var (

const (

melekes · 2019-04-18T11:54:26Z

privval/socket_listeners_test.go

 		}
+
+		if have, want := opErr.Timeout(), true; have != want {
+			t.Errorf("for %s listener, got unexpected error: have %v, want %v", tc.description, have, want)


Suggested change

t.Errorf("for %s listener, got unexpected error: have %v, want %v", tc.description, have, want)

t.Errorf("for %s listener, got unexpected error: have %v, want timeout error", tc.description, opErr)

melekes · 2019-04-18T11:54:55Z

privval/socket_listeners_test.go

 			t.Errorf("for %s listener, have %v, want %v", tc.description, have, want)
 		}
+
+		if have, want := opErr.Timeout(), true; have != want {


Suggested change

if have, want := opErr.Timeout(), true; have != want {

if !opErr.Timeout() {

I like to have this consistent with the cases above (would you also change to if opErr.Op != "read")

Yes, I would

The reason for this consistent form is that when you reformulate the expectations, you can't forget to change the error reprted. Less error prone and less editing.

for %s listener, got unexpected error: have false want true
in it's current form does not make sense for person running tests

@xla agree if you're talking about above case where we compare Op == read, but not here

Agreed, it doesn't always produce the most readable output. A balance must be struck.

Signed-off-by: Ismail Khoffi <Ismail.Khoffi@gmail.com>

xla

Confusing how an accept could take longer than that, but assuming a noisy environment full of little docker whales will be slower than what 50 years of silicon are capable of. The only thing I'd be vary of is that we mask structural issues with the code by just bumping the timeout, if we are sensitive towards that it warrants invesigation, but again this might only be true in the environment our CI runs in.

👍

@xla

…endermint#3580) This should fix tendermint#3576 (ran it many times locally but only time will tell). The test actually only checked for the opcode of the error. From the name of the test we actually want to test if we see a timeout after a pre-defined time. ## Commits: * increase readWrite timeout as it is also used in the `Accept` of the tcp listener: - before this caused the readWriteTimeout to kick in (rarely) while Accept - as a side-effect: remove obsolete time.Sleep: in both listener cases the Accept will only return after successfully accepting and the timeout that is supposed to be tested here will be triggered because there is a read without a write - see if we actually run into a timeout error (the whole purpose of this test AFAIU) Signed-off-by: Ismail Khoffi <Ismail.Khoffi@gmail.com> * makee local test-vars `const` Signed-off-by: Ismail Khoffi <Ismail.Khoffi@gmail.com> ## Additional comments: @xla: Confusing how an accept could take longer than that, but assuming a noisy environment full of little docker whales will be slower than what 50 years of silicon are capable of. The only thing I'd be vary of is that we mask structural issues with the code by just bumping the timeout, if we are sensitive towards that it warrants invesigation, but again this might only be true in the environment our CI runs in.

liamsi requested review from ebuchman, melekes and xla as code owners April 18, 2019 11:02

melekes reviewed Apr 18, 2019

View reviewed changes

makee local test-vars const

e52854b

Signed-off-by: Ismail Khoffi <Ismail.Khoffi@gmail.com>

xla changed the title ~~Increase timeout to mitigate non-deterministic test failure~~ test: Increase timeout to mitigate non-deterministic test failure Apr 18, 2019

xla approved these changes Apr 18, 2019

View reviewed changes

melekes merged commit 8db7e74 into develop Apr 22, 2019

melekes deleted the ismail/issue3576-non-deterministic-test branch April 22, 2019 08:04

melekes mentioned this pull request Apr 22, 2019

test: TestListenerTimeoutReadWrite non-deterministic test failure #3576

Closed

melekes mentioned this pull request May 7, 2019

v0.31.6 release preview #3637

Closed

36 tasks

melekes mentioned this pull request May 30, 2019

temporary #3684

Closed

44 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: Increase timeout to mitigate non-deterministic test failure#3580

test: Increase timeout to mitigate non-deterministic test failure#3580
melekes merged 2 commits intodevelopfrom
ismail/issue3576-non-deterministic-test

liamsi commented Apr 18, 2019

Uh oh!

codecov-io commented Apr 18, 2019 •

edited

Loading

Uh oh!

melekes Apr 18, 2019

Uh oh!

melekes Apr 18, 2019

Uh oh!

melekes Apr 18, 2019

Uh oh!

liamsi Apr 18, 2019

Uh oh!

melekes Apr 19, 2019

Uh oh!

xla Apr 19, 2019

Uh oh!

melekes Apr 19, 2019 •

edited

Loading

Uh oh!

melekes Apr 19, 2019 •

edited

Loading

Uh oh!

xla Apr 20, 2019 •

edited

Loading

Uh oh!

xla left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	t.Errorf("for %s listener, got unexpected error: have %v, want %v", tc.description, have, want)
	t.Errorf("for %s listener, got unexpected error: have %v, want timeout error", tc.description, opErr)

	if have, want := opErr.Timeout(), true; have != want {
	if !opErr.Timeout() {

Conversation

liamsi commented Apr 18, 2019

Uh oh!

codecov-io commented Apr 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

melekes Apr 18, 2019

Choose a reason for hiding this comment

Uh oh!

melekes Apr 18, 2019

Choose a reason for hiding this comment

Uh oh!

melekes Apr 18, 2019

Choose a reason for hiding this comment

Uh oh!

liamsi Apr 18, 2019

Choose a reason for hiding this comment

Uh oh!

melekes Apr 19, 2019

Choose a reason for hiding this comment

Uh oh!

xla Apr 19, 2019

Choose a reason for hiding this comment

Uh oh!

melekes Apr 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

melekes Apr 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xla Apr 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xla left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-io commented Apr 18, 2019 •

edited

Loading

melekes Apr 19, 2019 •

edited

Loading

melekes Apr 19, 2019 •

edited

Loading

xla Apr 20, 2019 •

edited

Loading