Skip to content

Conversation

@sipa
Copy link
Member

@sipa sipa commented May 13, 2016

Looks like it fixes #8045.

I cannot explain the failure (it happens 100% reproducibly for me, with a clean cache and no other bitcoind running, even when running outside of rpc-tests.py):

$ ./walletbackup.py 
INFO: Initializing test directory /tmp/testnfwna8uq
INFO: Generating initial blockchain
INFO: Creating transactions
INFO: Backing up
INFO: More transactions
INFO: Restoring using wallet.dat
INFO: Re-starting nodes
INFO: Restoring using dumped wallet
Unexpected exception caught during testing: BrokenPipeError(32, 'Broken pipe')
  File "/home/pw/git/bitcoin/qa/rpc-tests/test_framework/test_framework.py", line 138, in main
    self.run_test()
  File "./walletbackup.py", line 193, in run_test
    sync_blocks(self.nodes)
  File "/home/pw/git/bitcoin/qa/rpc-tests/test_framework/util.py", line 127, in sync_blocks
    counts = [ x.getblockcount() for x in rpc_connections ]
  File "/home/pw/git/bitcoin/qa/rpc-tests/test_framework/util.py", line 127, in <listcomp>
    counts = [ x.getblockcount() for x in rpc_connections ]
  File "/home/pw/git/bitcoin/qa/rpc-tests/test_framework/coverage.py", line 49, in __call__
    return_val = self.auth_service_proxy_instance.__call__(*args, **kwargs)
  File "/home/pw/git/bitcoin/qa/rpc-tests/test_framework/authproxy.py", line 137, in __call__
    response = self._request('POST', self.__url.path, postdata)
  File "/home/pw/git/bitcoin/qa/rpc-tests/test_framework/authproxy.py", line 118, in _request
    self.__conn.request(method, path, postdata, headers)
  File "/usr/lib/python3.5/http/client.py", line 1106, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib/python3.5/http/client.py", line 1151, in _send_request
    self.endheaders(body)
  File "/usr/lib/python3.5/http/client.py", line 1102, in endheaders
    self._send_output(message_body)
  File "/usr/lib/python3.5/http/client.py", line 936, in _send_output
    self.send(message_body)
  File "/usr/lib/python3.5/http/client.py", line 908, in send
    self.sock.sendall(data)
Stopping nodes
WARN: Unable to stop node: CannotSendRequest('Request-sent',)

I also can't explain why Travis doesn't have this problem. I certainly can't explain why this patch fixes it or even remotely has anything to do with the observed problem, but it fixes it 100% reproducibly.

@sipa sipa changed the title Fix walletbackup.py failure Seemingly fix walletbackup.py failure May 13, 2016
@sipa
Copy link
Member Author

sipa commented May 13, 2016

Sense, this makes none: replacing the sync_blocks call with a sleep does not fix it (even when the sleep is longer than how long the sync takes).

@maflcko
Copy link
Member

maflcko commented May 13, 2016

Note that #8045 fails due to Error: Unable to bind to 0.0.0.0:11280 on this computer. Bitcoin Core is probably already running. in the network setup phase. Your exception seems unrelated to #8045 .

kazcw added a commit to kazcw/bitcoin that referenced this pull request May 13, 2016
30s is not enough. An importwallet test case can take up to about 40s my test
machine.

Fixes bitcoin#8051.
@pstratem
Copy link
Contributor

@MarcoFalke I think the unable to bind is just because i failed to kill the bitcoind from a previous test

@sipa sipa force-pushed the fixwalletbackup branch from 3aa7a2b to e7a7f8f Compare May 15, 2016 01:58
@sipa
Copy link
Member Author

sipa commented May 15, 2016

Switched to @sdaftuar's solution

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-rpcservertimeout=90 is now set twice

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@sipa sipa force-pushed the fixwalletbackup branch from e7a7f8f to 8fd0c66 Compare May 17, 2016 03:19
@laanwj
Copy link
Member

laanwj commented May 18, 2016

I don't understand why a server timeout appears to solve this.

What the server timeout does (or is supposed to do, at least) is time out client sessions that are not making a request. They are in the state either before sending a request (or between request or keep-alive), or pausing too long during sending a request (slowloris).

The server should never time out while it is processing a request, no matter how long it takes. If that does happen, that is really really bad. This logic is unfortunately implemented in libevent so not easily accessible to us either.

What is your version of libevent?

@laanwj
Copy link
Member

laanwj commented May 18, 2016

What might happen is that the client hasn't sent a request for a while to the server and thus the server closes the keepalive connection. Python's http doesn't detect this properly and raises an error. I vaguely remember issues with this before.
E.g. see #6695, espeically ddf98d1.

Python's httplib does not graciously handle disconnections from the http server, resulting in BadStatusLine errors.
See https://bugs.python.org/issue3566 "httplib persistent connections violate MUST in RFC2616 sec 8.1.4."

However according to this is supposed to be fixed in Python 3.5. But it looks like you're using that.
I'm not sure how well we cope with Python 3.5's solution though.
Reverting commit ddf98d1 then testing with Python 3.5 should be interesting.

@laanwj
Copy link
Member

laanwj commented May 18, 2016

Ahh so now we get a BrokenPipeError(32, 'Broken pipe') (or CannotSendRequest ? ). I think that's the new Python 3.5 behavior.

I think we should catch that and repeat the request with a new connection if we do, analogous to the except httplib.BadStatusLine as e: handling in ddf98d1.

@arowser
Copy link
Contributor

arowser commented May 25, 2016

Can one of the admins verify this patch?

@sipa sipa force-pushed the fixwalletbackup branch from 8fd0c66 to 4cad179 Compare June 2, 2016 16:35
@sipa
Copy link
Member Author

sipa commented Jun 2, 2016

@laanwj Thanks, that fixes it.

Rebased and updated to retry on BrokenPipeError as well as BadStatusLine.

@maflcko
Copy link
Member

maflcko commented Jun 2, 2016

Also, would be interesting to know why on some systems importwallet takes such a long time.

@maflcko
Copy link
Member

maflcko commented Jun 2, 2016

To be more clear: If there is an issue with importwallet taking a long time, it may be worth looking into that as well. (importwallet on a small regcoin-wallet should take less than a second)

If the goal of this pull is to instead fix a bug in authproxy.py, the commit subject line and pull subject line should be updated accordingly. E.g. "Also retry on BrokenPipeError" or "repeat request with new connection on BrokenPipeError".

@sipa
Copy link
Member Author

sipa commented Jun 2, 2016

Reopening as a new PR in that case.

@sipa sipa closed this Jun 2, 2016
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WalletBackupTest Fails on Master but Travis Shows as Passing

6 participants