-
Notifications
You must be signed in to change notification settings - Fork 38.7k
net: Send txs without delay on regtest #15881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I don't think we want to do this generically, as it will prevent testing the normal relay logic? |
|
This preserves the normal relay logic except that the delay is always zero (for inbound and outbound peers). Testing that there is a difference in the delay between inbound and outbound peers was already impossible, I believe, since we can not create outbound connections. |
It's not fundamentally impossible though, is it? I think remember using part of some stuck PR (Luke's?) to test both directions. Anyway, could you motivate this change a bit? I see some description in #15858, but it is closed now :) |
|
Added a commit to make it a regtest command line flag |
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
I really don't like special casing the tests like this, the more we do this the more tests become useless just for show things-- and, of course, the special casing code itself can contribute to bugs too. Wouldn't it be better to run some of these tests in parallel so that their latency is less relevant? |
|
@gmaxwell but if there are no tests for delay, what is the point of keeping delay in tests at all? If it breaks, no one will notice it anyway. Also on:
How about starting 3 nodes like this: |
|
By the way, @MarcoFalke, if your intent is to only speedup several tests in which p2p is not test subject - you can just whitelist peers in such tests. |
|
I'd rather not whitelist peers in tests. Not only is it tedious to pass in the command line flag, but also it affects DoS scores and some other things, I believe. |
|
Concept ACK adding this as default in the functional tests.
I can't think of any cases in the functional tests where we do anything except submit a transaction and wait for it to propagate. Running with Notably, I don't think there are any tests in the functional test suite that explicitly test the poisson distribution for tx relay. Really, that should be tested at the unit test level. We could add a single functional test with
In general I agree, but the change here is so minor that I think it's ok. I'm much more concerned about other shortcomings in our functional test suite, such as not having peers outbound peers or peers on non-local network addresses.
That's already the default |
|
My instinct here is that our design goal should be to make the p2p connections in regtest be more like mainnet, and not less. We already have a problem that in regtest, we don't exercise our networking code enough (as mentioned in #14210 -- an issue I opened after I got the impression that many people were unaware of our code-coverage deficiences with respect to p2p behavior). So adding more special cases seems like it's going in the wrong direction. On the other hand it would be great for the tests to run faster. My suggestion would be to adopt an approach more like #15858 was trying to do, where we leave bitcoind unchanged but choose to make certain tests opt-out of testing the p2p behavior. I think that also makes it easier to reason about what a test is testing, if you can see clearly in the test what parts of the code are being exercised (rather than implicitly relying on some non-standard behaviors we've opted bitcoind into in the test framework). I only briefly looked at #15858 so far but I think it seemed to be going in the right direction. Having individual tests be modified one-by-one to opt into a faster behavior seems fine to me; reviewers can determine whether the code coverage change is logical or not based on what the test is doing and what the rest of the test suite already covers. The only caveat I'd add is that I think we should make it more explicit whether a call to As an aside, I think someone should just add a test that tests poisson delays -- that is currently possible in our test framework (inbounds can be tested via mininode, outbounds can be tested with multiple bitcoind's because there's no difference in our p2p logic between addnode peers and regular outbound peers, I believe). So I'm a weak concept NACK on adding functionality to bitcoind that allows changing the relay delays; I'm a strong concept NACK on defaulting all our functional tests into special-cased behavior (if we were to go with this approach, I think we should only opt-in tests that we choose to exercise non-standard behavior on, rather than take a blanket approach). Regarding parallelization, I think the point is that perhaps developers can immediately start saving their own time (and achieve the same benefits of this PR) by just increasing the number of simultaneous jobs they run with? I often run |
|
The response to "the tests run slow" shouldn't be "just run more of them in parallel". We might have users or developers that would like to run the tests on weak hardware (e.g. dual core cpu or hdd). Those constraints set a limit on how many tests can be run in parallel. Some of the tests might have random sleeps in them, but you can't rely on those because they are -- after all -- random. It already takes about an hour to compile from scratch and run all tests on a dual core cpu with hdd. I don't see how compilation could be sped up any time soon (it would probably require getting rid of the boost hpp includes and the serialize.h includes, neither trivially possible). So having the tests run faster with a low risk setting like this seems like a nice way to kick the can down the road. Because the flag can be set on the command line, it shouldn't hinder anyone from increasing the test coverage of our tx relay code. I do see the argument that regtest should be as close to mainnet as possible, but there is also a reason to have some regtest values differ from mainnet, when they aid testing. |
|
🙁 I much prefer this approach to #15858 and don't see why adding 30 lines of gnarly topo-sort code in the test framework is in any way preferable to adding the one-line change: bool fSendTrickle = pto->fWhitelisted;
bool fSendTrickle = connman->m_tx_relay_force_flush || pto->fWhitelisted;in The only change in this PR is one of timing. The other PR completely changes how transactions are relayed between nodes on the test network, so in fact this PR is results in less difference in behaviour from a 'real' network. I can easily come up with scenarios where a regression in tx relay would be caught with this change, but hidden by the change in #15858 which basically removes p2p tx relay between nodes under test. So my preference would be to use this PR, but default to off to satisfy the concerns of @sipa @gmaxwell and @sdaftuar. It can be explicitly enabled in the tests that @MarcoFalke picked out in 15881. |
Why not? At least to the extent that slowness is due to sleeps in the protocol design then running them in parallel is a perfectly reasonable response to it.
Because they are part of the system. The tests (for the most part) are system tests. Even if there is not a specific test for the delay behaviour in most possible ways that it could break, if it broke, it would break the test. Consider, if a rare crasher bug, say a divide by zero, were introduced in PoissonNextSend then currently the tests could eventually catch it even though there is no direct test-- but if that code was never run it could never be caught by the tests. Also, having additional conditional behaviour in the codebase complicates the code -- opportunities for confusion, time wasted fixing bugs (or testing code) which is specific to the test environment. What happens when some bool sign sense gets flipped and the delay skipping gets turned on in mainnet? Or when there is a race in getdata handling that doesn't work without delays? Additional conditional test only code also frustrates efforts to improve the code by hitting high multiple-condition-decision branch coverage (because of dead production code in tests, and dead test code in coverage). Three times faster is no joke. But the further we make the tested system less like the actual system the less useful the tests become. We could make them infinity faster by removing them completely... @jnewbery 's one line version is a lot more attractive. And while I wouldn't seen any problem if some particularly tx-relay-serailization-delay sensitive tests used it, we really shouldn't be doing the bulk of our system testing that way simply because its different from production. We should be trying to minimize the difference to production... What we could do is identify tests where the p2p behaviour is irrelevant to the test itself and the source of a large delay and bypass it for just those tests. However, we should take care that doing so doesn't leave some part of the production system unused in the tests (e.g. we shouldn't do this to all tests that relay transactions...). I feel like maybe we've lost the distinction between system and unit testing, and we're asking tests which are structurally system tests to work and perform like unit tests. In general I'm not the biggest fan of unit testing and prefer to achieve as much tests via system tests as possible, but when performance considerations are causing us to test increasingly more modified versions of the system it might make sense to split things up and stop trying to use the whole system when what we're trying to do is a unit test. |
|
It looks that the main concern is that this PR introduces a new case condition and if we misapply it, some part of the code (concretely What if we instead of enabling/disabling the delay, we pass a parameter to the function (actually |
|
@kostyantyn yes I think if this kind of approach is to be taken at all, it should be parameterized and shortened by default. |
|
@kostyantyn The "unit test" aspect is only part of it. In complex systems many interesting bugs emerge out of interactions of multiple parts. If all the tests significantly change the delays then conditions that are only reachable in the presence of delays won't get hit. Testing both with and without delays is probably better, but if only one is done it should be kept closer to actual use. |
44441db to
4fd2061
Compare
|
I tend to agree with @gmaxwell, @sdaftuar, @sipa.
IMO the last important property a test should have is beeing fast (not saying its completely irrelevant). |
|
Devs are less likely to run the tests (or a specific test) if it takes too long to run. We might as well remove slow tests if only one or two devs run them on high-end hardware. Long tests are excluded on travis and I doubt that a bunch of developers on different architectures pass in Moreover, our unit or functional tests are horribly unsuitable to find actual issues with the p2p code in a scenario such as mainnet. The transaction workload is trivial and highly synthetic, not at all comparable to mainet. I do see the argument of dead code paths when there is no delay by default, so I'll rework this pull to default to the mainnet setting unless a test opts in to the 0-delay. Testing both settings a bit might actually lead to more test coverage, as race conditions might be easier to trigger with a 0-delay. |
4fd2061 to
60d70a5
Compare
60d70a5 to
fa96670
Compare
fa96670 to
fadb718
Compare
Totally agree with this. I doubt anyone runs the dbcache test for example. When rebasing a PR with many commits, I would like to run the entire test suite over every commit to make sure that no intermediate commits break anything. Having very long running tests precludes this.
Again, very much agree with this. My view is that the functional tests test functionality. If we want to catch p2p issues we should build a stress test and a soak test environment, and also run a bunch of nodes on mainnet with heavy diags/logging/analytics/etc. |
IMO fast tests are very good. It encourages running the tests more often, makes Travis faster, and usually indicates the tests don't spend a lot of time on waiting and polling loops, or doing unnecessary stuff that doesn't increase test coverage and causes potential random errors (and ever higher parallelism increases this latter risk by affecting delays randomly). So in this case: yes, normal transaction propagation should be tested, but also, this is only necessary in part of the tests that deal with testing the P2P network. In the rest it's allowable to make some shortcuts, like disabling or speeding up randomized behavior. What I don't like (in this particular approach) is that bitcoind accumulates more test-specific behavior and functions, which make it harder to maintain and which (as @gmaxwell says) blurs the line between unit and functional tests, so I personally prefer #15858's appraoch. That's not a hard NACK on this though. I was just missing the context when seeing that PR closed. |
That other approach is going to bypass p2p tx relay completely. So is guaranteed to decrease test coverage. This pull request is a one-line patch to net_processing (off by default) so that both scenarios are tested. This is increasing test coverage, as I explained above. |
0e7c90e test: speed up wallet_avoidreuse.py (Jon Atack) 6d50b26 test: add logging to wallet_avoidreuse.py (Jon Atack) Pull request description: Inspired by PRs #17340 and #15881. - add logging - pass -whitelist in `set_test_params` to speed up transaction relay `wallet_avoidreuse.py` is not intended to test P2P transaction relay/timing, so it should be fine to do this here. This reduces test run time variability and speeds up the test by 2-3 times on average. Test run times in seconds: - before: 20, 24, 22, 17, 27, 40, 30 - after: 10, 10, 8, 9, 10, 7, 8 ACKs for top commit: MarcoFalke: ACK 0e7c90e 🐊 fanquake: ACK 0e7c90e Tree-SHA512: 6d954a0aaf402c9594201626b59d29263479059e68fa5155bb44ed973cd0c3347729dd78b78b4d5a2275e45da365dc1afb4cc7e3293dea33fcc2e3e83a39faf5
0e7c90e test: speed up wallet_avoidreuse.py (Jon Atack) 6d50b26 test: add logging to wallet_avoidreuse.py (Jon Atack) Pull request description: Inspired by PRs bitcoin#17340 and bitcoin#15881. - add logging - pass -whitelist in `set_test_params` to speed up transaction relay `wallet_avoidreuse.py` is not intended to test P2P transaction relay/timing, so it should be fine to do this here. This reduces test run time variability and speeds up the test by 2-3 times on average. Test run times in seconds: - before: 20, 24, 22, 17, 27, 40, 30 - after: 10, 10, 8, 9, 10, 7, 8 ACKs for top commit: MarcoFalke: ACK 0e7c90e 🐊 fanquake: ACK 0e7c90e Tree-SHA512: 6d954a0aaf402c9594201626b59d29263479059e68fa5155bb44ed973cd0c3347729dd78b78b4d5a2275e45da365dc1afb4cc7e3293dea33fcc2e3e83a39faf5
Send txs immediately to all peers, just like blocks are solved immediately on regtest.
Overall, the tests run three times as fast for me now, because they no longer wait for the poisson tx send to timeout.
This is only a regtest change and it is possible to disable it with
-tx_relay_force_flush=0should specific testing needs require it.Alternative to: