docs/rfc: add testnet RFC by mark-rushakoff · Pull Request #9124 · tendermint/tendermint

mark-rushakoff · 2022-07-28T20:56:27Z

Following several discussions internal to the Tendermint engineering
team, I am posting an RFC discussing the high-level details of the
Tendermint team owning and operating a long-lived testnet in order to
build experience running Tendermint, and to demonstrate that Tendermint
is stable under production workloads.

The outcome of this RFC will be a new track of work to begin building
and maintaining a testnet associated with the main branch of tendermint.
See the "Testnet MVP" section specifically for some of the first
milestones.

Note, I added the RFC where it would live once #9115 is merged to
restore the RFC layout from the v0.36.x branch. docs/rfc/README.md will
need to be updated to include this RFC once #9115 is merged.

This RFC is related to #9078.

{Rendered}

Following several discussions internal to the Tendermint engineering team, I am posting an RFC discussing the high-level details of the Tendermint team owning and operating a long-lived testnet in order to build experience running Tendermint, and to demonstrate that Tendermint is stable under production workloads. The outcome of this RFC will be a new track of work to begin building and maintaining a testnet associated with the main branch of tendermint. See the "Testnet MVP" section specifically for some of the first milestones. Note, I added the RFC where it would live once #9115 is merged to restore the RFC layout from the v0.36.x branch. docs/rfc/README.md will need to be updated to include this RFC once #9115 is merged. This RFC is related to #9078.

thanethomson

Sounds like a great idea to me. What're the next steps from here?

We should probably establish when the best time would be to get the MVP up and running. Is this something that'd add value during our Q3 work?

docs/rfc/rfc-022-semi-permanent-testnet.md

cason

Great document.

docs/rfc/rfc-022-semi-permanent-testnet.md

cason · 2022-07-29T07:36:16Z

docs/rfc/rfc-022-semi-permanent-testnet.md

+  but rather to demonstrate that Tendermint blockchains as a whole can be stable
+  under a production load.
+  Of course we will inject faults periodically, but the intent is to observe and prove that
+  the testnet is resilient to those faults.


From this I derive that we can inject malicious/faulty behavior, but this behavior should not in principle halt or produce irrecoverable errors in the chain?

I've thought about this for a few minutes and I'm not sure how to add it to the document.

If you knew that running a particular command did something strange on a 3-node testnet, and you were curious what effect it had on a 100-node testnet: I think it would be okay to run that against the main testnet, even with an uncertain risk that it would halt the testnet.

But if you were certain that you could halt the main testnet with a single command, then I would recommend that you would not run it -- unless perhaps you were going to confirm that the next software update was capable of recovering from said halt.

This is a kind of fuzzy area that I think is fine to omit from the document for now.

williambanfield

Thanks for putting this together! All comments I've made are as points of discussion, but I'm very happy with where the doc is so far.

williambanfield · 2022-07-29T15:29:07Z