roachtest: Index overload automation of TPC-E#90005
roachtest: Index overload automation of TPC-E#90005andrewbaptist wants to merge 2 commits intocockroachdb:masterfrom
Conversation
b14f2d0 to
f4c557c
Compare
73f08d3 to
74543ca
Compare
irfansharif
left a comment
There was a problem hiding this comment.
I know that @dt's working on adding roachtest libraries to snapshot GCE images and use them for quicker turn around times. He's also actually doing it for TPC-E restores I think. This PR should wait until that gets done -- this is far too long running and $$$ to be useful otherwise.
|
What if we merged it with the skip test flag - since the ability to run skipped tests is about to merge |
There was a problem hiding this comment.
I'm ok merging this test as skipped until the GCE snapshotting work but we ought to improve the TPC-E harness first here instead of duplicating a bunch of code (even if this was the easiest thing to do at the start). Left some comments below but the TL;DR is lets make this as good as the TPC-C harness; I think it's doable since it's not a ton of code for either today. Feel free to do this work in whatever PRs/commits make life easiest for you.
| "github.com/cockroachdb/errors" | ||
| ) | ||
|
|
||
| // TODO(baptist): This shares a lot of code with tpce.go. Consider ways to refactor. |
pkg/cmd/roachtest/tests/admission_control_index_overload_tpce.go
Outdated
Show resolved
Hide resolved
|
|
||
| // Wait for the load to run for a while before starting the index build. | ||
| select { | ||
| case <-time.After(opts.delayBeforeIndex): |
There was a problem hiding this comment.
Look at how the TPC-C harness does things. It's very flexible, and better in many regards. There we have that During callback where test authors can inject whatever sleep they want during a test run, and this sort of thing would apply. So I would suggest improving the TPC-E test harness first as part of this PR and then adding a good use of it. It'll also encourage more TPC-E roachtests from other teams.
| t.Status("creating index (~30min)") | ||
|
|
||
| db := c.Conn(ctx, t.L(), 1) | ||
| _, err := db.ExecContext(ctx, "CREATE INDEX ON tpce.cash_transaction (ct_dts)") |
There was a problem hiding this comment.
The same suggestion as the other PR -- use an explicit, unique name, suffixed perhaps by the timestamp this was run at so that repeated runs are made easy.
There was a problem hiding this comment.
I realized when iterating with that one that not specifying a name gives the flexibility to re-run the identical create index (the system does this automatically) - so its not necessary to pick an explicit name unless you want to drop it in the end. Since I wasn't dropping I didn't pick a name. Let me know if you think I should change this.
| }) | ||
|
|
||
| // This test takes about 1 hours to run. | ||
| r.Add(registry.TestSpec{ |
There was a problem hiding this comment.
Meh, let's just have the one test above and make sure it runs <1h. And also, within reason, support either --local and/or --skip-init if you use an already set up cluster to keep re-running things without the bulk import. This TPC-E harness also needs to integrate with prometheus/grafana (I see you've done so manually above), let's do that in this PR -- steal as many ideas from the TPC-C harness when doing so.
There was a problem hiding this comment.
I'm not sure I understand this request. I don't think it is possible to create a test that runs <1hr, uses default settings and sees this overload issue (since it needs 1000 L0 files as part of index creation). I wasn't sure how best to approach this, so definitely open to ideas on how to do this (combination of skipped tests, short test that doesn't show the issue or other).
There was a problem hiding this comment.
Removed the small test below.
74543ca to
3ee13b4
Compare
3ee13b4 to
0d53ff0
Compare
This patch adds the ability to run tests against n2 standard from within roachprod / roachtest. Epic: none Release note: None
13c6837 to
0a24449
Compare
This introduces a test which runs an index creation. During the index creation the latency of foreground traffic spikes even though it is not actually using the index. The cause of this is contention for resources between the foreground workload and the index creation at multiple levels (CPU, IO, Pebble sublevels). Release note: None Epic: none
0a24449 to
d8e50bd
Compare
|
This is subsumed by #103816. |
This patch adds the ability to run tests against n2 standard from within roachprod / roachtest.
Release note: None
Epic: None