workload: move sampledataccl abstraction and add query workloads by danhhz · Pull Request #21097 · cockroachdb/cockroach

danhhz · 2017-12-28T19:23:42Z

The sampledataccl package was started to centralize logic for testdata generation in BACKUP/RESTORE, as well as provide comparable benchmarks between operations running at the sql and kv layers. It has proven useful and is ready to graduate out of ccl.

Before sampledataccl, many ccl/... tests rolled their own test data in one of various forms: sql INSERTS, csv, []string, []engine.MVCCKeyValue, ssts, RocksDB batch reprs. Insidiously the kv testdata generation was different from test to test and recreated just enough of our sql key encoding to get that test to pass. The sampledataccl.Bank abstraction was built to provide exactly the same schema in all of these forms, converting between them using actual production code, so nothing rotted.

This had the wonderful side benefit of making the throughput measurement of benchmarks at the sql and kv layers comparable, which helped considerably during BACKUP/RESTORE performance tuning work.

This problem is not exclusive to ccl; many places in the codebase do this recreation of just enough of our key encoding to get a particular test to pass, which is often subtly incorrect and requires boilerplate that detracts from what the test is doing.

This moves the abstraction out of ccl so it can start to be used more widely. This opportunity is also taken to add support for more than one table as well as the ability to run query workloads (hence the new workload package name).

Upcoming commits will add tooling to run these workloads against clusters (think of the loadgen/kv tool), with significantly less effort than is currently required to add a new loadgen.

We'll also be able to have a single tool for making large test fixtures and to keep this tool tested so it doesn't rot. This has traditionally been done ad-hoc and was a huge obstruction in BACKUP/RESTORE production testing.

Still TODO is teasing the final ccl deps out of the remaining sampledataccl features.

Also port https://github.com/cockroachdb/loadgen/tree/master/kv to be a Generator and add the driver program as a workload cli command.

cockroach-teamcity · 2017-12-28T19:23:48Z

This change is

tbg · 2017-12-28T22:41:54Z

The code LGTM, but could you expand a bit on the vision? My worry right now is that we're introducing something that will compete with the old-style kv (which also runs against cassandra and mongo, or rather did at some point when that was desired). That said, I like that you're removing friction for adding new types of load generators, I'm just not sure how it fits into the overall picture.

I'm also particularly interested in making large test fixtures. Is the plan here to just run the load generator for x ops and then use the resulting data dump? That's fine, but you had at some point talked about making a DistSQL processor that just pulls data out of thin air and which can then be backed up, which sounded geeky but also generally useful, especially to get some low-level control. It's fine if that's out of scope here (having a good SQL-level loadgen driver is great), but a lot of workloads I have in mind need low-level control, and I'm curious if you had anything in mind in this string of commits.

Also, don't you think we should make workload a first-class cmd? After all, it already feels a lot like a better jMeter and I suppose that's also one way in which we ought to be using it. (The bulk of the libs could remain testutils).

Reviewed 13 of 13 files at r1, 2 of 2 files at r2, 6 of 6 files at r3.
Review status: all files reviewed at latest revision, 13 unresolved discussions, some commit checks failed.

pkg/ccl/sqlccl/backup_test.go, line 107 at r1 (raw file):

	}
	// This occasionally flakes, so ignore errors.
	_ = bank.Split(sqlDB.DB, bankData)

Wanna at least log them though? Or perhaps just retry? Nevermind if it really doesn't matter for this test

pkg/testutils/workload/opts.go, line 41 at r3 (raw file):

		}
	}
	return int(x)

Nit: overflow check (but perhaps we can just use int64 for everything). Feel free to ignore, this is obviously very minor.

pkg/testutils/workload/workload.go, line 93 at r1 (raw file):

		panic(errors.Wrapf(err, "could not register: %T", fn))
	}
	registered[g.Name()] = fn

name := g.Name()
if _, ok := registered[name]; ok {
  panic(name + " is already registered")
}
registered[name] = fn

pkg/testutils/workload/workload.go, line 127 at r1 (raw file):

			fmt.Fprintf(&insertStmtBuf, `INSERT INTO %s VALUES `, table.Name)
			rowIdx := 0
			for ; rowIdx < table.InitialRowCount; rowIdx++ {

Is something wrong here? I thought you'd use batchSize.

pkg/testutils/workload/workload.go, line 133 at r1 (raw file):

				insertStmtBuf.WriteString(`(`)
				for i, datum := range table.InitialRowFn(rowIdx) {
					size += int64(len(datum))

This measures the size of the encoded SQL values, so inserting a one-char string would count as three ('x') or more if it's, say, hex-encoded? Gets more intransparent when the generator spits out sin(5.0). I think it's worth calling out in the method what's measured.

pkg/testutils/workload/bank/bank.go, line 99 at r1 (raw file):

				strconv.Itoa(rowIdx), // id
				`0`,                  // balance
				fmt.Sprintf(`'initial-%x'`, randutil.RandBytes(rng, b.payloadBytes)), // payload

This is generally a lot longer than payloadBytes. Isn't that a problem?

pkg/testutils/workload/bank/bank.go, line 112 at r1 (raw file):

}

// Split creates the configured number of ranges in an already created created

remove one created.

pkg/testutils/workload/bank/bank.go, line 115 at r1 (raw file):

// version of the table represented by `g`.
func Split(db *gosql.DB, g workload.Generator) error {
	// TODO(dan): Make this general and move it into the workload package.

This method is now kinda in between: looks very general, is very specialized. I assume it won't stay like that long, so fine by me.

pkg/testutils/workload/bank/bank_test.go, line 28 at r1 (raw file):

func TestSplit(t *testing.T) {
	// defer leaktest.AfterTest(t)()

?

pkg/testutils/workload/bank/bank_test.go, line 35 at r1 (raw file):

		expectedRanges int
	}{
		{10, 0, 1}, // We always have at least one range.

I feel silly even nitting this, but inline comments start lowercase and no trailing dot.

pkg/testutils/workload/cmd/workload/main.go, line 209 at r3 (raw file):

			log.Fatalf(ctx, `invalid opt: %s`, opt)
		}
		opts[optParts[0]] = opts[optParts[1]]

Is this right? I would've expected opts[optParts[0]] = optParts[1].

pkg/testutils/workload/cmd/workload/main.go, line 300 at r3 (raw file):

			fmt.Sprintf("duration=%s", *duration),
		}, "/")
		for _, opt := range strings.Split(*optsRaw, `,`) {

Nit: may want to output those in sorted order, at which point you're better of grabbing them from the map.

pkg/testutils/workload/kv/kv.go, line 89 at r3 (raw file):

}

// Opsa implements the Generator interface.

Opsa.

Comments from Reviewable

The sampledataccl package was started to centralize logic for testdata generation in BACKUP/RESTORE, as well as provide comparable benchmarks between operations running at the sql and kv layers. It has proven useful and is ready to graduate out of ccl. Before sampledataccl, many ccl/... tests rolled their own test data in one of various forms: sql INSERTS, csv, []string, []engine.MVCCKeyValue, ssts, RocksDB batch reprs. Insidiously the kv testdata generation was different from test to test and recreated just enough of our sql key encoding to get that test to pass. The sampledataccl.Bank abstraction was built to provide exactly the same schema in all of these forms, converting between them using actual production code, so nothing rotted. This had the wonderful side benefit of making the throughput measurement of benchmarks at the sql and kv layers comparable, which helped considerably during BACKUP/RESTORE performance tuning work. This problem is not exclusive to ccl; many places in the codebase do this recreation of just enough of our key encoding to get a particular test to pass, which is often subtly incorrect and requires boilerplate that detracts from what the test is doing. This commit moves the abstraction out of ccl so it can start to be used more widely. This opportunity is also taken to add support for more than one table as well as the ability to run query workloads (hence the new workload package name). Upcoming commits will add tooling to run these workloads against clusters (think of the loadgen/kv tool), with significantly less effort than is currently required to add a new loadgen. We'll also be able to have a single tool for making large test fixtures and to keep this tool tested so it doesn't rot. This has traditionally been done ad-hoc and was a huge obstruction in BACKUP/RESTORE production testing. Still TODO is teasing the final ccl deps out of the remaining sampledataccl features. Release note: None

Copied exactly. All changes will be made in the next commit. Release note: None

Attempt to change as little as possible. There's definitely future work moving the concurrent workload runner into generator. workload -generator kv -drop -duration 2s Still TODO is creating initial splits and cleaning up the UX of how Generator options are passed from the commandline. Release note: None

danhhz · 2017-12-29T17:29:45Z

Thanks for the review!

The mongo/cassandra stuff is a good question, ycsb does that too. I doubt it makes sense to add the deps on the mongo/cassandra drivers here. The kv code I took is mostly unchanged, there's no reason it couldn't be exported and used by the crdb vs mongo vs cassandra tool, so the impl only lives in one place. I don't love the idea of multiple very similar tools, but I think on the whole this lowered friction (and upcoming ecosystem) is worth it.

As for what's next, I wasn't really sure where to put this information. I wanted to get the general idea signed off on and merged before making github issues for it. For now, I'll stick it here in response to your comment, but soon I should find somewhere more discoverable.

Large test fixtures and individual kvs are very soon on my mental roadmap. The distsql processor stuff works in a branch, but it's not ready to merge (that code is very tightly coupled right now to what it does, so my branch duplicates far too much). I think there's some reasonable interim solution that writes out CSVs from the generator and then literally calls IMPORT. This has the side effect of making the backup files we need for test fixtures.

Thoughts I've had:

Large test fixtures
- Distributed generation
- Tested so it doesn't rot
engine.MVCCKeyValue generation
splits
A "kitchen sink" generator that uses as many sql features as possible (interleaved, column families, column types, indexes, splits, fks, partitioning, schema changes, node death, SCRUB)
port ycsb/tpcc/tpch
play well with roachprod
go Benchmark that iterates every registered generator, so we can track them over time
post-data ingest, pre-operation arbitrary setup code

Can you describe some of the low level workloads you're talking about? I made a note of #18661 but I assume there are others.

Also, don't you think we should make workload a first-class cmd?

If you don't mind, I'd like to wait on this until I get the splits stuff sorted out so this can be a full replacement for kv (module mongo/cassandra).

Review status: 8 of 18 files reviewed at latest revision, 13 unresolved discussions, all commit checks successful.

pkg/ccl/sqlccl/backup_test.go, line 107 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Wanna at least log them though? Or perhaps just retry? Nevermind if it really doesn't matter for this test

Done.

pkg/testutils/workload/opts.go, line 41 at r3 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Nit: overflow check (but perhaps we can just use int64 for everything). Feel free to ignore, this is obviously very minor.

Done.

pkg/testutils/workload/workload.go, line 93 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

name := g.Name()
if _, ok := registered[name]; ok {
  panic(name + " is already registered")
}
registered[name] = fn

Oh nice, good call.

pkg/testutils/workload/workload.go, line 127 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Is something wrong here? I thought you'd use batchSize.

eek, this was all off

pkg/testutils/workload/workload.go, line 133 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

This measures the size of the encoded SQL values, so inserting a one-char string would count as three ('x') or more if it's, say, hex-encoded? Gets more intransparent when the generator spits out sin(5.0). I think it's worth calling out in the method what's measured.

Done. Also definitely accepting suggestions for better things to do here

pkg/testutils/workload/bank/bank.go, line 99 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

This is generally a lot longer than payloadBytes. Isn't that a problem?

It wasn't for the backup/restore tests but this is a good time to fix it. Done.

pkg/testutils/workload/bank/bank.go, line 112 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

remove one created.

Done.

pkg/testutils/workload/bank/bank.go, line 115 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

This method is now kinda in between: looks very general, is very specialized. I assume it won't stay like that long, so fine by me.

Yeah, ycsb and kv both also do splits. I haven't thought through what the generalization is yet

pkg/testutils/workload/bank/bank_test.go, line 28 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

?

I was testing the go:generate line that enforces these. (TIL this isn't even sufficient because it matches the literal string, which is still here. when i added something else it failed in the way i was expecting)

pkg/testutils/workload/bank/bank_test.go, line 35 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

I feel silly even nitting this, but inline comments start lowercase and no trailing dot.

Done.

pkg/testutils/workload/cmd/workload/main.go, line 209 at r3 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Is this right? I would've expected opts[optParts[0]] = optParts[1].

this file definitely wants tests, but they'll be easier once it's moved out of one big main (which I'd like to save for another PR)

pkg/testutils/workload/cmd/workload/main.go, line 300 at r3 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Nit: may want to output those in sorted order, at which point you're better of grabbing them from the map.

I don't feel strongly at all, but I'd probably expect them to come out in the order I specify, so that they can have some meaningful order. But maybe a stable order makes more sense. I'm going to leave this as a TODO if you don't mind

pkg/testutils/workload/kv/kv.go, line 89 at r3 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Opsa.

Done.

Comments from Reviewable

petermattis · 2017-12-29T17:43:33Z

Nice to see an attempt to remove some of the common boilerplate from load generators.

Is there a reason you structured this as a single workload binary vs making workload a library?

Review status: 8 of 18 files reviewed at latest revision, 14 unresolved discussions, all commit checks successful.

pkg/testutils/workload/cmd/workload/main.go, line 48 at r6 (raw file):

var generator = flag.String("generator", "", "")
var optsRaw = flag.String("opts", "", "")

This seems like it could get awkward. Is there a problem with letting workloads define their own flags? If you're worried about name space pollution, I think the solution to that is to use a more advanced command line interface such as spf13/cobra. Then each generator could be a top-level command and specify its own set of flags. Something like:

~ workload kv --min-block-bytes=1
~ workload bank --balance=50

spf13/cobra is used by both pkg/cli as well as roachperf and roachprod.

Comments from Reviewable

danhhz · 2017-12-29T17:53:38Z

I left it as a binary to minimize the change from the existing kv code.

Ditto for cobra, and I agree we should be using that instead (I've already started to reinvent the wheel more than I'd like in opts.go). The options were originally a map[string]string for easy serialization when parts of the generation are distributed, but maybe I can make cobra parse a []string or something and that would be the better interface.

tbg · 2017-12-29T18:28:12Z

Reviewed 10 of 10 files at r4, 2 of 2 files at r5, 6 of 6 files at r6.
Review status: all files reviewed at latest revision, 5 unresolved discussions, all commit checks successful.

pkg/testutils/workload/workload.go, line 133 at r1 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

Done. Also definitely accepting suggestions for better things to do here

The main problem seems to be that InitialRowFn returns pre-escaped data. If that weren't the case you could do somewhat better, but not clear to me whether it's worth it.

pkg/testutils/workload/workload.go, line 119 at r4 (raw file):

// batchSize will only be used when non-zero (but inserts are batched either way).

pkg/testutils/workload/bank/bank.go, line 104 at r4 (raw file):

				strconv.Itoa(rowIdx), // id
				`0`,                  // balance
				fmt.Sprintf(`'%s%s'`, initialPrefix, bytes), // payload

initialPrefix + bytes

pkg/testutils/workload/cmd/workload/main.go, line 300 at r3 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

I don't feel strongly at all, but I'd probably expect them to come out in the order I specify, so that they can have some meaningful order. But maybe a stable order makes more sense. I'm going to leave this as a TODO if you don't mind

No, I don't mind. My argument for ordering is that it decides the test name, and that should be stable w.r.t permutation of the opts.

Comments from Reviewable

danhhz · 2017-12-29T21:55:12Z

Yup, using cobra and switching to a []string for configuration worked like a charm. The cli interface is now a drop in replacement for the existing kv tool (module the TODO for splits).

Review status: 13 of 17 files reviewed at latest revision, 5 unresolved discussions.

pkg/testutils/workload/workload.go, line 133 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

The main problem seems to be that InitialRowFn returns pre-escaped data. If that weren't the case you could do somewhat better, but not clear to me whether it's worth it.

I had []interface{} at one point but this ended up being easier to reason about. I might switch back to it at some point though so there's not so much string serialization and parsing

pkg/testutils/workload/workload.go, line 119 at r4 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

// batchSize will only be used when non-zero (but inserts are batched either way).

Done.

pkg/testutils/workload/bank/bank.go, line 104 at r4 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

initialPrefix + bytes

Done.

pkg/testutils/workload/cmd/workload/main.go, line 300 at r3 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

No, I don't mind. My argument for ordering is that it decides the test name, and that should be stable w.r.t permutation of the opts.

It's now printed in stable order via Visit (happens to be alphabetical)

pkg/testutils/workload/cmd/workload/main.go, line 48 at r6 (raw file):

Previously, petermattis (Peter Mattis) wrote…

This seems like it could get awkward. Is there a problem with letting workloads define their own flags? If you're worried about name space pollution, I think the solution to that is to use a more advanced command line interface such as spf13/cobra. Then each generator could be a top-level command and specify its own set of flags. Something like:
~ workload kv --min-block-bytes=1
~ workload bank --balance=50
spf13/cobra is used by both pkg/cli as well as roachperf and roachprod.

I was planning on doing this later, but I'm glad you prodded me into doing it now. The cli interface is MUCH nicer

Comments from Reviewable

tbg · 2017-12-30T23:03:32Z

The flag change is super nice. Still .

Reviewed 5 of 5 files at r7.
Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.

pkg/testutils/workload/cmd/workload/main.go, line 303 at r7 (raw file):

			fmt.Sprintf("duration=%s", *duration),
		}, "/")
		// NB This visits in a deterministic order

// NB: This visits in a deterministic order.

but don't eat CI for this.

Comments from Reviewable

petermattis · 2017-12-31T01:18:13Z

Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.

pkg/testutils/workload/cmd/workload/main.go, line 53 at r7 (raw file):

// concurrency = number of concurrent insertion processes.
var concurrency = rootCmd.PersistentFlags().Int("concurrency", 2*runtime.NumCPU(), "Number of concurrent writers inserting blocks")

Many of these lines should be wrapped at 100 columns.

pkg/testutils/workload/cmd/workload/main.go, line 56 at r7 (raw file):

// batch = number of blocks to insert in a single SQL statement.
var batch = rootCmd.PersistentFlags().Int("batch", 1, "Number of blocks to insert in a single SQL statement")

I would have thought this a workload specific flag. Update: perhaps this is detritus.

pkg/testutils/workload/cmd/workload/main.go, line 122 at r7 (raw file):

}

type blocker struct {

s/blocker/worker/g

The blocker name is from the old block_writer load generator. It doesn't make sense in this context.

Comments from Reviewable

This allows for a vastly superior cli interface, help text for each configuration option, and removes the options parsing library that was a wheel starting to be re-invented. workload kv --drop --batch 10 --max-ops 5 --concurrency 5 --max-block-bytes 4 Also address some code review feedback that I didn't feel like carefully rebasing back into the relevant commits. Release note: None

danhhz · 2018-01-02T16:09:47Z

Thanks for the reviews!

Review status: 16 of 17 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.

pkg/testutils/workload/cmd/workload/main.go, line 53 at r7 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Many of these lines should be wrapped at 100 columns.

I don't find that more readable in this case, but I'll defer to my reviewer. Done here and elsewhere.

pkg/testutils/workload/cmd/workload/main.go, line 56 at r7 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I would have thought this a workload specific flag. Update: perhaps this is detritus.

Yeah, this is awkward because it's used for max-ops. I've been waiting to port the TPCC query loads over before cleaning this up too much because I think they'll be instructive in which generalizations I'll need

pkg/testutils/workload/cmd/workload/main.go, line 122 at r7 (raw file):

Previously, petermattis (Peter Mattis) wrote…

s/blocker/worker/g

The blocker name is from the old block_writer load generator. It doesn't make sense in this context.

Done.

pkg/testutils/workload/cmd/workload/main.go, line 303 at r7 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

// NB: This visits in a deterministic order.

but don't eat CI for this.

Done.

Comments from Reviewable

petermattis · 2018-01-02T17:02:08Z

Review status: 16 of 17 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.

pkg/testutils/workload/cmd/workload/main.go, line 53 at r7 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

I don't find that more readable in this case, but I'll defer to my reviewer. Done here and elsewhere.

Reviewable wraps into unreadability at 100 columns.

Comments from Reviewable

danhhz requested review from a team, benesch and tbg December 28, 2017 19:23

tbg mentioned this pull request Dec 28, 2017

storage, kv: remove the max intent limit #21078

Merged

danhhz added 3 commits December 29, 2017 12:03

workload: copy kv/main.go and internal/driver.go from loadgen@7616481

2728cd8

Copied exactly. All changes will be made in the next commit. Release note: None

danhhz force-pushed the workload branch from 85c6e77 to 248210b Compare December 29, 2017 17:07

danhhz force-pushed the workload branch from 4193ba1 to 9cdcbdd Compare January 2, 2018 15:49

danhhz merged commit 61c5c08 into cockroachdb:master Jan 2, 2018

danhhz deleted the workload branch January 2, 2018 16:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

workload: move sampledataccl abstraction and add query workloads#21097

workload: move sampledataccl abstraction and add query workloads#21097
danhhz merged 4 commits intocockroachdb:masterfrom
danhhz:workload

danhhz commented Dec 28, 2017

Uh oh!

cockroach-teamcity commented Dec 28, 2017

Uh oh!

tbg commented Dec 28, 2017

Uh oh!

danhhz commented Dec 29, 2017

Uh oh!

petermattis commented Dec 29, 2017

Uh oh!

danhhz commented Dec 29, 2017

Uh oh!

tbg commented Dec 29, 2017

Uh oh!

danhhz commented Dec 29, 2017

Uh oh!

tbg commented Dec 30, 2017

Uh oh!

petermattis commented Dec 31, 2017

Uh oh!

danhhz commented Jan 2, 2018

Uh oh!

petermattis commented Jan 2, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

danhhz commented Dec 28, 2017

Uh oh!

cockroach-teamcity commented Dec 28, 2017

Uh oh!

tbg commented Dec 28, 2017

Uh oh!

danhhz commented Dec 29, 2017

Uh oh!

petermattis commented Dec 29, 2017

Uh oh!

danhhz commented Dec 29, 2017

Uh oh!

tbg commented Dec 29, 2017

Uh oh!

danhhz commented Dec 29, 2017

Uh oh!

tbg commented Dec 30, 2017

Uh oh!

petermattis commented Dec 31, 2017

Uh oh!

danhhz commented Jan 2, 2018

Uh oh!

petermattis commented Jan 2, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants