Skip to content

workload: move sampledataccl abstraction and add query workloads#21097

Merged
danhhz merged 4 commits intocockroachdb:masterfrom
danhhz:workload
Jan 2, 2018
Merged

workload: move sampledataccl abstraction and add query workloads#21097
danhhz merged 4 commits intocockroachdb:masterfrom
danhhz:workload

Conversation

@danhhz
Copy link
Copy Markdown
Contributor

@danhhz danhhz commented Dec 28, 2017

The sampledataccl package was started to centralize logic for testdata generation in BACKUP/RESTORE, as well as provide comparable benchmarks between operations running at the sql and kv layers. It has proven useful and is ready to graduate out of ccl.

Before sampledataccl, many ccl/... tests rolled their own test data in one of various forms: sql INSERTS, csv, []string, []engine.MVCCKeyValue, ssts, RocksDB batch reprs. Insidiously the kv testdata generation was different from test to test and recreated just enough of our sql key encoding to get that test to pass. The sampledataccl.Bank abstraction was built to provide exactly the same schema in all of these forms, converting between them using actual production code, so nothing rotted.

This had the wonderful side benefit of making the throughput measurement of benchmarks at the sql and kv layers comparable, which helped considerably during BACKUP/RESTORE performance tuning work.

This problem is not exclusive to ccl; many places in the codebase do this recreation of just enough of our key encoding to get a particular test to pass, which is often subtly incorrect and requires boilerplate that detracts from what the test is doing.

This moves the abstraction out of ccl so it can start to be used more widely. This opportunity is also taken to add support for more than one table as well as the ability to run query workloads (hence the new workload package name).

Upcoming commits will add tooling to run these workloads against clusters (think of the loadgen/kv tool), with significantly less effort than is currently required to add a new loadgen.

We'll also be able to have a single tool for making large test fixtures and to keep this tool tested so it doesn't rot. This has traditionally been done ad-hoc and was a huge obstruction in BACKUP/RESTORE production testing.

Still TODO is teasing the final ccl deps out of the remaining sampledataccl features.

Also port https://github.com/cockroachdb/loadgen/tree/master/kv to be a Generator and add the driver program as a workload cli command.

@danhhz danhhz requested review from a team, benesch and tbg December 28, 2017 19:23
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@tbg
Copy link
Copy Markdown
Member

tbg commented Dec 28, 2017

The code LGTM, but could you expand a bit on the vision? My worry right now is that we're introducing something that will compete with the old-style kv (which also runs against cassandra and mongo, or rather did at some point when that was desired). That said, I like that you're removing friction for adding new types of load generators, I'm just not sure how it fits into the overall picture.

I'm also particularly interested in making large test fixtures. Is the plan here to just run the load generator for x ops and then use the resulting data dump? That's fine, but you had at some point talked about making a DistSQL processor that just pulls data out of thin air and which can then be backed up, which sounded geeky but also generally useful, especially to get some low-level control. It's fine if that's out of scope here (having a good SQL-level loadgen driver is great), but a lot of workloads I have in mind need low-level control, and I'm curious if you had anything in mind in this string of commits.

Also, don't you think we should make workload a first-class cmd? After all, it already feels a lot like a better jMeter and I suppose that's also one way in which we ought to be using it. (The bulk of the libs could remain testutils).


Reviewed 13 of 13 files at r1, 2 of 2 files at r2, 6 of 6 files at r3.
Review status: all files reviewed at latest revision, 13 unresolved discussions, some commit checks failed.


pkg/ccl/sqlccl/backup_test.go, line 107 at r1 (raw file):

	}
	// This occasionally flakes, so ignore errors.
	_ = bank.Split(sqlDB.DB, bankData)

Wanna at least log them though? Or perhaps just retry? Nevermind if it really doesn't matter for this test


pkg/testutils/workload/opts.go, line 41 at r3 (raw file):

		}
	}
	return int(x)

Nit: overflow check (but perhaps we can just use int64 for everything). Feel free to ignore, this is obviously very minor.


pkg/testutils/workload/workload.go, line 93 at r1 (raw file):

		panic(errors.Wrapf(err, "could not register: %T", fn))
	}
	registered[g.Name()] = fn
name := g.Name()
if _, ok := registered[name]; ok {
  panic(name + " is already registered")
}
registered[name] = fn

pkg/testutils/workload/workload.go, line 127 at r1 (raw file):

			fmt.Fprintf(&insertStmtBuf, `INSERT INTO %s VALUES `, table.Name)
			rowIdx := 0
			for ; rowIdx < table.InitialRowCount; rowIdx++ {

Is something wrong here? I thought you'd use batchSize.


pkg/testutils/workload/workload.go, line 133 at r1 (raw file):

				insertStmtBuf.WriteString(`(`)
				for i, datum := range table.InitialRowFn(rowIdx) {
					size += int64(len(datum))

This measures the size of the encoded SQL values, so inserting a one-char string would count as three ('x') or more if it's, say, hex-encoded? Gets more intransparent when the generator spits out sin(5.0). I think it's worth calling out in the method what's measured.


pkg/testutils/workload/bank/bank.go, line 99 at r1 (raw file):

				strconv.Itoa(rowIdx), // id
				`0`,                  // balance
				fmt.Sprintf(`'initial-%x'`, randutil.RandBytes(rng, b.payloadBytes)), // payload

This is generally a lot longer than payloadBytes. Isn't that a problem?


pkg/testutils/workload/bank/bank.go, line 112 at r1 (raw file):

}

// Split creates the configured number of ranges in an already created created

remove one created.


pkg/testutils/workload/bank/bank.go, line 115 at r1 (raw file):

// version of the table represented by `g`.
func Split(db *gosql.DB, g workload.Generator) error {
	// TODO(dan): Make this general and move it into the workload package.

This method is now kinda in between: looks very general, is very specialized. I assume it won't stay like that long, so fine by me.


pkg/testutils/workload/bank/bank_test.go, line 28 at r1 (raw file):

func TestSplit(t *testing.T) {
	// defer leaktest.AfterTest(t)()

?


pkg/testutils/workload/bank/bank_test.go, line 35 at r1 (raw file):

		expectedRanges int
	}{
		{10, 0, 1}, // We always have at least one range.

I feel silly even nitting this, but inline comments start lowercase and no trailing dot.


pkg/testutils/workload/cmd/workload/main.go, line 209 at r3 (raw file):

			log.Fatalf(ctx, `invalid opt: %s`, opt)
		}
		opts[optParts[0]] = opts[optParts[1]]

Is this right? I would've expected opts[optParts[0]] = optParts[1].


pkg/testutils/workload/cmd/workload/main.go, line 300 at r3 (raw file):

			fmt.Sprintf("duration=%s", *duration),
		}, "/")
		for _, opt := range strings.Split(*optsRaw, `,`) {

Nit: may want to output those in sorted order, at which point you're better of grabbing them from the map.


pkg/testutils/workload/kv/kv.go, line 89 at r3 (raw file):

}

// Opsa implements the Generator interface.

Opsa.


Comments from Reviewable

The sampledataccl package was started to centralize logic for testdata
generation in BACKUP/RESTORE, as well as provide comparable benchmarks
between operations running at the sql and kv layers. It has proven
useful and is ready to graduate out of ccl.

Before sampledataccl, many ccl/... tests rolled their own test data in
one of various forms: sql INSERTS, csv, []string, []engine.MVCCKeyValue,
ssts, RocksDB batch reprs. Insidiously the kv testdata generation was
different from test to test and recreated just enough of our sql key
encoding to get that test to pass. The sampledataccl.Bank abstraction
was built to provide exactly the same schema in all of these forms,
converting between them using actual production code, so nothing rotted.

This had the wonderful side benefit of making the throughput measurement
of benchmarks at the sql and kv layers comparable, which helped
considerably during BACKUP/RESTORE performance tuning work.

This problem is not exclusive to ccl; many places in the codebase do
this recreation of  just enough of our key encoding to get a particular
test to pass, which is often subtly incorrect and requires boilerplate
that detracts from what the test is doing.

This commit moves the abstraction out of ccl so it can start to be used
more widely. This opportunity is also taken to add support for more than
one table as well as the ability to run query workloads (hence the new
workload package name).

Upcoming commits will add tooling to run these workloads against
clusters (think of the loadgen/kv tool), with significantly less effort
than is currently required to add a new loadgen.

We'll also be able to have a single tool for making large test fixtures
and to keep this tool tested so it doesn't rot. This has traditionally
been done ad-hoc and was a huge obstruction in BACKUP/RESTORE production
testing.

Still TODO is teasing the final ccl deps out of the remaining
sampledataccl features.

Release note: None
Copied exactly. All changes will be made in the next commit.

Release note: None
Attempt to change as little as possible. There's definitely future work
moving the concurrent workload runner into generator.

    workload -generator kv -drop -duration 2s

Still TODO is creating initial splits and cleaning up the UX of how
Generator options are passed from the commandline.

Release note: None
@danhhz
Copy link
Copy Markdown
Contributor Author

danhhz commented Dec 29, 2017

Thanks for the review!

The mongo/cassandra stuff is a good question, ycsb does that too. I doubt it makes sense to add the deps on the mongo/cassandra drivers here. The kv code I took is mostly unchanged, there's no reason it couldn't be exported and used by the crdb vs mongo vs cassandra tool, so the impl only lives in one place. I don't love the idea of multiple very similar tools, but I think on the whole this lowered friction (and upcoming ecosystem) is worth it.

As for what's next, I wasn't really sure where to put this information. I wanted to get the general idea signed off on and merged before making github issues for it. For now, I'll stick it here in response to your comment, but soon I should find somewhere more discoverable.

Large test fixtures and individual kvs are very soon on my mental roadmap. The distsql processor stuff works in a branch, but it's not ready to merge (that code is very tightly coupled right now to what it does, so my branch duplicates far too much). I think there's some reasonable interim solution that writes out CSVs from the generator and then literally calls IMPORT. This has the side effect of making the backup files we need for test fixtures.

Thoughts I've had:

  • Large test fixtures
    • Distributed generation
    • Tested so it doesn't rot
  • engine.MVCCKeyValue generation
  • splits
  • A "kitchen sink" generator that uses as many sql features as possible (interleaved, column families, column types, indexes, splits, fks, partitioning, schema changes, node death, SCRUB)
  • port ycsb/tpcc/tpch
  • play well with roachprod
  • go Benchmark that iterates every registered generator, so we can track them over time
  • post-data ingest, pre-operation arbitrary setup code

Can you describe some of the low level workloads you're talking about? I made a note of #18661 but I assume there are others.

Also, don't you think we should make workload a first-class cmd?

If you don't mind, I'd like to wait on this until I get the splits stuff sorted out so this can be a full replacement for kv (module mongo/cassandra).


Review status: 8 of 18 files reviewed at latest revision, 13 unresolved discussions, all commit checks successful.


pkg/ccl/sqlccl/backup_test.go, line 107 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Wanna at least log them though? Or perhaps just retry? Nevermind if it really doesn't matter for this test

Done.


pkg/testutils/workload/opts.go, line 41 at r3 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Nit: overflow check (but perhaps we can just use int64 for everything). Feel free to ignore, this is obviously very minor.

Done.


pkg/testutils/workload/workload.go, line 93 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…
name := g.Name()
if _, ok := registered[name]; ok {
  panic(name + " is already registered")
}
registered[name] = fn

Oh nice, good call.


pkg/testutils/workload/workload.go, line 127 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Is something wrong here? I thought you'd use batchSize.

eek, this was all off


pkg/testutils/workload/workload.go, line 133 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

This measures the size of the encoded SQL values, so inserting a one-char string would count as three ('x') or more if it's, say, hex-encoded? Gets more intransparent when the generator spits out sin(5.0). I think it's worth calling out in the method what's measured.

Done. Also definitely accepting suggestions for better things to do here


pkg/testutils/workload/bank/bank.go, line 99 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

This is generally a lot longer than payloadBytes. Isn't that a problem?

It wasn't for the backup/restore tests but this is a good time to fix it. Done.


pkg/testutils/workload/bank/bank.go, line 112 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

remove one created.

Done.


pkg/testutils/workload/bank/bank.go, line 115 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

This method is now kinda in between: looks very general, is very specialized. I assume it won't stay like that long, so fine by me.

Yeah, ycsb and kv both also do splits. I haven't thought through what the generalization is yet


pkg/testutils/workload/bank/bank_test.go, line 28 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

?

I was testing the go:generate line that enforces these. (TIL this isn't even sufficient because it matches the literal string, which is still here. when i added something else it failed in the way i was expecting)


pkg/testutils/workload/bank/bank_test.go, line 35 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

I feel silly even nitting this, but inline comments start lowercase and no trailing dot.

Done.


pkg/testutils/workload/cmd/workload/main.go, line 209 at r3 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Is this right? I would've expected opts[optParts[0]] = optParts[1].

this file definitely wants tests, but they'll be easier once it's moved out of one big main (which I'd like to save for another PR)


pkg/testutils/workload/cmd/workload/main.go, line 300 at r3 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Nit: may want to output those in sorted order, at which point you're better of grabbing them from the map.

I don't feel strongly at all, but I'd probably expect them to come out in the order I specify, so that they can have some meaningful order. But maybe a stable order makes more sense. I'm going to leave this as a TODO if you don't mind


pkg/testutils/workload/kv/kv.go, line 89 at r3 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Opsa.

Done.


Comments from Reviewable

@petermattis
Copy link
Copy Markdown
Collaborator

Nice to see an attempt to remove some of the common boilerplate from load generators.

Is there a reason you structured this as a single workload binary vs making workload a library?


Review status: 8 of 18 files reviewed at latest revision, 14 unresolved discussions, all commit checks successful.


pkg/testutils/workload/cmd/workload/main.go, line 48 at r6 (raw file):

var generator = flag.String("generator", "", "")
var optsRaw = flag.String("opts", "", "")

This seems like it could get awkward. Is there a problem with letting workloads define their own flags? If you're worried about name space pollution, I think the solution to that is to use a more advanced command line interface such as spf13/cobra. Then each generator could be a top-level command and specify its own set of flags. Something like:

~ workload kv --min-block-bytes=1
~ workload bank --balance=50

spf13/cobra is used by both pkg/cli as well as roachperf and roachprod.


Comments from Reviewable

@danhhz
Copy link
Copy Markdown
Contributor Author

danhhz commented Dec 29, 2017

I left it as a binary to minimize the change from the existing kv code.

Ditto for cobra, and I agree we should be using that instead (I've already started to reinvent the wheel more than I'd like in opts.go). The options were originally a map[string]string for easy serialization when parts of the generation are distributed, but maybe I can make cobra parse a []string or something and that would be the better interface.

@tbg
Copy link
Copy Markdown
Member

tbg commented Dec 29, 2017

:lgtm:


Reviewed 10 of 10 files at r4, 2 of 2 files at r5, 6 of 6 files at r6.
Review status: all files reviewed at latest revision, 5 unresolved discussions, all commit checks successful.


pkg/testutils/workload/workload.go, line 133 at r1 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

Done. Also definitely accepting suggestions for better things to do here

The main problem seems to be that InitialRowFn returns pre-escaped data. If that weren't the case you could do somewhat better, but not clear to me whether it's worth it.


pkg/testutils/workload/workload.go, line 119 at r4 (raw file):

// batchSize will only be used when non-zero (but inserts are batched either way).


pkg/testutils/workload/bank/bank.go, line 104 at r4 (raw file):

				strconv.Itoa(rowIdx), // id
				`0`,                  // balance
				fmt.Sprintf(`'%s%s'`, initialPrefix, bytes), // payload

initialPrefix + bytes


pkg/testutils/workload/cmd/workload/main.go, line 300 at r3 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

I don't feel strongly at all, but I'd probably expect them to come out in the order I specify, so that they can have some meaningful order. But maybe a stable order makes more sense. I'm going to leave this as a TODO if you don't mind

No, I don't mind. My argument for ordering is that it decides the test name, and that should be stable w.r.t permutation of the opts.


Comments from Reviewable

@danhhz
Copy link
Copy Markdown
Contributor Author

danhhz commented Dec 29, 2017

Yup, using cobra and switching to a []string for configuration worked like a charm. The cli interface is now a drop in replacement for the existing kv tool (module the TODO for splits).


Review status: 13 of 17 files reviewed at latest revision, 5 unresolved discussions.


pkg/testutils/workload/workload.go, line 133 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

The main problem seems to be that InitialRowFn returns pre-escaped data. If that weren't the case you could do somewhat better, but not clear to me whether it's worth it.

I had []interface{} at one point but this ended up being easier to reason about. I might switch back to it at some point though so there's not so much string serialization and parsing


pkg/testutils/workload/workload.go, line 119 at r4 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

// batchSize will only be used when non-zero (but inserts are batched either way).

Done.


pkg/testutils/workload/bank/bank.go, line 104 at r4 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

initialPrefix + bytes

Done.


pkg/testutils/workload/cmd/workload/main.go, line 300 at r3 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

No, I don't mind. My argument for ordering is that it decides the test name, and that should be stable w.r.t permutation of the opts.

It's now printed in stable order via Visit (happens to be alphabetical)


pkg/testutils/workload/cmd/workload/main.go, line 48 at r6 (raw file):

Previously, petermattis (Peter Mattis) wrote…

This seems like it could get awkward. Is there a problem with letting workloads define their own flags? If you're worried about name space pollution, I think the solution to that is to use a more advanced command line interface such as spf13/cobra. Then each generator could be a top-level command and specify its own set of flags. Something like:

~ workload kv --min-block-bytes=1
~ workload bank --balance=50

spf13/cobra is used by both pkg/cli as well as roachperf and roachprod.

I was planning on doing this later, but I'm glad you prodded me into doing it now. The cli interface is MUCH nicer


Comments from Reviewable

@tbg
Copy link
Copy Markdown
Member

tbg commented Dec 30, 2017

The flag change is super nice. Still :lgtm:.


Reviewed 5 of 5 files at r7.
Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


pkg/testutils/workload/cmd/workload/main.go, line 303 at r7 (raw file):

			fmt.Sprintf("duration=%s", *duration),
		}, "/")
		// NB This visits in a deterministic order

// NB: This visits in a deterministic order.

but don't eat CI for this.


Comments from Reviewable

@petermattis
Copy link
Copy Markdown
Collaborator

:lgtm:


Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.


pkg/testutils/workload/cmd/workload/main.go, line 53 at r7 (raw file):

// concurrency = number of concurrent insertion processes.
var concurrency = rootCmd.PersistentFlags().Int("concurrency", 2*runtime.NumCPU(), "Number of concurrent writers inserting blocks")

Many of these lines should be wrapped at 100 columns.


pkg/testutils/workload/cmd/workload/main.go, line 56 at r7 (raw file):

// batch = number of blocks to insert in a single SQL statement.
var batch = rootCmd.PersistentFlags().Int("batch", 1, "Number of blocks to insert in a single SQL statement")

I would have thought this a workload specific flag. Update: perhaps this is detritus.


pkg/testutils/workload/cmd/workload/main.go, line 122 at r7 (raw file):

}

type blocker struct {

s/blocker/worker/g

The blocker name is from the old block_writer load generator. It doesn't make sense in this context.


Comments from Reviewable

This allows for a vastly superior cli interface, help text for each
configuration option, and removes the options parsing library that was
a wheel starting to be re-invented.

    workload kv --drop --batch 10 --max-ops 5 --concurrency 5 --max-block-bytes 4

Also address some code review feedback that I didn't feel like carefully
rebasing back into the relevant commits.

Release note: None
@danhhz
Copy link
Copy Markdown
Contributor Author

danhhz commented Jan 2, 2018

Thanks for the reviews!


Review status: 16 of 17 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.


pkg/testutils/workload/cmd/workload/main.go, line 53 at r7 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Many of these lines should be wrapped at 100 columns.

I don't find that more readable in this case, but I'll defer to my reviewer. Done here and elsewhere.


pkg/testutils/workload/cmd/workload/main.go, line 56 at r7 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I would have thought this a workload specific flag. Update: perhaps this is detritus.

Yeah, this is awkward because it's used for max-ops. I've been waiting to port the TPCC query loads over before cleaning this up too much because I think they'll be instructive in which generalizations I'll need


pkg/testutils/workload/cmd/workload/main.go, line 122 at r7 (raw file):

Previously, petermattis (Peter Mattis) wrote…

s/blocker/worker/g

The blocker name is from the old block_writer load generator. It doesn't make sense in this context.

Done.


pkg/testutils/workload/cmd/workload/main.go, line 303 at r7 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

// NB: This visits in a deterministic order.

but don't eat CI for this.

Done.


Comments from Reviewable

@danhhz danhhz merged commit 61c5c08 into cockroachdb:master Jan 2, 2018
@danhhz danhhz deleted the workload branch January 2, 2018 16:10
@petermattis
Copy link
Copy Markdown
Collaborator

Review status: 16 of 17 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


pkg/testutils/workload/cmd/workload/main.go, line 53 at r7 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

I don't find that more readable in this case, but I'll defer to my reviewer. Done here and elsewhere.

Reviewable wraps into unreadability at 100 columns.


Comments from Reviewable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants