workload/schemachange: improve errors; initial changes by jayshrivastava · Pull Request #55521 · cockroachdb/cockroach

jayshrivastava · 2020-10-13T18:13:41Z

This PR contains the groundwork for error handling and what we want to do with random complex schema generation. I think this is a good point to stop and access if we want to continue on this path.

Simpler commits:

workload/schemachange: improve abstraction for probabilistic error production
This adds a global knob for which we can use to control errors due to existing/non-existing resources. This replaces a bunch of int literals.
workload/schemachange: allow schema to be specified when fetching a random Table
When calling randTable it is sometimes useful to provide a schema name. This commit adds this functionality.
workload/schemachange: fix malformed column and index names
Fixes a small bug were we would have names such as "colpublic.table1_1" instead of "col1_1". The latter is correct.
workload/schemachange: sync column names with tables
Fixes a small bug. We want columns for "tableN" to start with "colN", but there was a bug where we would use N+1.

Refactoring:

workload/schemachange: refactor worker into op builder
Moves op generation logic from the schemaChangeWorker to a new struct

Error Handling:

workload/schemachange: add expected error handling
adds states to the op generator for storing errors - it also updates the error handling code in the worker
workload/schemachange: screen errors in addColumn op
workload/schemachange: update views and tables to allow multiple source tables and views
(Also screen for errors in these ops)
workload/schemachange: screen errors in addColumn op

cockroach-teamcity · 2020-10-13T18:13:49Z

This change is

jayshrivastava

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)

pkg/workload/schemachange/operation_generator.go, line 289 at r8 (raw file):

			errCode = &pgcode.UndefinedObject
		}
	}

When the NotNull constraint is present, the behaviour with returning either a NotNullViolation or DuplicateColumn is pretty unpredictable. I thought I had it correct here, but eventually this happened:

root@localhost:26257/schemachange> create table foo(t TIMESTAMPTZ);
CREATE TABLE
Time: 80ms total (execution 80ms / network 0ms)

root@localhost:26257/schemachange> insert into foo values ('2016-06-22 19:10:25-07');
INSERT 1
Time: 43ms total (execution 43ms / network 0ms)

root@localhost:26257/schemachange> ALTER TABLE foo ADD COLUMN t TIMESTAMPTZ NOT NULL;
ERROR: null value in column "t" violates not-null constraint
SQLSTATE: 23502

In a way this is good because it adds test coverage. The downsides would be having to figure out exactly what causes each error and having to put this level of detail into each op. In this case, it may even be nondeterministic. It might be a good idea to return set of errors instead of one and checking that the resultant error exists in the set.

ajwerner

I think I'm cool with this refactor though I'd love to get more insight into where you plan on taking it. Right now, it doesn't feel like you're getting a lot out of lifting the functionality onto this generator struct. What do you have in mind for its future?

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jayshrivastava)

pkg/workload/schemachange/operation_generator.go, line 127 at r8 (raw file):

// stochastic attempts and if verbosity is >= 2 the unsuccessful attempts are
// recorded in `log` to help with debugging of the workload.
func (og operationGenerator) randOp(tx *pgx.Tx) (string, string, *pgcode.Code, error) {

my sense is that this should be a method on the pointer, otherwise you can't modify og. I get that right now you aren't trying to but moving forward I suspect you will be trying to.

ajwerner

This is very close. Let's get it merged soon. Mostly just nits about the errorCodeSet.

Reviewed 1 of 3 files at r13.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava)

pkg/workload/schemachange/error_code_set.go, line 31 at r17 (raw file):

}

func (set *errorCodeSet) addIdempotent(code pgcode.Code) {

nit: I don't think Idempotent adds a lot to the name of this method.

pkg/workload/schemachange/error_code_set.go, line 31 at r17 (raw file):

}

func (set *errorCodeSet) addIdempotent(code pgcode.Code) {

These don't need to be pointers, maps are already reference data types (i.e. a map value is mutable).

pkg/workload/schemachange/error_code_set.go, line 35 at r17 (raw file):

}

func (set *errorCodeSet) addIdempotentWithConditions(codesWithConditions []codeWithCond) {

nit: lose the idempotent, I think set implies it. There's also something

pkg/workload/schemachange/error_code_set.go, line 44 at r17 (raw file):

func (set *errorCodeSet) reset() {
	*set = map[pgcode.Code]bool{}

nit: it's better to do the delete, the go compiler detects the idiom and does an efficient clear: golang/go#20138

for k := range *set {
    delete(*set, k)
}

pkg/workload/schemachange/error_screening.go, line 33 at r17 (raw file):

Quoted 11 lines of code…

	q := fmt.Sprintf(`SELECT EXISTS (
	SELECT table_name
    FROM information_schema.tables
   WHERE table_schema = '%s'
     AND table_name = '%s'
   )`, tableName.SchemaName, tableName.ObjectName.String())
	var exists bool
	if err := tx.QueryRow(q).Scan(&exists); err != nil {
		return false, err
	}
	return exists, nil

feels like you could extract a helper that takes a query sql (and maybe its args, you know you don't need to always string format the query, right?), and returns a bool and error. Might help some with the repetition her.

…oduction Before, we would use int literals in several places the code to specify how often we want a name conflict error to occur probabilisically. This change refactors this behavior to use a global knob for organization purposes. Release note: None

…andom table Before, you could not specify the schema when calling randTable(). This gave us less control over when errors occur. For example, when renaming a table, an error is produced if you specify a different schema than the original table's schema. Now, a schema can be specified to control when an error occurs in this case. Release note: None

Previously, the table name would sometimes get interpolated into column or index names. For example, a name such as "colpublic.table1_1" would be generated instead of "col1_1". This commit fixes this behavior. Release note: None

It is desired that the columns of a table have the table all have the table index number in the name. For example, "colN_M" should exist on "tableN". Previously, there was a double increment causing the numbers to misalign. This commit fixes this behavior. Release note: None

jayshrivastava

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jayshrivastava)

pkg/workload/schemachange/error_code_set.go, line 35 at r17 (raw file):

Previously, ajwerner wrote…

nit: lose the idempotent, I think set implies it. There's also something

I don't think you finished the comment. I changed the name. I also think this fcn is useful for saving some space compared to having several if statements in a row with set.add()

if .... {
    set.add(code)
}

pkg/workload/schemachange/error_screening.go, line 33 at r17 (raw file):

Previously, ajwerner wrote…

	q := fmt.Sprintf(`SELECT EXISTS (
	SELECT table_name
    FROM information_schema.tables
   WHERE table_schema = '%s'
     AND table_name = '%s'
   )`, tableName.SchemaName, tableName.ObjectName.String())
	var exists bool
	if err := tx.QueryRow(q).Scan(&exists); err != nil {
		return false, err
	}
	return exists, nil
feels like you could extract a helper that takes a query sql (and maybe its args, you know you don't need to always string format the query, right?), and returns a bool and error. Might help some with the repetition her.

I'm not exactly sure what you mean here. I felt that storing the SQL string templates inside these functions and interpolating the args with fmt.Sprintf reduces duplication because we only store the string templates in one place.

ajwerner

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jayshrivastava)

pkg/workload/schemachange/error_code_set.go, line 35 at r17 (raw file):

Previously, jayshrivastava (Jayant Shrivastava) wrote…

I don't think you finished the comment. I changed the name. I also think this fcn is useful for saving some space compared to having several if statements in a row with set.add()
if .... {
    set.add(code)
}

I agree with that. It's a handy construct. I was wondering whether I liked it as a method and I decided I couldn't really fault it. I was thinking maybe it's be better to invert it so that it's the slice with the method:

type codesWithConditions []struct {
    code         pgcode.Code
    condition bool
}

func (c codesWithConditions) add(s errorCodeSet) {
     for _, cc := range c {
         if cc.condition {
              s.add(cc.code)
         }
     }
}

   codesWithConditions{
       {code: ..., condition: ...},
   }.add(s)

pkg/workload/schemachange/error_code_set.go, line 52 at r26 (raw file):

Quoted 4 lines of code…

	if _, ok := set[code]; ok {
		return true
	}
	return false

total nit:

_, ok := set[code]
return ok

pkg/workload/schemachange/error_code_set.go, line 61 at r26 (raw file):

Quoted 6 lines of code…

	var codes []string
	for code := range set {
		codes = append(codes, code.String())
	}
	return strings.Join(codes, ",")
}

nits: probably you'll want this to be deterministic, so doing a sort.Strings beforehand might be nice

pkg/workload/schemachange/error_code_set.go, line 65 at r26 (raw file):

Quoted 4 lines of code…

	for range set {
		return false
	}
	return true

nit: return len(set) == 0

pkg/workload/schemachange/error_screening.go, line 33 at r17 (raw file):
I meant something like:

func scanBool(tx *pgx.Tx, query string, args ...interface) (b bool, err error) {
    err = tx.QueryRow(query, args...).Scan(&b)
    return b, err
}

then tableExists could be:

func tableExists(tx *pgx.Tx, tableName *tree.TableName) (bool, err) {
    return scanBool(`
SELECT EXISTS(
        SELECT table_name
          FROM information_schema.tables
         WHERE table_schema = $1 AND table_name = $2
       );
`, tableName.SchemaName, tableName.ObjectName.String())

}

I felt that storing the SQL string templates inside these functions and interpolating the args with fmt.Sprintf reduces duplication because we only store the string templates in one place.

I'm not sure I understand the value of the Sprintf vs placeholders.

Previously all operation generation logic was stateless and tied to the schemaChangeWorker. Creating an independent builder struct will make it easier to keep track of state when generating ops that relate to each other. Release note: None

Previously, there was no mechanism for checking for expected errors vs unexpected errors when operations fail due to intentionally produced errors. This commit adds logic for screening the types of errors that are generated and state variables to track which errors are expected. Release note: None

This change updates the addColumn op to screen for potential errors. The op now updates the state of the op generator so that expected errors can safely be ignored. Release note: None

…ce tables or views Previously, the ops createView and createTableAs would only create resources based on one table. This commit updates those ops so that they allow for multiple souce views and/or tables. It also adds error screening to the ops. Release note: None

Updates error screening so that expected errors are explicit and safely ignored in createTable op Release note: None

jayshrivastava

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)

pkg/workload/schemachange/error_code_set.go, line 31 at r17 (raw file):

Previously, ajwerner wrote…

These don't need to be pointers, maps are already reference data types (i.e. a map value is mutable).

Done.

pkg/workload/schemachange/error_code_set.go, line 52 at r26 (raw file):

Previously, ajwerner wrote…

	if _, ok := set[code]; ok {
		return true
	}
	return false

total nit:

_, ok := set[code]
return ok

Done.

pkg/workload/schemachange/error_code_set.go, line 61 at r26 (raw file):

Previously, ajwerner wrote…

	var codes []string
	for code := range set {
		codes = append(codes, code.String())
	}
	return strings.Join(codes, ",")
}
nits: probably you'll want this to be deterministic, so doing a sort.Strings beforehand might be nice

Done.

pkg/workload/schemachange/operation_generator.go, line 127 at r8 (raw file):

Previously, ajwerner wrote…

my sense is that this should be a method on the pointer, otherwise you can't modify og. I get that right now you aren't trying to but moving forward I suspect you will be trying to.

Done.

ajwerner

Thanks for iterating on this with me. I hope you're pleased with where it ended up.

Reviewed 1 of 3 files at r22, 1 of 4 files at r28, 1 of 3 files at r29, 2 of 3 files at r30, 1 of 1 files at r31.
Reviewable status: complete! 1 of 0 LGTMs obtained

jayshrivastava

Ofc! Thank you for reviewing it thoroughly.

Reviewable status: complete! 1 of 0 LGTMs obtained

jayshrivastava · 2020-10-28T13:08:04Z

bors r=ajwerner

craig · 2020-10-28T14:42:58Z

Build succeeded:

GitHub CI (Cockroach)

jayshrivastava force-pushed the improve-errors-schemachange-workload branch 2 times, most recently from 726d4bc to 312f96c Compare October 15, 2020 21:54

jayshrivastava changed the title ~~WIP Improve errors schemachange workload~~ improve errors schemachange workload; initial changes Oct 15, 2020

jayshrivastava force-pushed the improve-errors-schemachange-workload branch 7 times, most recently from d552602 to 9403c19 Compare October 16, 2020 18:36

jayshrivastava requested a review from ajwerner October 16, 2020 18:53

jayshrivastava commented Oct 16, 2020

View reviewed changes

ajwerner reviewed Oct 20, 2020

View reviewed changes

jayshrivastava force-pushed the improve-errors-schemachange-workload branch 6 times, most recently from 59bdc1d to 8ecfbeb Compare October 23, 2020 00:04

jayshrivastava changed the title ~~improve errors schemachange workload; initial changes~~ workload/schemachange: improve errors; initial changes Oct 23, 2020

jayshrivastava force-pushed the improve-errors-schemachange-workload branch 4 times, most recently from 5b62f8a to 6c9c5fa Compare October 26, 2020 20:42

ajwerner suggested changes Oct 27, 2020

View reviewed changes

jayshrivastava added 4 commits October 27, 2020 14:30

jayshrivastava force-pushed the improve-errors-schemachange-workload branch 2 times, most recently from 6ba1f70 to 54da2ea Compare October 27, 2020 18:56

jayshrivastava commented Oct 27, 2020

View reviewed changes

jayshrivastava requested a review from ajwerner October 27, 2020 19:11

ajwerner reviewed Oct 27, 2020

View reviewed changes

jayshrivastava added 5 commits October 27, 2020 18:47

workload/schemachange: screen errors in addColumn op

8ab699a

This change updates the addColumn op to screen for potential errors. The op now updates the state of the op generator so that expected errors can safely be ignored. Release note: None

workload/schemachange: screen for errors in createTable op

1a677e8

Updates error screening so that expected errors are explicit and safely ignored in createTable op Release note: None

jayshrivastava force-pushed the improve-errors-schemachange-workload branch from 54da2ea to 1a677e8 Compare October 27, 2020 22:49

jayshrivastava commented Oct 28, 2020

View reviewed changes

jayshrivastava requested a review from ajwerner October 28, 2020 13:00

ajwerner approved these changes Oct 28, 2020

View reviewed changes

jayshrivastava commented Oct 28, 2020

View reviewed changes

craig bot merged commit 3c74a82 into cockroachdb:master Oct 28, 2020

This was referenced Oct 29, 2020

workload/schemachange: improve error classification, random schema generation, and observability #56118

Closed

workload/schemachange: classify and handle errors in all operations #56119

Closed

jayshrivastava deleted the improve-errors-schemachange-workload branch October 31, 2020 18:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

workload/schemachange: improve errors; initial changes#55521

workload/schemachange: improve errors; initial changes#55521
craig[bot] merged 9 commits intocockroachdb:masterfrom
jayshrivastava:improve-errors-schemachange-workload

jayshrivastava commented Oct 13, 2020 •

edited

Loading

Uh oh!

cockroach-teamcity commented Oct 13, 2020

Uh oh!

jayshrivastava left a comment

Uh oh!

ajwerner left a comment

Uh oh!

ajwerner left a comment

Uh oh!

jayshrivastava left a comment

Uh oh!

ajwerner left a comment

Uh oh!

jayshrivastava left a comment

Uh oh!

ajwerner left a comment

Uh oh!

jayshrivastava left a comment

Uh oh!

jayshrivastava commented Oct 28, 2020

Uh oh!

craig bot commented Oct 28, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jayshrivastava commented Oct 13, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cockroach-teamcity commented Oct 13, 2020

Uh oh!

jayshrivastava left a comment

Choose a reason for hiding this comment

Uh oh!

ajwerner left a comment

Choose a reason for hiding this comment

Uh oh!

ajwerner left a comment

Choose a reason for hiding this comment

Uh oh!

jayshrivastava left a comment

Choose a reason for hiding this comment

Uh oh!

ajwerner left a comment

Choose a reason for hiding this comment

Uh oh!

jayshrivastava left a comment

Choose a reason for hiding this comment

Uh oh!

ajwerner left a comment

Choose a reason for hiding this comment

Uh oh!

jayshrivastava left a comment

Choose a reason for hiding this comment

Uh oh!

jayshrivastava commented Oct 28, 2020

Uh oh!

craig bot commented Oct 28, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jayshrivastava commented Oct 13, 2020 •

edited

Loading