workload/schemachange support user defined schemas in workload#54889
Conversation
93ed9b5 to
2ca209d
Compare
|
|
||
| stmt := rowenc.RandCreateTable(w.rng, "table", int(atomic.AddInt64(w.seqNum, 1))) | ||
| stmt.Table = tree.MakeUnqualifiedTableName(tree.Name(tableName)) | ||
| stmt.Table = tree.MakeTableNameWithSchema((tree.Name)(w.database), (tree.Name)(schemaName), (tree.Name)(tableName)) |
There was a problem hiding this comment.
I'm not sure if specifying the best db is the best thing to do here. It might be better to leave it blank until the workload supports multiple dbs.
| return "", err | ||
| } | ||
|
|
||
| stmt := rowenc.MakeSchemaName(w.rng.Intn(2) == 0, schemaName, "root") |
There was a problem hiding this comment.
It may be better to leave authorization plank until the workload supports multiple roles.
7edd1eb to
ff3a5dc
Compare
ajwerner
left a comment
There was a problem hiding this comment.
Reviewed 2 of 3 files at r1, 2 of 2 files at r2, 1 of 1 files at r3, 1 of 1 files at r4, 1 of 1 files at r5.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava)
pkg/workload/schemachange/schemachange.go, line 1100 at r1 (raw file):
Previously, jayshrivastava (Jayant Shrivastava) wrote…
It may be better to leave authorization plank until the workload supports multiple roles.
I agree completely. We should just leave a TODO to support authorization. Also worth noting that it is valid to not provide a name when you provide an authorization as it implies the schema of the same name.
pkg/workload/schemachange/schemachange.go, line 1110 at r1 (raw file):
const q = ` SELECT * FROM [information_schema.schemata]
This query is suspect to me. Firstly, I don't think those where brackets make sense. Secondly, if you do SELECT * I think it's going to fail when you only try to scan a single column.
I think you want:
SELECT schema_name
FROM information_schema.schemata
WHERE schema_name LIKE 'schema%' OR schema_name = 'public'
ORDER BY random()
LIMIT 1;
I'm starting to think that we need to make it so that any errors coming from these schema introspection functions fail the workload completely. My read is that they currently do not.
pkg/workload/schemachange/schemachange.go, line 565 at r3 (raw file):
Previously, jayshrivastava (Jayant Shrivastava) wrote…
I'm not sure if specifying the best db is the best thing to do here. It might be better to leave it blank until the workload supports multiple dbs.
This seems fine. If you'd prefer to not use the database name you could do:
tree.MakeTableNameFromPrefix(tree.ObjectPrefix{
SchemaName: tree.Name(schemaName),
ExplicitSchema: true,
}, tree.Name(tableName))
Also nit: you don't need parents around the type name for the cast.
pkg/workload/schemachange/schemachange.go, line 584 at r3 (raw file):
return "", err } qualifiedTableName := fmt.Sprintf("%s.%s", schemaName, tableName)
On this whole commit I think it'd be better to have a helper here that can choose to omit public or something like that. I'd take it one step further and have randTable just return a string or a tree.TableName (which has a String() method)
The repetition in this commit isn't very valuable.
pkg/workload/schemachange/schemachange.go, line 1077 at r5 (raw file):
} func (w *schemaChangeWorker) randView(tx *pgx.Tx, pctExisting int) (string, string, error) {
same comment on the return types here, returning two strings makes the caller do a lot of work and makes the code harder to read as far as I can tell. what do you think?
ff3a5dc to
46f3e5f
Compare
46f3e5f to
d46a9dd
Compare
d46a9dd to
3bfe843
Compare
| func (w *schemaChangeWorker) randTable(tx *pgx.Tx, pctExisting int) (tree.TableName, error) { | ||
| if w.rng.Intn(100) >= pctExisting { | ||
| randSchema, err := w.randSchema(tx, pctExisting) | ||
| randSchema, err := w.randSchema(tx, 90) |
There was a problem hiding this comment.
pctExisting will be a low value when randTable is called with the intention of creating a table, so a non existing table is returned most of the time. Before I added schemas, this means that a create table operation would mostly succeed because most of the time a non existing table name is returned. However, when you pass pctExisting to randSchema, the schema will most likely not exist. This would cause the create table operation to fail most of the time. To restore the original behaviour of normally succeeding, I thought 90 would be a good value here.
The same thing applies to createEnum and createView.
jayshrivastava
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/workload/schemachange/schemachange.go, line 584 at r3 (raw file):
Previously, ajwerner wrote…
On this whole commit I think it'd be better to have a helper here that can choose to omit public or something like that. I'd take it one step further and have
randTablejust return a string or atree.TableName(which has aString()method)The repetition in this commit isn't very valuable.
Ok. I Changed the return type.
pkg/workload/schemachange/schemachange.go, line 1077 at r5 (raw file):
Previously, ajwerner wrote…
same comment on the return types here, returning two strings makes the caller do a lot of work and makes the code harder to read as far as I can tell. what do you think?
Ok. It makes sense to me. I just put out these changes.
3bfe843 to
017c5a8
Compare
ajwerner
left a comment
There was a problem hiding this comment.
Seems like you need a rebase. This is getting close.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jayshrivastava)
pkg/workload/schemachange/schemachange.go, line 1035 at r12 (raw file):
Previously, jayshrivastava (Jayant Shrivastava) wrote…
pctExistingwill be a low value whenrandTableis called with the intention of creating a table, so a non existing table is returned most of the time. Before I added schemas, this means that a create table operation would mostly succeed because most of the time a non existing table name is returned. However, when you passpctExistingtorandSchema, the schema will most likely not exist. This would cause the create table operation to fail most of the time. To restore the original behaviour of normally succeeding, I thought 90 would be a good value here.The same thing applies to createEnum and createView.
interesting, I feel like if we're choosing to not return an existing name we know that no matter what, what we return here will not exist. The question at hand is whether the schema also does not exist. I don't know how to be principled about what to do here. Picking a random constant feels worse than just propagating pctExisting, no?
pkg/workload/schemachange/schemachange.go, line 502 at r16 (raw file):
} def.Nullable.Nullability = tree.Nullability(rand.Intn(1 + int(tree.SilentNull))) return fmt.Sprintf(`ALTER TABLE "%s" ADD COLUMN %s`, tableName.String(), tree.Serialize(def)), nil
nit: the %s verb calls String()
pkg/workload/schemachange/schemachange.go, line 600 at r16 (raw file):
return fmt.Sprintf(`CREATE TABLE "%s" AS SELECT %s FROM "%s"`, tableName.String(), tree.Serialize(&names), destTableName.String()), nil
nit: the %s verb calls String()
pkg/workload/schemachange/schemachange.go, line 640 at r16 (raw file):
return "", err } return fmt.Sprintf(`ALTER TABLE "%s" DROP COLUMN "%s"`, tableName.String(), columnName), nil
nit: the %s verb calls String()
pkg/workload/schemachange/schemachange.go, line 652 at r16 (raw file):
return "", err } return fmt.Sprintf(`ALTER TABLE "%s" ALTER COLUMN "%s" DROP DEFAULT`, tableName.String(), columnName), nil
nit: the %s verb calls String()
pkg/workload/schemachange/schemachange.go, line 664 at r16 (raw file):
return "", err } return fmt.Sprintf(`ALTER TABLE "%s" ALTER COLUMN "%s" DROP NOT NULL`, tableName.String(), columnName), nil
nit: the %s verb calls String()
pkg/workload/schemachange/schemachange.go, line 747 at r16 (raw file):
return fmt.Sprintf(`ALTER TABLE "%s" RENAME COLUMN "%s" TO "%s"`, tableName.String(), srcColumnName, destColumnName), nil
nit: the %s verb calls String()
pkg/workload/schemachange/schemachange.go, line 767 at r16 (raw file):
return fmt.Sprintf(`ALTER TABLE "%s" RENAME CONSTRAINT "%s" TO "%s"`, tableName.String(), srcIndexName, destIndexName), nil
nit: the %s verb calls String()
pkg/workload/schemachange/schemachange.go, line 795 at r16 (raw file):
} return fmt.Sprintf(`ALTER TABLE "%s" RENAME TO "%s"`, srcTableName.String(), destTableName.String()), nil
nit: the %s verb calls String()
pkg/workload/schemachange/schemachange.go, line 809 at r16 (raw file):
} return fmt.Sprintf(`ALTER VIEW "%s" RENAME TO "%s"`, srcViewName.String(), destViewName.String()), nil
nit: the %s verb calls String()
pkg/workload/schemachange/schemachange.go, line 827 at r16 (raw file):
return "", err } return fmt.Sprintf(`ALTER TABLE "%s" ALTER COLUMN "%s" SET NOT NULL`, tableName.String(), columnName), nil
nit: the %s verb calls String()
pkg/workload/schemachange/schemachange.go, line 844 at r16 (raw file):
} return fmt.Sprintf(`ALTER TABLE "%s" ALTER COLUMN "%s" SET DATA TYPE %s`, tableName.String(), columnName, typ), nil
nit: the %s verb calls String()
pkg/workload/schemachange/schemachange.go, line 872 at r16 (raw file):
return fmt.Sprintf( `INSERT INTO "%s" (%s) VALUES %s`, tableName.String(),
nit: the %s verb calls String()
4b25259 to
7924846
Compare
jayshrivastava
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jayshrivastava)
pkg/workload/schemachange/schemachange.go, line 1035 at r12 (raw file):
Previously, ajwerner wrote…
interesting, I feel like if we're choosing to not return an existing name we know that no matter what, what we return here will not exist. The question at hand is whether the schema also does not exist. I don't know how to be principled about what to do here. Picking a random constant feels worse than just propagating
pctExisting, no?
It is worth noting that we do use int literals in a few places, but that I plan to clean up all of them with my work related to error classification and control.
I think 100 - pctExisting might be the best. We mainly use pctExisting as a way to control errors. For example, we call w.randTable(tx, 10) in createTable. I feel that the intention is to have it fail roughly 10% of the time due to an existing table being given, otherwise succeed 90% of the time. My problem with propagatingpctExisting is that we will actually fail most of the time in that 90% case by calling w.randSchema(tx, 10)). The problem is we do not distinguish between if the error case is when a name does exist (ie. the table name) and when it doesn't (the schema name). There are a lot of examples of this in the workload (ex. create a column on a table - in the non error case, the column name does not exist, but the table name does). It makes sense to use two different pctExistings here to achieve some overall error rate.
Overall, I think the idea of pctExisting needs to be refactored into something called an errorRate. We should also configure it as a property in schemaChangeWorker rather than int literals. Then, we need to provide an abstraction that generates the pctExisting to use based on if the error case is having a name exist or not. I plan to accomplish this during my next project.
pkg/workload/schemachange/schemachange.go, line 600 at r16 (raw file):
Previously, ajwerner wrote…
nit: the
%sverb callsString()
It seems like this happens: CREATE TABLE {{table1647 { public %!s(bool=false) %!s(bool=true)}}} AS SELECT col676_6, col676_3, col676_1 FROM {{table675 { public %!s(bool=false) %!s(bool=true)}}} it also causes CI to fail due to linting.
ajwerner
left a comment
There was a problem hiding this comment.
still needs a rebase against master, no?
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jayshrivastava)
pkg/workload/schemachange/schemachange.go, line 1035 at r12 (raw file):
Previously, jayshrivastava (Jayant Shrivastava) wrote…
It is worth noting that we do use int literals in a few places, but that I plan to clean up all of them with my work related to error classification and control.
I think
100 - pctExistingmight be the best. We mainly usepctExistingas a way to control errors. For example, we callw.randTable(tx, 10)increateTable. I feel that the intention is to have it fail roughly 10% of the time due to an existing table being given, otherwise succeed 90% of the time. My problem with propagatingpctExistingis that we will actually fail most of the time in that 90% case by callingw.randSchema(tx, 10)). The problem is we do not distinguish between if the error case is when a name does exist (ie. the table name) and when it doesn't (the schema name). There are a lot of examples of this in the workload (ex. create a column on a table - in the non error case, the column name does not exist, but the table name does). It makes sense to use two differentpctExistings here to achieve some overall error rate.Overall, I think the idea of
pctExistingneeds to be refactored into something called anerrorRate. We should also configure it as a property inschemaChangeWorkerrather than int literals. Then, we need to provide an abstraction that generates the pctExisting to use based on if the error case is having a name exist or not. I plan to accomplish this during my next project.
sgtm.
pkg/workload/schemachange/schemachange.go, line 600 at r16 (raw file):
Previously, jayshrivastava (Jayant Shrivastava) wrote…
It seems like this happens:
CREATE TABLE {{table1647 { public %!s(bool=false) %!s(bool=true)}}} AS SELECT col676_6, col676_3, col676_1 FROM {{table675 { public %!s(bool=false) %!s(bool=true)}}}it also causes CI to fail due to linting.
okay, what's going on here is that the string method is on the pointer. What if we changed the return type to be the pointer rather than the struct?
jayshrivastava
left a comment
There was a problem hiding this comment.
Yup. I'll rebase momentarily.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jayshrivastava)
pkg/workload/schemachange/schemachange.go, line 600 at r16 (raw file):
Previously, ajwerner wrote…
okay, what's going on here is that the string method is on the pointer. What if we changed the return type to be the pointer rather than the struct?
Because you cannot take the address of a fcn in Go, we would have to change all the return statements in randTable and randView to look like this.
tableName := tree.MakeTableNameFromPrefix(tree.ObjectNamePrefix{
SchemaName: tree.Name(randSchema),
ExplicitSchema: true,
}, tree.Name(fmt.Sprintf("table%d", atomic.AddInt64(w.seqNum, 1)))), nil
return tableName
of alternatively, we can update the Sprintfs to take the address.
fmt.Sprintf(CREATE TABLE "%s" AS SELECT %s FROM %s, &destTableName, tree.Serialize(&names), &tableName)
Release note: None
Release note: None
jayshrivastava
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jayshrivastava)
pkg/workload/schemachange/schemachange.go, line 600 at r16 (raw file):
Previously, jayshrivastava (Jayant Shrivastava) wrote…
Because you cannot take the address of a fcn in Go, we would have to change all the return statements in
randTableandrandViewto look like this.tableName := tree.MakeTableNameFromPrefix(tree.ObjectNamePrefix{ SchemaName: tree.Name(randSchema), ExplicitSchema: true, }, tree.Name(fmt.Sprintf("table%d", atomic.AddInt64(w.seqNum, 1)))), nil return tableNameof alternatively, we can update the
Sprintfs to take the address.
fmt.Sprintf(CREATE TABLE "%s" AS SELECT %s FROM %s, &destTableName, tree.Serialize(&names), &tableName)
I'll go with the first option. It'll keep the changes in 1 spot for each rand fcn.
Release note: None
Release note: None
Resolves: cockroachdb#54408 Release note: None
Updated the 'Example usage' comment at the top of the workload to use the correct flag names and types Release note: None
…ements This fixes a syntax error. Specifying `"public.table_name"` instead of `public.table_name` will not recognize `public` as the schema name. Release note: None
7924846 to
a8bcdd6
Compare
ajwerner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @jayshrivastava)
|
bors r+ |
|
Build succeeded: |
Adds support for user-defined schemas
Resolves: #54408
Release note: None