sql: add support for create index inside new schema changer#67266
sql: add support for create index inside new schema changer#67266craig[bot] merged 3 commits intocockroachdb:masterfrom
Conversation
9590bae to
0bfb2d5
Compare
|
LGTM from Bulk perspective |
|
@adityamaru brought to my attention that Bulk also owns some parts of |
c363764 to
f1d4499
Compare
6caad0e to
85cd376
Compare
ajwerner
left a comment
There was a problem hiding this comment.
I'm anxious about the ways that this is running into the pain of descriptors vs. elements.
Reviewed 19 of 33 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @fqazi)
pkg/sql/schemachanger/scbuild/index.go, line 67 at r5 (raw file):
// mod(fnv32(colNames[0]::STRING)+fnv32(colNames[1])+...,buckets) // func makeHashShardComputeExpr(colNames []string, buckets int) *string {
I wonder if we can share this code?
pkg/sql/schemachanger/scbuild/index.go, line 191 at r5 (raw file):
func (b *buildContext) createIndex(ctx context.Context, n *tree.CreateIndex) { // Look up the table first
nit: period
pkg/sql/schemachanger/scbuild/index.go, line 193 at r5 (raw file):
_, table, err := resolver.ResolveExistingTableObject(ctx, b.Res, &n.Table, tree.ObjectLookupFlagsWithRequired())
we ought to be able to plumb through a mutable flag here to not need to resolve it again. I'd like to know if that does not work.
pkg/sql/schemachanger/scbuild/index.go, line 197 at r5 (raw file):
panic(err) } // Detect if the index already exists
nit: period
pkg/sql/schemachanger/scbuild/index.go, line 202 at r5 (raw file):
if foundIndex.Dropped() { panic(pgerror.Newf(pgcode.ObjectNotInPrerequisiteState, "index %q being dropped, try again later", string(n.Name)))
nit: use the format/string method on the AST node for an index name.
pkg/sql/schemachanger/scbuild/index.go, line 211 at r5 (raw file):
if table.IsView() && !table.MaterializedView() { panic(pgerror.Newf(pgcode.WrongObjectType, "%q is not a table or materialized view", table.GetName()))
&n.Table?
pkg/sql/schemachanger/scbuild/index.go, line 244 at r5 (raw file):
} colNames := make([]string, 0, len(n.Columns)) // Setup the column ID's
nit: period and no '.
pkg/sql/schemachanger/scbuild/index.go, line 259 at r5 (raw file):
Quoted 10 lines of code…
_, typ, _, err := schemaexpr.DequalifyAndValidateExpr( ctx, table, columnNode.Expr, types.Any, "index expression", b.SemaCtx, tree.VolatilityImmutable, &n.Table, )
This is not going to work if the column was added in this transaction. At least we should detect that case by searching the elements and then returning a not supported error.
pkg/sql/schemachanger/scbuild/index.go, line 264 at r5 (raw file):
Quoted 4 lines of code…
mutableDesc, err := b.Descs.GetMutableTableByID(ctx, b.EvalCtx.Txn, table.GetID(), tree.ObjectLookupFlagsWithRequired()) if err != nil { panic(err) }
let's avoid this lookup.
pkg/sql/schemachanger/scbuild/index.go, line 269 at r5 (raw file):
// Otherwise, create a new virtual column and add it to the table // descriptor.
Otherwise relative to what?
pkg/sql/schemachanger/scbuild/index.go, line 271 at r5 (raw file):
// descriptor. colName := tabledesc.GenerateUniqueName("crdb_idx_expr", func(name string) bool { _, err := mutableDesc.FindColumnWithName(tree.Name(name))
Doesn't this need to think about newly added columns? Imagine creating more than one index or creating an index with more than one expression.
pkg/sql/schemachanger/scbuild/table.go, line 253 at r1 (raw file):
FamilyName: familyName, }) /*newPrimaryIdxID := */ b.addOrUpdatePrimaryIndexTargetsForAddColumn(table, colID, col.Name)
What is this about?
pkg/sql/lex/BUILD.bazel, line 7 at r1 (raw file):
srcs = [ "encode.go", "keywords.go",
I don't think you meant to add these.
fqazi
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/sql/schemachanger/scbuild/index.go, line 193 at r5 (raw file):
Previously, ajwerner wrote…
_, table, err := resolver.ResolveExistingTableObject(ctx, b.Res, &n.Table, tree.ObjectLookupFlagsWithRequired())we ought to be able to plumb through a mutable flag here to not need to resolve it again. I'd like to know if that does not work.
Done.
pkg/sql/schemachanger/scbuild/index.go, line 211 at r5 (raw file):
Previously, ajwerner wrote…
&n.Table?
Done.
pkg/sql/schemachanger/scbuild/index.go, line 264 at r5 (raw file):
Previously, ajwerner wrote…
mutableDesc, err := b.Descs.GetMutableTableByID(ctx, b.EvalCtx.Txn, table.GetID(), tree.ObjectLookupFlagsWithRequired()) if err != nil { panic(err) }let's avoid this lookup.
Done.
pkg/sql/schemachanger/scbuild/index.go, line 269 at r5 (raw file):
create
Your right no otherwise, this will always be done
pkg/sql/schemachanger/scbuild/index.go, line 271 at r5 (raw file):
Previously, ajwerner wrote…
Doesn't this need to think about newly added columns? Imagine creating more than one index or creating an index with more than one expression.
Yes, we need to sort out those scenarios, as a stop-gap, I'm returning unsupported errors. The query stuff I think will allow us better interfaces for this stuff too.
pkg/sql/schemachanger/scbuild/table.go, line 253 at r1 (raw file):
Previously, ajwerner wrote…
What is this about?
We were incorrectly adding a secondaryIndex with the same ID in the unique case if I remember correctly. So, the newPrimaryIdxId parameter has no real use.
pkg/sql/lex/BUILD.bazel, line 7 at r1 (raw file):
Previously, ajwerner wrote…
I don't think you meant to add these.
Done.
pkg/sql/schemachanger/scplan/plan.go
Outdated
| // SortOps sorts the operations into order based on | ||
| // graph dependencies | ||
| func SortOps(graph *scgraph.Graph, ops []scop.Op) { |
There was a problem hiding this comment.
I find myself surprised by this approach. I would have thought that it'd be easier to sort the edges than the ops. It seems to me like any time you need to sort, you already have the edges in scope. The edges then have the nodes and will generally avoid any need to go back and find nodes from ops.
There was a problem hiding this comment.
I believe that we can create a full topological sort of nodes at the outset of planning with the below code. We can then use that to sort op edges on their To node. I think that'll do it. Am I missing something? Consider the below code to generate a topological sort order:
// rankNodes performs a topological sort of the nodes in the graph.
func rankNodes(g *Graph) map[*scpb.Node]int {
backCycleExists := func(n *scpb.Node, de *DepEdge) bool {
var foundBack bool
_ = g.ForEachDepEdgeFrom(de.To(), func(maybeBack *DepEdge) error {
foundBack = foundBack || maybeBack.To() == n
return nil
})
return foundBack
}
l := list.New()
marks := make(map[*scpb.Node]bool)
var visit func(n *scpb.Node)
visit = func(n *scpb.Node) {
permanent, marked := marks[n]
fmt.Println(marked, permanent, screl.NodeString(n))
if marked && !permanent {
panic("not a dag")
}
if marked && permanent {
return
}
marks[n] = false
g.ForEachDepEdgeFrom(n, func(de *DepEdge) error {
// We want to eliminate cycles caused by swaps. In that
// case, we want to pretend that there is no edge from the
// add to the drop, and, in that way, the drop is ordered first.
if n.Direction != scpb.Target_ADD || !backCycleExists(n, de) {
visit(de.To())
}
return nil
})
if oe, ok := g.GetOpEdgeFrom(n); ok {
visit(oe.To())
}
marks[n] = true
l.PushFront(n)
}
_ = g.ForEachNode(func(n *scpb.Node) error {
visit(n)
return nil
})
rank := make(map[*scpb.Node]int, l.Len())
for i, cur := 0, l.Front(); i < l.Len(); i++ {
rank[cur.Value.(*scpb.Node)] = i
cur = cur.Next()
}
return rank
}
fqazi
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/sql/schemachanger/scplan/plan.go, line 238 at r19 (raw file):
Previously, ajwerner wrote…
I believe that we can create a full topological sort of nodes at the outset of planning with the below code. We can then use that to sort op edges on their
Tonode. I think that'll do it. Am I missing something? Consider the below code to generate a topological sort order:// rankNodes performs a topological sort of the nodes in the graph. func rankNodes(g *Graph) map[*scpb.Node]int { backCycleExists := func(n *scpb.Node, de *DepEdge) bool { var foundBack bool _ = g.ForEachDepEdgeFrom(de.To(), func(maybeBack *DepEdge) error { foundBack = foundBack || maybeBack.To() == n return nil }) return foundBack } l := list.New() marks := make(map[*scpb.Node]bool) var visit func(n *scpb.Node) visit = func(n *scpb.Node) { permanent, marked := marks[n] fmt.Println(marked, permanent, screl.NodeString(n)) if marked && !permanent { panic("not a dag") } if marked && permanent { return } marks[n] = false g.ForEachDepEdgeFrom(n, func(de *DepEdge) error { // We want to eliminate cycles caused by swaps. In that // case, we want to pretend that there is no edge from the // add to the drop, and, in that way, the drop is ordered first. if n.Direction != scpb.Target_ADD || !backCycleExists(n, de) { visit(de.To()) } return nil }) if oe, ok := g.GetOpEdgeFrom(n); ok { visit(oe.To()) } marks[n] = true l.PushFront(n) } _ = g.ForEachNode(func(n *scpb.Node) error { visit(n) return nil }) rank := make(map[*scpb.Node]int, l.Len()) for i, cur := 0, l.Front(); i < l.Len(); i++ { rank[cur.Value.(*scpb.Node)] = i cur = cur.Next() } return rank }
Think I was way too focused on sorting the ops, so my approach was a bit heavy-handed. This algorithm looks good, and is much simpler, I'll replace the logic inside the graph.
ajwerner
left a comment
There was a problem hiding this comment.
My comments are mostly superficial and point towards tightening up the commentary.
| table, familyName, d.Family.Create, d.Family.IfNotExists, | ||
| ) | ||
| } else { | ||
| } else if !d.IsVirtual() { // FIXME: Compute columns should not have families? |
There was a problem hiding this comment.
Stored ones definitely need families. Virtual ones I would think can live without them but I don't know whether or not they do.
| allowedNewColumnNames []tree.Name, | ||
| allowImplicitPartitioning bool, | ||
| ) error { | ||
| return nil |
There was a problem hiding this comment.
ideally we'd track this and verify it in the tests
| // Name returns the name of the rule which generated this edge. | ||
| func (de *DepEdge) Name() string { return de.rule } | ||
|
|
||
| // GetNodeRanks fetches ranks of nodes in topological order |
pkg/sql/schemachanger/scjob/job.go
Outdated
| execCfg.Codec, | ||
| execCfg.Settings, | ||
| sql.MakeIndexValidator(execCfg.DB, execCfg.Codec, execCfg.InternalExecutor), | ||
| func(ctx context.Context, tableDesc *tabledesc.Mutable, indexDesc descpb.IndexDescriptor, partBy *tree.PartitionBy, allowedNewColumnNames []tree.Name, allowImplicitPartitioning bool) (newImplicitCols []catalog.Column, newPartitioning descpb.PartitioningDescriptor, err error) { |
There was a problem hiding this comment.
wrap this, or, perhaps better, write a function that has this signature.
How would you feel about creating an interface as opposed to this function type? IDEs make it easy to implement the interface.
| TypeID descpb.ID | ||
| } | ||
|
|
||
| // FIXME: Check create index first |
| RangePartitions []*scpb.RangePartitions | ||
| } | ||
|
|
||
| // NoOpInfo a no-op mutation operation. |
There was a problem hiding this comment.
Can you explain what this is for?
| repeated uint32 dependentObjects = 2 [(gogoproto.customname) = "DependentObjects", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/catalog/descpb.ID"]; | ||
| } | ||
|
|
||
| // FIXME: Dead code |
pkg/sql/schemachanger/scplan/plan.go
Outdated
| isStageRevertible = revertible == 1 | ||
| for i, ts := range cur { | ||
| for _, e := range edges { | ||
| // FIXME: Flip the loop |
There was a problem hiding this comment.
can you say more or do this?
pkg/sql/schemachanger/scplan/plan.go
Outdated
| } | ||
| // Sort ops based on graph dependencies. | ||
| sortOps(g, s.Ops.Slice()) | ||
| //SortOps(g, s.Ops.Slice()) |
| return stages.String() | ||
| } | ||
|
|
||
| // TestPlanGraphSort sanity checks sorting of the graph. |
There was a problem hiding this comment.
it seems like this test can go in scgraph. I wonder if we can rework it to be easier to read and extend.
fqazi
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @fqazi)
pkg/sql/schemachanger/scjob/job.go, line 74 at r24 (raw file):
How would you feel about creating an interface as opposed to this function type? IDEs make it easy to implement the interface.
pkg/sql/schemachanger/scop/mutation.go, line 145 at r24 (raw file):
Previously, ajwerner wrote…
Can you add more to this?
Stale comment left over from when I started off.
pkg/sql/schemachanger/scop/mutation.go, line 267 at r24 (raw file):
Previously, ajwerner wrote…
Can you explain what this is for?
// NoOpInfo a no-op mutation operation, which is allows
// an emit function to conditionally generate operations.
pkg/sql/schemachanger/scpb/scpb.proto, line 222 at r24 (raw file):
Previously, ajwerner wrote…
?
Stale comment from development, I should have checked my diffs for these.
Done.
pkg/sql/schemachanger/scplan/plan.go, line 170 at r24 (raw file):
Previously, ajwerner wrote…
can you say more or do this?
The edge and cur loops were flipped, which was a reminder for me.
Done.
pkg/sql/schemachanger/scplan/plan.go, line 238 at r24 (raw file):
Previously, ajwerner wrote…
detritus
Done
pkg/sql/schemachanger/scplan/plan_test.go, line 197 at r24 (raw file):
Previously, ajwerner wrote…
it seems like this test can go in
scgraph. I wonder if we can rework it to be easier to read and extend.
Reworked this to have more syntatic sugar and more readability.
Done.
ajwerner
left a comment
There was a problem hiding this comment.
Reviewed 1 of 58 files at r25, 1 of 33 files at r26, 4 of 46 files at r27.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @fqazi)
pkg/sql/schema_changer_ccl.go, line 50 at r27 (raw file):
allowImplicitPartitioning) } func MakeCCLCallbacks(
linter is going to want a comment here
pkg/sql/schemachanger/scexec/dependencies.go, line 99 at r27 (raw file):
// CCLCallbacks provides an interface that implements CCL exclusive // callbacks. type CCLCallbacks interface {
nit: call this Partitioner? The fact that it's CCL isn't that important or meaningful.
pkg/sql/schemachanger/scplan/plan.go, line 236 at r27 (raw file):
break } // Sort ops based on graph dependencies.
This comment is now stale.
|
@ajwerner TFTR, thanks for the lightning faster review :) |
12a69d0 to
56966da
Compare
|
bors r+ |
|
Build failed (retrying...): |
|
Merge skew with the interleaved removal. bors r- |
|
Canceled. |
Previously, the new schema changer attempted to use a bubble sort which compared the nodes linked to ops to generate an approximate order. Unfortunately, this approach had its limitations when it came to elements that had no edge linking them, which could lead to incorrect order relative to dependency edges. To address this, this patch moves the sorting to a simpler topological sort which guarantees dependency order. Release note: None
Fixes: cockroachdb#67264 Previously, create index was not supported inside the new shcema changer. This was inadequate because, to help support transaction schema changes in the future we need all schema changes inside the new schema changer. To address this, this patch adds CREATE INDEX inside the new schema changer. Release note: None
Previously, the unique index validation inside the new schema changer was stubbed out. This was inadequate because during index addition operations we were not properly confirming the unique constraints were properly implemented. To address this, this patch adds support in the new schema changer for the valdiation logic. Release note: None
|
bors r+ |
|
Build succeeded: |
Fixes: #67264
Previously, create index was not supported inside the
new schema changer. This was inadequate because to
help support transaction schema changes in the future
we need all schema changes inside the new schema changer.
To address this, this patch adds CREATE INDEX inside
the new schema changer.
Release note: None