Skip to content

sql: add support for create index inside new schema changer#67266

Merged
craig[bot] merged 3 commits intocockroachdb:masterfrom
fqazi:createTableNew
Nov 9, 2021
Merged

sql: add support for create index inside new schema changer#67266
craig[bot] merged 3 commits intocockroachdb:masterfrom
fqazi:createTableNew

Conversation

@fqazi
Copy link
Copy Markdown
Collaborator

@fqazi fqazi commented Jul 6, 2021

Fixes: #67264

Previously, create index was not supported inside the
new schema changer. This was inadequate because to
help support transaction schema changes in the future
we need all schema changes inside the new schema changer.
To address this, this patch adds CREATE INDEX inside
the new schema changer.

Release note: None

@fqazi fqazi requested review from a team and nihalpednekar and removed request for a team July 6, 2021 13:50
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@fqazi fqazi force-pushed the createTableNew branch 3 times, most recently from 9590bae to 0bfb2d5 Compare July 6, 2021 14:41
@nihalpednekar
Copy link
Copy Markdown

LGTM from Bulk perspective

@fqazi fqazi force-pushed the createTableNew branch from 0bfb2d5 to 525f96b Compare July 6, 2021 17:37
@nihalpednekar
Copy link
Copy Markdown

@adityamaru brought to my attention that Bulk also owns some parts of */backfill.go so please wait until someone else from Bulk takes a look at this as well.

@nihalpednekar nihalpednekar requested a review from a team July 6, 2021 18:02
@fqazi fqazi force-pushed the createTableNew branch 6 times, most recently from c363764 to f1d4499 Compare July 29, 2021 23:51
@fqazi fqazi requested a review from a team as a code owner July 29, 2021 23:51
@fqazi fqazi force-pushed the createTableNew branch 2 times, most recently from 6caad0e to 85cd376 Compare July 30, 2021 13:51
Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm anxious about the ways that this is running into the pain of descriptors vs. elements.

Reviewed 19 of 33 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @fqazi)


pkg/sql/schemachanger/scbuild/index.go, line 67 at r5 (raw file):

//    mod(fnv32(colNames[0]::STRING)+fnv32(colNames[1])+...,buckets)
//
func makeHashShardComputeExpr(colNames []string, buckets int) *string {

I wonder if we can share this code?


pkg/sql/schemachanger/scbuild/index.go, line 191 at r5 (raw file):

func (b *buildContext) createIndex(ctx context.Context, n *tree.CreateIndex) {
	// Look up the table first

nit: period


pkg/sql/schemachanger/scbuild/index.go, line 193 at r5 (raw file):

	_, table, err := resolver.ResolveExistingTableObject(ctx, b.Res, &n.Table,
		tree.ObjectLookupFlagsWithRequired())

we ought to be able to plumb through a mutable flag here to not need to resolve it again. I'd like to know if that does not work.


pkg/sql/schemachanger/scbuild/index.go, line 197 at r5 (raw file):

		panic(err)
	}
	// Detect if the index already exists

nit: period


pkg/sql/schemachanger/scbuild/index.go, line 202 at r5 (raw file):

		if foundIndex.Dropped() {
			panic(pgerror.Newf(pgcode.ObjectNotInPrerequisiteState,
				"index %q being dropped, try again later", string(n.Name)))

nit: use the format/string method on the AST node for an index name.


pkg/sql/schemachanger/scbuild/index.go, line 211 at r5 (raw file):

	if table.IsView() && !table.MaterializedView() {
		panic(pgerror.Newf(pgcode.WrongObjectType, "%q is not a table or materialized view", table.GetName()))

&n.Table?


pkg/sql/schemachanger/scbuild/index.go, line 244 at r5 (raw file):

	}
	colNames := make([]string, 0, len(n.Columns))
	// Setup the column ID's

nit: period and no '.


pkg/sql/schemachanger/scbuild/index.go, line 259 at r5 (raw file):

Quoted 10 lines of code…
			_, typ, _, err := schemaexpr.DequalifyAndValidateExpr(
				ctx,
				table,
				columnNode.Expr,
				types.Any,
				"index expression",
				b.SemaCtx,
				tree.VolatilityImmutable,
				&n.Table,
			)

This is not going to work if the column was added in this transaction. At least we should detect that case by searching the elements and then returning a not supported error.


pkg/sql/schemachanger/scbuild/index.go, line 264 at r5 (raw file):

Quoted 4 lines of code…
			mutableDesc, err := b.Descs.GetMutableTableByID(ctx, b.EvalCtx.Txn, table.GetID(), tree.ObjectLookupFlagsWithRequired())
			if err != nil {
				panic(err)
			}

let's avoid this lookup.


pkg/sql/schemachanger/scbuild/index.go, line 269 at r5 (raw file):

			// Otherwise, create a new virtual column and add it to the table
			// descriptor.

Otherwise relative to what?


pkg/sql/schemachanger/scbuild/index.go, line 271 at r5 (raw file):

			// descriptor.
			colName := tabledesc.GenerateUniqueName("crdb_idx_expr", func(name string) bool {
				_, err := mutableDesc.FindColumnWithName(tree.Name(name))

Doesn't this need to think about newly added columns? Imagine creating more than one index or creating an index with more than one expression.


pkg/sql/schemachanger/scbuild/table.go, line 253 at r1 (raw file):

		FamilyName: familyName,
	})
	/*newPrimaryIdxID := */ b.addOrUpdatePrimaryIndexTargetsForAddColumn(table, colID, col.Name)

What is this about?


pkg/sql/lex/BUILD.bazel, line 7 at r1 (raw file):

    srcs = [
        "encode.go",
        "keywords.go",

I don't think you meant to add these.

@fqazi fqazi requested a review from a team as a code owner August 3, 2021 17:30
Copy link
Copy Markdown
Collaborator Author

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/sql/schemachanger/scbuild/index.go, line 193 at r5 (raw file):

Previously, ajwerner wrote…
	_, table, err := resolver.ResolveExistingTableObject(ctx, b.Res, &n.Table,
		tree.ObjectLookupFlagsWithRequired())

we ought to be able to plumb through a mutable flag here to not need to resolve it again. I'd like to know if that does not work.

Done.


pkg/sql/schemachanger/scbuild/index.go, line 211 at r5 (raw file):

Previously, ajwerner wrote…

&n.Table?

Done.


pkg/sql/schemachanger/scbuild/index.go, line 264 at r5 (raw file):

Previously, ajwerner wrote…
			mutableDesc, err := b.Descs.GetMutableTableByID(ctx, b.EvalCtx.Txn, table.GetID(), tree.ObjectLookupFlagsWithRequired())
			if err != nil {
				panic(err)
			}

let's avoid this lookup.

Done.


pkg/sql/schemachanger/scbuild/index.go, line 269 at r5 (raw file):

create
Your right no otherwise, this will always be done


pkg/sql/schemachanger/scbuild/index.go, line 271 at r5 (raw file):

Previously, ajwerner wrote…

Doesn't this need to think about newly added columns? Imagine creating more than one index or creating an index with more than one expression.

Yes, we need to sort out those scenarios, as a stop-gap, I'm returning unsupported errors. The query stuff I think will allow us better interfaces for this stuff too.


pkg/sql/schemachanger/scbuild/table.go, line 253 at r1 (raw file):

Previously, ajwerner wrote…

What is this about?

We were incorrectly adding a secondaryIndex with the same ID in the unique case if I remember correctly. So, the newPrimaryIdxId parameter has no real use.


pkg/sql/lex/BUILD.bazel, line 7 at r1 (raw file):

Previously, ajwerner wrote…

I don't think you meant to add these.

Done.

@fqazi fqazi requested a review from ajwerner August 3, 2021 19:22
Comment on lines +236 to +238
// SortOps sorts the operations into order based on
// graph dependencies
func SortOps(graph *scgraph.Graph, ops []scop.Op) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find myself surprised by this approach. I would have thought that it'd be easier to sort the edges than the ops. It seems to me like any time you need to sort, you already have the edges in scope. The edges then have the nodes and will generally avoid any need to go back and find nodes from ops.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that we can create a full topological sort of nodes at the outset of planning with the below code. We can then use that to sort op edges on their To node. I think that'll do it. Am I missing something? Consider the below code to generate a topological sort order:

// rankNodes performs a topological sort of the nodes in the graph.
func rankNodes(g *Graph) map[*scpb.Node]int {
	backCycleExists := func(n *scpb.Node, de *DepEdge) bool {
		var foundBack bool
		_ = g.ForEachDepEdgeFrom(de.To(), func(maybeBack *DepEdge) error {
			foundBack = foundBack || maybeBack.To() == n
			return nil
		})
		return foundBack
	}
	l := list.New()
	marks := make(map[*scpb.Node]bool)
	var visit func(n *scpb.Node)
	visit = func(n *scpb.Node) {
		permanent, marked := marks[n]
		fmt.Println(marked, permanent, screl.NodeString(n))
		if marked && !permanent {
			panic("not a dag")
		}
		if marked && permanent {
			return
		}
		marks[n] = false
		g.ForEachDepEdgeFrom(n, func(de *DepEdge) error {
			// We want to eliminate cycles caused by swaps. In that
			// case, we want to pretend that there is no edge from the
			// add to the drop, and, in that way, the drop is ordered first.
			if n.Direction != scpb.Target_ADD || !backCycleExists(n, de) {
				visit(de.To())
			}
			return nil
		})
		if oe, ok := g.GetOpEdgeFrom(n); ok {
			visit(oe.To())
		}
		marks[n] = true
		l.PushFront(n)
	}

	_ = g.ForEachNode(func(n *scpb.Node) error {
		visit(n)
		return nil
	})
	rank := make(map[*scpb.Node]int, l.Len())
	for i, cur := 0, l.Front(); i < l.Len(); i++ {
		rank[cur.Value.(*scpb.Node)] = i
		cur = cur.Next()
	}
	return rank
}

Copy link
Copy Markdown
Collaborator Author

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/sql/schemachanger/scplan/plan.go, line 238 at r19 (raw file):

Previously, ajwerner wrote…

I believe that we can create a full topological sort of nodes at the outset of planning with the below code. We can then use that to sort op edges on their To node. I think that'll do it. Am I missing something? Consider the below code to generate a topological sort order:

// rankNodes performs a topological sort of the nodes in the graph.
func rankNodes(g *Graph) map[*scpb.Node]int {
	backCycleExists := func(n *scpb.Node, de *DepEdge) bool {
		var foundBack bool
		_ = g.ForEachDepEdgeFrom(de.To(), func(maybeBack *DepEdge) error {
			foundBack = foundBack || maybeBack.To() == n
			return nil
		})
		return foundBack
	}
	l := list.New()
	marks := make(map[*scpb.Node]bool)
	var visit func(n *scpb.Node)
	visit = func(n *scpb.Node) {
		permanent, marked := marks[n]
		fmt.Println(marked, permanent, screl.NodeString(n))
		if marked && !permanent {
			panic("not a dag")
		}
		if marked && permanent {
			return
		}
		marks[n] = false
		g.ForEachDepEdgeFrom(n, func(de *DepEdge) error {
			// We want to eliminate cycles caused by swaps. In that
			// case, we want to pretend that there is no edge from the
			// add to the drop, and, in that way, the drop is ordered first.
			if n.Direction != scpb.Target_ADD || !backCycleExists(n, de) {
				visit(de.To())
			}
			return nil
		})
		if oe, ok := g.GetOpEdgeFrom(n); ok {
			visit(oe.To())
		}
		marks[n] = true
		l.PushFront(n)
	}

	_ = g.ForEachNode(func(n *scpb.Node) error {
		visit(n)
		return nil
	})
	rank := make(map[*scpb.Node]int, l.Len())
	for i, cur := 0, l.Front(); i < l.Len(); i++ {
		rank[cur.Value.(*scpb.Node)] = i
		cur = cur.Next()
	}
	return rank
}

Think I was way too focused on sorting the ops, so my approach was a bit heavy-handed. This algorithm looks good, and is much simpler, I'll replace the logic inside the graph.

@fqazi fqazi requested a review from ajwerner November 4, 2021 16:56
Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments are mostly superficial and point towards tightening up the commentary.

table, familyName, d.Family.Create, d.Family.IfNotExists,
)
} else {
} else if !d.IsVirtual() { // FIXME: Compute columns should not have families?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stored ones definitely need families. Virtual ones I would think can live without them but I don't know whether or not they do.

allowedNewColumnNames []tree.Name,
allowImplicitPartitioning bool,
) error {
return nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally we'd track this and verify it in the tests

// Name returns the name of the rule which generated this edge.
func (de *DepEdge) Name() string { return de.rule }

// GetNodeRanks fetches ranks of nodes in topological order
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: period

execCfg.Codec,
execCfg.Settings,
sql.MakeIndexValidator(execCfg.DB, execCfg.Codec, execCfg.InternalExecutor),
func(ctx context.Context, tableDesc *tabledesc.Mutable, indexDesc descpb.IndexDescriptor, partBy *tree.PartitionBy, allowedNewColumnNames []tree.Name, allowImplicitPartitioning bool) (newImplicitCols []catalog.Column, newPartitioning descpb.PartitioningDescriptor, err error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrap this, or, perhaps better, write a function that has this signature.

How would you feel about creating an interface as opposed to this function type? IDEs make it easy to implement the interface.

TypeID descpb.ID
}

// FIXME: Check create index first
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add more to this?

RangePartitions []*scpb.RangePartitions
}

// NoOpInfo a no-op mutation operation.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what this is for?

repeated uint32 dependentObjects = 2 [(gogoproto.customname) = "DependentObjects", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/catalog/descpb.ID"];
}

// FIXME: Dead code
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

isStageRevertible = revertible == 1
for i, ts := range cur {
for _, e := range edges {
// FIXME: Flip the loop
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you say more or do this?

}
// Sort ops based on graph dependencies.
sortOps(g, s.Ops.Slice())
//SortOps(g, s.Ops.Slice())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

detritus

return stages.String()
}

// TestPlanGraphSort sanity checks sorting of the graph.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like this test can go in scgraph. I wonder if we can rework it to be easier to read and extend.

Copy link
Copy Markdown
Collaborator Author

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @fqazi)


pkg/sql/schemachanger/scjob/job.go, line 74 at r24 (raw file):

How would you feel about creating an interface as opposed to this function type? IDEs make it easy to implement the interface.


pkg/sql/schemachanger/scop/mutation.go, line 145 at r24 (raw file):

Previously, ajwerner wrote…

Can you add more to this?

Stale comment left over from when I started off.


pkg/sql/schemachanger/scop/mutation.go, line 267 at r24 (raw file):

Previously, ajwerner wrote…

Can you explain what this is for?

// NoOpInfo a no-op mutation operation, which is allows
// an emit function to conditionally generate operations.


pkg/sql/schemachanger/scpb/scpb.proto, line 222 at r24 (raw file):

Previously, ajwerner wrote…

?

Stale comment from development, I should have checked my diffs for these.

Done.


pkg/sql/schemachanger/scplan/plan.go, line 170 at r24 (raw file):

Previously, ajwerner wrote…

can you say more or do this?

The edge and cur loops were flipped, which was a reminder for me.

Done.


pkg/sql/schemachanger/scplan/plan.go, line 238 at r24 (raw file):

Previously, ajwerner wrote…

detritus

Done


pkg/sql/schemachanger/scplan/plan_test.go, line 197 at r24 (raw file):

Previously, ajwerner wrote…

it seems like this test can go in scgraph. I wonder if we can rework it to be easier to read and extend.

Reworked this to have more syntatic sugar and more readability.

Done.

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: when it builds

Reviewed 1 of 58 files at r25, 1 of 33 files at r26, 4 of 46 files at r27.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @fqazi)


pkg/sql/schema_changer_ccl.go, line 50 at r27 (raw file):

		allowImplicitPartitioning)
}
func MakeCCLCallbacks(

linter is going to want a comment here


pkg/sql/schemachanger/scexec/dependencies.go, line 99 at r27 (raw file):

// CCLCallbacks provides an interface that implements CCL exclusive
// callbacks.
type CCLCallbacks interface {

nit: call this Partitioner? The fact that it's CCL isn't that important or meaningful.


pkg/sql/schemachanger/scplan/plan.go, line 236 at r27 (raw file):

			break
		}
		// Sort ops based on graph dependencies.

This comment is now stale.

@fqazi
Copy link
Copy Markdown
Collaborator Author

fqazi commented Nov 8, 2021

@ajwerner TFTR, thanks for the lightning faster review :)

@fqazi fqazi force-pushed the createTableNew branch 7 times, most recently from 12a69d0 to 56966da Compare November 8, 2021 21:50
@fqazi
Copy link
Copy Markdown
Collaborator Author

fqazi commented Nov 9, 2021

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 9, 2021

Build failed (retrying...):

@ajwerner
Copy link
Copy Markdown
Contributor

ajwerner commented Nov 9, 2021

Merge skew with the interleaved removal.

bors r-

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 9, 2021

Canceled.

fqazi added 3 commits November 9, 2021 00:36
Previously, the new schema changer attempted to use
a bubble sort which compared the nodes linked to ops
to generate an approximate order. Unfortunately, this
approach had its limitations when it came to elements
that had no edge linking them, which could lead to
incorrect order relative to dependency edges. To
address this, this patch moves the sorting to a simpler
topological sort which guarantees dependency order.

Release note: None
Fixes: cockroachdb#67264

Previously, create index was not supported inside the
new shcema changer. This was inadequate because, to
help support transaction schema changes in the future
we need all schema changes inside the new schema changer.
To address this, this patch adds CREATE INDEX inside
the new schema changer.

Release note: None
Previously, the unique index validation inside the
new schema changer was stubbed out. This was inadequate
because during index addition operations we were not
properly confirming the unique constraints were properly
implemented. To address this, this patch adds support in
the new schema changer for the valdiation logic.

Release note: None
@fqazi
Copy link
Copy Markdown
Collaborator Author

fqazi commented Nov 9, 2021

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 9, 2021

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql: implement create index inside the new schema changer

5 participants