Skip to content

importccl: enable import into tables with user defined types#50002

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
rohany:importer-test
Jun 10, 2020
Merged

importccl: enable import into tables with user defined types#50002
craig[bot] merged 1 commit intocockroachdb:masterfrom
rohany:importer-test

Conversation

@rohany
Copy link
Copy Markdown
Contributor

@rohany rohany commented Jun 9, 2020

Fixes #49971.

This PR adds the necessary type metadata hydration to the import
statement and import workers in order to import into tables that contain
user defined types.

Release note: None

@rohany rohany requested review from a team, dt and miretskiy June 9, 2020 04:15
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 15 of 15 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @rohany)


pkg/ccl/importccl/import_processor.go, line 127 at r1 (raw file):

				// Create a new transaction to hydrate the types in.
				if err := evalCtx.DB.Txn(ctx, func(ctx context.Context, txn *kv.Txn) error {
					evalCtx.Txn = txn

Could evalCtx already have a Txn set? What happens when we override it?

just not sure how this code supposed to work (nor, why for example HydrateTypeSlice does not
take txn argument explicitly, though, looking at that method, I don't see a reason why we shouldn't pass txn directly


pkg/ccl/importccl/import_stmt_test.go, line 1100 at r1 (raw file):

		require.NoError(t, err)
		// Run the import statement.
		if test.into {

why not just add another loop
for into := range {true,false}?
You could cut your test cases in 1/2

Copy link
Copy Markdown
Contributor Author

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @miretskiy)


pkg/ccl/importccl/import_processor.go, line 127 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

Could evalCtx already have a Txn set? What happens when we override it?

just not sure how this code supposed to work (nor, why for example HydrateTypeSlice does not
take txn argument explicitly, though, looking at that method, I don't see a reason why we shouldn't pass txn directly

I don't think it can in this case, but it's reasonable to check and reuse it if so. We use an EvalContext in that function because it will eventually be updated to look into the LeaseManager for leased TypeDescriptors, and fall back to the txn if a descriptor isn't leased.


pkg/ccl/importccl/import_stmt_test.go, line 1100 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

why not just add another loop
for into := range {true,false}?
You could cut your test cases in 1/2

True -- i was originally thinking that something more would need to be done for the import into statements, but that doesn't seem to be the case. I'll update it.

verifyQuery: "SELECT * FROM t ORDER BY a",
expected: [][]string{{"hello", "hello"}, {"hi", "hi"}},
},
// TODO (rohany): What's the best way to test out some AVRO imports?
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestions here?

@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Jun 9, 2020

I was looking at some backup stuff today, and I see that some of this descriptor ID rewriting logic is also used in import -- do you think that needs to be touched here?

Copy link
Copy Markdown
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly don't know. @pbardea ?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @miretskiy, and @rohany)


pkg/ccl/importccl/import_stmt_test.go, line 1100 at r1 (raw file):

Previously, rohany (Rohan Yadav) wrote…

True -- i was originally thinking that something more would need to be done for the import into statements, but that doesn't seem to be the case. I'll update it.

Good question. I think we almost certainly need to add an issue to add support for Avro.

We might have to update our avro parser (read_import_avro), but I'm not sure on that..
The change you're making is focused making changes to makeInputConverter -- the same code we use in avro.

I think you can try doing few things:

  1. If you want to generate test data on the fly,like you do in your test,
    you might want to take a look at read_import_avro_test. There is a generator code in there that
    can generate and test avro data. I suggest genOcfData which will probably be the easiest thing for you to use (you might want to move this test code to testutils_test.go)
  2. You can generate external (testdata) avro data. testdata/avro directory has readme file
    describing how to generate avro data.

@miretskiy miretskiy requested review from miretskiy June 9, 2020 19:11
Copy link
Copy Markdown
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @miretskiy, and @rohany)


pkg/ccl/importccl/import_stmt_test.go, line 1090 at r1 (raw file):

Good question. I think we almost certainly need to add an issue to add support for Avro.
We might have to update our avro parser (read_import_avro), but I'm not sure on that..
The change you're making is focused making changes to makeInputConverter -- the same code we use in avro.
I think you can try doing few things:

If you want to generate test data on the fly,like you do in your test,
you might want to take a look at read_import_avro_test. There is a generator code in there that
can generate and test avro data. I suggest genOcfData which will probably be the easiest thing for you to use (you might want to move this test code to testutils_test.go)
You can generate external (testdata) avro data. testdata/avro directory has readme file
describing how to generate avro data.

Copy link
Copy Markdown
Contributor Author

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @miretskiy, and @rohany)


pkg/ccl/importccl/import_processor.go, line 127 at r1 (raw file):

Previously, rohany (Rohan Yadav) wrote…

I don't think it can in this case, but it's reasonable to check and reuse it if so. We use an EvalContext in that function because it will eventually be updated to look into the LeaseManager for leased TypeDescriptors, and fall back to the txn if a descriptor isn't leased.

Done


pkg/ccl/importccl/import_stmt_test.go, line 1100 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

Good question. I think we almost certainly need to add an issue to add support for Avro.

We might have to update our avro parser (read_import_avro), but I'm not sure on that..
The change you're making is focused making changes to makeInputConverter -- the same code we use in avro.

I think you can try doing few things:

  1. If you want to generate test data on the fly,like you do in your test,
    you might want to take a look at read_import_avro_test. There is a generator code in there that
    can generate and test avro data. I suggest genOcfData which will probably be the easiest thing for you to use (you might want to move this test code to testutils_test.go)
  2. You can generate external (testdata) avro data. testdata/avro directory has readme file
    describing how to generate avro data.

Removed the extra tests. Thanks for the avro tip, I'll look into that now.

Copy link
Copy Markdown
Contributor Author

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @miretskiy, and @rohany)


pkg/ccl/importccl/import_stmt_test.go, line 1040 at r2 (raw file):

		// Test CSV imports.
		{
			create:      "a greeting, b greeting",

TODO (me): Test import with some default column values that are enums as well.

Copy link
Copy Markdown
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @miretskiy, and @rohany)

@miretskiy miretskiy self-requested a review June 9, 2020 20:44
Copy link
Copy Markdown
Contributor Author

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @miretskiy, and @rohany)


pkg/ccl/importccl/import_stmt_test.go, line 1040 at r2 (raw file):

Previously, rohany (Rohan Yadav) wrote…

TODO (me): Test import with some default column values that are enums as well.

Never mind, I don't think you can do this.

Copy link
Copy Markdown
Contributor Author

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @miretskiy)


pkg/ccl/importccl/import_stmt_test.go, line 1090 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

Good question. I think we almost certainly need to add an issue to add support for Avro.
We might have to update our avro parser (read_import_avro), but I'm not sure on that..
The change you're making is focused making changes to makeInputConverter -- the same code we use in avro.
I think you can try doing few things:

If you want to generate test data on the fly,like you do in your test,
you might want to take a look at read_import_avro_test. There is a generator code in there that
can generate and test avro data. I suggest genOcfData which will probably be the easiest thing for you to use (you might want to move this test code to testutils_test.go)
You can generate external (testdata) avro data. testdata/avro directory has readme file
describing how to generate avro data.

Done, added some avro data to the test.

Copy link
Copy Markdown
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @rohany)


pkg/ccl/importccl/import_stmt_test.go, line 1057 at r3 (raw file):

		require.NoError(t, err)
		// Create an AVRO writer from the schema.
		ocr, err := goavro.NewOCFWriter(goavro.OCFConfig{

s/ocr/ocf/?

Fixes cockroachdb#49971.

This PR adds the necessary type metadata hydration to the import
statement and import workers in order to import into tables that contain
user defined types.

Release note: None
@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Jun 10, 2020

TFTR!

I was looking at some backup stuff today, and I see that some of this descriptor ID rewriting logic is also used in import -- do you think that needs to be touched here?

I looked into this more, and I don't think that it is necessary. I'll take another pass while working on backup.

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jun 10, 2020

Build succeeded

@craig craig bot merged commit 7d252ca into cockroachdb:master Jun 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql: investigate interaction between user defined types and different IMPORT/EXPORT varieties

3 participants