Skip to content

importccl: Added row limit option for importing bundle formats#56587

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
mokaixu:project10
Dec 1, 2020
Merged

importccl: Added row limit option for importing bundle formats#56587
craig[bot] merged 1 commit intocockroachdb:masterfrom
mokaixu:project10

Conversation

@mokaixu
Copy link
Copy Markdown
Contributor

@mokaixu mokaixu commented Nov 11, 2020

A row limit option allows users to specify a fixed number of rows
that they want to import. This functionality is only available for
CSV/DELIMITED/AVRO formats.

With this code change, users can limit the number of rows they want
to import for PGDUMP and MYSQLDUMP bundle data formats. That is, they can
specify: WITH row_limit="{$num}" to only import $num rows.

Release note (sql change): Added WITH row_limit="{$num}" option for importing
bundle formats to allow users to do a quick test run on an import of $num rows.
Ex. IMPORT ... WITH row_limit="3";

Resolves: #26493

@mokaixu mokaixu requested review from a team and adityamaru November 11, 2020 19:48
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@mokaixu mokaixu force-pushed the project10 branch 8 times, most recently from b3f16b4 to 34b690a Compare November 18, 2020 19:31
@mokaixu mokaixu force-pushed the project10 branch 3 times, most recently from e1f4e91 to ffa5da5 Compare November 19, 2020 21:26
@mokaixu mokaixu changed the title [WIP] importccl: Added row limit option for importing bundle formats importccl: Added row limit option for importing bundle formats Nov 19, 2020
Copy link
Copy Markdown
Contributor

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 8 files at r1, 1 of 4 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @mokaixu)


pkg/ccl/importccl/import_stmt.go, line 538 at r1 (raw file):

			}
			format.Format = roachpb.IOFileFormat_Mysqldump

nit: can we undo this change


pkg/ccl/importccl/import_stmt.go, line 549 at r1 (raw file):

				format.MysqlDump.RowLimit = int64(rowLimit)
			}

nit: ditto


pkg/ccl/importccl/import_stmt_test.go, line 1598 at r2 (raw file):

Quoted 4 lines of code…
		res := sqlDB.QueryStr(t, "SELECT * FROM second")
		if actualRowCount := len(res); expectedRowLimit != actualRowCount {
			t.Fatalf("expected %d, got %d", expectedRowLimit, actualRowCount)
		}

I think you can do something like:

var numRows int
sqlDB.QueryRow("SELECT count(*) FROM second").Scan(&numRows)
require.Equal(t, expectedRowLimit, numRows)

Here and elsewhere in the tests.


pkg/roachpb/io-formats.proto, line 110 at r1 (raw file):

CSV file

I think you need to update the comment.


pkg/roachpb/io-formats.proto, line 117 at r1 (raw file):

  // Indicates the number of rows to import per CSV file.
  // Must be a non-zero positive number.

ditto as above

Copy link
Copy Markdown
Contributor Author

@mokaixu mokaixu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @mokaixu)


pkg/ccl/importccl/import_stmt.go, line 538 at r1 (raw file):

Previously, adityamaru (Aditya Maru) wrote…

nit: can we undo this change

like separate the mysqldump logic? If I undo this change, all the mysqldump imports with rowlimits fail


pkg/ccl/importccl/import_stmt_test.go, line 1598 at r2 (raw file):

Previously, adityamaru (Aditya Maru) wrote…
		res := sqlDB.QueryStr(t, "SELECT * FROM second")
		if actualRowCount := len(res); expectedRowLimit != actualRowCount {
			t.Fatalf("expected %d, got %d", expectedRowLimit, actualRowCount)
		}

I think you can do something like:

var numRows int
sqlDB.QueryRow("SELECT count(*) FROM second").Scan(&numRows)
require.Equal(t, expectedRowLimit, numRows)

Here and elsewhere in the tests.

done


pkg/roachpb/io-formats.proto, line 110 at r1 (raw file):

Previously, adityamaru (Aditya Maru) wrote…
CSV file

I think you need to update the comment.

oops! done changed to per table


pkg/roachpb/io-formats.proto, line 117 at r1 (raw file):

Previously, adityamaru (Aditya Maru) wrote…
  // Indicates the number of rows to import per CSV file.
  // Must be a non-zero positive number.

ditto as above

done, changed to per table

@mokaixu mokaixu force-pushed the project10 branch 3 times, most recently from 85b53a1 to e941479 Compare November 27, 2020 19:13
@adityamaru
Copy link
Copy Markdown
Contributor

This LGTM! Feel free to merge over the weekend once CI is green 🥳

@mokaixu mokaixu force-pushed the project10 branch 3 times, most recently from 77a3071 to 73db851 Compare November 29, 2020 23:22
Copy link
Copy Markdown
Contributor

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 8 files at r1, 1 of 4 files at r2, 2 of 4 files at r3, 4 of 4 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @mokaixu)

@mokaixu
Copy link
Copy Markdown
Contributor Author

mokaixu commented Nov 30, 2020

bors r=adityamaru

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 30, 2020

Build failed:

A row limit option allows users to specify a fixed number of rows
that they want to import. This functionality is only available for
CSV/DELIMITED/AVRO formats.

With this code change, users can limit the number of rows they want
to import for PGDUMP and MYSQLDUMP bundle data formats. That is, they can
specify: `WITH row_limit="{$num}"` to only import $num rows.

Release note (sql change): Added WITH row_limit="{$num}" option for importing
bundle formats to allow users to do a quick test run on an import of $num rows.
Ex. IMPORT ... WITH row_limit="3";
@mokaixu
Copy link
Copy Markdown
Contributor Author

mokaixu commented Dec 1, 2020

bors r=adityamaru

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Dec 1, 2020

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

importccl: add dry-run mode to show what would be imported

3 participants