Skip to content

cli: support more user friendly userfile upload semantics#51353

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
adityamaru:add-default-uribase
Jul 22, 2020
Merged

cli: support more user friendly userfile upload semantics#51353
craig[bot] merged 1 commit intocockroachdb:masterfrom
adityamaru:add-default-uribase

Conversation

@adityamaru
Copy link
Copy Markdown
Contributor

@adityamaru adityamaru commented Jul 12, 2020

Previously, the user would have to specify both the source and
detination args when using userfile upload. This change tweaks the
semantics to be as follows:

  • Users must specify a source path.

  • If the user does not specify a destination URI/path, we use the
    default URI scheme and host, and the basename from the source arg as
    the path.

  • If the destination is a well-formed userfile URI of the form
    userfile://db.schema.tablename_prefix/path/to/file, then we use that
    as the final URI.

  • If destination is not a well-formed userfile URI, we use the default
    userfile URI schema and host, and the destination as the path.

File paths are never cleaned during upload or interaction with the
UserFileStorage. Thus, the filename is the exact string represented by
the path of the constructed userfile URI. This is in keeping with the
semantics of s3 and gcsutil. To prevent user surprises we still reject
constructs where the URI post normalization != URI pre normalization.

egs: test/./test.csv or test/../../test.csv

Informs: #51222

Release note (cli change): Improves the user semantics for userfile upload by supporting different patterns of specifying the source and
destination CLI args. The source arg is required, while the destination
arg is now optional.

@adityamaru adityamaru requested review from dt and miretskiy July 12, 2020 19:23
@adityamaru adityamaru requested a review from a team as a code owner July 12, 2020 19:23
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@adityamaru
Copy link
Copy Markdown
Contributor Author

Commit depends on #50981, I will rebase once merged.

@adityamaru adityamaru force-pushed the add-default-uribase branch from de81a03 to 925ff78 Compare July 13, 2020 13:45
Copy link
Copy Markdown
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @dt, and @miretskiy)


pkg/cli/userfiletable_test.go, line 41 at r3 (raw file):

	// User passes no destination so we use the basename of the source as the
	// filename.
	c.Run(fmt.Sprintf("userfile upload %s", file))

any way to check if the upload path is what you expect it to be?

@miretskiy miretskiy self-requested a review July 13, 2020 13:48
pkg/cli/flags.go Outdated
clientCmds = append(clientCmds, nodeLocalCmds...)
clientCmds = append(clientCmds, userFileCmds...)
clientCmds = append(clientCmds, stmtDiagCmds...)
clientCmds = append(clientCmds, userFileCmds...)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you somehow doubled this line in a rebase since you also have an extra commit

@adityamaru adityamaru force-pushed the add-default-uribase branch from 925ff78 to 0b262cb Compare July 14, 2020 02:38
Copy link
Copy Markdown
Contributor Author

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @dt and @miretskiy)


pkg/cli/flags.go, line 544 at r3 (raw file):

Previously, dt (David Taylor) wrote…

i think you somehow doubled this line in a rebase since you also have an extra commit

Done.


pkg/cli/userfiletable_test.go, line 41 at r3 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

any way to check if the upload path is what you expect it to be?

I'm not sure I understand, the comments at the end of this test are parsed as the expected outputs.

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aaron-crl I'd like to put this in your radar as well. Ideally, you'd come up with some guidelines / evaluation criteria to help Aditya assert on his own whether the path handling is sane.

Reviewed 3 of 3 files at r3, 1 of 1 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @adityamaru, @dt, and @miretskiy)


pkg/cli/userfile.go, line 91 at r4 (raw file):

		userFileURL := url.URL{
			Scheme: defaultUserfileScheme,
			Host:   defaultQualifiedNamePrefix + user,

see my comment about the username in the other PR.

@adityamaru adityamaru force-pushed the add-default-uribase branch 2 times, most recently from 35c3d9d to 1990865 Compare July 16, 2020 17:57
Copy link
Copy Markdown
Contributor Author

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @dt, @knz, and @miretskiy)


pkg/cli/userfile.go, line 91 at r4 (raw file):

Previously, knz (kena) wrote…

see my comment about the username in the other PR.

addressed in the previous PR by unescaping the URI before we use it. This allows for arbitrary Unicode characters.

@adityamaru
Copy link
Copy Markdown
Contributor Author

RFAL, please

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 8 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @adityamaru, @dt, @knz, and @miretskiy)


pkg/storage/cloudimpl/file_table_storage.go, line 118 at r5 (raw file):

	if path.Clean(basename) != basename {
		return "", errors.Newf("basename %s changes to %s on normalization. "+

I'm OK with this test but I'd like a unit test that demonstrates the error: under which circumstances would client code encounter this?

@adityamaru adityamaru force-pushed the add-default-uribase branch from 1990865 to def15e3 Compare July 17, 2020 13:46
Copy link
Copy Markdown
Contributor Author

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @dt, @knz, and @miretskiy)


pkg/storage/cloudimpl/file_table_storage.go, line 118 at r5 (raw file):

Previously, knz (kena) wrote…

I'm OK with this test but I'd like a unit test that demonstrates the error: under which circumstances would client code encounter this?

CLI related code should not encounter this because all the FileTableStorage methods are passed an empty basename, as a result of which all operations occur on the prefix.
BACKUP uses the basename when writing data, manifest files, and encryption information. I have added a specialized test to file_table_storage_test.go to ensure we're rejecting what we think we are. external_storage_test.go has tests in testExportStore which test valid basename constructs.

@adityamaru adityamaru force-pushed the add-default-uribase branch 2 times, most recently from 64cd0bf to 0e90fe8 Compare July 17, 2020 15:12
@adityamaru adityamaru requested a review from knz July 20, 2020 18:05
Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r6.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @adityamaru, @dt, @knz, and @miretskiy)


pkg/cli/userfiletable_test.go, line 159 at r6 (raw file):

	}

	require.NoError(t, os.RemoveAll(dir))

Can you double-check what the cleanFn returned by TempDir does? It seems to me that you do not need this RemoveAll here.


pkg/storage/cloudimpl/file_table_storage.go, line 118 at r5 (raw file):

Previously, adityamaru (Aditya Maru) wrote…

Client code should not encounter this because all the FileTableStorage methods are passed an empty basename, as a result of which all operations occur on the prefix.
BACKUP uses the basename when writing data, manifest files, and encryption information. I have added a specialized test to file_table_storage_test.go to ensure we're rejecting what we think we are. external_storage_test.go has tests in testExportStore which test valid basename constructs.

Yes that works thanks

@adityamaru adityamaru force-pushed the add-default-uribase branch from 0e90fe8 to ea15683 Compare July 21, 2020 12:59
Copy link
Copy Markdown
Contributor Author

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @dt, @knz, and @miretskiy)


pkg/cli/userfiletable_test.go, line 159 at r6 (raw file):

Previously, knz (kena) wrote…

Can you double-check what the cleanFn returned by TempDir does? It seems to me that you do not need this RemoveAll here.

Done.

@adityamaru adityamaru requested a review from knz July 21, 2020 13:00
Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. You'll want your final PR approval from someone in your team.

Reviewed 1 of 1 files at r7.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @dt, @knz, and @miretskiy)

@miretskiy miretskiy requested review from dt and knz July 22, 2020 13:10
Copy link
Copy Markdown
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 8 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @adityamaru, @dt, and @knz)


pkg/cli/userfiletable_test.go, line 131 at r7 (raw file):

		if err != nil {
			t.Fatal(err)
		}

require.NoError(t, err) here and throughout?


pkg/storage/cloudimpl/cloudimpltests/file_table_storage_test.go, line 64 at r7 (raw file):

		require.True(t, testutils.IsError(store.WriteFile(ctx, testfile, bytes.NewReader([]byte{0})),
			"basename listing-test/../basepath changes to basepath on normalization. "+
				"userfile does not permit such constructs."))

you don't have to specify full error message. Just substring (does not permit such).
Also, I would pull out write file call and pass in err to require.True so that if the test fails, you'll see what kind of error you got (as opposed to "expected true")
err := store.WriteFile()
require.True(t, testutils.IsError(err, "changes to basepath", err))

Previously, the user would have to specify both the source and
detination args when using `userfile upload`.  This change tweaks the
semantics to be as follows:

- Users must specify a source path.

- If the user does not specify a destination URI/path, we use the
  default URI scheme and host, and the basename from the source arg as
the path.

- If the destination is a well-formed userfile URI of the form
  userfile://db.schema.tablename_prefix/path/to/file, then we use that
as the final URI.

- If destination is not a well-formed userfile URI, we use the default
  userfile URI schema and host, and the destination as the path.

File paths are never cleaned during upload or interaction with the
UserFileStorage. Thus, the filename is the exact string represented by
the path of the constructed userfile URI. This is in keeping with the
semantics of s3 and gcsutil. To prevent user surprises we still reject
constructs where the URI post normalization != URI pre normalization.

egs: test/./test.csv or test/../../test.csv

Release note (cli change): Improves the user semantics for `userfile
upload` by supporting different patterns of specifying the source and
destination CLI args. The source arg is required, while the destination
arg is now optional.
@adityamaru adityamaru force-pushed the add-default-uribase branch from ea15683 to aeab5d3 Compare July 22, 2020 14:42
@adityamaru
Copy link
Copy Markdown
Contributor Author

TFTRs!

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jul 22, 2020

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants