Skip to content

Improve reference validation#18586

Merged
tiborvass merged 5 commits into
moby:masterfrom
tonistiigi:tag-reference-squashed
Dec 16, 2015
Merged

Improve reference validation#18586
tiborvass merged 5 commits into
moby:masterfrom
tonistiigi:tag-reference-squashed

Conversation

@tonistiigi

Copy link
Copy Markdown
Member

#17924 added a reference package from docker/distribution that improved validation of tags and repository names a lot. There were still some cases left that needed docker specific handling for the references.

This PR adds a wrapper around distribution/reference that adds the docker specific functionality. This includes forbidding 64-byte hex as repo name, specific rules for detecting hostname, detection of optional default hostnames, handling references with both tag and digest and support for conversion between normalized and canonical reference forms. There is no need to remember(and forget in many places) calling validation methods for these anymore, as they all built in to parse function.

I've also cleaned up the type switches for different reference types that become easier now and made error messages more verbose(fixes #18093). After this, the need to use RepositoryInfo struct should be much lower as you can always turn any reference to any other form.

I recommend reviewing by commit. Most changes are in the first commit, but they are all simple replacements from moving package and easy to follow.

@aaronlehmann

Comment thread distribution/push_v2.go

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how Association is in the same package as the other reference types now.

@tonistiigi tonistiigi force-pushed the tag-reference-squashed branch from f730401 to 1ec7ea2 Compare December 10, 2015 22:55
Comment thread reference/store.go Outdated

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the comment call it a reference store?

@tonistiigi tonistiigi force-pushed the tag-reference-squashed branch from 1ec7ea2 to 7563385 Compare December 10, 2015 23:16
Comment thread reference/reference.go Outdated

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment doesn't seem to be true. If there's no hostname, it returns the default index name.

This seems to be the main difference from distreference.SplitHostname, besides the extra validation, and it looks like the reason ParseNamed uses distreference.SplitHostname instead of this version.

I feel this function should be renamed to something else so it doesn't get confused with the other version, and to make it clearer it's doing more than just splitting on a hostname boundary.

Perhaps IndexAndRemoteName?

@aaronlehmann

Copy link
Copy Markdown

Thanks for doing all this cleanup. It's a big improvement.

@tonistiigi

Copy link
Copy Markdown
Member Author

I'll attempt a version with a custom definition for Named that already has normalization/remotename built in. Will reopen when ready.

@tonistiigi tonistiigi closed this Dec 11, 2015
@tonistiigi tonistiigi reopened this Dec 12, 2015
@tonistiigi tonistiigi force-pushed the tag-reference-squashed branch from 7563385 to 500e9a7 Compare December 12, 2015 00:39
@tonistiigi

Copy link
Copy Markdown
Member Author

Updated with a new interface for reference.Named so it can't get mixed up with distribution.Named. This is also much safer for the pull code is registry/ that before accepted Named but actually required the input to be in a specific form(like remotename or fullname).

PTAL

Comment thread daemon/image_delete.go Outdated

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use reference.WithDefaultTag instead, and remove the switch?

@aaronlehmann

Copy link
Copy Markdown

I like these changes. Noted a few small things, but I think this is getting very close.

One thing I was wondering about is whether it makes sense to store the string that was originally parsed, and provide a method to access this. This might be useful for error messages where we want to echo back the exact reference the user supplied, not a normalized form. But if it's okay for error messages to use the normalized version, I guess there's no reason to do this.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Fixes moby#18093

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
@tonistiigi tonistiigi force-pushed the tag-reference-squashed branch from 0ecd28a to eeb2d4c Compare December 16, 2015 19:59
@calavera

Copy link
Copy Markdown
Contributor

LGTM

1 similar comment
@tiborvass

Copy link
Copy Markdown
Contributor

LGTM

tiborvass added a commit that referenced this pull request Dec 16, 2015
@tiborvass tiborvass merged commit dc81c25 into moby:master Dec 16, 2015
@tiborvass

Copy link
Copy Markdown
Contributor

Thank you @tonistiigi @aaronlehmann @dmcgowan !

@stevvooe

Copy link
Copy Markdown
Contributor

The whole point of the distribution/reference was to have this done in one place. This basically forks the package and undoes all of the work to consolidate. Why aren't these changes upstreamed into the distribution package?

Comment thread reference/reference.go

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This explanation seems incorrect and diverges from reference package.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would this be anything but the actual name? I don't understand why we would have FullName and this. The existence of the two doesn't make sense.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There used to be a ParseRepositoryInfo function that would generate the different forms of the name that were needed for different purposes (i.e. ubuntu, library/ubuntu, index.docker.io/library/ubuntu). This PR makes it so that a reference can expose any of these forms using methods. The old code made it very easy to accidentally pass the wrong kind of reference to a function expecting a certain form - for example passing ubuntu to something that wants library/ubuntu. IMHO this change makes things a lot cleaner by preventing that kind of mistake.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we can normalize the reference then return this with a RemoteName package function. Anywhere where it might expect library/ubuntu, just call RemoteName to do get the expected prefix. (arguably, RemoteName should be Basename).

@stevvooe

Copy link
Copy Markdown
Contributor

It seems like we needed to have a references package that has defaults and definitions unique to docker's internal needs, such as references.Store. Several of these methods should have also been flowed upstream to distribution/reference or have informed changes to that package.

@aaronlehmann

Copy link
Copy Markdown

@stevvooe: With a few exceptions that could move upstream, the code here is specific to the daemon. For example, IMHO we want to avoid pushing special cases for docker.io and library/ prefixes into the distribution/reference package.

I don't see this as a fork of distribution/reference. It creates types that extend those interfaces for daemon-specific purposes.

@stevvooe

Copy link
Copy Markdown
Contributor

@aaronlehmann This is a fork. We are no longer using the same types, parsing functions and combinators. We are no longer using the reference package throughout the docker code base. For instance, Named.Name() doesn't mean the same thing anymore. The distribution/reference package shouldn't do all of this but a lot of this can be upstreamed and the rest can be utility methods.

Comment thread reference/reference.go

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so we spent all that time creating a grammar for image names. This needs to be flowed upstream. The distribution/reference package is supposed to be a one stop shop for correct image naming.

stevvooe added a commit to stevvooe/distribution that referenced this pull request Dec 17, 2015
Rather than fork, we have no upstreamed some of the additions required in
moby/moby#18586. Of upmost importance is the
addition of a useful WithName function. We also restrict Reference
implementations to being from the reference package, making Reference a proper
algebraic data type. This makes it easier to assert that a reference is not
something.

We'll still need to remove the forking from the PR above to get this fully
there. There may also be some missing additions.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
stevvooe added a commit to stevvooe/distribution that referenced this pull request Dec 17, 2015
Rather than fork, we have no upstreamed some of the additions required in
moby/moby#18586. Of upmost importance is the
addition of a useful WithName function. We also restrict Reference
implementations to being from the reference package, making Reference a proper
algebraic data type. This makes it easier to assert that a reference is not
something.

We'll still need to remove the forking from the PR above to get this fully
there. There may also be some missing additions.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
stevvooe added a commit to stevvooe/distribution that referenced this pull request Dec 18, 2015
Rather than fork, we have no upstreamed some of the additions required in
moby/moby#18586. Of upmost importance is the
addition of a useful WithName function. We also restrict Reference
implementations to being from the reference package, making Reference a proper
algebraic data type. This makes it easier to assert that a reference is not
something.

We'll still need to remove the forking from the PR above to get this fully
there. There may also be some missing additions.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
stevvooe added a commit to stevvooe/distribution that referenced this pull request Jan 13, 2016
Rather than fork, we have now upstreamed some of the additions required in
moby/moby#18586. Of upmost importance is the
addition of a useful WithName function.

While did attempt to restrict Reference implementations to being from the
reference package, making Reference a proper algebraic data type, we ran into
issues with the usage in the main docker project. We will re-examine this
approach in the future.

This change will be followed up with some other additions from the daemon.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
stevvooe added a commit to stevvooe/distribution that referenced this pull request Jan 13, 2016
Rather than fork, we have now upstreamed some of the additions required in
moby/moby#18586. Of upmost importance is the
addition of a useful WithName function.

While did attempt to restrict Reference implementations to being from the
reference package, making Reference a proper algebraic data type, we ran into
issues with the usage in the main docker project. We will re-examine this
approach in the future.

This change will be followed up with some other additions from the daemon.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
stevvooe added a commit to stevvooe/distribution that referenced this pull request Jan 13, 2016
Rather than fork, we have now upstreamed some of the additions required in
moby/moby#18586. Of upmost importance is the
addition of a useful WithName function.

While did attempt to restrict Reference implementations to being from the
reference package, making Reference a proper algebraic data type, we ran into
issues with the usage in the main docker project. We will re-examine this
approach in the future.

This change will be followed up with some other additions from the daemon.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
stevvooe added a commit to stevvooe/distribution that referenced this pull request Jan 13, 2016
Rather than fork, we have now upstreamed some of the additions required in
moby/moby#18586. Of upmost importance is the
addition of a useful WithName function.

While did attempt to restrict Reference implementations to being from the
reference package, making Reference a proper algebraic data type, we ran into
issues with the usage in the main docker project. We will re-examine this
approach in the future.

This change will be followed up with some other additions from the daemon.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
stevvooe added a commit to stevvooe/distribution that referenced this pull request Jan 13, 2016
Rather than fork, we have now upstreamed some of the additions required in
moby/moby#18586. Of upmost importance is the
addition of a useful WithName function.

While did attempt to restrict Reference implementations to being from the
reference package, making Reference a proper algebraic data type, we ran into
issues with the usage in the main docker project. We will re-examine this
approach in the future.

This change will be followed up with some other additions from the daemon.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error message "invalid reference format"

7 participants