Skip to content

[WIP] allow lookup by prefix on ID and Name for jobs#199

Closed
vieux wants to merge 6 commits intomoby:masterfrom
vieux:prefix
Closed

[WIP] allow lookup by prefix on ID and Name for jobs#199
vieux wants to merge 6 commits intomoby:masterfrom
vieux:prefix

Conversation

@vieux
Copy link
Copy Markdown
Contributor

@vieux vieux commented Mar 22, 2016

depends on hashicorp/go-memdb#17

result from bench:

BenchmarkCreateNode-4                  50000         49162 ns/op       11590 B/op        154 allocs/op
BenchmarkUpdateNode-4                  20000         64126 ns/op       15051 B/op        208 allocs/op
BenchmarkUpdateNodeTransaction-4       20000         74439 ns/op       21684 B/op        310 allocs/op
BenchmarkDeleteNodeTransaction-4       50000         35235 ns/op       10911 B/op        155 allocs/op
BenchmarkGetNode-4                   1000000          1900 ns/op         272 B/op          9 allocs/op
BenchmarkFindAllNodes-4                  100      11709034 ns/op     2340975 B/op      30588 allocs/op
BenchmarkFindNodeByName-4            1000000          1415 ns/op         185 B/op         11 allocs/op
BenchmarkNodeConcurrency-4             10000        328975 ns/op       96820 B/op       1450 allocs/op

@vieux
Copy link
Copy Markdown
Contributor Author

vieux commented Mar 22, 2016

@stevvooe please look at fc83994

Does this match what you were thinking about ?

@docker-codecov-bot
Copy link
Copy Markdown

Current coverage is 57.05%

Merging #199 into master will decrease coverage by -0.19% as of 594dda3

@@            master    #199   diff @@
======================================
  Files           43      43       
  Stmts         5220    5172    -48
  Branches       745     735    -10
  Methods          0       0       
======================================
- Hit           2988    2951    -37
+ Partial        406     405     -1
+ Missed        1826    1816    -10

Review entire Coverage Diff as of 594dda3


Uncovered Suggestions

  1. +0.45% via agent/session.go#182...204
  2. +0.35% via agent/agent.go#525...542
  3. +0.35% via .../test/deepcopy.pb.go#1492...1509
  4. See 7 more...

Powered by Codecov. Updated on successful CI builds.

Indexes: map[string]*memdb.IndexSchema{
indexPrefix: {
Name: indexPrefix,
Unique: true,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This index cannot be unique. The independent name and id index must be unique, but we need this to be non-unique.

This allows you to do two things:

  1. Names and identifiers can collide. This is a bad thing but we must allow it.
  2. Ambiguous names/identifiers can be detected and resolved.

Effectively, if we have a query against this index that this non-unique, we can identify and address the ambiguous items.

For example, let's say we create an object with name a and generate and identifier with the name a. Independently, these should be allowed. There is no requirement for these spaces to be disjoint. If we make the index unique, it is ambiguous as to which entry wins. If both are there, we know they are ambiguous and can have a user take action to resolve the problem, either by changing the name, id or just providing a more specific query (ie id=a) when referencing an object.

@stevvooe
Copy link
Copy Markdown
Contributor

@vieux Yes, this looks good.

As a rule, we shouldn't support "fuzzy" prefix lookup for Get methods. Get methods should only be by id. All of the user facing query functionality should be support through ListXXX.

@vieux
Copy link
Copy Markdown
Contributor Author

vieux commented Mar 23, 2016

@stevvooe what about using Find instead of Get (like in the commit I just added)


id := common.LookupID(common.Context(cmd), c, api.Job{}, args[0])
r, err := c.GetJob(common.Context(cmd), &api.GetJobRequest{JobID: id})
r, err := c.GetJob(common.Context(cmd), &api.GetJobRequest{JobID: args[0]})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is part of the issue. We should only using the id from a ListXXXRequest. There is no reason to make the JobID field fancy.

@stevvooe
Copy link
Copy Markdown
Contributor

@vieux Find is likely the correct approach. However, we need to not let this "nice" user features pollute fields that are meant strictly for identifiers. This is a problem in the docker API, since an id may actually be a name or id. Convenient but problematic. Specifically, it means we can't lock these down as ID fields.

I think the standard lookup for names/ids should be to use ListXXX, the use the id to lookup the object. It means we pay for extra requests but we get consistency. We can always address performance with a cache or have a protobuf that clients can get on startup that sends an index.

@vieux
Copy link
Copy Markdown
Contributor Author

vieux commented Mar 23, 2016

@stevvooe what you saying is going a round trip in the cli ?

CLI ----- ListXXX (or Find) with Prefix ----> manager
CLI <--------- one or mutilple jobs ----------- manager
if 1 job :
  CLI ----- edit/get/delete with ID ----> manager
if 0 job or > 1 job: 
  CLI error

@stevvooe
Copy link
Copy Markdown
Contributor

@vieux Yep. This keeps the API consistent.

The other option is to define query options in the GetXXXRequest objects. Really, I want to avoid overloading the ID field.

@vieux
Copy link
Copy Markdown
Contributor Author

vieux commented Mar 24, 2016

I went for option 1 for now.
Besides the fact that hashicorp/go-memdb#17 is not yet merged, the current PR looks good in my opinion.

@stevvooe
Copy link
Copy Markdown
Contributor

LGTM for now.

We need to spend some time on the selector design. This is going to be important for selecting any object by Meta field.

@vieux
Copy link
Copy Markdown
Contributor Author

vieux commented Mar 24, 2016

ping @aluzzardi @aaronlehmann

vieux added 6 commits March 24, 2016 00:56
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
@vieux
Copy link
Copy Markdown
Contributor Author

vieux commented Mar 24, 2016

@stevvooe how about we use the prefix capabilities of go-memdb (thanks @aluzzardi) , it avoids creating a huge index

}

message ListJobsRequest {
string prefix = 1;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps make a ListOptions?

@aaronlehmann
Copy link
Copy Markdown
Collaborator

I'm okay with this approach, but I have a few concerns before we merge this:

  • We can't vendor an unmerged PR. We need to either pressure upstream to merge it, or create our own fork.
  • I'm wondering about the performance overhead of adding additional indices. I believe the current store benchmarks only use node objects, but it would be interesting if you could modify them to use jobs (or add this indexing to nodes), and check the impact. The alternative to adding a new indices would be to modify the existing indices to do prefix matches, and do lookups in both whenever we do a ByPrefix lookup. While it would be a little slower to lookup items in two indices instead of a unified one, this should only happen for CLI requests, so it's not performance-critical at all compared to the general case of adding and updating items in the store.

}
if j, ok := obj.(*api.Job); ok {
jobs = append(jobs, j.Copy())
if _, exists := ids[j.ID]; !exists {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a job the same ID and Name, it will be returned twice without this.

@vieux
Copy link
Copy Markdown
Contributor Author

vieux commented Mar 24, 2016

@aaronlehmann I'm going to compare bench

@vieux
Copy link
Copy Markdown
Contributor Author

vieux commented Mar 25, 2016

I posted a version without the new index in #210 alongside with bench comparison.

#210 is mush faster, I think we should close this PR

/cc @aaronlehmann

@vieux vieux closed this Mar 25, 2016
@vieux vieux deleted the prefix branch March 29, 2016 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants