Skip to content

Key value store#1694

Merged
sinbad merged 12 commits into
locking-masterfrom
key-value-store
Nov 28, 2016
Merged

Key value store#1694
sinbad merged 12 commits into
locking-masterfrom
key-value-store

Conversation

@sinbad

@sinbad sinbad commented Nov 21, 2016

Copy link
Copy Markdown
Contributor

This PR adds an in-memory, optimistically locked key-value store. It's basically a replacement for Boltdb for caching active locks locally as requested in the lock PR.

Submitted separately because it's generic and hopefully easy to review. Rewrite of lock PR is still ongoing but will use this.

@sinbad

sinbad commented Nov 21, 2016

Copy link
Copy Markdown
Contributor Author

Hold off a minute, I think there may be an issue with writing a DB that's truncated smaller, need to add a test for that.

@sinbad

sinbad commented Nov 21, 2016

Copy link
Copy Markdown
Contributor Author

OK, handled that case now; actually still worked (gob protocol is pretty robust) but shouldn't leave old data lying around.

@sinbad sinbad added the review label Nov 21, 2016

@ttaylorr ttaylorr left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Digging this. Left a few comments for you to 💭 about.

Comment thread tools/keyvaluestore.go
}

// firstly peek at version; open read/write to keep lock between check & write
f, err := os.OpenFile(k.filename, os.O_RDWR|os.O_CREATE, 0664)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking that we could do this eagerly and maintain a handle on the *os.File from the instance, instead of this method.

Since I assume that we Save() more than once, only having to do this somewhat expensive operation once would be a win for me.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is completely deliberate, and key to the optimistic locking technique. If you hold on to the handle, it will mean that parallel instances of git-lfs will block each other, which is limiting. The whole point of the optimistic locking system is to allow multiple instances of git-lfs to operate on shared key value stores and not end up locking each other out, while also resolving any parallel update issues.

Opening a file isn't that expensive and not unexpected on a Save() call. I doubt we'll be calling it very often; optimally you only call it once just before exit because it's an in-memory store. Conflicts are resolved at save time to avoid having to maintain stronger locks (this is a common DB trick).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks for explaining.

Comment thread tools/keyvaluestore.go

// Short-circuit if we have no changes
if len(k.log) == 0 {
return nil

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does k.mu lock k.log?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implicitly yes; it's a requirement that mu is always locked in the body of functions, either read-only or a full lock on mutating functions.

Comment thread tools/keyvaluestore.go

k.version++

enc := gob.NewEncoder(f)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we may want to just hold onto this instance instead of the file. Perhaps both, since we need to f.Seek and f.Truncate.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, see comments above. Being able to let go of the file handle is a deliberate feature :)

Comment thread tools/keyvaluestore.go Outdated

enc := gob.NewEncoder(f)
err = enc.Encode(k.version)
if err != nil {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: inline this:

if err = enc.Encode(k.Version); err != nil {
        // ...
}

Comment thread tools/keyvaluestore.go Outdated
return fmt.Errorf("Error while writing version data to %v: %v", k.filename, err)
}
err = enc.Encode(k.db)
if err != nil {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

Comment thread tools/keyvaluestore.go Outdated
db map[string]interface{}
}

type keyValueOperation int

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible to use a byte, since it's smaller than an int?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, although because of data alignment, in practice you don't actually save anything unless you have an array of them :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, damn!

Comment thread tools/keyvaluestore.go Outdated
@@ -0,0 +1,201 @@
package tools

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be appropriate to put in a sub-package of tools: github.com/git-lfs/git-lfs/tools/kv, which would allow us to get rid of a lot of KeyValue* prefixes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, can I call it 'keyvalue' for clarity? I even considered the option of making it a separate top-level package since it's not even specific to git-lfs...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's stick with tools/kv since kv is a common abbreviation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 👍 on kv because it's a common abbreviation, and go packages tend to prefer that (such as io). I'm totally fine with leaving it if you have strong feelings about it though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, done.

Comment thread tools/keyvaluestore.go Outdated
func NewKeyValueStore(filepath string) (*KeyValueStore, error) {
kv := &KeyValueStore{filename: filepath, db: make(map[string]interface{})}
err := kv.loadAndMergeIfNeeded()
return kv, err

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: since you can return non-nil values for both, this could be simplified to:

return kv, kv.loadAndMergeIfNeeded()

Comment thread tools/keyvaluestore.go Outdated
defer k.mu.Unlock()

delete(k.db, key)
k.log = append(k.log, keyValueChange{keyValueRemoveOperation, key, nil})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like these transactional methods are pretty similar. What do you think about teaching the keyValueChange type how to perform these operations?

Comment thread tools/keyvaluestore.go Outdated
if err == nil {
defer f.Close()
return k.loadAndMergeReaderIfNeeded(f)
} else if !os.IsNotExist(err) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this possible given the above?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not any more 👍

@technoweenie technoweenie left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a few code style improvements, but nothing crucial. Thanks for doing this!

Comment thread tools/kv/keyvaluestore.go Outdated
if err != nil {
return err
}
stat, _ := os.Stat(k.filename)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit, but you can call f.Stat() here.

Comment thread tools/kv/keyvaluestore.go
case setOperation:
baseDb[change.key] = change.value
case removeOperation:
delete(baseDb, change.key)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a bit of duplication here with Set() and Remove(). Though since this is just a wrapper around a map, it's probably fine for now. One way to solve it is to turn change into an interface, and calling Apply() on *setKey or *removeKey implementations in reapplyChanges() and logChange().

It looks like a mutex is needed here, but I think it's fine this code path is only called on Save() (which has a mutex) and NewStore() (which shouldn't need one).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only one line of duplication in practice since at this point we're replaying the change and don't want to re-record it in the log; I think a function or interface for 1 line is probably overkill.

The pattern I've used for the mutex is that externally visible functions lock it and internal functions rely on it already being locked. This distinction is required because Go's mutexes aren't recursive, so locks have to be at the edge of the interface and not re-locked inside that.

@ttaylorr

Copy link
Copy Markdown
Contributor

Hmm.. I think we have a new intermittent failure:

test: clone in current directory ...                               FAILED

I've seen this one pop up a few times here and there, and only on Windows. Haven't done any investigation, so I'm not sure what's going on 😭

I'm going to re-run the tests for now, since your PR didn't introduce the transient failure.

@sinbad

sinbad commented Nov 24, 2016

Copy link
Copy Markdown
Contributor Author

Gah, yeah I saw this a couple of weeks ago in an unrelated PR. Only seems to affect AppVeyor and I can't reproduce it on Windows :/

@sinbad sinbad changed the base branch from master to locking-master November 28, 2016 15:05
@sinbad sinbad merged commit ab48ceb into locking-master Nov 28, 2016
@sinbad sinbad deleted the key-value-store branch November 28, 2016 15:42
@sinbad

sinbad commented Nov 28, 2016

Copy link
Copy Markdown
Contributor Author

Re-targetted this at locking-master which will be my integration point for locking stuff from now on, so I can submit incrementally without the overhead of needing APIs to be final at every point. Appveyor problems went away again after I rebased.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants