Locking part 1: locking package by sinbad · Pull Request #1625 · git-lfs/git-lfs

sinbad · 2016-11-07T15:38:31Z

Part 1 resubmission of #1590

This refactors logic into a dedicated package called locking, so that secondary functionality can be added in a re-usable place. Makes calls to a lock cache which will be implemented in a later PR (a replacement for BoltDB which was used in #1590).

Also involved moving ChannelWrapper into tools since it is a useful abstraction of working with parallel result and error channels while avoiding mistakes; used for the refactored SearchLocks so streamed results are avaialble if needed.

technoweenie · 2016-11-07T19:29:53Z

Since this is adding a new locking package, I want it to have a well designed API that external users can depend on. Can we design something with a locking.Client object that has per-command properties like CurrentRemote, and some lock caching implementation?

I realize at this point, it's a pipe dream, since both the api and auth packages use the *config.Configuration.

sinbad · 2016-11-07T20:19:06Z

I’d planned that for a later PR when I introduce the caching (started the bolt replacement already), trying to chop it up into smaller incremental changes.

On 7 Nov 2016, at 19:30, risk danger olson notifications@github.com wrote:

Since this is adding a new locking package, I want it to have a well designed API that external users can depend on. Can we design something with a locking.Client object that has per-command properties like CurrentRemote, and some lock caching implementation?

I realize at this point, it's a pipe dream, since both the api and auth packages use the *config.Configuration.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.

ttaylorr

Looking good. I have some concerns re: the usage of the <T>ChannelWrapper pattern in public APIs and some notes about usage of the *config.Configuration global instance.

Other than that, I'm very excited for this to get into master. ✨

ttaylorr · 2016-11-07T22:40:19Z

locking/locks.go

+var (
+	// API is a package-local instance of the API client for use within
+	// various command implementations.
+	API = api.NewClient(nil)


Is there a reason that this is public?

ttaylorr · 2016-11-07T22:40:43Z

locking/locks.go

+	API = api.NewClient(nil)
+	// errNoMatchingLocks is an error returned when no matching locks were
+	// able to be resolved
+	errNoMatchingLocks = errors.New("lfs: no matching locks found")


I'd be 👍 increasing the surface-area of this package's API and exporting these as sentinel error values.

ttaylorr · 2016-11-07T22:42:42Z

locking/locks.go

+// Returns the lock id if successful, or an error
+func Lock(path, remote string) (id string, e error) {
+	// TODO: API currently relies on config.Config but should really pass to client in future
+	savedRemote := config.Config.CurrentRemote


Could you pull out a temporary variable:

cfg := config.Config

to signal that references to cfg should be replaced with a parameter, or pass the *config.Configuration instance explicitly? This is a pattern that we use throughout other parts of LFS to signal that the usage of that config should be replaced with a more specific type that holds only the data that it needs, or that the usage of that configuration instance is not necessarily bound to the global one.

I believe that both of those situations apply here.

ttaylorr · 2016-11-07T22:43:17Z

locking/locks.go

+// Force causes the file to be unlocked from other users as well
+func Unlock(path, remote string, force bool) error {
+	// TODO: API currently relies on config.Config but should really pass to client in future
+	savedRemote := config.Config.CurrentRemote


Same note about the config.Configuration use here.

ttaylorr · 2016-11-07T22:45:35Z

locking/locks.go

+func Unlock(path, remote string, force bool) error {
+	// TODO: API currently relies on config.Config but should really pass to client in future
+	savedRemote := config.Config.CurrentRemote
+	config.Config.CurrentRemote = remote


This is not safe to use across multiple goroutines. Ideally we should fix this and either:

duplicate the configuration instance

extract a type that holds this information and has only one local instance

have some sort of locking/coordination mechanism guarding (a sync.RWMutex or similar) read/write access to this instance.

If neither of those seem immediately feasible or you'd rather focus on other stuff for the time being, let's at least document this behavior in the method's godoc.

Again, taken from the previous implementation. I really need to change the API to properly decouple this as mentioned in the TODO. Fair point though since this is a supposedly re-usable package now, maybe this first step needs to include more changes after all.

ttaylorr · 2016-11-07T22:49:33Z

locking/locks.go

+
+// ChannelWrapper for lock search to more easily return async error data via Wait()
+// See NewPointerChannelWrapper for construction / use
+type LockChannelWrapper struct {


I am still not totally sold on this <T>ChannelWrapper pattern. I think introducing this new locking package is a great call, but I'd love if right off the bat we could make better promises about the stability of it.

My biggest concern is that public consumers of this package expect results from functions that return one of these channel wrappers in a different way that is provided. In other words, I'm not sure if they're flexible enough.

From what I can gather, this type provides two APIs:

Results <-chan api.Lock

Wait() error

What do you think about returning the <-chan api.Lock wholesale (allowing callers to interact with it asynchonously as before, and exporting a function in tools a-la:

package tools func CollectErrsFromChan(errs <-chan error) error

that provides the same behavior as Wait() error?

I think this solution would allow callers/consumers of this method to handle errors in a synchronous or asynchronous manner, which would provide a sufficient amount of flexibility.

OK - my experience is that a lot of people aren't immediately comfortable with the 'dual asynchronous channel' pattern and that mistakes get made quite easily; the channel wrapper is intended to make consuming from it idiomatically simple and less prone to error.

Maybe I can think of a way to make default use simple & error free but allowing manual access to the underlying channels possible if desired. Although realistically I don't see the benefit; you can already process primary results in parallel and processing errors in parallel is rarely very useful.

Sounds good, I'll keep my 👀 peeled. My motivation--one I should have mentioned earlier--was that I'd like to provide the flexibility for people to handle these errors on a case-by-case basis, halting the goroutine/worker/etc if an error is deemed "fatal", and ignoring it to move on otherwise.

ttaylorr · 2016-11-07T22:49:49Z

locking/locks.go

+// If limit > 0 then search stops at that number of locks
+func SearchLocks(remote string, filter map[string]string, limit int) *LockChannelWrapper {
+	// TODO: API currently relies on config.Config but should really pass to client in future
+	savedRemote := config.Config.CurrentRemote


Same note here about the config.Config usage.

ttaylorr · 2016-11-07T22:52:21Z

locking/locks.go

+	errChan := make(chan error, 5) // can be multiple errors below
+	lockChan := make(chan api.Lock, 10)
+	c := NewLockChannelWrapper(lockChan, errChan)
+	go func() {


What do you think about extracting this goroutine out to a separate (private) function to add some clarity and cut down on the indentation? I'm thinking:

func scanLocks(offsetId string, locks chan<- api.Lock, errs chan<- error) { ... }

This function could also recurse and handle the offset in a really nice way.

ttaylorr · 2016-11-07T22:52:30Z

locking/locks.go

+	savedRemote := config.Config.CurrentRemote
+	config.Config.CurrentRemote = remote
+	errChan := make(chan error, 5) // can be multiple errors below
+	lockChan := make(chan api.Lock, 10)


Why 5 and 10?

Arbitrary; > 1 because parallel, errors less likely than results. What would you have used?

I'd remove the buffer, since it is not obvious to the caller when this function will block and when it wont.

sinbad · 2016-11-08T09:35:48Z

My take away summary: step 1 needs to be more than a simple move refactor. I'll have to come back to this later.

ttaylorr · 2016-11-08T17:52:31Z

My take away summary: step 1 needs to be more than a simple move refactor. I'll have to come back to this later.

Thanks, and sorry for some of these heavier 💣s. The only reason I want to address these now rather than later is that this PR is targeted against master.

sinbad · 2016-11-28T12:49:10Z

The only reason I want to address these now rather than later is that this PR is targeted against master.

OK, so if I built a major feature branch then submitted individual stage PRs against that, with the final PR to master just being one big merge-of-merges, you'd be OK with that? I think that will be more manageable as I'm stuck between trying to make PRs that are simultaneously small enough / incremental enough to review, and acceptable at a higher level design wise.

technoweenie · 2016-11-28T15:52:54Z

IMO that approach did not work well at all with the big filter process PR (#1382). At one point I was making a review comment in PR 3 that was already fixed in PR 5.

I'd prefer a branch that merges master multiple times until it's ready to merge. This leads to a messier git history, but it preserves our true public collaboration history.

That said, I imagine this is pretty frustrating for you. I think part of the problem is that this PR is doing 2 or 3 things at the same time.

Introduce locking package
Use locking package in commands
Extract channel wrappers to tools.

Items 1 and 2 could probably stay in 1 PR, but item 3 should definitely have been separate to minimize conflicts. We could've fast tracked a PR to extract the channel wrappers, while debating the finer points of the locking package API.

I know your LFS time is limited so feel free to toss us some tasks like item 3 (which I tackled separately). Also, I do plan for @ttaylorr or me to help out on the locking front some time after we finish the scanner and transfer queue changes. Those are core problems for all LFS users that I've wanted to fix for a long time.

sinbad · 2016-11-28T16:59:30Z

IMO that approach did not work well at all with the big filter process PR (#1382). At one point I was making a review comment in PR 3 that was already fixed in PR 5.

If it makes a difference, I don't intend to run PRs in parallel like that (I won't have the bandwidth). I expect to only have one in review at any one time.

I've ditched the channel wrappers for my next iteration anyway since we're moving away from channels as an API-exposed construct (on reflection, a good shout), so they're not really needed any more. It actually addresses most of the feedback on this PR in one go in fact, barring the configuration stuff. Turns out the methods in this case didn't really even need callbacks anyway given the way they were working. So I think I'll nail 1 & 2 in a mo and 3 goes away. I'll iterate again on config later but it's a 'no change' for this PR. If you think what's there can go into master, feel free to merge it.

technoweenie · 2016-11-28T17:45:21Z

Thanks for the new PR. The locking-master PR target is interesting. Let's us approve and merge outside of master, so technically any Go APIs aren't shipped yet. Unline my git scanner stuff which IS technically shipped :/

I've also enabled branch protections on locking-master, so it verifies code reviews, ci, cla, etc :)

sinbad added 2 commits November 7, 2016 15:24

Generalise ChannelWrapper in tools package, including base impl

b7a842e

Create a locking package to support functionality beyond just api calls

fa0f9ec

sinbad added the review label Nov 7, 2016

ttaylorr reviewed Nov 7, 2016

View reviewed changes

technoweenie mentioned this pull request Nov 9, 2016

LFS Roadmap: v1.5.0 -> v2.0 #1632

Closed

technoweenie mentioned this pull request Nov 18, 2016

experimental gitscanner api #1670

Merged

3 tasks

sinbad closed this Nov 28, 2016

technoweenie deleted the locking-package branch November 28, 2016 15:58

technoweenie restored the locking-package branch November 28, 2016 15:58

sinbad deleted the locking-package branch November 28, 2016 17:03

sinbad mentioned this pull request Nov 28, 2016

Locking: move lock funcs to own package #1723

Merged

Conversation

sinbad commented Nov 7, 2016

Uh oh!

technoweenie commented Nov 7, 2016

Uh oh!

sinbad commented Nov 7, 2016

Uh oh!

ttaylorr left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sinbad commented Nov 8, 2016

Uh oh!

ttaylorr commented Nov 8, 2016

Uh oh!

sinbad commented Nov 28, 2016

Uh oh!

technoweenie commented Nov 28, 2016

Uh oh!

sinbad commented Nov 28, 2016

Uh oh!

technoweenie commented Nov 28, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants