Skip to content

Gitscanner callbacks#1725

Merged
technoweenie merged 10 commits intofast-walk-no-channelsfrom
gitscanner-callbacks
Dec 6, 2016
Merged

Gitscanner callbacks#1725
technoweenie merged 10 commits intofast-walk-no-channelsfrom
gitscanner-callbacks

Conversation

@technoweenie
Copy link
Contributor

@technoweenie technoweenie commented Nov 29, 2016

This changes the gitscanner API to use callbacks, instead of returning a PointerChannelWrapper. See the fsck command for an ideal usage of it:

gitscanner := lfs.NewGitScanner(func(p *lfs.WrappedPointer, err error) {
  // handle pointers or scan errors
})

if err := gitscanner.ScanRefWithDeleted(ref.Sha, nil); err != nil {
  // handle cmd error
}

gitscanner.Close()

You can now pass a callback to NewGitScanner() for all the .Scan*() funcs, or pass a callback to individual functions if you want those pointers to be handled differently.

Unfortunately, this introduces some ugly code in the core push/pull/checkout related commands. A lot of code still expects to receive a []*lfs.WrappedPointer after the scan has completed. This cleanup will have to happen in future PRs. The main blockers are the progress meter and transfer queues need to accept a stream of pointers.

My hunch is that the transfer queue is all ready setup for this, but our use of it prefers receiving a []*lfs.WrappedPointer. However, the progress meter definitely expects to receive the total number and size of the pointers up front. Since this process is becoming async, updating the progress meter stuff is my next target.

Copy link
Contributor

@ttaylorr ttaylorr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, I just left a few comments that I'd love to 👂 your 💭 's on.

// use new gitscanner so mapping has all the scanned pointers before continuing
mapping := make(map[string][]*lfs.WrappedPointer)
chgitscanner := lfs.NewGitScanner(nil)
err = chgitscanner.ScanTree(ref.Sha, func(p *lfs.WrappedPointer, err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this callback should be moved into the line above since (I think) we're leaning towards getting rid of the per-operation callbacks?

tempgitscanner := lfs.NewGitScanner(func(p *lfs.WrappedPointer, err error) {
if err != nil {
if multiErr != nil {
multiErr = fmt.Errorf("%v\n%v", multiErr, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about extracting this into a function in the errors package? I'd prefer to use errors.Wrap(), but I think it defaults to "Error" if err is nil.

package errors

func CombineErrors(e1, e2 error) error {
        if e1 == nil {
                return e2
        }

        return fmt.Errorf("%v\n%v", e1, e2)
}

There's probably a pretty cool opportunity to do something like:

func CombineErrors(car error, cdr ...error) error { ... }

but I don't think we need it.

Copy link
Contributor

@ttaylorr ttaylorr Nov 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coming back to this, if CombineErrors were to go in the errors package, I think it'd be better to just call it errors.Combine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we could probably expose the inner errors somehow too. I'll experiment with this in another PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added errors.Combine in my original locking PR so 👍 (although it only concatenated with a line break as you're doing here, nothing smart like storing inner errors)

var pointers []*lfs.WrappedPointer

tempgitscanner := lfs.NewGitScanner(nil)
err := tempgitscanner.ScanPreviousVersions(ref, since, func(p *lfs.WrappedPointer, err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here about inlining the callback to the above line (and then shortening this if statement to not leak the err variable).

return false, err
for _, oid := range corruptOids {
badFile := filepath.Join(badDir, oid)
if err := os.Rename(lfs.LocalMediaPathReadOnly(oid), badFile); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I didn't know you could rename like this!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How else do you rename?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, I definitely meant to type "move" in my original comment. I thought there was a os.MoveFile function, but I was mistaken 😅


if ok {
Print("Git LFS fsck OK")
recalculatedOid := hex.EncodeToString(oidHash.Sum(nil))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh man, this is so much better than

fmt.Sprintf("%x", oidHash.Sum(nil))

pointers = append(pointers, p)
}

if err := g.ScanLeftToRemote(ref, cb); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we OK to do this if multiErr != nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. If this returns an error, the command that ScanLeftToRemote() attempts to spin up errored, so the callback potentially sets multiErr is never called.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for the explanation 👍

return cb, nil
}

return nil, errors.New("No callback given")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about this in 🍖 -space, but this would be a good candidate to extract to a sentinel value if you want to keep the firstGitScannerCallback business 👍


// CallbackMissing returns a boolean indicating whether the error is reporting
// that a GitScanner is missing a required GitScannerCallback.
func CallbackMissing(err error) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dig IsCallbackMissing instead of just CallbackMissing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a 50/50 toss-up for me, but your recommendation made it a 51/49 split :)

@sinbad
Copy link
Contributor

sinbad commented Dec 5, 2016

It feels a little odd to me to supply the callback in NewGitScanner instead of per Scan... function. Are you really likely to want to use the same callback for more than one invocation of Scan...? Feels to me like you'd mostly want different callbacks each time, and the callback implementation flowing after the Scan.. call feels more natural and readable to having to look back at the construction.

@technoweenie
Copy link
Contributor Author

Part of the point in the git scanner redesign is spawning the git cat-file processes a single time for large pushes, reducing how many git commands are spawned overall. Most uses of the git scanner can be changed to pass the pointers asynchronously to the transfer queue, but a lot of cruft prevents us from doing it in this PR. I plan to tackle the progress meter next, while @ttaylorr works on the transfer queue.

That said, I do think passing the callback on the constructor is pretty strange from an API perspective. There's still time before the API is finalized a git/gitscanner package, so I expect the callback situation will change.

@technoweenie technoweenie merged this pull request into fast-walk-no-channels Dec 6, 2016
@technoweenie technoweenie deleted the gitscanner-callbacks branch December 6, 2016 16:25
@technoweenie technoweenie restored the gitscanner-callbacks branch December 6, 2016 22:37
@technoweenie technoweenie deleted the gitscanner-callbacks branch December 6, 2016 22:39
@technoweenie technoweenie restored the gitscanner-callbacks branch December 6, 2016 22:40
@ttaylorr ttaylorr deleted the gitscanner-callbacks branch December 16, 2016 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants