Implementation of pure SSH API by sinbad · Pull Request #350 · git-lfs/git-lfs

sinbad · 2015-05-29T18:35:55Z

This PR makes git-lfs compatible with a pure SSH server implementation, rather than only authenticating then doing everything else over HTTP. This is particularly useful for people self-hosting who don't want the overhead of setting up & maintaining a web server.

Corresponding reference server implementation is at https://github.com/sinbad/git-lfs-ssh-serve

PR in its current state focusses on a pure SSH route (including upload/download), future extensions might include supporting a further hybrid setup where again no web server is required for any API calls (UploadCheck/DownloadCheck) but the Upload/Download can still be redirected to hypermedia links (most likely S3).

PR also adds support for a few more SSH URL forms including custom ports. The existing git-lfs-authenticate route is not affected.

For now, HTTP code hasn't been moved more than necessary, left in existing location to aid merging

… full ssh:// URLs Also corrected duplicated test which was intended to test non-bare SSH URLs I assume

…stom ports

…upport GIT_SSH, Plink & TortoisePlink

…th interleaved byte streams)

… can't be)

…P reference server

…, since they're general

…ssh server

…e-use contexts based directly on endpoint equality for safety

…e.name.lfsurl Previously only supported SSH if it was derived from the main git remote url. Although this is the most likely case when using SSH (co-located git repo and binaries), also allow a separate URL to be used with the native SSH path too if required.

technoweenie · 2015-05-29T18:49:51Z

lfs/ssh_test.go

Change this to "github.com/github/git-lfs/vendor/_nuts/github.com/technoweenie/assert" to get the test to pass.

Oops, didn't spot that when I merged in the nut changes, cheers.

technoweenie · 2015-06-03T18:08:59Z

lfs/api.go

This interface isn't really ready to be extracted. We don't intend to maintain support for the current upload/download API once the batch API is finalized. I'm thinking an eventual interface would look something like this.

type ApiContext interface { Endpoint() Endpoint Close() error DownloadAll(objects []*ObjectResource) UploadAll(objects []*ObjectResource) }

I think an ApiContext should be responsible for a single Git LFS operation. That includes a smudge filter that downloads a single object, a push command that's uploading 100 objects from the range of commits. I think that would remove the need to access cached ApiContext objects through GetApiContext(). I can't think of a reason you'd need to get an ApiContext for multiple endpoints, or multiple contexts for the same endpoint in a single Git LFS command.

OK - I was just working with what was there, happy for it to be 100% batch oriented too. In theory yeah that would eliminate the need for cached contexts within the usage model of the command line, although it does no harm for them to be there really; the SSH transport will always need a discrete setup & tear-down step anyway, so whether it's potentially around multiple ops or just one is neither here nor there in terms of added complexity.

FWIW I was developing this in parallel with the interface morphing (Upload disappeared & Check/Object/Batch variants appeared) and it was really no big deal to stay in step. I don't think you necessarily need to wait until your interface is 'final' before allowing it to be abstracted like this - although depends on how many 3rd party implementations you were expecting (I wasn't really expecting any).

Honestly, I'd love to reject all other implementations and figure out some way to implement custom ones through commands. Until then, I think we can merge this and move forward with the API design incrementally. We're likely not accepting other implementations any time soon. Each client makes it that much harder to evolve and maintain.

I'd really like to get rid of GetApiContext if we don't need it. That can wait until everything is using the *All() batch methods though.

Yeah the concept that the context can survive over multiple API calls is necessary until you can guarantee that only one API call will be made per command, unless you want a separate SSH connection to be fired up for every file. Even after the *All() batch methods are implemented though, I think it's nice to have that flexibility - to replace it would just be equivalent to a global context per API and I'm not sure that actually simplifies much; instead baking in an assumption.

technoweenie · 2015-06-04T17:00:58Z

Before merging, I'd like to get some assurance that you'll be around to support it. We're not planning on using it in production anywhere. I don't see this being a problem, but I just want to be clear :) I also still need to set this up locally and see the flow working, and make sure our existing servers all still work.

After merging, here are some things I'd like to see:

Git config to immediately fallback to git-lfs-authenticate.
Implement the *All() batch methods, and get rid of ApiContext.
Integration tests would be awesome. I wonder if we can spin up a fake ssh server...
Eventually merge the SSH commands. git-lfs-ssh-authenticate (command name doesn't matter) could verify the auth, and then send a response tells the client to either send further commands over the connection, or use the given token to talk to the API.

Regarding a git config fallback: If I understand this, it falls back to the hybrid https/ssh authentication method if connect() discards the error and returns a nil context. Any thoughts on writing a key like remote.{name}.lfs**** to the repo's git config when this happens, so that it doesn't have to do all this every time a user accesses their remote? We're experimenting with this idea in #358.

sinbad · 2015-06-05T09:40:24Z

Yeah I'd love to figure out how to fully test over real SSH instead of having to use a pipe, that last little piece of non-automated testing bothers me. I couldn't figure out a way to fake it without still having a gap - a fake ssh server in go is possible but it still wouldn't be 100% real. But then again, still better so maybe I should have a crack at it. Before you merge I'd like to do a few more tests myself anyway since I had to finish it while travelling and didn't have my office servers available to test with. Also I don't have a live implementation git-lfs-ssh-authenticate to test with, assume GH's implementation is not available as source - I've signed up for the GitHub LFS early access queue which I guess is the best way to test this, maybe you can get that enabled for me?

Config fallback to remember the preferred route sounds sensible, the only issue is dealing with transitory error conditions, e.g. someone tries to push while offline etc - you wouldn't want to bake in that failure as a preference. I guess that means only setting the option after a successful use of that path.

About maintenance, I'm assigned to this area by Atlassian for at least the next few months and have a personal interest in it anyway so would probably be tinkering with it even if I wasn't. I can't make any irrefutable promises for the future though, real life etc :) I think it's useful even if marked 'experimental' with GH's line being simply that you don't support it - the refactoring is useful anyway. Or, if you're worried it will slow you down when you're adapting the API, at least now I know you're OK with the principles you could merge just the stuff that's immediately useful (SSH URLs, plink/Tortoise support) and I can maintain my SSH fork in parallel & see what the community thinks. The main issue with that is that the refactoring required moving some code around which becomes harder to maintain over time. That said, it was only about 10 days work so if I had to do it manually again I could do it more quickly anyway - it might just not be always in sync. I'm happy with whichever approach you want to take.

sinbad · 2015-06-05T11:34:42Z

Found a couple of auth prompting issues now I can test in real environments, hold off merging either way until I've nailed them down.
[Edit]False alarm actually, was caused a bad default remote that had nothing to do with the SSH setup.

sinbad · 2015-06-05T14:04:12Z

I am seemingly able to stall it sometimes with large numbers of files when using git lfs fetch though (and not git checkout - maybe an issue with the batch implementation). Working on that.

sinbad · 2015-06-09T09:08:51Z

OK, finally figured out why - my misunderstanding of how your goroutines worked. Needs a bit of a design change to accommodate & serialise to the SSH pipe(s), I'll resubmit once I've sorted it out.

sinbad added 30 commits May 19, 2015 14:59

HTTP localised & added context so proper SSH support can be added

97fc097

For now, HTTP code hasn't been moved more than necessary, left in existing location to aid merging

Remove defunct notes

265e41e

Move HTTP API code from api.go to http.go

dbea408

Starting implementation of ssh context, connection & close only

a03b433

Enhance SSH URL parsing to handle custom ports, implicit username and…

35f55bc

… full ssh:// URLs Also corrected duplicated test which was intended to test non-bare SSH URLs I assume

Fix fallback https URL for ssh URL with custom port, add tests for cu…

5d60c26

…stom ports

Remove outdated TODO

89c60fe

Support custom ports in ssh and upgrade sshAuthenticate fallback to s…

0ca3060

…upport GIT_SSH, Plink & TortoisePlink

Support GIT_SSH [Tortoise]Plink settings with file extensions

0d92958

TortoisePlink uses -P not -p for port

c92f322

Added tests for sshGetExeAndArgs

e5a9c3f

Indicate TODOs in new api interface

03d654a

Document what the arguments represent in CopyCallback

30a7c92

Adding required utility functions for JSON-RPC style ssh protocol (wi…

a803fd6

…th interleaved byte streams)

Tests for SSH son encode/decode

b85e8bc

Add docs for proposed pure SSH API path

ebd27b9

Wrap markdown at 80 chars for easier readability (except tables which…

1be411e

… can't be)

Fix list after wrap

0c42e80

More wrapping

6077746

Wrapping fixes

680745e

Rename git-lfs-serve to git-lfs-ssh-serve to avoid confusion with HTT…

046e022

…P reference server

Implement Upload() for SSH

8ea4e47

Make sure we always shut down multi-request contexts

4512059

Add clean exit request for server

0410e1b

Add test for SSH download (via Pipe)

0b24874

Test SSH Upload

5593904

Make ExtractStructFromJsonRawMessage unscoped (general fund)

641ab8a

Take JSON messaging utility functions out of scope of SSH API context…

98cae76

…, since they're general

Started reference implementation of git-lfs-ssh-serve

a5c087c

Implemented download and downloadInfo in reference implementation of …

03fca55

…ssh server

sinbad added 16 commits May 27, 2015 16:31

Merge latest from multitransfer

2a2290c

Fix tests - changing endpoint in process for HTTP was a problem, so r…

1174667

…e-use contexts based directly on endpoint equality for safety

Gracefully report in SSH DownloadCheck when file doesn't exist

62176d4

Tests for DownloadCheck and UploadCheck for SSH

6ac8581

Batch() client tests for SSH

f4b25de

Merge branch 'master' into feature/full_ssh_multitransfer

09e0a51

Fix a few formatting issues in spec

1757268

Move git-lfs-ssh-serve to an external repo like lfs-test-server

50cf65e

Export ObjectResource, too useful not to be exposed to external libs

b98d4d3

Make sure we adhere to spec in DownloadCheck when not found

25e7e40

Merge branch 'master' into feature/full_ssh_multitransfer

ac44219

Moved http-specific code to http.go

37e2337

Play nicely with zero length files

e75e3aa

Test SSH as global lfs.url setting

77fa70f

Merge branch 'master' into feature/full_ssh_multitransfer

326f8db

technoweenie reviewed May 29, 2015
View reviewed changes

sinbad added 3 commits May 29, 2015 12:54

Fix import of assert tool to local vendor dir to fix tests

b6ea9a4

Update readmes to reference SSH path

d901b76

Fix ssh server URL

83a743d

technoweenie reviewed Jun 3, 2015
View reviewed changes

sinbad closed this Jun 9, 2015

sinbad mentioned this pull request Jun 10, 2015

Pure SSH API implementation #378

Closed

gully mentioned this pull request Feb 25, 2016

Pure SSH-based protocol #1044

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of pure SSH API#350

Implementation of pure SSH API#350
sinbad wants to merge 58 commits intogit-lfs:masterfrom
sinbad:feature/full_ssh_multitransfer

sinbad commented May 29, 2015

Uh oh!

technoweenie May 29, 2015

Uh oh!

sinbad May 29, 2015

Uh oh!

technoweenie Jun 3, 2015

Uh oh!

sinbad Jun 4, 2015

Uh oh!

sinbad Jun 4, 2015

Uh oh!

technoweenie Jun 4, 2015

Uh oh!

sinbad Jun 5, 2015

Uh oh!

technoweenie commented Jun 4, 2015

Uh oh!

sinbad commented Jun 5, 2015

Uh oh!

sinbad commented Jun 5, 2015

Uh oh!

sinbad commented Jun 5, 2015

Uh oh!

sinbad commented Jun 9, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sinbad commented May 29, 2015

Uh oh!

technoweenie May 29, 2015

Choose a reason for hiding this comment

Uh oh!

sinbad May 29, 2015

Choose a reason for hiding this comment

Uh oh!

technoweenie Jun 3, 2015

Choose a reason for hiding this comment

Uh oh!

sinbad Jun 4, 2015

Choose a reason for hiding this comment

Uh oh!

sinbad Jun 4, 2015

Choose a reason for hiding this comment

Uh oh!

technoweenie Jun 4, 2015

Choose a reason for hiding this comment

Uh oh!

sinbad Jun 5, 2015

Choose a reason for hiding this comment

Uh oh!

technoweenie commented Jun 4, 2015

Uh oh!

sinbad commented Jun 5, 2015

Uh oh!

sinbad commented Jun 5, 2015

Uh oh!

sinbad commented Jun 5, 2015

Uh oh!

sinbad commented Jun 9, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants