-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
Describe the bug
While attempting to migrate a large LFS project to GitHub (~60GB total, probably around 50GB of which is in the 12K LFS objects), I've run into a problem where git push --mirror sits around for close to an hour doing seemingly nothing (low network traffic, low disk IO, low CPU usage, no TTY log output) before eventually failing with:
ref "2.6_leaderboard_anim_fix":: fork/exec /Applications/Xcode.app/Contents/Developer/usr/libexec/git-core/git: resource temporarily unavailable
error: failed to push some refs to 'github.com:REDACTED.git'
After several hours of trying to debug the issue I managed to track it down to what looks like an SSH process leak: that hour where it's sitting around doing nothing, it's actually launching a new SSH process every couple of seconds, until eventually the system's process limit is hit and the OS says "no".
For some reason these processes weren't showing up in the process count shown by Activity Monitor or by top, I only spotted them when I started logging the ps -aef output every few minutes, and spotted all of the SSH processes that had been spawned by the pre-push hook:
501 20172 20171 0 3:51pm ttys000 0:04.03 /opt/homebrew/bin/git-lfs pre-push git@github.com:REDACTED.git git@github.com:REDACTED.git
501 20183 20172 0 3:51pm ttys000 0:00.00 (ssh)
501 20195 20172 0 3:51pm ttys000 0:00.00 (ssh)
501 20203 20172 0 3:51pm ttys000 0:00.00 (ssh)
501 20210 20172 0 3:51pm ttys000 0:00.00 (ssh)
501 20218 20172 0 3:51pm ttys000 0:00.00 (ssh)
501 20223 20172 0 3:51pm ttys000 0:00.00 (ssh)
501 20236 20172 0 3:51pm ttys000 0:00.00 (ssh)
(+2000 more)
501 35138 20172 0 4:37pm ttys000 0:00.00 (ssh)
501 35143 20172 0 4:37pm ttys000 0:00.00 (ssh)
501 35155 20172 0 4:37pm ttys000 0:00.00 (ssh)
501 35169 20172 0 4:37pm ttys000 0:00.00 (ssh)
501 35176 20172 0 4:37pm ttys000 0:00.00 (ssh)
501 35181 20172 0 4:37pm ttys000 0:00.00 (ssh)
501 35186 20172 0 4:37pm ttys000 0:00.01 (ssh)
Running the push with GIT_TRACE=1 shows the following log output being repeated hundreds of times. Checking the ps output at the same time suggests that this is also when the process leak occurs.
16:50:48.718919 trace git-lfs: exec: git '-c' 'filter.lfs.smudge=' '-c' 'filter.lfs.clean=' '-c' 'filter.lfs.process=' '-c' 'filter.lfs.required=false' 'remote'
16:50:48.730284 trace git-lfs: attempting pure SSH protocol connection
16:50:48.730638 trace git-lfs: run_command: ssh git@github.com git-lfs-transfer REDACTED.git upload
16:50:48.730797 trace git-lfs: exec: ssh 'git@github.com' 'git-lfs-transfer REDACTED.git upload'
16:50:49.708958 trace git-lfs: pure SSH protocol connection failed: Unable to negotiate version with remote side (unable to read capabilities): EOF
16:50:49.709737 trace git-lfs: exec: git '-c' 'filter.lfs.smudge=' '-c' 'filter.lfs.clean=' '-c' 'filter.lfs.process=' '-c' 'filter.lfs.required=false' 'remote'
16:50:49.720135 trace git-lfs: ssh cache: git@github.com git-lfs-authenticate REDACTED.git upload
16:50:49.720362 trace git-lfs: exec: git '-c' 'filter.lfs.smudge=' '-c' 'filter.lfs.clean=' '-c' 'filter.lfs.process=' '-c' 'filter.lfs.required=false' 'remote'
16:50:49.727622 trace git-lfs: exec: git '-c' 'filter.lfs.smudge=' '-c' 'filter.lfs.clean=' '-c' 'filter.lfs.process=' '-c' 'filter.lfs.required=false' 'remote'
16:50:49.733616 trace git-lfs: HTTP: POST https://lfs.github.com/REDACTED/locks/verify
16:50:49.903063 trace git-lfs: HTTP: 200
16:50:49.903217 trace git-lfs: HTTP: {"ours":[],"theirs":[],"next_cursor":""}
I've been following Git's docs for mirroring repos, which suggest doing the git push before the lfs push. However while investigating this issue I found #4350 (comment), which recommends doing the LFS push first. And I can report that git lfs push does behave better: After spending a minute or two silently leaking 200 SSH processes, it's now moved on to the "Uploading LFS objects" stage, and that stage appears to be running correctly (no additional processes leaked after 1 hour, and data is being uploaded at a good rate). But I've still got a few hours to go before I'll know whether it's a complete success or not.
To Reproduce
- Find a large repo which uses LFS and has some pointer files which are broken (no corresponding object in LFS)
- Create a local mirror via
git clone --bare&git lfs fetch --all - Create a new GitHub repo to mirror it to
- With LFS 3.0.2 installed, attempt
git push --mirror, and have it fail pushing the commits due to some non-LFS files being over GitHub's size limits - Upgrade to LFS 3.2.0 to avoid git lfs migrate import performance issues with version 3.0.2 #4750
- Use
git lfs migrateto move the big files into LFS - Attempt
git push --mirror, and have it fail (in the pre-push hook?) halfway through uploading the LFS objects due to hitting one of the broken pointer files - Locate the bad pointer files and use
git filter-repo --strip-blobs-with-idsto nuke them - Attempt
git push --mirroragain, have it fail due to the process leak
Maybe there's a simpler set of steps to follow, but that's how I ended up in the situation. Step 4 didn't upload any LFS objects, but step 7 did (difference in behaviour between 3.0.2 and 3.2.0? Missing pre-push hook? I'm not sure), and at step 9 the repo should still be empty in terms of regular git data.
Expected behavior
lfs pre-pushshould give some progress indicator for what it's doing during that first hour before it fails (or it should be fixed to not take an hour)- lfs shouldn't leak processes
System environment
- macOS 12.4 on an M1 mac mini
- git 2.32.1 (Apple Git-133)
- git-lfs 3.2.0 from homebrew (and initially 3.0.2 from homebrew)
credential.helper=osxkeychain
filter.lfs.clean=git-lfs clean -- %f
filter.lfs.smudge=git-lfs smudge -- %f
filter.lfs.process=git-lfs filter-process
filter.lfs.required=true
user.name=Jeffrey Lee
credential.helper=osxkeychain
core.editor=nano
core.excludesfile=/Users/jeffrey/.gitignore_global
pull.rebase=false
Output of git lfs env
git-lfs/3.2.0 (GitHub; darwin arm64; go 1.18.2)
git version 2.32.1 (Apple Git-133)
LocalWorkingDir=
LocalGitDir=
LocalGitStorageDir=
LocalMediaDir=lfs/objects
LocalReferenceDirs=
TempDir=lfs/tmp
ConcurrentTransfers=8
TusTransfers=false
BasicTransfersOnly=false
SkipDownloadErrors=false
FetchRecentAlways=false
FetchRecentRefsDays=7
FetchRecentCommitsDays=0
FetchRecentRefsIncludeRemotes=true
PruneOffsetDays=3
PruneVerifyRemoteAlways=false
PruneRemoteName=origin
LfsStorageDir=lfs
AccessDownload=none
AccessUpload=none
DownloadTransfers=basic,lfs-standalone-file,ssh
UploadTransfers=basic,lfs-standalone-file,ssh
GIT_EXEC_PATH=/Applications/Xcode.app/Contents/Developer/usr/libexec/git-core
git config filter.lfs.process = "git-lfs filter-process"
git config filter.lfs.smudge = "git-lfs smudge -- %f"
git config filter.lfs.clean = "git-lfs clean -- %f"