Skip to content

Commit 53614cf

Browse files
committed
nix-prefetch-git: fix determinism with leaveDotGit
Add more files to the delete list: * .git/FETCH_HEAD * .git/ORIG_HEAD * .git/refs/remotes/origin/HEAD * .git/config Further, remove all remote branches, remove tags not reachable from the given 'rev', do a full repack and then garbage collect unreferenced objects. According to my testing, the result is fully deterministic. As in "any change done to the upstream repo, ahead of 'rev', will not affect the hash of the resulting 'clone'". Even changing the clone URL will not change the output hash, because .git/config is removed. A new version of git can of course change store format, but that's unavoidable. For big repositories, the repack operation may be a bit heavy. But as far as I can see there is no cheaper way to determinism.
1 parent 415f41b commit 53614cf

1 file changed

Lines changed: 40 additions & 3 deletions

File tree

pkgs/build-support/fetchgit/nix-prefetch-git

Lines changed: 40 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -199,6 +199,43 @@ clone(){
199199
cd $top
200200
}
201201

202+
# Remove all remote branches, remove tags not reachable from HEAD, do a full
203+
# repack and then garbage collect unreferenced objects.
204+
make_deterministic_repo(){
205+
local repo="$1"
206+
207+
# run in sub-shell to not touch current working directory
208+
(
209+
cd "$repo"
210+
# Remove files that contain timestamps or otherwise have non-deterministic
211+
# properties.
212+
rm -rf .git/logs/ .git/hooks/ .git/index .git/FETCH_HEAD .git/ORIG_HEAD \
213+
.git/refs/remotes/origin/HEAD .git/config
214+
215+
# Remove all remote branches.
216+
git branch -r | while read branch; do
217+
git branch -rD "$branch" >&2
218+
done
219+
220+
# Remove tags not reachable from HEAD. If we're exactly on a tag, don't
221+
# delete it.
222+
maybe_tag=$(git tag --points-at HEAD)
223+
git tag --contains HEAD | while read tag; do
224+
if [ "$tag" != "$maybe_tag" ]; then
225+
git tag -d "$tag" >&2
226+
fi
227+
done
228+
229+
# Do a full repack, for determinism.
230+
# Repack does not add unreferenced objects to a pack file.
231+
git repack -A -d -f
232+
233+
# Garbage collect unreferenced objects.
234+
git gc --prune=all
235+
)
236+
}
237+
238+
202239
clone_user_rev() {
203240
local dir="$1"
204241
local url="$2"
@@ -227,9 +264,9 @@ clone_user_rev() {
227264
echo "removing \`.git'..." >&2
228265
find $dir -name .git\* | xargs rm -rf
229266
else
230-
# The logs and index contain timestamps, and the hooks contain
231-
# the nix path of git's bash
232-
find $dir -name .git | xargs -I {} rm -rf {}/logs {}/index {}/hooks
267+
find $dir -name .git | while read gitdir; do
268+
make_deterministic_repo "$(readlink -f "$gitdir/..")"
269+
done
233270
fi
234271
}
235272

0 commit comments

Comments
 (0)