Skip to content

Add LFS fsck feature#280

Closed
christophershirk wants to merge 13 commits intogit-lfs:masterfrom
christophershirk:master
Closed

Add LFS fsck feature#280
christophershirk wants to merge 13 commits intogit-lfs:masterfrom
christophershirk:master

Conversation

@christophershirk
Copy link

This is a first-cut of the LFS fsck command to address issue #262.

I wanted to put this out there for review before adding stuff like on-the-fly status completion updates (which would be useful given the possibility of LFS fsck hashing gigs of media content). I'm curious whether this is a good approach to the fsck problem.

After a fresh git clone and running script/bootstrap, I encountered a ton of errors in the unit tests for the commands package, so the changes you see in the non fsck unit tests were to fix those issues on my machine.

Example of the unit test errors I encountered:

--- FAIL: TestUpdateWithOldPrePushHook_2 (0.04s)
    update_test.go:174: Error writing pre-push in Before(): open /private/var/folders/0t/n69_kx414lg481jhdyf8n6xh0000gn/T/git-lfs-tests/empty/.git/hooks/pre-push: no such file or directory

@christophershirk
Copy link
Author

Not sure why TestFsck() can't os.Stat() the LFS object file on the CI machine.
note: the next commit e934a82 is an unrelated typo fix.

@technoweenie technoweenie mentioned this pull request May 4, 2015
38 tasks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're not exposing any of this, why not have something like:

func fsckError(name, oid string) error {
  return fmt.Errorf("Object %s (%s) is corrupt", name, oid)
}

@technoweenie
Copy link
Contributor

do we want to look for LFS stuff in past commits?

I think there are at least 4 modes that we could support.

  1. Scan HEAD + Index (default)
  2. Scan a commit by name. git lfs fsck --commit abcdef
  3. Scan .git/lfs/objects git lfs fsck --all
  4. Scan EVERYTHING.

I think we can get away with 1-3. Scanning the entire git history isn't interesting unless you have everything downloaded already.

@christophershirk
Copy link
Author

Thanks for the comments, @technoweenie . I'll make the changes. Before I add support for different modes, I'm going to push some temporary debug code to diagnose the os.Stat() error.

@technoweenie
Copy link
Contributor

I'm going to push some temporary debug code to diagnose the os.Stat() error.

Use the testing package's Logf(). They only display on test runs that have the error, so there's no harm leaving them in after you fix the bug.

@christophershirk
Copy link
Author

Alright, I'm stuck. I'm not sure why this works on my machine and not on the CI box, and the test is simple, so perhaps someone else could take a quick look.

TestFsck() is failing to find the LFS object corresponding to a LFS file after a commit. I have test code that recursively prints out the repo dir, and on the CI machine, there are simply no LFS objects under .git/lfs/objects. On my local machine, of course, everything works as expected.

If someone can help figure this out, that'd be awesome.

Additionally, I'd like to refactor TestFsck() a bit, but first I'd need to get clarification on how to run a bunch of commands on a TestRepository sequentially without the repo state getting cleared. Currently -- and several other unit tests like TestCleanPointer depend on this behavior -- func (r *Repository) test(path string)() invokes repo.clone() before each testCommand Run(), so it isn't possible.

@technoweenie
Copy link
Contributor

Ugh, sorry about the sad state of integration testing here. I have some ideas on making it better.

The reason it's failing is because Git LFS isn't installed. You can replicate it locally by removing the clean filter:

$ git config --global --unset filter.lfs.clean

I'm going to update the test to manually write a bad file to .git/lfs/objects.

@technoweenie
Copy link
Contributor

I have my own branch that uses your commits up until you started all those debug commits. I also changed the test so that it corrupts the file in the before callback. Once the test passes, I'd add a check in the after callback that ensures the file is gone.

I want to change the fsck behavior so that it just prints the errors to STDOUT without logging an error. We can probably just delete the corrupt object, and expect the client to download it on demand from the server.

$ git lfs fsck
Could not fsck Git LFS files:

Object a.dat (916f0027a575074ce72a331777c3478d6513f786a591bd892da1a577bf2335f9) is corrupt

@technoweenie technoweenie mentioned this pull request May 8, 2015
@technoweenie
Copy link
Contributor

I did some more work and opened a PR: #290. Still can't get the tests to pass though.

@technoweenie
Copy link
Contributor

Also, I'm going to make #291 a priority of mine. Your testing experience in this PR is unacceptable.

@christophershirk
Copy link
Author

Thanks for picking this up and taking on the testing issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants