Skip to content

Reference operations with log#1920

Merged
vmg merged 12 commits intodevelopmentfrom
cmn/ref-with-log
Dec 18, 2013
Merged

Reference operations with log#1920
vmg merged 12 commits intodevelopmentfrom
cmn/ref-with-log

Conversation

@carlosmn
Copy link
Member

This is basically @nulltoken 's #1577 but on top of the new reflog API which @vmg wanted to see. We don't need to load the reflog ourselves before writing it back out (though it still happens in the background)

@vmg
Copy link
Member

vmg commented Oct 30, 2013

You committed a pretty big Makefile under examples. :)

@carlosmn
Copy link
Member Author

Damn, CMake keeps overwriting that every time I need to update the file list. Think of the 20 times I didn't commit it, instead of the one I did :)

@vmg
Copy link
Member

vmg commented Oct 30, 2013

The implementation looks correct, but it races. The locking behavior of Git takes into account both the loose ref file and the reflog, but I don't think this can be done atomically unless we unify reflog and refdb access. I guess there's an interdependence between those two PRS... Hm.

@carlosmn
Copy link
Member Author

It does race (and so does reading and appending). The only way out of that would be to make the backend know about the combined operations (we could probably simply add a message param to write and rename and update the reflog if it's non-NULL)

@carlosmn
Copy link
Member Author

For that matter, we should probably also add append to the backend reflog operations so we don't race with the read.

@carlosmn carlosmn mentioned this pull request Nov 1, 2013
@carlosmn
Copy link
Member Author

This is getting there. Writing a ref now forces you to write a reflog entry so we match git. That appending happens as a proper append, but that still needs to be exposed to the outside.

Stash is still racy, as it uses git reflog delete --updateref --rewrite which is some seriously messed up way of removing stuff from the reflog.

@carlosmn
Copy link
Member Author

Turns out it's not getting that close. This would currently forbit HEAD from ever getting updated, which would be bad. However, it's the only symref that actually gets to do this...

It also turns out that not all refs get a reflog, even if you ask nicecly (update-ref -m)... Only HEAD, refs/heads/ and refs/remotes/. This is going to require some inventiveness.

@carlosmn
Copy link
Member Author

So, I think this is ready. Stash is still racy, but fixing that requires transactions and that's a whole different thing which will needs its own PR.

Something I still have doubts about is what we do when there is no default signature configured. git-core will do its magical guessing, but we don't do that (nor do we want to), so we go for "unknown ". This situation is bound to be rather rare, but maybe we want to figure out something else to do?

Whenever a reference is created or updated, we need to write to the
reflog regardless of whether the user gave us a message, so we shouldn't
leave that to the ref frontend, but integrate it into the backend.

This also eliminates the race between ref update and writing to the
reflog, as we protect the reflog with the ref lock.

As an additional benefit, this reflog append on the backend happens by
appending to the file instead of parsing and rewriting it.
This was a convenience method for the refs front-end to do the reflog
writing. This is now done in the backend and it has no more reason for
being.
@carlosmn
Copy link
Member Author

Is there anything left to fix in the scope of this PR? Once this is merge we can more easily work on conditional updates and transactions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@return? From the name I'm kind of expecting a true/false type, but the doc-string makes me think this has side effects, so it could be an error code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then maybe "ensure" isn't a good name. What this is meant to do is to force the backend to write a log for the given reference. The return is the usual "0 or error code". I'll add that. Do you have a better name for this maybe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, whoops. My eyes wandered up to git_reference_has_log, then I read this doc-comment, and I associated the two.

So the only nitpick I have is the missing @return. Everything else is fine. 😊

@ben
Copy link
Member

ben commented Dec 3, 2013

Super nitpicky things. I 😄 the new APIs, and I 😍 having test coverage.

git-core only writes to the reflogs of HEAD, refs/heads/ and,
refs/notes/ or if there is already a reflog in place. Adjust our code to
follow these semantics.
Sometimes (e.g. stash) we want to make sure that a log will be written,
even if it's not in one of the standard locations. Let's make that
easier.
The frontend used to look at the file directly, but that's obviously not
the right thing to do. Expose it on the backend and use that function
instead.
@carlosmn
Copy link
Member Author

carlosmn commented Dec 9, 2013

Nits un-picked (or something). The test coverage is obviously @nulltoken 's doing.

vmg pushed a commit that referenced this pull request Dec 18, 2013
@vmg vmg merged commit 4e1f517 into development Dec 18, 2013
@carlosmn carlosmn deleted the cmn/ref-with-log branch December 18, 2013 17:49
@jspahrsummers
Copy link
Contributor

From @nulltoken in #1577 (posting here to keep conversation in one place):

Actually @vmg was rather in favor of exposing a set of separate functions and keep git_reference_create() and git_reference_symbolic_create() as they are.

What's the rationale for that? The plain variants still write to the reflog—just without any useful information—so a couple of optional parameters would make more sense to me.

As a maintainer of a bindings library, breaking changes (like adding parameters) here would actually be really helpful, because I was having trouble figuring out why the reflog was getting populated with junk.

@carlosmn
Copy link
Member Author

At the git_reference_ level we're providing something similar to git update-ref and that allows you to update a reference without having to specify a reflog message.

Failing to compile would certainly provide a strong message that something has changed, but the basic call should look like the one we were already providing, and every reference update needs to have a corresponding reflog update which is performed under the reference lock, so this is fixing a bug you were working around in a racy manner.

We should figure out a way to broadcast these type of changes to users of the library.

@jspahrsummers
Copy link
Contributor

At the git_reference_ level we're providing something similar to git update-ref and that allows you to update a reference without having to specify a reflog message.

Unless something's changed since I looked at the code, it still creates a reflog entry, just without a message.

That's unnecessarily confusing to me—if an entry will always be written, I'd say there should always be a message parameter. It could be NULL if the caller doesn't have a specific message, but having such a parameter will help set expectations about the function's behavior.

@carlosmn
Copy link
Member Author

That's right, it will create a reflog entry with no message unless you provide one. This is how update-ref works.

A well-behaved application should provide a message, but Git doesn't require it. It does want to have an entry for every change.

@arrbee
Copy link
Member

arrbee commented Jan 14, 2014

I'm inclined to agree with @jspahrsummers here. Given that the only difference between these APIs is that one takes a couple of optional extra parameters, why do we even need separate APIs? It's easy enough to just pass NULLs for the optional values if you want to use the default / empty values and it would reduce the number of APIs.

@carlosmn
Copy link
Member Author

We can change it and make it more awkward in C not to pass the reflog data. It would diverge from what we've been doing, though it might not necessarily be a bad thing. I have cursed the amount of functions when doing other work on the refs.

So if there's some consensus that we should remove the {,_with_log} differentiation, I'll send a PR. It would make some of my other code easier.

@ethomson
Copy link
Member

if there's some consensus that we should remove the {,_with_log} differentiation

I'll +1 this.

@ben ben mentioned this pull request Feb 24, 2014
34 tasks
phatblat pushed a commit to phatblat/libgit2 that referenced this pull request Sep 13, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants