Skip to content

merge-base: Remove redundant merge bases#3492

Merged
vmg merged 5 commits into
masterfrom
vmg/redundant
Nov 2, 2015
Merged

merge-base: Remove redundant merge bases#3492
vmg merged 5 commits into
masterfrom
vmg/redundant

Conversation

@vmg

@vmg vmg commented Oct 30, 2015

Copy link
Copy Markdown
Member

Following up on the Famous Case Of The Extra Merge Base. Comparing our runtime vs Git's, it turns out we never implemented a second optimization pass after finding the merge bases:

https://github.com/git/git/blob/80980a1d5c2678ab9031d7c60faf38b9631eb1ce/commit.c#L900-L954

Some commit in the array may be an ancestor another commit. Move such commit to the end the array, and return the number of commits are independent from each other.

So, here's my implementation, ported mostly verbatim from Git. I'm aware tests are missing, but it's not immediately obvious to me how to reproduce the "minimal test case" here (I haven't put much thought into it to be fair). Any assistance on that would be welcome.

cc @carlosmn @ethomson @peff

Comment thread src/merge.c Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it sound like there are redundant entries every time we have more than one merge base, which is not what we're doing here. If there are multiple merge bases, we may have some redundant ones.

@carlosmn

Copy link
Copy Markdown
Member

Looks fine; as far as testing goes, were you able to identify what was special about the repository in which this happened? Maybe we can recreate part of the graph which made us have redundant commits.

@vmg

vmg commented Oct 30, 2015

Copy link
Copy Markdown
Member Author

Waiting for some feedback from @peff too.

@vmg

vmg commented Oct 30, 2015

Copy link
Copy Markdown
Member Author

Looks fine; as far as testing goes, were you able to identify what was special about the repository in which this happened? Maybe we can recreate part of the graph which made us have redundant commits.

Nope, the history on that repo is quite dense. I could use the whole anonymized repo as the test case, but that'd be a tad too large I think.

@ethomson

Copy link
Copy Markdown
Member

Yeah, even if we can't use the exact repo that you found, I would like to be able to have a test repo that shows what a redundant merge-base is...

Though I think to create such a beast, we'll have to understand WTF is actually going on here, which I certainly don't yet.

@peff

peff commented Oct 30, 2015

Copy link
Copy Markdown
Member

I looked at the graph of the repo that triggered this. What happens to cause the redundant base is that you have a merge one one side that "straddles" a merge on the other. Here's a picture:

       *--*--S
      /     /
--*--F--*--R---M
         \    /
          *--*

F = fork point
R = re-merge point
S = side branch tip
M = master tip

Here we have a side branch S that has forked from master at fork point F. It then remerged later from master at R. Meanwhile, master itself had its own unrelated merges, one of which straddles the re-merge.

Imagine we want to find the merge base of the side branch and master. We walk backwards from M and S. Obviously we'll find R, but we still need to follow up S^1 and M^2 to see if they meet. They do, at F.

In a true criss-cross merge, F would not be an ancestor of R, and we would truly have two merge bases (which is why we need to follow up even after finding R). But in this case, one is an ancestor of the other, and can therefore be removed.

I drew the diagram above by hand. It does represent the situation in the repo we found, but I didn't actually recreate the simplified version and test it. But it should be pretty straightforward to make a libgit2 test case out of it.

@peff

peff commented Oct 30, 2015

Copy link
Copy Markdown
Member

I don't see anything obviously wrong with the code itself, but I am not that familiar with the libgit2 code. I'm fairly sure that failing to remove the redundant bases is the source of the problem we found, though. So at least the intent of this PR is correct. :)

@vmg

vmg commented Nov 2, 2015

Copy link
Copy Markdown
Member Author

So, I couldn't make heads or tails of @peff's ASCII diagram, but running the original repository through the anonymizer and manually pruning some extra branches gave a pretty small reproduction case that seems to work really well. I've committed it to the resources folder and added the corresponding test.

I think this is ready to merge now.

vmg pushed a commit that referenced this pull request Nov 2, 2015
merge-base: Remove redundant merge bases
@vmg vmg merged commit 1318ec9 into master Nov 2, 2015
@peff

peff commented Nov 2, 2015

Copy link
Copy Markdown
Member

@vmg Hmm. I did recreate my diagram as a git repository, but running I git, I noticed that it didn't actually trigger remove_redundant. So it's entirely possible that my analysis of the root cause here was wrong. Or it may simply be that there are some vagaries in paint_down_to_common based on the exact number of commits between each event, and especially on the order we visit them based on commit timestamps (we always end up with the right answer, but traversal order sometimes matters for things like early-cutoff behavior).

I'd give it 50/50 odds on one explanation versus the other. I don't think it's worth spending a lot of time trying to recreate the minimal test case, though. That would give me a warm fuzzy feeling, but since you have a working anonymized case, that's probably enough.

@ethomson ethomson deleted the vmg/redundant branch January 9, 2019 10:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants