Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: git/git
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: b3b2aaf0fd
Choose a base ref
...
head repository: git/git
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: 33286dcd6d
Choose a head ref
  • 12 commits
  • 11 files changed
  • 1 contributor

Commits on May 2, 2018

  1. ref-filter: fix outdated comment on in_commit_list

    The in_commit_list() method does not check the parents of
    the candidate for containment in the list. Fix the comment
    that incorrectly states that it does.
    
    Reported-by: Jakub Narebski <jnareb@gmail.com>
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    derrickstolee authored and gitster committed May 2, 2018
    Configuration menu
    Copy the full SHA
    8fb572a View commit details
    Browse the repository at this point in the history

Commits on May 22, 2018

  1. commit: add generation number to struct commit

    The generation number of a commit is defined recursively as follows:
    
    * If a commit A has no parents, then the generation number of A is one.
    * If a commit A has parents, then the generation number of A is one
      more than the maximum generation number among the parents of A.
    
    Add a uint32_t generation field to struct commit so we can pass this
    information to revision walks. We use three special values to signal
    the generation number is invalid:
    
    GENERATION_NUMBER_INFINITY 0xFFFFFFFF
    GENERATION_NUMBER_MAX 0x3FFFFFFF
    GENERATION_NUMBER_ZERO 0
    
    The first (_INFINITY) means the generation number has not been loaded or
    computed. The second (_MAX) means the generation number is too large to
    store in the commit-graph file. The third (_ZERO) means the generation
    number was loaded from a commit graph file that was written by a version
    of git that did not support generation numbers.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    derrickstolee authored and gitster committed May 22, 2018
    Configuration menu
    Copy the full SHA
    83073cc View commit details
    Browse the repository at this point in the history
  2. commit-graph: compute generation numbers

    While preparing commits to be written into a commit-graph file, compute
    the generation numbers using a depth-first strategy.
    
    The only commits that are walked in this depth-first search are those
    without a precomputed generation number. Thus, computation time will be
    relative to the number of new commits to the commit-graph file.
    
    If a computed generation number would exceed GENERATION_NUMBER_MAX, then
    use GENERATION_NUMBER_MAX instead.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    derrickstolee authored and gitster committed May 22, 2018
    Configuration menu
    Copy the full SHA
    3258c66 View commit details
    Browse the repository at this point in the history
  3. commit: use generations in paint_down_to_common()

    Define compare_commits_by_gen_then_commit_date(), which uses generation
    numbers as a primary comparison and commit date to break ties (or as a
    comparison when both commits do not have computed generation numbers).
    
    Since the commit-graph file is closed under reachability, we know that
    all commits in the file have generation at most GENERATION_NUMBER_MAX
    which is less than GENERATION_NUMBER_INFINITY.
    
    This change does not affect the number of commits that are walked during
    the execution of paint_down_to_common(), only the order that those
    commits are inspected. In the case that commit dates violate topological
    order (i.e. a parent is "newer" than a child), the previous code could
    walk a commit twice: if a commit is reached with the PARENT1 bit, but
    later is re-visited with the PARENT2 bit, then that PARENT2 bit must be
    propagated to its parents. Using generation numbers avoids this extra
    effort, even if it is somewhat rare.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    derrickstolee authored and gitster committed May 22, 2018
    Configuration menu
    Copy the full SHA
    3afc679 View commit details
    Browse the repository at this point in the history
  4. commit-graph: always load commit-graph information

    Most code paths load commits using lookup_commit() and then
    parse_commit(). In some cases, including some branch lookups, the commit
    is parsed using parse_object_buffer() which side-steps parse_commit() in
    favor of parse_commit_buffer().
    
    With generation numbers in the commit-graph, we need to ensure that any
    commit that exists in the commit-graph file has its generation number
    loaded.
    
    Create new load_commit_graph_info() method to fill in the information
    for a commit that exists only in the commit-graph file. Call it from
    parse_commit_buffer() after loading the other commit information from
    the given buffer. Only fill this information when specified by the
    'check_graph' parameter.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    derrickstolee authored and gitster committed May 22, 2018
    Configuration menu
    Copy the full SHA
    e2838d8 View commit details
    Browse the repository at this point in the history
  5. ref-filter: use generation number for --contains

    A commit A can reach a commit B only if the generation number of A
    is strictly larger than the generation number of B. This condition
    allows significantly short-circuiting commit-graph walks.
    
    Use generation number for '--contains' type queries.
    
    On a copy of the Linux repository where HEAD is contained in v4.13
    but no earlier tag, the command 'git tag --contains HEAD' had the
    following peformance improvement:
    
    Before: 0.81s
    After:  0.04s
    Rel %:  -95%
    
    Helped-by: Jeff King <peff@peff.net>
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    derrickstolee authored and gitster committed May 22, 2018
    Configuration menu
    Copy the full SHA
    819807b View commit details
    Browse the repository at this point in the history
  6. commit: use generation numbers for in_merge_bases()

    The containment algorithm for 'git branch --contains' is different
    from that for 'git tag --contains' in that it uses is_descendant_of()
    instead of contains_tag_algo(). The expensive portion of the branch
    algorithm is computing merge bases.
    
    When a commit-graph file exists with generation numbers computed,
    we can avoid this merge-base calculation when the target commit has
    a larger generation number than the initial commits.
    
    Performance tests were run on a copy of the Linux repository where
    HEAD is contained in v4.13 but no earlier tag. Also, all tags were
    copied to branches and 'git branch --contains' was tested:
    
    Before: 60.0s
    After:   0.4s
    Rel %: -99.3%
    
    Reported-by: Jeff King <peff@peff.net>
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    derrickstolee authored and gitster committed May 22, 2018
    Configuration menu
    Copy the full SHA
    f9b8908 View commit details
    Browse the repository at this point in the history
  7. commit: add short-circuit to paint_down_to_common()

    When running 'git branch --contains', the in_merge_bases_many()
    method calls paint_down_to_common() to discover if a specific
    commit is reachable from a set of branches. Commits with lower
    generation number are not needed to correctly answer the
    containment query of in_merge_bases_many().
    
    Add a new parameter, min_generation, to paint_down_to_common() that
    prevents walking commits with generation number strictly less than
    min_generation. If 0 is given, then there is no functional change.
    
    For in_merge_bases_many(), we can pass commit->generation as the
    cutoff, and this saves time during 'git branch --contains' queries
    that would otherwise walk "around" the commit we are inspecting.
    
    For a copy of the Linux repository, where HEAD is checked out at
    v4.13~100, we get the following performance improvement for
    'git branch --contains' over the previous commit:
    
    Before: 0.21s
    After:  0.13s
    Rel %: -38%
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    derrickstolee authored and gitster committed May 22, 2018
    Configuration menu
    Copy the full SHA
    d7c1ec3 View commit details
    Browse the repository at this point in the history
  8. commit: use generation number in remove_redundant()

    The static remove_redundant() method is used to filter a list
    of commits by removing those that are reachable from another
    commit in the list. This is used to remove all possible merge-
    bases except a maximal, mutually independent set.
    
    To determine these commits are independent, we use a number of
    paint_down_to_common() walks and use the PARENT1, PARENT2 flags
    to determine reachability. Since we only care about reachability
    and not the full set of merge-bases between 'one' and 'twos', we
    can use the 'min_generation' parameter to short-circuit the walk.
    
    When no commit-graph exists, there is no change in behavior.
    
    For a copy of the Linux repository, we measured the following
    performance improvements:
    
    git merge-base v3.3 v4.5
    
    Before: 234 ms
     After: 208 ms
     Rel %: -11%
    
    git merge-base v4.3 v4.5
    
    Before: 102 ms
     After:  83 ms
     Rel %: -19%
    
    The experiments above were chosen to demonstrate that we are
    improving the filtering of the merge-base set. In the first
    example, more time is spent walking the history to find the
    set of merge bases before the remove_redundant() call. The
    starting commits are closer together in the second example,
    therefore more time is spent in remove_redundant(). The relative
    change in performance differs as expected.
    
    Reported-by: Jakub Narebski <jnareb@gmail.com>
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    derrickstolee authored and gitster committed May 22, 2018
    Configuration menu
    Copy the full SHA
    04bc8d1 View commit details
    Browse the repository at this point in the history
  9. merge: check config before loading commits

    Now that we use generation numbers from the commit-graph, we must
    ensure that all commits that exist in the commit-graph are loaded
    from that file instead of from the object database. Since the
    commit-graph file is only checked if core.commitGraph is true, we
    must check the default config before we load any commits.
    
    In the merge builtin, the config was checked after loading the HEAD
    commit. This was due to the use of the global 'branch' when checking
    merge-specific config settings.
    
    Move the config load to be between the initialization of 'branch' and
    the commit lookup.
    
    Without this change, a fast-forward merge would hit a BUG("bad
    generation skip") statement in commit.c during paint_down_to_common().
    This is because the HEAD commit would be loaded with "infinite"
    generation but then reached by commits with "finite" generation
    numbers.
    
    Add a test to t5318-commit-graph.sh that exercises this code path to
    prevent a regression.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    derrickstolee authored and gitster committed May 22, 2018
    Configuration menu
    Copy the full SHA
    7adf526 View commit details
    Browse the repository at this point in the history
  10. commit-graph.txt: update design document

    We now calculate generation numbers in the commit-graph file and use
    them in paint_down_to_common().
    
    Expand the section on generation numbers to discuss how the three
    special generation numbers GENERATION_NUMBER_INFINITY, _ZERO, and
    _MAX interact with other generation numbers.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    derrickstolee authored and gitster committed May 22, 2018
    Configuration menu
    Copy the full SHA
    1472978 View commit details
    Browse the repository at this point in the history
  11. commit-graph: fix UX issue when .lock file exists

    We use the lockfile API to avoid multiple Git processes from writing to
    the commit-graph file in the .git/objects/info directory. In some cases,
    this directory may not exist, so we check for its existence.
    
    The existing code does the following when acquiring the lock:
    
    1. Try to acquire the lock.
    2. If it fails, try to create the .git/object/info directory.
    3. Try to acquire the lock, failing if necessary.
    
    The problem is that if the lockfile exists, then the mkdir fails, giving
    an error that doesn't help the user:
    
      "fatal: cannot mkdir .git/objects/info: File exists"
    
    While technically this honors the lockfile, it does not help the user.
    
    Instead, do the following:
    
    1. Check for existence of .git/objects/info; create if necessary.
    2. Try to acquire the lock, failing if necessary.
    
    The new output looks like:
    
      fatal: Unable to create
      '<dir>/.git/objects/info/commit-graph.lock': File exists.
    
      Another git process seems to be running in this repository, e.g.
      an editor opened by 'git commit'. Please make sure all processes
      are terminated then try again. If it still fails, a git process
      may have crashed in this repository earlier:
      remove the file manually to continue.
    
    Helped-by: Jeff King <peff@peff.net>
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    derrickstolee authored and gitster committed May 22, 2018
    Configuration menu
    Copy the full SHA
    33286dc View commit details
    Browse the repository at this point in the history
Loading