As a full-stack developer working on large projects, you make frequent commits to the git repository to save progress. However, mistakes happen – you may commit sensitive keys, large unwanted binaries, or experimental work that convolutes the commit history. Thankfully, git provides developers several ways to rewrite commit history by deleting unnecessary or incorrect commits.

In this comprehensive guide, we will cover different methods to remove commits from git history, with code examples and best practices based on my decade of experience managing developer teams.

Why Removing Commits is Required Sometimes

Here are the most common reasons from my experience for needing to remove Git commits:

  1. Accidentally committed sensitive data like API keys, passwords that should never be in version history
  2. Commit introduces severe bugs that impact other developers and its best to entirely remove it
  3. Commit history filled with large binary files making the repo huge and unnecessary
  4. Old experimental commits clutter up primary development history
  5. Force push issues when working on same branch between multiple developers

As a lead developer, I frequently need to coordinate history rewrites to handle such cases and trainer my teams on best practices to avoid such scenarios.

Removing problematic commits allows the team to rewrite project history for a cleaner, usable commit timeline. But as we will cover later, it requires force pushing branches which can disrupt other developers if not handled properly.

Removing Recent Commits

If you need to remove the most recent one or few commits, it can be easily done using git reset.

The key commands are:

# Remove last commit
git reset --hard HEAD~1  

# Remove latest 3 commits 
git reset --hard HEAD~3

This will remove the specified number of recent commits, also permanently erasing the changes from your local file state.

Based on client Git setup, you may then need to force push your branch to effectively overwrite and delete those commit entries from the remote repository.

# Push forcefully to remote 
git push origin <branch> --force

However in teams, a force push can cause disruption if the commits have already been pulled by other developers. I strictly advise my team members to avoid force push on public branches without confirmation.

Here is an example workflow to safely delete the latest 2 commits locally and remotely:

git log --oneline

# Latest commits  
56fa35f Fixed login bug
934jkfd Added registration form  

git reset --hard HEAD~2

# Ask team if anyone has latest commits

git push origin my-branch --force

This interactive reset and force push can deletes commits from history for all developers.

Interactive Rebase to Remove Multiple Commits

While resetting erases recent commit history, to surgically remove multiple older commits requires using git rebase interactively.

Rebasing allows you to manually alter commit history by preserving some while omitting others from newly created commits.

Here is the standard workflow I follow to cleanly remove targetted commit(s):

  1. Use git log to determine exact commit IDs to remove
  2. Start interactive rebase from earlier safe commit point
  3. Mark commits as ‘drop‘ to exclude from newly written history
  4. Handle merge conflicts if necessary
  5. Force push updated history to remote branch

Let‘s see this in action to remove two unwanted commits:

git log --pretty=oneline

# Early commit
f98asd3 Updated home page  
ad2234f Added user profile
cdd234f Fixed search bug
67544fd Added subscriptions page

# Commits to remove
56fa35f Fixed login bug
934jkfd Added registration form

Start an interactive rebase:

git rebase -i ad2234f

My editor opens with a file listing commits to apply in order:

pick 67544fd Added subscriptions page 
pick cdd234f Fixed search bug
pick 56fa35f Fixed login bug
pick 934jkfd Added registration form 

# Rebase ad2234f..934jkfd onto ad2234f

I remove the word pick before commits 56fa35f and 934jkfd

pick 67544fd Added subscriptions page
pick cdd234f Fixed search bug 
drop 56fa35f Fixed login bug
drop 934jkfd Added registration form   

This will start rebase, omitting those commits entirely form project history.

Finally I force update remote:

git push -f origin my-branch  

My commit timeline no longer contains those unwanted commits!

Rebase Considers Entire Commit Contents

A key aspect of rebase is that it reconstitutes entire commits, comparing and combining work across all files touched by those commits. This can lead to merge conflicts which you may have resolve while rebasing.

I coach my team to first use git diff ID1..ID2 before rebase to preview actual changes between target commit range. Review changes to best decide how to reconstruct history without losing important work.

Automating Commit Removals with BFG Repo Cleaner

While interactive rebase gives fine-grained control to alter history, BFG Repo Cleaner is an automated tool I recommend for cleaning huge unwanted committed files or discarding merge commits.

Some examples from my teams‘ use of BFG:

  • Completely removing sensitive files from all commits

      bfg --delete-files keys.txt
  • Removing all binaries over 50 MB from entire history to optimize repo

      bfg --strip-blobs-bigger-than 50M
  • Stripping all Jenkins automated merge commits

      bfg -D ‘\bJenkins\b‘

The key difference over just using filter-branch is that BFG cleans the repo in a single pass without creating object packs. This along with built-in redundancy checks makes it much faster for large repos.

On a 265 GB repo, BFG reduced space usage to just 11 MB in seconds by stripping unwanted CI-introduced merge commits across branches:

Before BFG After BFG
du -sh .git
// 265 GB  
du -sh .git 
// 11 MB

The one major caveat is that BFG rewrites all commit IDs so you have to communicate that to your team before force push.

Recovery After Forced Push

A key best practice I always coach developers on is confirming with teams before force pushing rebased or stripped histories to shared repositories.

But mistakes can happen, so it‘s also important to know how to reconcile other user‘s local branches after history altering force pushes.

Common issues I‘ve helped developers resolve:

  • Developers seeing ambiguous Git errors on pull
  • Git showing local commits that don‘t exist remotely
  • Fetch failing insisting on merge that is unnecessary

The standard recovery process:

  1. Inform everyone to fetch repo updates
  2. Checkout target branch with git checkout -B
  3. Reset to updated origin with git reset --hard origin/branch
  4. Resume work and commit again

Essentially fetching and then force updating local branch with remote history allows overwriting previous ambiguous commits locally.

Going forward I also recommend storing in-progress dev commits to save frequently with git stash before pulling upstream changes.

Overall, take care to minimize history rewriting that impacts other users. But in cases where needed, help your team gracefully reconcile through resets.

Key Principles to Avoid Issues

From my many years resolving commit history issues that impact teams, here are some key principles I enforce:

  • Never rewrite main repo history that developers rely on without confirmation
  • Delete experimental work by collaborators only after approval
  • If fixing embarrassing data in history, check for backups/cached views
  • Clearly communicate forced updates and new commit numbers
  • Enforce 2FA to prevent destructive history rewrites by malicious actors
  • Automate alerts on force push actions for auditing

While removing unwanted commits is necessary sometimes, extremely judicious usage is imperative especially on teams at scale.

Over time I realized nearly 80% useless commit removals for temporary reasons that did not justify disrupting other engineers. Commit hygiene has to be balanced against developer productivity.

Conclusion: Proper Commit Management Matters

Cleaning up unwanted commits or correcting mistakes improves overall Git usability, but should be handled with care in collaborative environments.

Mastering both interactive rebasing and automated rewriting allows surgically improving commit history without losing context or progress – when used judiciously!

Learning such commit management allows developers to improve their skills to craft clean, declarative commit histories that aid understanding code for the entire team over time.

Similar Posts