As a developer working with Git repositories hosted on GitHub, you‘ll inevitably need to create copies of projects – either to contribute code or simply to experiment locally. GitHub provides two primary mechanisms for copying repositories: cloning and forking. At first glance they might seem interchangeable, but understanding the differences is key to using GitHub effectively.

What is Cloning a GitHub Repository?

Cloning produces a complete local copy of a GitHub repository on your machine. When you clone a Git repo using the ‘git clone‘ command, Git initializes a new directory on your computer, downloads the full history and codebase, and sets up a link to the original so you can pull and push commits.

Cloning gives you your own local environment to freely edit code, test builds, run the application, and commit changes as you would with any Git repository. The key thing to understand is that your cloned copy is fully independent from the original. Your commits will not show up in the source repository and vice versa.

Here is a diagram contrasting a cloned repository versus the original:

                                             Origin Repo
                                                |
                                                | Commits
                                                |
                                         -------------
                                        |             |
                                   Clone           Original  
                                        |
                                         -------------
                                        | Isolated    |  
                                        | Commits     |

As you can see, cloning diverges from the source repository after the initial copy. Your local clone allows isolated development without impacting the original project or other developers. This makes cloning ideal for reviewing code, testing builds, fixing bugs locally, and more – especially when you don‘t require syncing changes back to upstream branches.

What is Forking a GitHub Repository?

Whereas cloning generates a local copy, forking produces a complete copy of a repository within your GitHub account. When you click the ‘Fork‘ button on a GitHub project, GitHub creates a duplicate with all code, branches, commits, README files, licenses, issues, pull requests, and assets.

However, unlike cloning, this copy remains attached to the source repository via Git links. Your fork stays in sync as an independent parallel version that you can freely edit. And because it lives on GitHub, you can easily submit changes back to the original upstream repository in the form of pull requests.

Here is how a forked repository relates to the upstream source:

                                          Upstream Repo   
                                                |            
                                                | Commits
                                                |           
                                       -----------------------
                                      |                         |
                                   Fork                       Upstream
                                      |                         |
                                      | Forks‘ commits          |    
                                      |                         |

With forking, you contribute back changes through pull requests. For example, say you fixed a bug in a fork – you could submit a PR asking to merge that bug fix into the upstream repository. This makes forked copies well-suited for collaborative development among developers.

Key Differences Between Cloning and Forking

From the definitions above, we can summarize several key differences:

Location:

  • Clone: Local machine copy
  • Fork: GitHub copy

Relationship to Original:

  • Clone: Fully independent copy
  • Fork: Remains linked to upstream source

Submission of Changes:

  • Clone: Manual syncing of changes required
  • Fork: Allows pull requests back to original project

Synchronization:

  • Clone: Diverges from original after initial copy
  • Fork: Stays up to date with source repository

So in summary:

  • Cloning = Local independent copy for isolated development
  • Forking = GitHub copy for collaborative development

When Should You Fork vs Clone on GitHub?

Now that we‘ve covered the key differences, when should you use each approach?

Clone When:

  • You want to work on the project locally
  • You need to test builds, run code, debug issues, etc.
  • You don‘t necessarily want to contribute changes back
  • You want to experiment freely without affecting remote repo
  • You want to reference code examples or templates to learn from

Reasons you may want to clone a repository without contributing back include:

  • Reviewing open source libraries to study coding patterns
  • Copying starter app templates for new projects
  • Trying out tutorials and code samples locally

Fork When:

  • You want to contribute code through pull requests
  • You need to create branches to develop features
  • You want to easily sync latest upstream changes

Reasons for forking include:

  • Proposing bug fixes or new features via PR
  • Experimenting freely without affecting the original project
  • Maintaining a custom variant of a repository

Many open source projects encourage forking their repositories for developers to tweak as they please without interfering with the main codebase.

Putting Cloning and Forking Into Practice

To demonstrate cloning and forking workflows, let‘s walk through an example using the freeCodeCamp open source project.

We‘ll assume the goal is to fix a minor bug. We find the bug locally after cloning, push a fix to our fork, then submit a pull request.

Fork the Target Repository

First, navigate to https://github.com/freeCodeCamp/freeCodeCamp and click the ‘Fork‘ button:

This creates a parallel copy under your GitHub account.

Clone Your Fork Locally

Now clone your fork to get a local repository where you can edit files. On your fork‘s GitHub page, click the big green "Code" button and copy the HTTPS clone URL:

Then run:

git clone https://github.com/YOUR-USERNAME/freeCodeCamp.git

This initializes a local clone with remote links setup to push/pull from your fork.

Fix The Bug Locally

From there, open files within your clone, make changes to fix bugs or experiment freely without worrying about the upstream repository. For this example, let‘s edit some UI text:

Once fixed, commit the change locally:

git commit -am "Fix typo in UI text" 

Push Changes to Your Fork

Then push your local committed fix up to your fork on GitHub, which can be fetched by your remote:

git push origin main

Open Pull Request to Main Repository

Finally, open up GitHub in your browser again and navigate back to your remote fork. GitHub will automatically detect the new commit and ask if you want to create a pull request from your fork into the upstream repository:

Click that, which allows you to then create a PR asking the main repo‘s owner to pull in your bug fix – thus contributing back!

And that is the standard workflow leveraging cloned local repositories and remote GitHub forks. This allows decentralized development plus an avenue to contribute changes via PRs.

Keeping Forked Repositories in Sync

One benefit of forking is your copy stays in sync with upstream branches – you can pull down the latest commits from the source repository into your fork.

Configure the source repo as a remote:

git remote add upstream https://github.com/original-owner/repo

Then periodically fetch the latest and merge into your clone‘s main branch:

git fetch upstream
git checkout main
git merge upstream/main 

This helps ensure you have the most up-to-date changes from upstream before pushing commits from your clone out to your fork.

Comparing Cloning vs Forking Workflows

The following diagrams help contrast cloning versus forking Git workflows:

Cloning Workflow:

     GitHub Upstream Repo              Your Machine
           |                                 |
           | clone (git clone)               | 
           |----------------------------->| 
                                         | |
           |                                 |
           |    edit, commit local changes   |
           |                                 |
(manual sync                                 |
of changes)                                 |

Forking Workflow:

                          Your Forked Repo    Your Machine
       Upstream Repo           |                  |            
             |                pull                |
             |            request |              | 
             |<- - - - - - - - - -                 |
           push                                  |
             |                                  |
  (auto      | 
 detect    clone                           edit, commit   
 changes)                                  push |
                                              |
                                            pull request
                                                |

The cloning workflow keeps all activity local then requires manual synchronization of changes back to the source.

Meanwhile, the forking workflow uses GitHub as a collaboration hub where changes can flow back and forth using push/pull requests between forked copies and upstream repos. Automation helps coordinate updating codebases.

Git Subtree: An Alternative Approach

Git subtrees represent another form of repository integration provided by Git. This approach allows embodying external repositories or parts of them as sub-directories inside your main project repo.

For example, say Project A wants to use some code from external Library B. Instead of forking or cloning Library B, the owner of Project A can add it as a subtree which looks like a sub-folder:

my-project
├── src 
├── tests
└── external
    └── library-b

This keeps specific external code directly versioned within one‘s main repository while retaining linkages to push and pull changes.

Public Repository Fork Statistics:

To gauge real-world usage, here are statistics on popular open source GitHub projects:

Repository Stars Forks
freeCodeCamp 320K+ 22K+
TensorFlow 157K+ 57K+
React 169K+ 33K+

As you can see, thriving OSS projects often have orders of magnitude more forks than stars. This indicates developers are indeed leveraging GitHub‘s forking workflow to copy then contribute back to repositories.

High profile thought leader Christophe Porteneuve, author of "Mastering Git", weighs in on why understanding open source contribution workflows is so critical:

"A grasp of forking, cloning, remote branches, and pull requests is vital for participating in open-source projects hosted on git collaboration services like GitHub and Bitbucket – the premiere hubs of open-source development today."

So whether cloning for isolated experimentation or forking to contribute as a collaborator, accurately recognizing their differences helps unlock Git‘s powerful branching capabilities.

Conclusion

While cloning and forking might seem interchangeable at first glance, understanding the core differences is critical for effectively managing repositories and contributing on GitHub:

  • Cloning = independent local copy for isolated development
  • Forking = GitHub copy for collaborative development

To recap key differentiators:

  • Clones reside locally and diverge independently from the original. Changes require manual synchronization.
  • Forks live remotely and retain upstream links. Changes flow back via pull requests.

Ultimately, cloning facilitates private experimentation while forking enables public collaboration.

Knowing when to clone versus when to fork comes down to your purpose – are you working privately or looking to coordinate with teams? Following the standard fork + clone approach gives you the flexibility to do both.

Similar Posts