Merging Git Repositories and Preserving History

Recently I was faced with the need to merge two Git repositories and preserve the history behind the files in each.  Here is an overview of the situation:

  • A "Main" project repository with a remote in a public GitHub project.
  • A "Secondary" project repository with a remote in a private Visual Studio Online project.
  • Both repositories contained Visual Studio / .NET / C# solutions/projects.
  • I needed to move the Secondary (private) repository into the Main (public) repository.
  • I wanted to preserve the history of the files in the Secondary repository.

Much advice about merging two Git repositories and preserving history can be found online.  Here is a sample of what can be found:

http://saintgimp.org/2013/01/22/merging-two-git-repositories-into-one-repository-without-losing-file-history/
http://jasonkarns.com/blog/merge-two-git-repositories-into-one/
http://stackoverflow.com/questions/1425892/how-do-you-merge-two-git-repositories
http://julipedia.meroh.net/2014/02/how-to-merge-multiple-git-repositories.html
http://scottwb.com/blog/2012/07/14/merge-git-repositories-and-preseve-commit-history/

If you scan through the content at those links, you see that there seems to be multiple ways to approach this problem.

Following is the sequence of Git commands that worked for me.  I must stress that this worked for me, and may not work equally well for your situation.  Proceed carefully, and be prepared to handle unexpected situations.

1) Navigate to the master branch of the Main project repository.

2) Add a remote that references the Secondary project repository.  In my case, this was a reference to the Visual Studio Online remote repository.

git remote add secondaryrep <URL of secondary repository>

3) Create a new branch in the Main repository.

git branch mergebranch

4) Navigate to the new branch.

git checkout mergebranch

5) Fetch the files and metadata from the Secondary repository.

git fetch secondaryrep

6) Merge the master branch of the Secondary repository into the working branch of the Main repository.

git merge secondaryrep/master

At this point I had to stop and resolve a handful of minor errors which were specific to my situation.  You may or may not encounter similar issues with your own repositories.

Specifically, there was an untracked file that initially prevented the merge operation.  In this case, it was safe to simply remove that file and retry the merge.

In addition, after merging, there were a few merge conflicts in configuration files related to NuGet packages (recall that these repositories contained Visual Studio / .NET/ C# projects).  It was a simple matter to edit the files indicated by Git and resolve the conflicts.

7) Prepare the files for commit.

git add .

8) Opened all projects/solutions in Visual Studio and confirm that they successfully build and pass tests.

9) Commit the files from the Secondary repository.

git commit -a -m "Added projects from secondary repository"

10) Return to the master branch of the Main repository.

git checkout master

11) Merge the branch we created for the files from the Secondary repository into the master branch of the Main repository.

git merge mergebranch

12) Push the updated master branch to the Main repository’s remote GitHub repository.

git push origin master

13) Remove the branch that had been created for the Secondary repository’s files.

git branch -d mergebranch

14) Remove the reference to the Secondary repository’s remote Visual Studio Online repository.

git remote remove secondaryrep

Advertisements

Using Git Rebase to Combine Commits (GitHub for Windows)

Let’s say you have been working in a branch of code for a week, and have checked in your changes several times.  You are now satisfied with the changes, are ready to merge everything back into the main branch.  However, you do not want to maintain all of the interim check-ins that were done during the week.  You only want the cumulative changes to be merged back into the main branch.

Here is how to use Git’s rebase functionality to “squash” all of the commits on your branch into a single commit that can then be merged back into the main branch.  This assumes the use of the tools provided with the GitHub for Windows package, but it should work similarly on other operating systems.

1) Open the Git Shell included with the GitHub for Windows tools.

2) Navigate to your git repository.

3) Use “git status” to make sure you are working on the branch of code that includes the commits that you want to combine.

3) Use “git log master..” to view the commits on the current branch.  (This command assumes you are working on a branch separate from “master”).  Either way, the goal of this is to determine how many commits are going to be squashed together.

4) “git rebase -i HEAD~N”, where N is the number of commits to include in the rebase/squash operation.

5) This will open Notepad with a list of the commits.  Example:

pick fd8adcc Initial check-in of new feature
pick ece5003 Updates to the new feature
pick 7d828d2 Finalized the new feature
# Rebase d96b13b..7d828d2 onto d96b13b
#
# Commands:
#  p, pick = use commit
#  r, reword = use commit, but edit the commit message
#  e, edit = use commit, but stop for amending
#  s, squash = use commit, but meld into previous commit
#  f, fixup = like “squash”, but discard this commit’s log message
#  x, exec = run command (the rest of the line) using shell
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
# Note that empty commits are commented out

6) Assuming you want to collapse all of these commits into a single commit, change the “pick” command in front of every line AFTER THE FIRST LINE to “s” or “squash”.  Each line that has “s” in front of it will be squashed into the line above it.  Notice the comments at the bottom of the file… use them to guide your actions.  Save the file and exit when you are done.  Here is an example file after editing:

pick fd8adcc Initial check-in of new feature
s ece5003 Updates to the new feature
s 7d828d2 Finalized the new feature
# Rebase d96b13b..7d828d2 onto d96b13b
#
# Commands:
#  p, pick = use commit
#  r, reword = use commit, but edit the commit message
#  e, edit = use commit, but stop for amending
#  s, squash = use commit, but meld into previous commit
#  f, fixup = like “squash”, but discard this commit’s log message
#  x, exec = run command (the rest of the line) using shell
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
# Note that empty commits are commented out

7) Git will proceed to implement the squashing of commits.  You’ll see it working with a prompt that looks like “Rebasing (2/3)”.  When done, a new file containing the combined list of files and comments will be opened in Notepad.  Here is an example:

# This is a combination of 3 commits.
# The first commit’s message is:
Initial check-in of new feature
# This is the 2nd commit message:
Updates to the new feature
# This is the 3rd commit message:
Finalized the new feature
# Please enter the commit message for your changes. Lines starting
# with ‘#’ will be ignored, and an empty message aborts the commit.
# rebase in progress; onto d96b13b
# You are currently editing a commit while rebasing branch ‘feature’ on ‘d96b13b’.
#
# Changes to be committed:
#     new file:   project.csproj
#     new file:   parms.cs
#     new file:   processor.cs
#     new file:   program.cs
#     modified:   solution.sln
#

8) Edit this file to contain a new comment for the combined commit (or if you prefer, keep all of the existing comments).  As before, the comments at the bottom of the file can be used as a guide.  Save and exit when you are done.  Here is an example file after editing:

Added the new feature to the solution.
# Please enter the commit message for your changes. Lines starting
# with ‘#’ will be ignored, and an empty message aborts the commit.
# rebase in progress; onto d96b13b
# You are currently editing a commit while rebasing branch ‘feature on ‘d96b13b’.
#
# Changes to be committed:

#     new file:   project.csproj
#     new file:   parms.cs
#     new file:   processor.cs
#     new file:   program.cs
#     modified:   solution.sln
#

9) Rerun “git log master..” to view the new list of commits on the current branch.  If you collapsed everything into a single commit, you should now only see that one commit.

NOTE:  Consider creating a separate “rebase” branch to try this process out the first time you do it.  If you do this, you can use “git diff <branch> <rebase-branch>” to compare the original set of commits to the newly squashed commit and verify that the two branches match.

More information

https://help.github.com/articles/interactive-rebase
http://stackoverflow.com/questions/16974204/git-how-to-get-commit-history-for-just-one-branch
http://www.youtube.com/watch?v=msuJGG2iWjs
http://davidwalsh.name/squash-commits-git