This is part three of a set of blogs I'm writing as I learn about Git. In part one, I talked about Git's distributed architecture, its approach to version management, and its support for frequent branching and merging. In part two, I looked at some of the tools we use to work with Git at endjin: GitHub, Visual Studio's support for Git in Team Explorer, and a third party Git client, SmartGit. I described how to clone a repository, commit new versions of the source code locally, view the commit history, and undo local changes.
In part three, I'm going to describe the basic branching operations, and sharing changes between your local repository and a remote repository. I'll continue with the previous blog's format, comparing how each operation is carried out through the command line, through Visual Studio, and through SmartGit.
Branching with Git
Before I describe branching operations, I'll step back for a moment to talk about some key concepts around branching in Git.
A Git branch is simply a tag which is applied to certain commits. Each branch's tag 'moves' with the commits you make, applying to each new commit in turn, until you swap to a different branch.
The default branch in Git is known as the master branch. Unless you add a new branch and start working on it, the master branch points to the last commit you made. HEAD is Git's pointer to the current branch – it identifies which branch tag should be applied to new commits.
When you create a new branch from master, it initially points to the same commit as master – as there are no differences between the files in the branches at this stage. Two tags on one commit. Once you begin working on the new branch, other branches such as the master branch fall behind – their tags no longer point to the latest commit.
Conversely, when you first return to the master branch after making commits on another branch, until you commit, HEAD, which points to master, will no longer point to the most recent commit.
Because each commit contains a pointer to the commit or commits that they were created from, Git is able to automatically find the 'merge base' – the best common ancestor between two commits – when you merge branches. More on this when we get to the Merging branches section.
Howard has written a step-by-step guide to using GitFlow with TeamCity which goes into greater depth about branching operations and workflows in Git:
- Part 1 – Different Branching Models
- Part 2 – A Branching Model for a Release Cycle
- Part 3 – GitFlow Commands
- Part 4- Feature branches in TeamCity
Making a new branch
Before you make a new branch, you should commit your changes on the current branch, or remove them from the working directory and staging area.
To create a new branch, and switch to working on that branch:
git checkout -b <name of new branch>
I'll use angle brackets in these command line examples when you need to change a value.
The parent of the branch will be whichever branch you are currently working on.
You can create new branches by selecting 'New Branch' in the branches area, and choose the parent branch from a drop down list.
New branches are added through the Branch menu.
Switching to a different branch
You can easily swap between branches in Git, by moving the HEAD pointer.
As when making a new branch, before you swap branches, you should commit your changes on the current branch, or remove them from the working directory and staging area.
git checkout <name of the branch you want to go to>
To change to a new branch, in the Branches area, select the branch from the Branch drop down or double click on the name of the branch.
Solution explorer will refresh to show the working directory for the branch selected.
SmartGit's Branch menu lets you select a branch to check out from the same graphical list of commits that was provided by its Log tool.
Merging local branches
Eventually, most branches need to be merged back in with another branch.
In some cases, it's not even necessary to create a new commit to merge branches. When you are merging branch B back into a branch that it came from, branch A, and there have been no commits on branch A since branch B was created, a speedy 'fast forward merge' is possible. Git just picks up the tag 'branch A', and moves it to the latest commit for branch B. This one commit will now have two branch tags.
Where a fast forward merge isn't possible, Git uses the 'parent commit' pointers of each branch's most recent commit to automatically identify the common ancestor, and tries merging the changes between the most recent commit on each branch, and the common ancestor. In this case, merging branches is a specialised form of commit. Unlike the commits we saw earlier, it will have more than one 'parent' commit.
In some cases, Git won't be able to merge branches because the same files are changed in both branches, and it doesn't know which changes you want to keep. I'll describe how to deal with merge conflicts in a later blog.
Whether a merge is achieved by moving a tag, or creating a new commit, the commands you enter are the same.
A merge preserves the full, separate commit histories of the branches which have been merged. Git offers an alternative way of combining branches, known as a 'rebase'. A rebase re-writes history, replaying all of the commits that happened on one branch onto another.
The most recent commit will contain the same snapshot of the repository after either a merge or a rebase, but merging lets you go back and see the history of separate commits for each branch, whereas re-basing makes it look like all the changes happened linearly on one branch. The effect of a rebase is much like a branch merge, followed by the deletion of one branch – which is often what happens.
To merge branches, switch to the branch you want to merge into:
git checkout <name of branch you want to merge into>
Then use the command below, to merge the two:
git merge <name of other branch>
In the Branches area, you select 'Merge', and then choose the source and target branches.
SmartGit's Merge tool lets you choose a branch to merge into the current working tree. It doesn't differentiate between merging branches and merging commits.
Sharing your commits with a remote repository
Once you are happy with a commit or series of commits, you will probably want to share them. The more often you do this, the fewer merge conflicts will occur when working on a project in a team. You need to commit changes locally before sharing them with a remote repository.
An important thing to understand about operations which share changes between repositories is that they are carried out on branches. Git uses the notion of branch tracking, where local branches can be associated with remote branches. A local branch which is associated with a remote branch is known as a tracking branch. Local branches that are created from a remote branch are automatically set up to be tracking branches of that remote branch. If you share changes for a local tracking branch, Git will know which remote branch to merge them with. By default, when you clone a repository, the local master branch is set up to track the remote master branch. Branches that you create anew in your repository are not shared with the remote repository unless you specifically request this.
To share the changes you've made locally on a branch, first you should get any updates from the remote repository. Git won't let you share your changes if the repository has been modified since you last got its changes. A 'fetch' tells you what has altered (across the entire repository, not just the current branch), but does not update your local repository. Pull combines the operations of fetching the remote changes to your local repository, and merging the changes for a particular branch into your repository, creating a new commit in the process. If there are any changes in your local repository which haven't been committed when you pull, Git will 'stash' them. They can be retrieved but they won't show in your working directory. This is why it's better to commit locally, immediately before pulling from a remote repository.
If there are no merge conflicts for the pull, you can then send your changes to the repository, in an operation known as a push. After a push, the commits that you've made locally for a branch will be available on that branch in the remote repository.
Pull and push operations combine the commit histories for the local and remote versions of a branch in a linear sequence.
To fetch changes from the remote repository, and merge in the changes for the remote branch that your current local branch is tracking:
This is a shortcut for two commands:
git merge origin <name of remote branch the checked out local branch is tracking>
To pull from a different branch, switch to that branch locally.
To push your changes to the remote repository:
git push origin <branch name>
If you're on a tracking branch, Git will be able to work out which remote server and branch to push to. For example, if you cloned the repository, and are working on the master branch, you won't need to specify the branch name. Otherwise you need to specify a branch name. A branch of the same name is automatically added to the remote repository.
You will be asked to authenticate when you push changes to the remote repository. When you are using the command line, if you use SSH, you don't need to supply your username and password to push changes.
The Visual Studio extension supplies a 'Sync' button carries out a fetch, a merge, then a push. You can access it through the Changes area, or the Commits area.
SmartGit also lets you combine pull and push operations with a 'Synchronize' tool, available through the Remote menu.
SmartGit's Sync option does things a different way round to the Visual Studio extension. Instead of pulling, then pushing, SmartGit's Sync command tries to push first, then pulls. The thinking behind this is to give you more control over what is merged into your local repository, and how the merge happens, so that you don't end up pushing untested changes.
Push me, pull you, pull request?
A side note on the terminology around sharing commits between Git repositories. If you are new to Git like me, you may get a bit confused with the way the terms 'push' and 'pull' are used.
I'd got it straight in my head that a push meant pushing changes to a remote repository, and a pull meant getting changes from a remote repository into yours. But then what's this talk of sharing changes by making a 'pull request' which pops up all around discussions of GitHub? I thought that you shared changes with a push?
It's to do with permissions. In open source projects, project owners want anyone to be able to copy the main project repository, but don't necessarily want everyone to be able to freely push their commits to the repository.
In these circumstances, people who want to make a change can 'fork' the main repository into their own account, work independently on a feature or fix, and then ask the owners of the main repository to pull their changes. Hence, 'pull request'.
If you're working on a .NET project using GitHub for source control, most likely you will have permission to push to the repository, and pull requests won't apply.... when working on your own team's projects at least.*
- Update July 2014 - as johnkors points out in the comments, this last sentence is not correct. Pull requests can be used as part of the development process by teams working on the same repository, allowing project leads to assess work carried out in feature or fix branches before they are merged with core branches.
Which tools are best?
I've found Visual Studio's support for Git very handy for cloning repositories, and viewing the commit history and differences between files. It's nice not to have to open up a separate application. However, I did begin to experience some odd behaviour where some changes I'd made would show up in SmartGit, but not the Visual Studio Team Explorer window. Whether this is down to something I did to the extension, or another factor, I haven't worked out yet. User error is likely at this stage! I see the extension as a simple, easy to use tool, for the basic Git operations. It doesn't support some of the more advanced operations like tagging, cherry picking, or working with multiple remote repositories.
As I get more familiar with Git, I'm turning to SmartGit more often to commit and push changes to the remote repository, and have used it to deal with all of the merge conflicts that have occurred so far. As much as anything, it's when you're dealing with something complex, it's easier to see what's going on with a GUI designed to fill a screen, rather than fit in a 2 inch column.
In summary, rather than an 'either or' situation, I'm using both clients – depending on the task I want to do. In future, I want to rely less on GUI clients and more on command line tools such as PowerShell, for anything except the basic operations. I only really understand what's happening with the clients if I relate GUI actions back to commands, so I might as well just start using them!
Writing this blog, I used these excellent resources:
- The official Git documentation – clear, in depth descriptions with great diagrams
- Git – the simple guide by Roger Dudler – a short, 'get you up and running', visual guide to the crucial Git commands.
The next blog/s in the series will dig into the process of resolving some crunchy merge conflicts, and describe SmartGit's newly added support for GitFlow.