Mark Pollmann
BooksAbout

Advanced Git

4 minutes read

Introduction

This post is the first in a series in which I introduce some git beyond the basics. We'll take a look at some useful commands and try to understand the internals and more cryptic symbols in git-land.

Understanding HEAD

HEAD is just the pointer pointing at the currently checked-out commit, usually the most recent commit on your current branch. After a successful commit, HEAD is moved to the new commit with a pointer back to the last one.

Detached HEAD

But check out a commit by its hash and you get the following message: "You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by performing another checkout."

What happened? A detached HEAD means the HEAD pointer is not pointing at the end of a branch. If you would create a new commit now git can't add it to the linear line of commits called the branch. So if you want to keep working from this state you have to create a new branch to log your work which will deviate from the other branch.

Getting a hold of older commits: HEAD~

With the HEAD~ command you can point down the commit change. For example HEAD~1 points at the first commit below HEAD on the current branch (Analogously HEAD~n reaches the n-th commit down the history).

What it is good for? How about deleting the last 3 commits:
git reset --hard HEAD~3. This tells git to point HEAD down three commits and just remove the content of the commits that were jumped over.

Or compressing the last 4 commits into one so your history looks cleaner?
git reset --soft HEAD~4. The --soft flag stages the changes from the commits instead of deleting them. Now you can commit all the changes with a nice new commit message. This is also called squashing commits.

Your command history: git reflog

The reference log keeps track of when the tips of your branch or other refs change. git reflog is a shorthand for git reflog show HEAD so it shows the last updates to HEAD. If you need more information there is git reflow show --all or, for a specific branch, git reflog show BRANCHNAME.

Keeping a clean house: git rebase

This command is useful if you care about the tidiness of your commit history. Example scenario: You created a topic branch off of master some time ago and now the branches diverged because other commits were added to master after your branch-off. You want to stay up to date on these changes so you merge master into your topic branch every now and then, creating these ugly merge commits every time. Is there a better way?

There is: git rebase. Instead of git merge master you run git rebase master. This tells git to move back in the commit history to the point of diversion, save all of your commits on the topic branch to a temporary area, apply all master commits on your branch and then re-apply your topic commits on top of those. Conflicts are solved the same way as in a git merge and when you push your changes to your remote repo you might have to force push as you re-wrote your git history. Don't do this on a public branch on which other people work because you just changed the commit history on the server and others won't be able to pull anymore.

This command is used a lot in open source projects as a clean commit history is valued highly and information-less merge commits just clutter up the history.

Automate rebase pulls via git config

You can rebase during a pull with git pull --rebase which I like to do automatically by setting your config with: git config --global pull.rebase true.

There is also the even more powerful git rebase --interactive where you can rearrange, edit or remove single commits which I will write about in a future blog post.

Temporary cleanup: git stash

Push your uncommited changes to a temporary stack to clean up your working directory and apply them later. This comes in handy when you want to do a quick pull to get the latest changes from your team. git pull will complain if the working directory is not clean and abort. If you want this to happen automatically you can set up your config like this.

git config pull.rebase true
git config rebase.autoStash true

This rebases your commits, stashes your uncommited changes, finishes the pull and then re-applies the stash.

Retracing your steps: git blame and git bisect

See who introduced the latest changes to a specific line with git blame FILE. This is quite useful if you need to investigate who introduced changes1.

With git bisect you can search for the commit that introduced a bug. The idea is to do a binary search on your commit tree and check out commits until the bug status changed from good to bad (or the reverse). See the bisect documentation for more information.

Conclusion

In the next part I'm going to talk about git hooks, the .git folder, undoing commits and recovering from mistakes like committing to the wrong branch or an ill-advised git reset --hard.

If you're still in search of enlightenment, take a look at the git koans.


1 In IntelliJ IDEs this function is called annotate