Git History Rewriting
Git commit history rewriting is one of Git’s most important, but often overlooked features. Many people either don’t know how or don’t care to have a clean commit history.
Why Bother?
One day while working I came across some code that didn’t make sense. Most of the time when I encounter something that doesn’t make sense or looks wrong, it can be explained by the person who wrote it.
I didn’t know who that person was, so I checked the commit history of the file instead. It sent me back to a larger commit made months ago, with tons of changes with a terse commit message, “Adding X feature to Y System”. It was a dead end.
I sat around a bit until the person who wrote it came in. They explained the code was an old fix that could be removed.
In this case only an hour or two was lost, but I’ve encountered other situations where the person who wrote the code is no longer there. The only historical information about the code is the commit history. Bad decisions are made and countless hours are wasted when the history is lost. So much like commenting code and naming variables appropriately, having a clean commit history is an important part of developing maintainable software.
Git tools
Git provides a few commands to help make this process easier, but sometimes its not obvious which ones to use in different situations. (Note: These situations all assume you haven’t pushed your code to shared repository)
Let’s say there is a repository and you’ve added 3 commits to it ( I’ll refer to them as A->B->C, where C is the latest):
The commits are:
a263de8 Temp commit to share with Bob b65e8f5 Fixing issues with Y c2d6c53 Adding feature X
Changing your latest commit: git commit –amend
Now lets say someone is reviewing your code and notices a formatting error in commit C. What do you do?
Well, the easiest thing is to make the change in you working copy and stage it as normal. Instead of just git commit -m “Fixing formatting in previous commit you can use:
git commit –amend
This will modify your old commit. Now no one will ever know you made a formatting error!
Editing your commit history: git rebase -i
So, thats a nice shortcut for making changes in C, but what if you make a formatting error in B?
In order to modify a specific commit we can use git rebase. This command basically replays your commit history on a given commit. In this case we’re going to replay our command history on a previous commit, 3 before the one we are currently on. This is represented by git as HEAD~3.
So, our final command is:
git rebase -i HEAD~3.
This will bring up a file in whatever editor you set git to use (usually vim by default):
pick a263de8 Temp commit to share with Bob pick b65e8f5 Fixing issues with Y pick c2d6c53 Adding feature X
It will also list some options. What we want to do is tell rebase, “As you replay, stop at commit B and let me make some changes”. The way to do this is to change the “pick” to “edit” as shown:
pick a263de8 Temp commit to share with Bob edit b65e8f5 Fixing issues with Y pick c2d6c53 Adding feature X
Once you save and quit it will start the rebase and stop at commit B, telling you that you can now edit it. You can now fix the formatting and type git commit –amend to add the changes to commit B. Then type “git rebase –continue” and you’re done!
More advanced rebase:
Let’s say you realize that temp commit (commit A) was really part of feature X (commit C), how do you fix that?
you can follow the git rebase steps above only in the next editor, just change the order of the commits:
pick b65e8f5 Fixing issues with Y pick c2d6c53 Adding feature X fixup a263de8 Temp commit to share with Bob
This will make the history become B->(C+A).
Re-writing your history: git reset
Lets say you have tons of commits 20+, they are all temp with random changes and you just want to throw the history away and start over. You could copy the changes to a different directory, re-clone the repository and add the changes back, but git provides a much easier and faster way to do this called git reset.
Lets say you’ve committed tons of commits past master and want to go back. You can use git reset master. It will by default leave all the changes from your commits in the working directory, but move the latest commit back to master. Effectively doing in one command the copying and re-cloning process above without destroying your entire repo.
Important Note
When doing any history rewriting, make sure you have a copy of your repository somewhere. With git rebase you can do git rebase –abort to go back to the pre-rebase state, but in case you accidentally make a bad change its important to have a backup.
It can take some time in a world where every second is precious, but by carefully editing your history you can save orders of magnitude more time later.
Other Resources
https://help.github.com/articles/about-git-rebase
http://git-scm.com/book/en/Git-Branching-Rebasing