How to clean up your git branch for code review

What do I mean and why does that matter

Entanglement of git branches. Photo by Adam Rhodes on Unsplash.

If you have been working for a long time in your branch and you made merges from the main branch or another feature branch to get the latest changes, your branch's history might look messy:

* (my-feature) F3
*   Merge branch 'other-feature' into my-feature
| * (other-feature) OF3
| * OF2
* | F2
* | Merge branch 'master' into my-feature
|\ \  
| * | (master) M3
| * | M2
* | | F1
| |/  
* | OF1
* M1

(From this point on I'll call the main branch "master" and your feature branch - "my-feature".)

If you also have "rebase" as the default pull mode in your git config, your git client might have rebased some merges and made copies of somebody else's commits into your branch (at least, VS Code's "Synchronize Changes" button is prone to that):

$ git log my-feature --graph --abbrev-commit --pretty=format:'%h -%d %s'
* 884ddd4 - F1
* ed708d2 - M1

$ git merge other-feature

$ git log my-feature --graph --abbrev-commit --pretty=format:'%h -%d %s'
*   bc5f314 - (HEAD -> my-feature) Merge branch 'other-feature' into my-feature
| * 38c6076 - (other-feature) OF2
| * 8eae6d9 - OF1
* | 884ddd4 - F1
* ed708d2 - M1

$ git pull

$ git log my-feature --graph --abbrev-commit --pretty=format:'%h -%d %s' 
* d08d208 - (HEAD -> my-feature) OF2
* caaddb0 - OF1
* 968fa85 - (origin/my-feature) Remote F
* 884ddd4 - F1
* ed708d2 - M1

$ git log other-feature --graph --abbrev-commit --pretty=format:'%h -%d %s' 
* 38c6076 - (other-feature) OF2
* 8eae6d9 - OF1
* ed708d2 - M1
"other-feature" was merged into "my-feature", then "my-feature" was pulled from "origin" in rebase mode. Because of that "other-feature" commits were copied (look at the hashes of "OF1", and "OF2"). This is not a desirable situation!

With some git magic, you can rearrange the commits in your history to make it linear and easy to read:

* 6f4629f - (HEAD -> my-feature) F3
* 97fd5fa - F2
* aec96c6 - F1
*   106258f - (master) Merge branch 'other-feature'
| * 1c125de - (other-feature) OF3
| * 6730a54 - OF2
| * 5d598b0 - OF1
* | e6994fe - M3
* | 62abcd4 - M2
* fdb2996 - M1 (42 минуты назад) <Oleg Yamnikov>
All 3 commits of this branch (F1-3) are applied on top of each other and based upon the most recent master commit. This diagram assumes "other-feature" has already been merged into master; otherwise, it's too soon to send the "my-feature" pull request.

Your code reviewer might thank you for that.

So in this post, I'll describe several ways to achieve the result above. Make sure you're on the branch you're going to clean up:

git checkout my-feature

But before we do anything with your branch...

How to backup your branch's state

Before you do anything else described in this post, you should save your current state to make sure you can return to it later without losing any of your work.

1. Make sure you have everything committed. Uncommitted changes will stand in the way or get lost.

Either stash these changes if you know what git stash is:

git stash -u

Or commit them in a "WIP" commit:

git add -A
git commit -m "WIP"

2. Look at the latest commit of your branch:

git show

This is the original HEAD of your branch, the pointer to its current state. Write the hash of this commit so that you won't lose it.

How to restore from the backup

Now, if later on, you feel like things have started going south, run the following commands:

git rebase --abort  # To make sure we're not in the rebase state
git reset --hard originalheadhash

Where originalheadhash is the hash you have written down when making the backup.

You're back to where you started. Now you can try another method to clean up your branch.

Option 1: Interactive rebase

Moving commits around. Photo by Bernd Dittrich on Unsplash.

1. Make sure you have the latest version of the main branch:

git fetch

2. Run interactive rebase onto the main branch (the -i flag makes the rebasing interactive):

git rebase -i origin/master

3. This will open an editor of your choice.

Hopefully, you're not stuck in vim at this point, but if you are, exit it, set your git editor as specified in the page linked below, and try step 2 again.

Git - Setup and Config
See "git config core.editor commands"

In the editor you're going to see something similar to the following:

pick dad6cfd F1
pick d08d208 OF2
pick caaddb0 OF1
pick 99bc880 F2
pick 935d834 F3

# Rebase 106258f..935d834 onto 106258f
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup <commit> = like "squash", but discard this commit's log message
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
# .       create a merge commit using the original merge commit's
# .       message (or the oneline, if no original merge commit was
# .       specified). Use -c <commit> to reword the commit message.
# These lines can be re-ordered; they are executed from top to bottom.
# If you remove a line here THAT COMMIT WILL BE LOST.
# However, if you remove everything, the rebase will be aborted.
# Note that empty commits are commented out

First, there is the list of the commits in your branch. Below them are short instructions on how to edit this file. Look closely through the commit list and delete all the commits that do not belong to the feature of this branch (or mark them with "d" in the front).

You should end up with something like this:

pick dad6cfd F1
d d08d208 OF1
d caaddb0 OF2
pick 99bc880 F2
pick 935d834 F3
I marked "OF1" and "OF2" with "d" to remove them from history.

It's possible that you will only see relevant commits and there will be nothing to remove. That's good news!

4. Close the editor and let the rebasing start. You will likely encounter conflicts while it's reapplying your commits. Carefully resolve them each time and run git rebase --continue to move on to the next commit.

5. When git says it's finished, breathe out and relax, all that pain has paid off. Though it's still probably a good idea to look through all the changes in your pull request to make sure that everything's in order and nothing was forgotten.

Here is a good article about interactive rebase if you want to study it in depth:

Git - Rewriting History

Option 2: Cherry picking

Image by John Jeon from Pixabay.

This option can be easier if your branch only contains several commits. Essentially it's the same concept - you reset your branch's state to "master" and then reapply all your commits on top in order. The difference is that git-rebase does that automatically, while cherry-picking is the manual counterpart.

1. Make you have the latest version of the main branch:

git fetch

2. Reset your branch to the target branch:

git reset --hard origin/master

3. Open the history of your original branch's head and scroll down to the first commit that you want to keep (that is, to the first commit belonging to this feature):

git log originalheadhash

4. Copy that commit's hash and run the following command to copy the entire commit into your new branch:

git cherry-pick copiedhash

Where copiedhash is the hash you've copied.

5. As with the rebase, the conflicts are not out of the picture. If git-cherry-pick notifies you of one, resolve the conflicting changes and run git cherry-pick --continue to finish the commit replaying.

6. Return to step 3, find the next commit from the bottom, and copy it. And so on, until your branch contains all the commits you intend to keep.

Which option to choose

Both of the options above are essentially the same.

The first one is more automated, it lets you plan the future history of your branch in full before history rewriting starts. But at the same time, the UI of interactive rebasing can be overwhelming to intermediate git users.

The second option gives you more control of the process since you perform it step by step. Yet it makes you prone to, for example, missing some commits or mixing up their order. It can be tedious as well.