Clean Git up!

How to clean up Git repos

Things accumulate over time, and version control systems can be subject to the same hoarding tendencies that haunt people in their everyday lives.

Developers aren’t immune to clutter, and the same can be said about their Git repositories.

There comes a point where you must clean up Git. Branches, commits and repositories require a good trim occasionally.

In this Git clean up tutorial, we’ll demonstrate how to take a moderately messy repository and reduce it to only a few commits.

Disambiguation note: If you’re looking for information on the git clean command, see our tutorial on how to git clean untracked files.

Steps to clean up Git

A developer can tackle a Git clean up from many angles. Follow these steps to tactically clean your git repository:

  1. Rebase to avoid messy merge points.
  2. Delete stale Git branches.
  3. Squash commit histories down to nothing.
  4. Perform aggressive Git garbage collection.

Clean up Git without a git clean?

Interestingly, one thing we won’t recommend when you clean up Git is using the git clean command.

The git clean command deletes untracked files and ignores files that one often needs to configure the local environment or move code into production.

The git clean command has the potential to delete important files and directories. Use it with extreme caution.

Clean up Git commits

The first order of business with any Git cleanup operation is to squash lengthy commit histories. In this example repository, there’s only one unshared commit on the master branch, but multiple ones on feature and develop branches. A developer can use the interactive Git rebase tool to squash all commits on a given branch down to one.

git clean up diagram

These commits and branches will be the target of the Git clean up.

A developer can also squash Git commits on the develop branch. They must first perform a checkout, and then invoke the git rebase command with a reference to the number of commits on the branch:

cleanup@git:~$ git clone https://gitlab.com/cameronmcnz/squash-commits-example.git git-clean-up
cleanup@git:~$ cd git-clean-up
cleanup@git:~$ git checkout develop 
cleanup@git:~$ git log --graph --branches --oneline
cleanup@git:~$ git rebase --interactive HEAD~5

In the interactive rebase tool, the commit named E dev is set to be the target of the squash by pre-pending the word pick next to it. The letter s or the word squash is pre-pended to all other commits to indicate they will be removed.

rebase to clean up Git

The interactive rebase tool is a great way to help you clean up Git commits and branches.

When the rebase commits, a successful Git cleanup of the commits on the develop branch squashes them all into one. The new Git commit created in the process is given the name E’.

“Merges are for losers.”
Cameron McKenzie, editor-in-chief of TheServerSide.

The same Git cleanup also must be performed on the three commits in the feature branch. Developers don’t need to use the HEAD~ syntax, either. It’s perfectly acceptable to simply reference the hash ID of the commit from which the branch diverged. In this case, that ID is 953f018.

rebase and merge clean up

Each branch has squashed commits, but the code must still be merged.

A developer can initiate the Git clean up on the feature branch with the following commands:

cleanup@git:~$ git checkout feature
cleanup@git:~$ git rebase --interactive 953f018

At this point, the individual branches have been cleaned up, but none of the code has been synchronized through a merge. With three branches, four rebase operations will completely synchronize each branch and ensure that no code will be lost in any future Git cleanup operations.

The Git rebase merge

The four rebase commands needed to synchronize all three branches are as follows:

cleanup@git:~$ git rebase feature develop
cleanup@git:~$ git rebase develop master
cleanup@git:~$ git rebase master feature
cleanup@git:~$ git rebase feature develop
multiple Git rebase clean

Repeated rebase commands help to clean up commits and flatten branch histories.

Clean up Git branches

There’s really no need to have three branches that point to the same commit. A developer can identify a Git branch clean up task that deletes features and develops with a hard D:

cleanup@git:~$ git branch -D feature
cleanup@git:~$ git branch -D develop
rebase git clean up

The deletion of unused branches is a smart Git clean up task.

Git rebase vs merge

If anyone wonders why I performed four rebases rather than just merge the develop and feature branches into the master before I deleted them, I don’t have a good response other than to invoke Donald Trump and say, “Merges are for losers.”

Yes, merging would have been a simpler way to arrive at this point in the Git clean up. But I doubt you’ll ever see another Git tutorial that does four rebases in a row — which is a reason in itself to do it.

A developer can run the following commands instead of the rebase and delete:

merge@loser:~$ git checkout master
merge@loser:~$ git merge feature
merge@loser:~$ git merge develop
merge@loser:~$ git branch -D feature 
merge@loser:~$ git branch -D develop

One last Git squash rebase

Finally, with no other branches hanging around, you can once again squash the entire master branch down to two commits. Also, it’s more common to use the -i switch rather than the long-winded rebase –interactive. During the interactive rebase, name the new commit base’.

cleanup@git:~$ git rebase -i HEAD~4

After the rebase, the Git repository will reduce to two lonely commits.

If a developer uses Git Flow, they might want to resurrect the develop and feature branch names at this time:

cleanup@git:~$ git branch develop
cleanup@git:~$ git branch feature

When these two operations are complete, you will have reduced the repository to two commits, with all three branches pointing at the same one.

Developers should also note that all the files contained in the tips of the original three branches reside in base’. Commits and branches have been removed, but every file has been retained and no files have been lost. Every file present at the tip of the three branches before this Git clean up began is in base’.

git clean up before after

Compare the repo before and after the Git clean up.

Git garbage collection

With all the Git cleanup operations complete, it’s time to take out the trash. Developers can force a Git garbage collection routine to dispose of all the deleted branches and commits:

cleanup@git:~$ git gc --aggressive

And that’s how to clean up Git branches and commits.

You can find the source code for this example on GitLab.