Thuy's Blog

Using BFG Repo-Cleaner to delete sensitive files from git history

May 26, 2019

Problem

Few weeks ago, I was very proud that I managed to resolve a security issue for an Android project that I was currently working on. Somehow when that Android project was created, the release keystore was committed to git and pushed into GitHub. This kind of keystore must be kept secure according to Google. So how did I eventually fix the issue?

Solution

The solution that I chose is to use BFG Repo-Cleaner. I already used this tool in a past Android project before and I see it worked pretty well and fast.

For a demo purpose, I created a GitHub repo to experiment with BFG Repo-Cleaner.

The release keystore is the file that we want to delete

In this repo, the branch which has the latest commit is dev. All the branches at remote are:

alice/bugfix0
alice/bugfix1
alice/feature0
bob/bugfixN
bob/feature1
dev
master

I also created 2 tags whose latest commits still contain the keystore file. I did this intentionally just for demo purpose. After using the BFG tool to delete the keystore file, those commits should no longer contain the keystore file.

tags

Be aware that we should do this whole rewrite of git history when nobody else is currently pushing changes into the repo. This is to avoid any potential conflicts. For example, after we completely delete the keystore file, somebody else may not know about it and may push their old local changes with force, accidentally exposing the keystore file again. It’s best to inform everyone that we are going to rewrite the git history of the repo. Also, don’t forget to tell everyone to push their latest work to the repo because after we finish, we will ask them to re-clone the repo.

It’s also important that we should store a back-up version of the repo at local just in case we may screw up something. If that case happens, we can go to the back-up repo at local, then do a forced push to restore everything to the original state.

Delete the file from HEAD

Now let’s start the process of deleting the keystore file. First, delete the keystore file at your local. Then push and open a new pull request to merge that deletion into dev.

pr to delete keystore

Check out all remote branches

Next, we need to clone all remote branches and track them at local. With a big help from an answer on StackOverflow, we can create a bash script like below to automate that task:

#!/bin/bash

for branch in $(git branch -a | grep remotes | grep -v HEAD | grep -v dev); do
  git branch --track ${branch#remotes/origin/} $branch
done

What the script does is to iterate through all remote branches. For each remote branch remotes/origin/x, the script will create a new branch x at local and track its corresponding remote branch. The part grep -v dev is to ensure that we will ignore the dev branch because we are already on dev. Run the script, we will see the output like below:

clone all remote branches

Rewrite git history using the BFG tool

Next step is to run the BFG tool to delete the keystore file completely from git history. We will download the jar file from the homepage of the BFG tool. Then put the jar file into the root folder of the repo. At the time of writing this post, the jar I downloaded is bfg-1.13.0.jar. Now, let’s run the following command to delete the keystore file:

$ java -jar bfg-1.13.0.jar --delete-files release.keystore

After that, the BFG will print out lots of info regarding how the git history has changed.

bfg output

Force pushes to change remote

Now, it’s time for some forced pushes. GitHub allows us to make branches protected, thus preventing those branches from being force-pushed. So, remember to disable that protection in your repo before doing like I show next. Since we have a lot of modified branches at local, manually forcing a push for each branch can be tedious and time-consuming (especially for a repo having a great deal of branches). We can automate that work with a new bash script namely force-push-all.sh like this:

#!/bin/bash

for branch in $(git branch --format='%(refname:short)'); do
  git checkout $branch
  git push -f
done

Basically this script will iterate through all local branch. Then it will check out each local branch and force a push. Another big thank to this StackOverflow answer to make sure that, as for the current branch, the asterisk will not be printed out (e.g. * dev). This helps navigate through all local branches correctly. Let’s execute the script:

force push

So, we have finished overwriting all branches at remote. As for remote tags, we can simply perform one forced push like:

$ git push --tags -f

force push tags

We are all done now. The release.keystore was gone from git history. It’s time to tell everybody to re-clone the repo.

done


Thuy Trinh

Written by Thuy Trinh who lives and works in Frankfurt, Germany building robust Android apps. You should follow him on Twitter