Git squash merge: non, no, nein, nee, na, nej, não, net

Today I will tell you why you should say no to git squash merges. Even better I will show you. First, I will explain what lead me to write this post. Then, I will provide you with steps to reproduce a problem that seems ignored. A problem that will eventually bite you if you keep using git squash merges. No spoilers until you get there. Hopefully, you will get my point. Since the aim of this post is to save you time in the future or even now this will also be the latest entry of my future-proof series.

Mystery around broken merges

Over the past few weeks, I have noticed an issue on a project I have been working on for some time now. Multiple teams developing multiple services in the same repository with the team I am part of relying on some small part of each of the teams’ services. There is that one big trunk merged into my team’s main development branch so that we have the latest version of these parts each team is owning and changing as they want. So far the flow has been to merge the whole of that trunk even though we need only a few files to be updated/added to our main branch.

Since I started working on that project I observed two of these merges and did one myself. Funnily enough, if you are sadistic, it looked like we had the approximately the same amount of conflict each time. Between 600 and 700 files. Even funnier, we only touched a limited set of common files. That was a change that occurred only once. Presumably months ago. However, the conflicts kept coming back again and again. The most confusing part is that these conflicts affected files we never touched. At least that’s what we thought…

Encounter of the squash type

The first thing that caught my eye is that when merging changes to a remote branch, the merge would be a squash. I never used squash merges before but I knew it was making a big commit from multiple ones but it didn’t shock me. Then thinking further I realised it would create a whole new commit written on top of our branch. This is different behaviour from a regular merge that would add the new commits on top of ours plus potentially a merge commit which may contain conflicts fixed.

For the longest time, I believed merges would be based on file diff comparison. Never had I thought that the commit hash could have any impact. At most I saw it as a value that allows for cherry-picking or creating a branch from a specific commit without undoing work. (potentially look into how commit hash are generated).

Game Merge Theory

From there, my hypothesis is that applying successive squashed merges make git lose track of the history because the merge is ran using commit hashes for primary comparison instead of file diffs. So I decided to run a simple experiment in 27 simple steps. For the sake of brevity, you will see the 27 steps with a squash merge applied.

Twenty-seven steps

Step 1: Create a new repository

Here we are going to assume that you are in an empty directory. From there you can run the following command to initialise a local repository.

git init

Step 2: Create your master branch

Now we are going to move a little faster and each step shall contain only the command to run. Exceptions will be made if neither the subtitle nor the command is clear enough.

git checkout -b master

Here we create a master branch and make it the active local branch.

Step 3: Create an empty file named nonsense.txt

touch nonsense.txt

Step 4: Commit nonsense.txt to master

git add nonsense.txt ; git commit -m "added nonsense from master"

Here two commands are ran because I really wanted this to fit 27 steps for no reason at all.

Step 5: Create test branch from master

 git branch test

Careful, here we are only creating the test branch but are still working on the master branch. All the changes applied to master to this point are on test.

Step 6: Update nonsense.txt on master

You can write the line “Hello World” to your nonsense.txt file and save. If you are a command line warrior though you can run the following command:

echo "Hello World" > nonsense.txt

Step 7: Commit nonsense.txt changes to master

git add nonsense.txt ; git commit -m "first change from master"

If you run git log your history should look like this:

Early history, so far, so good

Step 8: Switch to the test branch

git checkout test

Back to the history, you should only have the first commit “added nonsense to master”.

Test history

Step 9: Git squash merge number one

Here we will merge master to test using a squash commit.

git merge master --squash
first git squash
After your first squash merge

Fun fact, after finishing writing the post, while reproducing steps and adding screenshots I noticed that the history did not change after that first squash merge. I would have assumed that it would be visible. Yet again the previous screenshot does tell that the HEAD is not updated so it makes sense in a way. Look:

Weird stuff

If we used a classic merge or even a rebase we would see that commit in history. Moving onto the next step!

Step 10: Let’s add a new file test.txt to test

Similarly to the step 6, we will create a new test.txt file with the text “this is a test”.

echo "this is a test" > test.txt

Step 11: Commit test.txt changes to test

git add test.txt ; git commit -m "first change from test"
2 files changed?

After the previous git squash merge you would have thought that something would have happened to the history but nope. The change gets acknowledged only as part of the commit where only test.txt is created.

Step 12: Git squash merge number two

git merge master --squash
I did not request anything but ok…

Once more, HEAD is not updated and if you look at the logs, you do not see the last commit from master. Which makes sense as the changes were added as part of the commit where we created test.txt. This is all part of how the squashing works. You can keep going by running git commit -am "merge from master".

Step 13: Have a break

Have a KitKat. Not sponsored… Yet. Nestlé representatives if you read this, feel free to slide in the DMs.

Fancy looking KitKat, you earned it!

Step 14: Replace contents of nonsense.txt from test

Here we will replace the contents of nonsense.txt with “Hello nonsense”. As previously you can either use an editor of your choice or run a command:

echo "Hello nonsense" > nonsense.txt

Step 15: Commit nonsense.txt changes to test

git add nonsense.txt ; git commit -m "second change from test"
One changed file as expected

Step 16: Git squash merge the third

git merge master --squash

And you should be able to observe a first conflict.

First conflict #ItsAlive

Step 17: Contemplate failure

Not much more to do, there is a conflict due to the nature of a git merge squash. You thought you had your last synchronisation in step 9 right? Well, surprise! Indeed, touching nonsense.txt broke whatever comparison algorithm git uses to compare commits. Gut feeling says it based on the hash of the commits and which files changed.

Indeed, in step 7, the contents of nonsense.txt are changed on master after test was branched out in step 5. However, during step 14, the contents of nonsense.txt are changed after merging the master changes to test in step 9. The problem is that the git squash merge in step 9 adds a whole new commit different from the commit created in step 7 from master.

More history for you

Step 18: Fix conflict

You can fix the conflict by selecting to keep “Hello nonsense!” as the content of nonsense.txt.

I use VSCode here

Then select execute the following command to complete your merge:

git commit -am "Fix squash conflict 1"
Nothing to commit? Are you sure?

So yeah, git thinks that nothing needs to be committed. If you look at the history, it will look exactly the same as step 17. Moving on!

Step 19: Add a new line to test.txt

We covered text editing and echo commands a few times so do as you please as long as you finish with this content for test.txt.

this is a test
another one

Step 20: Commit changes from test.txt to test 

git add test.txt ; git commit -m "third change from test"

I want to prove a point, wait for the next step.

Step 21: Git squash merge (not May) the fourth

git merge master --squash
Oh you thought you were done with this? Not even close.

Step 22: More of the same conflict

You did not change nonsense.txt, yet you have a conflict on it. Again. The very conflict you thought fixed in step 18 is back to haunt you. I could copy paste the explanation from step 17 but you can also scroll back up in case you already forgot. Are you entertained yet?

Step 23: Fix the conflict like it’s step 18!

Select “Hello nonsense!” again then run that:

git commit -am "Fix squash conflict 2"

It won’t change anything as you will still get a message telling you that your working tree is clean so there is nothing to commit. Just like in step 18!

Step 24: Git squash merge stack overflow

Now you can repeat the cycle of applying a git squash merge then fix it by selecting “Hello nonsense”. Why cycle? Because without making any further change to test but git squash merges you will get the same conflict showing up again and again.

Fun fact, when you run git commit -am "whatever message" after selecting “Hello nonsense” it tells you it’s clean and there is nothing to commit. It is all fun and games for 1 or 2 files. And it will happen every single time you make a change. Indeed, it will happen even you don’t make any changes. This is an annoyance but you may find the strength to live with it. But what about 662 files? Yes, 662 is a very specific number. And yes, a still fresh experience taught me how unpleasant this can become.

Step 25: Another squash merge for fun

I hope you had fun and spent some time on step 24 and realised that conflict will never really go away. It will not die. Because squash merges breed zombie conflicts. Let’s apply another git squash merge for the lols:

git merge master --squash

Step 26: Switcharoo

You may think you’re cleverer than everyone else and that you can get rid of the conflict by unmaking the change. I have two counterarguments to that. First, unless the change was accidental you probably don’t want to do that. Second, well try and select “Hello world” to fix the conflict and see what happens. I’ll tell you what happens, step 27.

Notice how after that change my history finally updated:

Don’t think about it…

Step 27: The Bamboozling

Yes, after all these conflicts I dealt with only one post merge commit appears. The one where I revert the change that created a conflict that should never appear in the first place.

So one more time, let’s run another git squash merge command, shall we?

Abandon all hope ye who enter here

Hahaha! Come on that’s funny. A least a little. Also only because you’re doing it as practice in an hopefully safe environment. Now, you have a conflict between “Hello World” and “Hello World”. This is the perfect abomination. A conflict between two identical files, lines even, which will come back again and again. Go ahead, fix that conflict and run another git merge master --squash. See what happens.

It doesn’t matter how many times you fix it. It will return and eventually will get the best of you. Well done, you played yourself.

Bonus step: Try without squash

You can go back from step one in a clean directory and replace each git merge master --squash with a git merge master for a regular merge or git rebase master for a rebase. See what happens. Note that following the same pattern of steps you will not encounter any conflict.

Further observations

If you have a long-lived feature branch, do not merge the main branch into with a squash. A regular merge will be safer and a rebase can be ok if you have not pushed your branch yet. Applying a rebase to a feature branch can work if your team is ok with you to force push on that branch. Merging to the main branch using squashes does not seem to be a major issue as it should not require synchronisation from secondary branches that are just merged into it.

However, personally, I will avoid squash merges until I get a really good reason to use them. This would need to go a little further than getting a “cleaner history”. Which is true only when you use a GUI for git. However, if as a man of science you use the command line interface, squashed commits change nothing for a git log with some CVS. Per example on Bitbucket, you get all the commits as part of the message of the squashed one which makes it as readable as usual.

Back to the conflict bit, one might argue that if I do not touch the `nonsense.txt` file to begin with, no error occurs. This is true. But what if you run a replace all in your project that touches multiple files, commit afterwards. A few minutes later, you realise your error and undo your changes maybe with a commit revert. No matter what you do, these files will always be in conflict from that point onwards. Conflict forever on files that technically are identical. Is that not a nightmare? To me, this is a dreadful thing to think about.

Closing on that topic

Long story short, using squashed merges on secondary branches as a means of synchronisation is like having sex without a condom. You may feel great. Hell, you may even feel smart for whatever reason. You potentially can have a child you may or may not want. You could catch a benign STD which is unpleasant but can be treated or AIDS which is closer to that zombie conflict from step 27 that will stick with your branch for its entire life until it dies potentially because of it. Think about it. Stay safe, don’t merge and squash.

Facebook Comments

Leave a Reply