Today I will tell you why you should say no to git squash merges. Even better I will show you. First, I will explain what lead me to write this post. Then, I will provide you with steps to reproduce a problem that seems ignored. A problem that will eventually bite you if you keep using git squash merges. No spoilers until you get there. Hopefully, you will get my point. Since the aim of this post is to save you time in the future or even now this will also be the latest entry of my future-proof series.
Mystery around broken merges
Over the past few weeks, I have noticed an issue on a project I have been working on for some time now. Multiple teams developing multiple services in the same repository with the team I am part of relying on some small part of each of the teams’ services. There is that one big trunk merged into my team’s main development branch so that we have the latest version of these parts each team is owning and changing as they want. So far the flow has been to merge the whole of that trunk even though we need only a few files to be updated/added to our main branch.
Since I started working on that project I observed two of these merges and did one myself. Funnily enough, if you are sadistic, it looked like we had the approximately the same amount of conflict each time. Between 600 and 700 files. Even funnier, we only touched a limited set of common files. That was a change that occurred only once. Presumably months ago. However, the conflicts kept coming back again and again. The most confusing part is that these conflicts affected files we never touched. At least that’s what we thought…
Encounter of the squash type
The first thing that caught my eye is that when merging changes to a remote branch, the merge would be a squash. I never used squash merges before but I knew it was making a big commit from multiple ones but it didn’t shock me. Then thinking further I realised it would create a whole new commit written on top of our branch. This is different behaviour from a regular merge that would add the new commits on top of ours plus potentially a merge commit which may contain conflicts fixed.
For the longest time, I believed merges would be based on file diff comparison. Never had I thought that the commit hash could have any impact. At most I saw it as a value that allows for cherry-picking or creating a branch from a specific commit without undoing work. (potentially look into how commit hash are generated).
Game Merge Theory
From there, my hypothesis is that applying successive squashed merges make git lose track of the history because the merge is ran using commit hashes for primary comparison instead of file diffs. So I decided to run a simple experiment in 27 simple steps. For the sake of brevity, you will see the 27 steps with a squash merge applied.
Twenty-seven steps
Step 1: Create a new repository
Here we are going to assume that you are in an empty directory. From there you can run the following command to initialise a local repository.
git init
Step 2: Create your master branch
Now we are going to move a little faster and each step shall contain only the command to run. Exceptions will be made if neither the subtitle nor the command is clear enough.
git checkout -b master
Here we create a master branch and make it the active local branch.
Step 3: Create an empty file named nonsense.txt
touch nonsense.txt
Step 4: Commit nonsense.txt to master
git add nonsense.txt ; git commit -m "added nonsense from master"
Here two commands are ran because I really wanted this to fit 27 steps for no reason at all.
Step 5: Create test branch from master
git branch test
Careful, here we are only creating test
master
test
.
Step 6: Update nonsense.txt on master
You can write the line “Hello World” to your nonsense.txt
file and save. If you are a command line warrior though you can run the following command:
echo "Hello World" > nonsense.txt
Step 7: Commit nonsense.txt changes to master
git add nonsense.txt ; git commit -m "first change from master"
If you run git log
your history should look like this:
Step 8: Switch to the test branch
git checkout test
Back to the history, you should only have the first commit “added nonsense to master”.
Step 9: Git squash merge number one
Here we will merge master
to test
using a squash commit.
git merge master --squash
Fun fact, after finishing writing the post, while reproducing steps and adding screenshots I noticed that the history did not change after that first squash merge. I would have assumed that it would be visible. Yet again the previous screenshot does tell that the HEAD is not updated so it makes sense in a way. Look:
If we used a classic merge or even a rebase we would see that commit in history. Moving onto the next step!
Step 10: Let’s add a new file test.txt to test
Similarly to the step 6, we will create a new test.txt
file with the text “this is a test”.
echo "this is a test" > test.txt
Step 11: Commit test.txt changes to test
git add test.txt ; git commit -m "first change from test"
After the previous git squash merge you would have thought that something would have happened to the history but nope. The change gets acknowledged only as part of the commit where only test.txt
is created.
Step 12: Git squash merge number two
git merge master --squash
Once more, HEAD is not updated and if you look at the logs, you do not see the last commit from master
. Which makes sense as the changes were added as part of the commit where we created test.txt
. This is all part of how the squashing works. You can keep going by running git commit -am "merge from master"
.
Step 13: Have a break
Have a KitKat. Not sponsored… Yet. Nestlé representatives if you read this, feel free to slide in the DMs.
Step 14: Replace contents of nonsense.txt from test
Here we will replace the contents of nonsense.txt
with “Hello nonsense”. As previously you can either use an editor of your choice or run a command:
echo "Hello nonsense" > nonsense.txt
Step 15: Commit nonsense.txt changes to test
git add nonsense.txt ; git commit -m "second change from test"
Step 16: Git squash merge the third
git merge master --squash
And you should be able to observe a first conflict.
Step 17: Contemplate failure
Not much more to do, there is a conflict due to the nature of a git merge squash. You thought you had your last synchronisation in step 9 right? Well, surprise! Indeed, touching nonsense.txt
broke whatever comparison algorithm git uses to compare commits. Gut feeling says it based on the hash of the commits and which files changed.
Indeed, in step 7, the contents nonsense.txt
master
test
nonsense.txt
master
test
master
Step 18: Fix conflict
You can fix the conflict by selecting to keep “Hello nonsense!” as the content nonsense.txt
Then select execute the following command to complete your merge:
git commit -am "Fix squash conflict 1"
So yeah, git thinks that nothing needs to be committed. If you look at the history, it will look exactly the same as step 17. Moving on!
Step 19: Add a new line to test.txt
We covered text editing and echo commands a few times so do as you please as long as you finish with this content for test.txt
.
this is a test
another one
Step 20: Commit changes from test.txt to test
git add test.txt ; git commit -m "third change from test"
I want to prove a point, wait for the next step.
Step 21: Git squash merge (not May) the fourth
git merge master --squash
Step 22: More of the same conflict
You did not change nonsense.txt
, yet you have a conflict on it. Again. The very conflict you thought fixed in step 18 is back to haunt you. I could copy paste the explanation from step 17 but you can also scroll back up in case you already forgot. Are you entertained yet?
Step 23: Fix the conflict like it’s step 18!
Select “Hello nonsense!” again then run that:
git commit -am "Fix squash conflict 2"
It won’t change anything as you will still get a message telling you that your working tree is clean so there is nothing to commit. Just like in step 18!
Step 24: Git squash merge stack overflow
Now you can repeat the cycle of applying a git squash merge then fix it by selecting “Hello nonsense”. Why cycle? Because without making any further change to test
but git squash merges you will get the same conflict showing up again and again.
Fun fact, when you run git commit -am "whatever message"
after selecting “Hello nonsense” it tells you it’s clean and there is nothing to commit. It is all fun and games for 1 or 2 files. And it will happen every single time you make a change. Indeed, it will happen even you don’t make any changes. This is an annoyance but you may find the strength to live with it. But what about 662 files? Yes, 662 is a very specific number. And yes, a still fresh experience taught me how unpleasant this can become.
Step 25: Another squash merge for fun
I hope you had fun and spent some time on step 24 and realised that conflict will never really go away. It will not die. Because squash merges breed zombie conflicts. Let’s apply another git squash merge for the lols:
git merge master --squash
Step 26: Switcharoo
You may think you’re cleverer than everyone else and that you can get rid of the conflict by unmaking the change. I have two counterarguments to that. First, unless the change was accidental you probably don’t want to do that. Second, well try and select “Hello world” to fix the conflict and see what happens. I’ll tell you what happens, step 27.
Notice how after that change my history finally updated:
Step 27: The Bamboozling
Yes, after all these conflicts I dealt with only one post merge commit appears. The one where I revert the change that created a conflict that should never appear in the first place.
So one more time, let’s run another git squash merge command, shall we?
Hahaha! Come on that’s funny. A least a little. Also only because you’re doing it as practice in an hopefully safe environment. Now, you have a conflict between “Hello World” and “Hello World”. This is the perfect abomination. A conflict between two identical files, lines even, which will come back again and again. Go ahead, fix that conflict and run git merge master --squash
It doesn’t matter how many times you fix it. It will return and eventually will get the best of you. Well done, you played yourself.
Bonus step: Try without squash
You can go back from step one in a clean directory and replace each git merge master --squash
with a git merge master
for a regular merge or git rebase master
for a rebase. See what happens. Note that following the same pattern of steps you will not encounter any conflict.
Further observations
If you have a long-lived feature branch, do not merge the main branch into with a squash. A regular merge will be safer and a rebase can be ok if you have not pushed your branch yet. Applying a rebase to a feature branch can work if your team is ok with you to force push on that branch. Merging to the main branch using squashes does not seem to be a major issue as it should not require synchronisation from secondary branches that are just merged
However, personally, I will avoid squash merges until I get a really good reason to use them. This would need to go a little further than getting a “cleaner history”. Which is true only when you use a GUI for git. However, if as a man of science you use the command line interface, squashed commits change nothing for a git log
with some CVS. Per example on Bitbucket, you get all the commits as part of the message of the squashed one which makes it as readable as usual.
Back to the conflict bit, one might argue that if I do not touch the `nonsense.txt` file to begin with, no error occurs. This is true. But what if you run a replace all in your project that touches multiple files, commit afterwards. A few minutes later, you realise your error and undo your changes maybe with a commit revert. No matter what you do, these files will always be in conflict from that point onwards. Conflict forever on files that technically are identical. Is that not a nightmare? To me
Closing on that topic
Long story short, using squashed merges on secondary branches as a means of synchronisation is like having sex without a condom. You may feel great. Hell, you may even feel smart for whatever reason. You potentially can have a child you may or may not want. You could catch a benign STD which is unpleasant but can be treated or AIDS which is closer to that zombie conflict from step 27 that will stick with your branch for its entire life until it dies potentially because of it. Think about it. Stay safe, don’t merge and squash.
If you want to lookup more about git you can have a look at my epic cheat sheet of epicness.