While re-reading The Architecture of Open Source Applications, Volume II , a line in the chapter on git caught my eye:
if two objects are different they will have different SHAs.
This was surprising, as there should always be be some very very small but nonzero chance of a hash collision. Either the book is simplifying for ease of explanation, or git is doing something behind the scenes that is more complicated than simply hashing the contents of the object.
It turns out that hash collisions are possible.
As expected, an accidental collision is very unlikely: on even a large and active repo, a collision is unlikely to occur in the timespan of the age of the universe .
Even if an attacker could break SHA-1 or brute-force a hash, git always prefers the older version of an object , so the attacker would be unable to replace an existing file with a tainted version.