Hash Collision
What is Hash Collision?
Hash CollisionTwo distinct inputs that produce the same cryptographic hash value, breaking integrity, uniqueness, and signature guarantees that depend on the hash function.
A hash collision is a pair of different inputs that map to identical hash outputs. Because hash functions compress arbitrary data to a fixed size, collisions exist mathematically, but a secure hash makes them computationally infeasible to find. When a function is broken, attackers can forge certificates, manipulate signed documents, or substitute one binary for another. MD5 has been practically collidable since 2004 (Wang et al.) and was used in the Flame malware to forge a Microsoft code-signing certificate. SHA-1 was broken by the SHAttered attack from Stevens et al. in 2017. Modern systems should use SHA-256, SHA-3, or BLAKE2/BLAKE3 for collision-resistant hashing.
● Examples
- 01
The 2017 SHAttered PDF pair from Google and CWI showed two PDFs with the same SHA-1 hash.
- 02
Flame (2012) used an MD5 chosen-prefix collision to forge a Microsoft Terminal Server license certificate.
● Frequently asked questions
What is Hash Collision?
Two distinct inputs that produce the same cryptographic hash value, breaking integrity, uniqueness, and signature guarantees that depend on the hash function. It belongs to the Cryptography category of cybersecurity.
What does Hash Collision mean?
Two distinct inputs that produce the same cryptographic hash value, breaking integrity, uniqueness, and signature guarantees that depend on the hash function.
How does Hash Collision work?
A hash collision is a pair of different inputs that map to identical hash outputs. Because hash functions compress arbitrary data to a fixed size, collisions exist mathematically, but a secure hash makes them computationally infeasible to find. When a function is broken, attackers can forge certificates, manipulate signed documents, or substitute one binary for another. MD5 has been practically collidable since 2004 (Wang et al.) and was used in the Flame malware to forge a Microsoft code-signing certificate. SHA-1 was broken by the SHAttered attack from Stevens et al. in 2017. Modern systems should use SHA-256, SHA-3, or BLAKE2/BLAKE3 for collision-resistant hashing.
How do you defend against Hash Collision?
Defences for Hash Collision typically combine technical controls and operational practices, as detailed in the full definition above.
● Related terms
- cryptography№ 247
Cryptographic Hash Function
A deterministic one-way function that maps arbitrary-length input to a fixed-length digest, designed to be collision-, preimage-, and second-preimage-resistant.
- cryptography№ 658
MD5
A 128-bit cryptographic hash function designed by Ron Rivest in 1992; now broken — practical collisions are trivial and it must not be used for any security-sensitive purpose.
- cryptography№ 1023
SHA-1
A cryptographic hash function producing a 160-bit digest, designed by the NSA in 1995 and now considered broken for collision resistance.
- cryptography№ 1024
SHA-256
A 256-bit cryptographic hash function from the SHA-2 family, widely used for digital signatures, TLS, blockchains, and integrity verification.
- cryptography№ 321
Digital Signature
A public-key cryptographic mechanism that proves the authenticity, integrity and non-repudiation of a message or document.
- cryptography№ 101
BLAKE2
A fast, modern cryptographic hash function specified in RFC 7693, offering security comparable to SHA-3 with significantly higher performance in software.