Hash Lab

Trivia

Interesting Facts

Naming origins, weird constants, real-world incidents, and surprising uses. Curated, sourced, and where possible linked back to the algorithm catalog.

Names & origins

MD = Message Digest

All MD-family names mean “Message Digest” followed by the version number. MD2 (1989), MD4 (1990), MD5 (1991), MD6 (2008) were authored by Ronald Rivest. MD1 and MD3 were proprietary RSA Inc. internal designs that never reached publication.

SHA = Secure Hash Algorithm

All SHA-family names are by the NSA / NIST. SHA-0 (1993) was silently withdrawn within months because of a defect the NSA never publicly explained, then re-published as SHA-1 (1995) with an extra rotation in the message schedule.

BLAKE descends from ChaCha

Jean-Philippe Aumasson’s BLAKE family (BLAKE → BLAKE2 → BLAKE3) uses the ChaCha quarter-round as its compression mixer. ChaCha itself comes from Daniel J. Bernstein’s Salsa20 (2005); the lineage is one of the cleaner ARX inheritances in symmetric cryptography.

Keccak / SHA-3 was a Belgian win

Joan Daemen (co-designer of Rijndael / AES) co-authored Keccak with Bertoni, Peeters, and Van Assche , making Belgium the only country whose researchers designed two NIST-selected cryptographic primitives (AES and SHA-3).

Whirlpool is named after the galaxy

Barreto and Rijmen named Whirlpool after the Whirlpool Galaxy (M51), to fit the “heavenly bodies” theme NESSIE submissions used (others included Saturn, Crescent, etc.).

Constants & magic numbers

SHA-256 uses cube and square roots of primes

The eight 32-bit initial hash values come from the first 32 bits of the fractional parts of square roots of the first 8 primes. The 64 round constants come from the same for cube roots of the first 64 primes. This is a “nothing-up-my-sleeve” choice: any tampering with the constants would require choosing primes, which is publicly observable.

HMAC ipad = 0x36, opad = 0x5c

The HMAC inner and outer pad constants 0x36 and 0x5c are the byte pair with the maximum possible Hamming distance: every bit position differs. Originally proposed by Bellare, Canetti, and Krawczyk (1996); the choice has held up for thirty years.

FNV picks primes by hand-search

The FNV constants (0x01000193, 0x100000001b3) were chosen by exhaustive search for primes near 224 and 240respectively that produce good distribution under a byte-by-byte mixer. The “Fowler/Noll/Vo” in the name names the three people who did that search.

Keccak's pi step is matrix multiplication mod 5

The π step of Keccak-f permutes the 25 lanes via (x, y) → (y, 2x + 3y mod 5). The matrix [[0, 1], [2, 3]] has order 24 in GL(2, F5), which is exactly the number of Keccak rounds.

Real-world incidents

Flame malware used a chosen-prefix MD5 collision (2012)

The state-sponsored Flame cyber-espionage tool forged a Microsoft Windows code-signing certificate by constructing a chosen-prefix MD5 collision with a legitimate Microsoft Terminal Services certificate. Marc Stevens reverse-engineered the attack and published the cryptanalysis later that year.

SHAttered: two PDFs with the same SHA-1 (2017)

Stevens et al. at Google & CWI produced two PDF documents with identical SHA-1 hashes but completely different visible content. Total cost: about 6,500 CPU-years and 100 GPU-years of compute. Practical chosen-prefix SHA-1 collisions arrived three years later (Shambles, 2020) at a fraction of that cost.

Flickr 2009: signed URLs broken by length-extension

Flickr’s API used md5(secret || params) for API request signatures. Thai Duong and Juliano Rizzo showed that length-extension let anyone with one valid URL forge URLs with extra parameters , without knowing the secret. AWS S3 had a related vulnerability in their query-string authentication.

LinkedIn 2012: unsalted SHA-1 password leak

6.5 million LinkedIn passwords hashed with raw SHA-1, no salt, posted publicly. Within hours, ~90% had been cracked by rainbow tables. The incident is the canonical example used in every password-storage talk since.

Sony PlayStation 3: hardcoded random number

Not exactly a hash attack, but adjacent: Sony's ECDSA signature implementation reused the same nonce for every PS3 firmware signature. fail0verflow recovered Sony’s private signing key by spotting this in 2010, and the same construction made counterfeit signed firmwares trivial to produce.

Apple's NeuralHash was broken in days (2021)

Apple’s perceptual-hash design for client-side CSAM detection was reverse-engineered within hours of disclosure; collisions and preimages followed within days. Apple later shelved the client-side scanning plan, citing different concerns, but the episode is a textbook lesson that perceptual hashes are not cryptographic.

Strange uses

Bitcoin's proof-of-work is a partial preimage search

Mining is: find a nonce such that SHA-256(SHA-256(header)) starts with N leading zero bits. The expected work to satisfy this is 2N. Difficulty adjusts globally to keep the average block time at 10 minutes. The total work currently committed to Bitcoin per block is roughly 280 hash evaluations.

Git's object IDs are content-addressed by design

Two files with the same content always have the same Git object ID. This means cloning a 100-MB repository over and over only transfers the unique objects once; it is also why Git’s SHA-1 collision risk is not just theoretical (an attacker who can get their content into someone’s repository can collide with a specific commit ID).

DNSSEC's NSEC3 uses hashes to hide subdomain existence

Plain NSEC records leak the full list of zone names via signed denial-of-existence. NSEC3 hashes each name with a salted iterated SHA-1 so the zone’s name list is not directly readable from DNS responses. (An offline dictionary attack still works on most public zones.)

Tor onion v3 addresses are public-key hashes

A v3 .onion address is a base32-encoded SHA3-256 of the service’s Ed25519 public key, with a checksum and a version byte. The 56-character length is dictated by 256 hash bits + 2-byte checksum + 1-byte version, base32-encoded.

Genomic minimizers compress entire genomes via hashing

Tools like Mash and sourmash use MinHash over k-mers (short DNA substrings) to compress a 3-billion-base human genome into a few-kilobyte sketch, then estimate genetic similarity by comparing sketches. Two human genomes typically have ≥ 99.5% MinHash similarity at k = 21.

Hash-based proof of storage

Filecoin’s “proof of replication” uses chained hash trees over sectors of replicated client data, so a miner can prove they are holding a unique encoded copy of a file without revealing the file itself. Underneath, Poseidon and SHA-256 both appear.

Curiosities

The birthday paradox needs surprisingly few people

Among 23 random people, the probability that two share a birthday is over 50%. Translated to hashes: an n-bit hash has collisions appear after roughly 2n/2 random samples, not 2n. This is why 128-bit hashes are no longer enough for collision resistance.

Bcrypt has a 72-byte secret password limit

bcrypt is built on Blowfish, whose key schedule accepts at most 72 bytes. Passwords longer than 72 bytes are silently truncated. The fix is to pre-hash the password (e.g., HMAC-SHA-256) before feeding it to bcrypt , producing a fixed-length input that captures the full entropy.

Apple changed bcrypt's identifier

macOS internally writes bcrypt hashes with the identifier $2b$. PHP’s password_hash uses $2y$. OpenBSD originated $2a$. All three are bcrypt; the prefixes track tiny historical sign-extension bugs that were since fixed in slightly different ways.

Ethereum's keccak256 is not SHA3-256

Ethereum predates the NIST SHA-3 finalization (which changed the padding rule to add a domain separator byte). Ethereum’s “keccak256” uses the original Keccak padding and produces a different digest from SHA3-256 for any non-empty input.

Bitcoin’s genesis block has a unique quirk

Block 0 of Bitcoin contains, inside the coinbase transaction’s scriptSig, the text “The Times 03/Jan/2009 Chancellor on brink of second bailout for banks” , both a timestamp anchor and commentary by Satoshi. The block’s SHA-256d hash is 00000000...c26f with an unusual number of leading zeros for its low difficulty.

Cryptography's longest-running paper title

Antoine Joux’s 2004 paper “Multicollisions in Iterated Hash Functions: Application to Cascaded Constructions” forced an entire generation of hash designs to rethink why cascading two hashes is not as strong as their combined output suggests. Cited in nearly every modern hash function paper.