Non-cryptographic
CityHash & FarmHash
Two generations of Google’s fast string-hash families. CityHash(Geoff Pike & Jyrki Alakuijala, 2011) was designed as a faster, better-distributed Murmur with platform-specific paths. FarmHash (2014) refined the architecture, exposed cleaner APIs (Hash64, Hash128), and added CPU-feature-dispatch wrappers.
At a glance
| Output | 32, 64, 128, or 256 bits depending on call |
|---|---|
| Throughput | ~7–15 GiB/s on modern CPUs |
| Year | 2011 (CityHash), 2014 (FarmHash) |
| Standard | None; reference impl on GitHub (google/cityhash, google/farmhash) |
| Status | Non-cryptographic; superseded for raw speed by xxHash3 |
Where they are used
- Google products internally , the original brief was that internal use required Murmur-class speed with measurably better distribution under SMHasher.
- Bazel , content addressing for build outputs (Bazel uses BLAKE3 today but FarmHash was a step on the way).
- Apache Beam / Dataflow , keys for grouping operations.
- ClickHouse ,
cityHash64andfarmHash64are standard SQL functions. - HyperLogLog implementations in C++ , popular short-input hashes.
Differences vs MurmurHash
- Better distribution under SMHasher , Google fixed several mid-range Murmur failures.
- CPU-dispatching , CityHash had SSE4.2 / CRC32 instructions paths; FarmHash generalized this.
- Larger outputs , CityHash128 and FarmHash128 are first-class, not bolt-ons.
And vs xxHash3?
xxHash3 (2019) is faster on most modern CPUs and has cleaner SIMD scalability. CityHash / FarmHash remain in production for legacy reasons and for inter-product reproducibility with Google’s ecosystem.
References
- google/cityhash , reference implementation
- google/farmhash , reference implementation
- SMHasher , the comparison bench every non-crypto hash gets measured against
- xxHash3 · MurmurHash3
Quick quiz
Test yourself on cityhash-farmhash
10 multiple-choice questions. Pick an answer for each, then submit to see explanations.
Q1.Who designed CityHash and FarmHash?
Q2.Are CityHash and FarmHash cryptographic?
Q3.Which CPU instruction did CityHash exploit on x86?
Q4.Which is the typical FarmHash output size?
Q5.Which database exposes CityHash as a SQL function?
Q6.FarmHash vs CityHash main improvement:
Q7.Why use CityHash/FarmHash today over xxHash3?
Q8.Are CityHash outputs stable across CPU instructions?
Q9.FarmHash's documented API includes which method?
Q10.Is FarmHash/CityHash safe against hash flooding?