What we don’t know

Open problems

A curated list of open research questions in hash function design and cryptanalysis. Each one is labelled by approximate difficulty: Approachable for a strong masters/PhD project, Research-grade for a multi-year line of work, Long-game for problems that may take a generation.

If you are looking for an open question to attack, this page is a finger pointing at where the open ground is.

ApproachableResearch-gradeLong game

Cryptanalysis of standardized hashes

These are the largest, most-studied targets. Any progress would reshape the field.

Find any SHA-256 collision
Long game
No collision attack better than the 2¹²⁸ generic birthday bound is known. Even reduced-round attacks plateau around half the rounds (~32 of 64) with no clear path to extending. A practical full-SHA-256 collision would be a once-in-a-generation result and would force a global migration roughly comparable to the SHA-1 retirement.
Improve preimage bounds on full SHA-3
Research-grade
Best published preimage attacks cover ~5 of 24 Keccak-f rounds. Closing the gap (even partially, say to 8-10 rounds) would be a top-tier result and would clarify how much margin SHA-3 actually has.
Find a BLAKE3 distinguisher
Research-grade
BLAKE3’s round count (7) is aggressive even after accounting for the tree structure. No distinguisher better than generic is known on full BLAKE3, but several papers note that future cryptanalysis could narrow this margin. A real distinguisher would invite a round-count bump.
Concrete cost of multicollisions in modern Merkle-Damgård hashes
Research-grade
Joux’s 2004 multicollision result shows that finding k-way collisions in an iterated hash costs roughly k · 2^n/2, not k·2^n/2(k!). Concrete time-memory bounds for SHA-256, SHA-512, BLAKE2b under real adversaries (with GPU/ASIC budgets) are surprisingly open.

SNARK-friendly hash design

Algebraic hashes are barely a decade old. The race between new designs and algebraic-attack cryptanalysis is the most active part of symmetric crypto today.

Smaller circuit cost for the same security level
Research-grade
Poseidon, Poseidon2, MiMC, Rescue, Griffin, Anemoi keep trimming constraints. The lower bound for “circuit cost per security bit” is unknown. A 2-3× reduction would be world-changing for ZK rollups.
Provable bounds against Gröbner-basis attacks
Long game
Algebraic attacks (Gröbner bases, FreeLunch, etc.) keep nibbling at SNARK-friendly hashes. Most security arguments are based on best-known attacks plus a margin. Tight upper bounds on polynomial-system solving cost as a function of hash structure are mostly conjectural.
Standardize at least one SNARK-friendly hash
Approachable
Despite production usage in billions of dollars of ZK rollups, no SNARK-friendly hash is in NIST / IETF / ISO. The standardization effort would require pinning a specific field, parameter set, and target proof system , all of which are still moving.

Memory-hard functions

Argon2id is the current champion but the theory under it is incomplete.

Provable memory-hardness against time-memory trade-offs
Long game
Argon2id’s security argument is partly empirical. A clean mathematical proof that no polynomial-time tradeoff exists in the data-dependent phase , under any reasonable model of computation , is an open problem.
Resistance against custom ASICs at fixed RAM cost
Research-grade
Modern GPUs and FPGAs have GB-scale memory. A 1-GiB Argon2id is inconvenient for them but not prohibitive. Designing a memory-hard function whose “dollar cost per password attempt” scales linearly with all reasonable adversary architectures is open.

Post-quantum hash-based signatures

FIPS 205 (SLH-DSA / SPHINCS+) is standardized but slow. Improvements without compromising the security floor are an active area.

Reduce SLH-DSA signature sizes
Research-grade
SLH-DSA signatures range from 8 KiB (SLH-DSA-128f) to ~50 KiB (256s) , orders of magnitude bigger than Dilithium or classical ECDSA. Compressing them without sacrificing the hash-only security floor is an open question.
Faster Winternitz parameter selection
Approachable
Hash-based signatures spend their time in Winternitz one-time signatures. Picking the best w for given verify / sign / size trade-offs is well-understood theoretically but the constant-time engineering is non-trivial.

Perceptual & similarity hashing

Adversarial robustness is the elephant in this room.

Adversarially-robust perceptual hash for CSAM detection
Long game
Every published perceptual hash (PhotoDNA, NeuralHash, pHash, blockhash) has been broken in the adversarial sense. Designing an image fingerprint that is robust to deliberate evasion (re-encoding, small visible perturbations) and to targeted preimage attacks (forcing a benign image to collide with a known bad hash) is open.
Provably-private LSH against membership inference
Research-grade
LSH indexes leak which documents are similar. For applications with sensitive corpora (medical records, legal documents), LSH indexes that satisfy a privacy notion (differential privacy or strong indistinguishability) without destroying retrieval quality are still rough.

Implementation & engineering

These are the unglamorous open problems whose solutions actually ship in shipped systems.

Formally-verified hash implementations beyond HACL*
Approachable
HACL* covers a handful of standard hashes (SHA-2, SHA-3, BLAKE2) with full F* proofs. Extending the coverage to BLAKE3, Poseidon, and modern KDFs while keeping the verification effort tractable is an open engineering goal.
Constant-time bcrypt without ABI breakage
Approachable
Reference bcrypt has data-dependent table lookups. Constant-time bcrypt implementations exist but they differ subtly in cost from the reference, complicating “same hash with cost N” benchmarks across libraries.
Side-channel resistance for sponge hashes on small devices
Research-grade
SHA-3 / SHAKE on 8-bit microcontrollers leak via cache timing and electromagnetic emissions in known ways. Constant-time and constant-EM implementations exist for AES; comparable engineering for Keccak-f[1600] on resource-constrained hardware is still developing.

Standardization & ecosystem

Less technical, more diplomatic, but still genuinely open.

An IETF / NIST track for SNARK-friendly hashes
Approachable
Production deployment is enormous; standardization is zero. The governance question of whose Poseidon parameters become the standard , and how they get chosen , is open.
Open data on long-term hash function failure modes
Approachable
We collectively have a generation’s worth of hash function cryptanalysis. A curated, open dataset of (algorithm, attack, date, cost, references) suitable for trend analysis does not exist. Compiling one would let the field empirically calibrate its expectations.

Cryptanalysis of standardized hashes

Find any SHA-256 collision

Improve preimage bounds on full SHA-3

Find a BLAKE3 distinguisher

Concrete cost of multicollisions in modern Merkle-Damgård hashes

SNARK-friendly hash design

Smaller circuit cost for the same security level

Provable bounds against Gröbner-basis attacks

Standardize at least one SNARK-friendly hash

Memory-hard functions

Provable memory-hardness against time-memory trade-offs

Resistance against custom ASICs at fixed RAM cost

Post-quantum hash-based signatures

Reduce SLH-DSA signature sizes

Faster Winternitz parameter selection

Perceptual & similarity hashing

Adversarially-robust perceptual hash for CSAM detection

Provably-private LSH against membership inference

Implementation & engineering

Formally-verified hash implementations beyond HACL*

Constant-time bcrypt without ABI breakage

Side-channel resistance for sponge hashes on small devices

Standardization & ecosystem

An IETF / NIST track for SNARK-friendly hashes

Open data on long-term hash function failure modes