Does any published research indicate that preimage attacks on MD5 are imminent?

In cryptography recommendations are not generally made by predicting the future, as this is impossible to do. Rather cryptographers try to evaluate what is already known and published. To adjust for potential future attacks, cryptosystems are generally designed so that there is some safety margin. E.g. cryptographic keys are generally chosen a little bit longer than absolutely necessary. For the same reason algorithms are avoided once weaknesses are found, even if these weaknesses are just certificational.

In particular, the RSA Labs recommended to abandon MD5 for signatures already in 1996 after Dobbertin found collisions in the compression function. Collisions in the compression function do not imply that collisions in the hash function exist, but we can't find collisions for MD5 unless we can find collisions for its compression function. Thus the RSA Labs decided that they no longer have confidence in MD5s collision resistance.

Today, we are in a similar situation. If we are confident that a hash function is collision resistant then we can also be confident that the hash function is preimage resistant. But MD5 has significant weaknesses. Hence many cryptographers (including people like Arjen Lenstra) think that MD5 no longer has the necessary safety margin to be used even in applications that only rely on preimage resistance and hence recommend to no longer use it. Cryptographers can't predict the future (so don't look for papers doing just that), but they can recommend reasonable precautions against potential attacks. Recommending not to use MD5 anymore is one such reasonable precaution.


We don't know.

This kind of advance tends to come 'all of a sudden' - someone makes a theoretical breakthrough, and finds a method that's 2^10 (or whatever) times better than the previous best.

It does seem that preimage attacks might still be a bit far off; a recent paper claims a complexity of 2^96 for a preimage on a reduced, 44-round version of MD5. However, this isn't a question of likelihood but rather whether someone is clever enough to go that final step and bring the complexity for the real deal into a realistic margin.

That said, since collision attacks are very real already (one minute on a typical laptop), and preimage attacks might (or might not) be just around the corner, it's generall considered prudent to switch to something stronger now, before it's too late.

If collisions aren't a problem for you, you might have time to wait for the NIST SHA-3 competition to come up with something new. But if you have the processing power and bits to spare, using SHA-256 or similar is probably a prudent precaution.


Cryptographically speaking MD5's pre-image resistance is already broken, see this paper from Eurocrypt 2009. In this formal context "broken" means faster than brute force attacks, i.e. attacks having a complexity of less than (2^128)/2 on average. Sasaki and Aoki presented an attack with a complexity of 2^123.4 which is by far only theoretical, but every practical attack is build on less potent theoretical attack, so even a theoretical break casts serious doubts on its medium-term security. What is also interesting is that they reuse a lot of research that has gone into collision attacks on MD5. That nicely illustrates Accipitridae's point that MD5's safety margin on pre-image resistance is gone with the collision attacks.

Another reason why the use of MD5 in 2009 has been and now the use of SHA1 is strongly discouraged for any application is that most people do not understand which exact property the security of their use case relies on. You unfortunately proved my point in your question stating that the 2008 CA attack did not rely on a failure of collision resistance, as caf has pointed out.

To elaborate a bit, every time a (trusted) CA signs a certificate it also signs possibly malicious data that is coming from a customer in form of a certificate signing request (CSR). Now in most cases all the data that is going to be signed can be pre-calculated out of the CSR and some external conditions. This has the fatal side effect that the state the hash function will be in, when it is going to hash the untrusted data coming out of the CSR is completely known to the attacker, which facilitates a collision attack. Thus an attacker can precompute a CSR that will force the CA to hash and sign data that has a collision with a shadow certificate only known to the attacker. The CA cannot check the preconditions of the shadow certificate that it would usually check before signing it (for example that the new certificate does not claim to be a root certificate), as it only has access to legitimate CSR the attackers provided. Generally speaking, once you have collision attacks and part of your data is controlled by an attacker then you no longer know what else you might be signing beside the data you see.