Is my developer's home-brew password security right or wrong, and why?

/** Dave's Home-brew Hash^H^H^H^H^Hkinda stupid algorithm */

// user data
$user = '';
$password = '';

// timestamp, "random" #
$time = date('mdYHis'); // known to attackers - totally pointless
// ^ also, as jdm pointed out in the comments, this changes daily. looks broken!

// different hashes for different days? huh? or is this stored as a salt?
$rand = mt_rand().'\n'; // mt_rand is not secure as a random number generator
// ^ it's even less secure if you only ask for a single 31-bit number. and why the \n?

// crypt is good if configured/salted correctly
// ... except you've used crypt on the username? WTF.
$crypt = crypt($user.$time.$rand); 

// hash
function hash_it($string1, $string2) {
    $pass = md5($string1); // why are we MD5'ing the same pass when crypt is available?
    $nt = substr($pass,0,8); // <--- BAD BAD BAD - why shuffle an MD5 hash?!?!?
    $th = substr($pass,8,8);
    $ur = substr($pass,16,8);
    $ps = substr($pass,24,8); // seriously. I have no idea. why?
    // ^ shuffling brings _zero_ extra security. it makes _zero_ sense to do this.
    // also, what's up with the variable names?

    // and now we SHA1 it with other junk too? wtf?
    $hash = 'H'.sha1($string2.$ps.$ur.$nt.$th); 
    return $hash
}

$hash = hash_it($password, $crypt); // ... please stop. it's hurting my head.

summon_cthulhu();

Dave, you are not a cryptographer. Stop it.

This home-brew method offers no real resistance against brute force attacks, and gives a false impression of "complicated" security. In reality you're doing little more than sha1(md5(pass) + salt) with a possibly-broken and overly complicated hash. You seem to be under the illusion that complicated code gives better security, but it doesn't. A strong cryptosystem is strong regardless of whether the algorithm is known to an attacker - this fact is known as Kerckhoff's principle. I realise that it's fun to re-invent the wheel and do it all your own way, but you're writing code that's going into a business-critical application, which is going to have to protect customer credentials. You have a responsibility to do it correctly.

Stick to tried and tested key derivation algorithms like PBKDF2 or bcrypt, which have undergone years of in-depth analysis and scrutiny from a wide range of professional and hobbyist cryptographers.

If you'd like a good schooling on proper password storage, check out these links:

  • How to store salt?
  • Do any security experts recommend bcrypt for password storage?
  • How to securely store passwords?

Advantages of a public protocol:

  • Probably written by smarter people than you
  • Tested by a lot more people (probably some of them smarter than you)
  • Reviewed by a lot more people (probably some of them smarter than you), often has mathematical proof
  • Improved by a lot more people (probably some of them smarter than you)
  • At the moment just one of those thousands of people finds a flaw, a lot of people start fixing it

Disadvantages of a public protocol:

  • If indeed a flaw is found,
  • AND made public
  • AND not fixed fast enough

then attackers will start searching for vulnerable sites/applications. BUT:

  • They'll probably go after more important targets first
  • When the flaw is exposed, you know and can lock-down
  • Not all flaws are "critical" (most are like "collisions are possible under certain conditions" or "the time to bruteforce is not 1012 years, but 106 years")

Security by obscurity is deemed outright bad. Inventing your own falls under that category.


If Dave is really "your" developer, as in you have the authority to fire him, then you have the authority to direct him to use a more well-documented scheme, and you should.

In cryptography, the fewer secrets that are required to be kept, the better. This applies especially to "hard-coded" secrets, such as the hash function itself, which are not secrets as soon as the code containing the secret leaves your building.

This is why open standards for cryptography are a good thing; if the scheme has been around for years or even decades, with the basic algorithm and various implementations publicly known, and still nobody's found a way to get in, it's a good scheme.

Case in point, Polynomial provided a very good breakdown of what's wrong with the scheme, but here are the major failings of the proposed hash algorithm:

  • MD5 is completely broken; While a preimage attack isn't made much easier by the primary ways that MD5 is broken (it's extremely vulnerable to known-plaintext strong collisions; knowing the message and the digest, one can produce a collision in 2^24 time), the hashing algorithm is way too fast to be secure against modern distributed cracking hardware. It should never be used in applications requiring data security.

  • SHA-1 is considered vulnerable; Again, there's not an elegant vector for a preimage attack, but there is a strong collision attack (2^61 time for a 160-bit hash), and the hashing algorithm is fast enough and the keyspace small enough that current hardware can feasibly brute-force it.

  • More hashes don't necessarily mean better hashing. Hashes are a deterministic process; they stand in for a "random oracle" in theoretical cryptography, but since they must always produce the same output given the same input, they cannot add any entropy to what is already inherent in the input.

    Stated simply, using a secret combination of multiple hashes as Dave is doing, even if that combination is unknown, doesn't add a significant amount of work to a brute-force (the relative order of three hashes has 6 possibilities, increasing the complexity of trying all possibilities by less than 3 powers of 2), and that extra complexity disappears as soon as an attacker can decompile your code and discover the relative order of the hash functions.

  • Passwords are inherently low-entropy. The best "user-consumable" (non-gibberish) passwords max out in complexity at about 50 bits of entropy, meaning they can be cracked, if you have some idea of what you're looking for, in 2^50 or better time. That makes passwords easier to crack than other things you'd normally hash (like certificate digests), requiring additional "proof of work" to increase their security.

  • This scheme is not adding any significant proof of work; three hashes instead of one, in the grand scheme of things, adds the equivalent of about 1.25 bits of entropy to the scheme in extra required work; you could do better by just requiring passwords to be one character longer.

BCrypt, in contrast to all of the above, has as its main advantage an extremely slow key derivation function or KDF. Blowfish, the encryption cipher that forms the basis of BCrypt, uses an "SBox"; a large array of predetermined initial values, which is modified by the key and/or IV to produce the initial state of the cipher through which the plaintext is fed. In BCrypt, this SBox setup is made more complex by taking the smaller password and salt values and running them through the "unkeyed" cipher a predetermined number of times, "warming up" the cipher into a state that theoretically could only be produced by performing the same number of iterations of setup, using the same password and salt. This "warmed-up" cipher is then fed a constant plaintext to produce the hash value. In CS terms, the hash forms a "proof of work"; a computational task that is difficult (time- and/or resource-consuming) to perform, but easy to check for correctness (does the produced hash match the one we already have?).

That "predetermined number" of iterations of SBox setup is configurable; this is BCrypt's real power. A number of "rounds" of key setup are specified when hashing, and for n rounds, 2^n iterations of key setup are performed. That means that incrementing the number of rounds makes the hash perform twice as much work and take twice as long on the same hardware to produce the requisite proof of work. BCrypt therefore can easily keep up with Moore's Law, which has been the bane of many hash functions that came before.

BCrypt's main theoretical weakness is its low memory usage; while computation time can be exponentially increased, the amount of memory required remains effectively constant (the SBox doesn't get any bigger as the iterations increase and no "intermediate results" are required to be kept around). As such, while BCrypt stymies the pure pipelined computational power of a GPU, it remains vulnerable to cracking by more sophisticated "field-programmable gate arrays" (basically a massive, highly modular set of "soft-wired" logic units, that are the current state-of-the-art in highly distributed computing), because the low memory constraints mean you can just throw more relatively-inexpensive circuit boards at the problem.

A newer algorithm, SCrypt, combats FPGA crackers by exponentially increasing the memory demands as well as the computational expense of calculating the hash, so the limited memory available to each FPGA quickly makes them infeasible as well (distributed cracking would basically require large numbers of CPU/FSB/RAM combinations, essentially full computers, hooked together). The only problem there is that SCrypt is only 2 or 3 years old, so it doesn't have the cryptanalytic pedigree of BCrypt (13 years old). And really, if you have to worry about an FPGA-armed cracker getting a password in a feasible amount of time (like before you can change all of them anyway), you have pissed off some very powerful people, and have already exposed another serious vulnerability (allowing an attacker to obtain your stored password hashes in the first place, so they can crack one "offline").

Bottom line, use BCrypt for password hashing. It's 13 years old, based on a 20-year-old cipher algorithm, an in that time no known vulnerabilities have been found that would aid a password cracker. It's slow as molasses, and can be configured to always be so, provided you combine it with a requirement for users to change their password every 90 days or so, so new passwords keep getting hashed with more expensive configurations.