How do poor-quality initialization vectors affect the security of CBC mode?

There are two distinct "dangers" with CBC. Remember that CBC works the following way: to encrypt a block, first XOR it with the previous encrypted block. The IV is just the "previous encrypted block" for the very first block to encrypt. The idea is that a block cipher is a deterministic permutation: with the same key and the same input block, you get the same output. The XOR with the previous encrypted block is meant as a "randomization". So the dangers are:

Block collisions.
Chosen-plaintext attacks.

Block collisions are when, through bad luck or lack of randomness, the XOR of a block with the previous block leads to a value which was already obtained beforehand.

For instance, if you use a fixed IV (all-zero or not, it does not matter), then two messages which begin with the same sequence of bytes will yield two encrypted streams which also begin with the same sequence of bytes. This allows outsiders ("attackers") to see that the two files were identical up to some point, which can be pinpointed with block granularity. This is considered a bad thing; encryption is supposed to prevent such kinds of leaks.

If using a counter as IV, you may still have such collisions, because counters have structure, and "normal" data also has structure. As an extreme case, suppose that the encrypted message also begins with a counter (e.g. it is part of a protocol in which messages have a header which begins with a sequence number): the counter-for-IV and that counter may cancel each other with the XOR, leading you back to the fixed-IV situation. This is bad. We really prefer it when encryption systems provide confidentiality without requiring some complex requirements on the plaintext format. A high-res clock as "counter" could also incur the same issue.

Chosen-plaintext attacks are when the attacker can choose part of the data that is to be encrypted. With CBC, if the attacker can predict the IV, then he can adjust his plaintext data to match it.

This is the basis of the BEAST attack. In the BEAST attack, the attacker tries to "see through" SSL. In SSL 3.0 and TLS 1.0, each record is encrypted with CBC, and the IV for each record is the last encrypted block of the previous record: an attacker observing the wire and in position to input some data in the stream can push just enough bytes to trigger emission of a record, observe it, and thus deduce the IV which will be used for the next record, whose contents will begin by the next byte the attacker will push.

Of all the IV generation methods you show, only the first one (IV generated with a cryptographically strong PRNG) will protect you against chosen-plaintext attacks. This is what was added to TLS 1.1.

On a specific situation like your credit cards in a database, some of the possible attacks may or may not apply. However, don't try to "cut corners" too much. If you put user data in the database, then chosen-plaintext attacks may apply: an attacker who can look at your database (e.g. with some SQL injection technique) may also act as a "basic user" to feed you with phony credit card numbers, just to see what shows up in the database.

In particular, in that scenario, if you use deterministic encryption (and that's exactly what you get with a fixed IV, be it all-zeros or not), then the attacker can simply brute-force credit card numbers: a number is 16 digits, but one of them is a checksum, and the first four or six digits are from the bank, and the remaining one are not necessarily "random", so such kinds of attacks can be effective.

Bottom-line is that if you use CBC, then you must use CBC properly, i.e. with a strongly random IV. If you prefer a monotonic counter (or clock), then don't use CBC; instead, use a mode which is known to be perfectly happy with a monotonic counter, e.g. GCM. It is already hard enough to achieve security when cryptographic algorithms are used by the book, so any "creativity" here is to be shunned.

And, of course, contents which has been encrypted with a given key is no more secret than the key itself. When an attacker has read access to your database, he might have read access to more than the database -- in particular, to the encryption key itself. It depends on where you store the key, and also on the extent of the attacker's access (SQL injection, stolen backup tape, front-end system complete hijack,...).

The main reason you use an IV is to prevent the same plain text yielding the same encrypted text twice. With CBC you encrypt your text in blocks. Let's assume you have the following text and each line is a block:

AAAAAA
BBBBBB
CCCCCC
DDDDDD

and

AAAAAA
CCCCCC
EEEEEE
FFFFFF

Without using an IV, the encrypted block for AAAAAA would be the same for both texts. Which means that if someone notices that the encrypted blocks are the same at the beginning of the encrytped files, he would know what the other file began with in the first place.

The idea behind an IV is that you never use it twice. It must be unique, because if it isn't unique and there is a chance you re-use one, you can run into previously mentioned situation where you can recover part of the plain text due to similiarities with an encrytped version of a known plain text.

How do poor-quality initialization vectors affect the security of CBC mode?

Tags:

Encryption

Cryptography

Databases

Initialisation Vector

Related

Recent Posts