Principles of cache attacks

Apparently, this topic arouse great interest to many of you. After having done some more research, I presented the CacheBleed attack to a group of scientists last month. Now, I would like to share my results with you and to actually answer my own question.

The above three steps of the generic Prime+Probe attack are correct. However, the decisive step is missing, i.e. how to deduce the secret key. The attacker is not able to perform a DMA and he is not able to read the data saved in the cache lines. This is quite important to understand because otherwise, the entire attack would be much easier.

The attacker only knows what cache line was accessed by measuring time delays. Furthermore, he knows how RSA is implemented and what algorithms are in use. OpenSSL uses a fixed-window exponentiation algorithm to compute the message m

m = c^d mod p

We need to understand how this exponentiation algorithm works. The Handbook of Applied Cryptography by Menezes, van Oorschot, and Vanstone suggest the following pseudo-code:

Fixed Exponent Exponentiation Algorithm

Please do not confuse the secret key d and the exponent e in the above algorithm. As we are only interested in the decryption step, e is not the public key but the secret. Thus d = e in our case. The interesting point is the multiplication in 3.2. It uses precomputed multiplier g_j which actually speeds up the exponentiation. However, their selection depends on the secret key.

If the attacker knows the index of the current multiplier, i.e. what multiplier is used, he knows some bits of the secret key. The value of the multiplier is not of interest.

OpenSSL uses the so-called scatter-gather technique to avoid cache attacks on cache line granularity. It is predictable where the multiplier is stored. In total there are 32 multipliers. For that reason each needs 5 bits to be identified uniquely. The two most significant bits select the cache line while the three least significant bits identify the bin. Each cache line consists of eight bins.

The attacker is able to deduce what bin was accessed during a decryption operation. This reveals three bits of the index of the used multiplier and thus partly the private key. The missing two bits can be computed due to redundancies in RSA keys.

To sum it up, no DMA is performed, the attacker does not read data from the cache. The crucial factor is that the cache position partly reveals the secret key. This is due to secret-dependent memory accesses. Similar attacks such as on AES make use of the cache position as well. The actual data is not of interest, but the position reveals sensible data.