Searching encrypted storage

The "safe" versions of searching in encrypted data assume at least one of the following:

  • the pattern to search for is also encrypted with the same key (or some kind of related key) than the data itself;
  • the search result ("pattern was found there") is encrypted with the same key (or some kind of related key) than the data itself.

With either of these properties, the search engine does not leak information on the data to someone who would not be able to decrypt it in the first place.

The ultimate goal is to be able to offload the grunt work of searching to a big cloud system, while not needing to trust that system. Fully homomorphic encryption is the generic full-blown solution for offloading any kind of work, and the best currently known solutions for that are utterly unreasonable to apply because the overhead is tremendous (it can be implemented, but there is little use in offloading work to a cloud if the result is something slower than a pocket calculator). Encrypted data searching is a specialization: by restricting ourselves to a specific kind of work to offload (i.e. searching), we hope to find algorithms which are sufficiently lightweight to have a practical application.

To my knowledge, the field has not produced anything practical yet (but there is no intrinsic reason why it could not).


An important thing to remember about searching encrypted data is who the data owner is. In a number of proposals/papers I've seen, the data is owned by the person doing the searching. They are simply utilizing cloud resources to do the searching. This is not true of all schemes, however. For example, SADS.

Another important point to look at when reading these sorts of papers is if it requires a trusted third party (TTP). A TTP could mitigate many attacks that would be possible otherwise (such as brute-forcing documents as you describe).

For the state of the art, you'll need to be more specific about your requirements (who owns the data, who can search the data, is a TTP okay, etc) as schemes can be quite different based on those requirements. In addition to SADS, I'd recommend looking at CryptDB.

Tags:

Encryption