Consider a "zero-knowledge" file host such as mega.co.nz. How can one prevent users to upload unencrypted content?

It can't.

There are heuristics to tell whether a file is encrypted, but they're unreliable, and they're useless anyway.

An encrypted file is uncompressible because the encryption hides all the patterns. Except when it's not: if you encode an encrypted file in hexadecimal (with two bytes to encode each original byte), it takes twice as much room, so you can compress the hexadecimal version by 50%, but the hexadecimal version isn't less encrypted. Conversely, if a file is uncompressible, if may be encrypted, or it may simply be already compressed with an algorithm that's at least as good as what you have. So in fact an encrypted file may or may not be uncompressible, and a plaintext file may or may not be uncompressible,

An encrypted file contains no recognizable patterns. Except that of course it can contain, for example, a header indicating the type of encryption. Or it can recognizably contain only hexadecimal digits. So in fact an encrypted file may or may not contain recognizable patterns, and a plaintext file may or may not contain recognizable patterns.

An encrypted file may very well be a JPG picture of a giraffe, with information encoded in (say) the low-order bits of each pixel. See steganography.

Even if a file is encrypted, you can't tell, say, an AES-CBC-encrypted file from an AES-CBC-encrypted file with the key prepended. Except by trying to use the first 16 bytes as the key — and then you'd need to try out all the other ways the key may have been put in there.

Let's say that the file really is encrypted and cannot be deciphered with only the uploaded material. The creator of the encrypted file could still publish the key elsewhere.


Idea: Force structure: integrity checkable files or hybrid encryption.


It is hard to distinguish between unencrypted and encrypted files encrypted with symmetric mechanisms, if there is no forced structure within the files uploaded to the service.

An idea I have, they could use is: force some structure on uploaded files.

For instance, have integrity checking for ciphertext. This way, the file hosting service is able to ensure they host files which are not broken, in addition to added integrity checking based on cryptographic mechanism allows the hosting service to ensure that at least some cryptographic processing has been applied to the uploaded file.


If the service insists on being able to recognize if encryption has been applied, maybe they wish to require hybrid encryption. On many hybrid cryptographic schemes, the asymmetric encryption part is distinguishable from random.

What remains problem: the actual symmetric encrypted part remains unverifiable, if the key is not known. This is kind of obvious: if the file host could decrypt the file, they have access to the keys which was intended to be avoided.

Tags:

Encryption