Designing a cryptographic file-sharing protocol

Making your own crypto is fine as long as you understand that it is for learning, not for using.

There are several "layers" in cryptography. There are algorithms, like RSA, AES, SHA-256... Then there are protocols, which assemble algorithms together. And then, there are implementations, which turn protocols into executable code. For a first grasp of cryptography, I usually recommend to go to implementation first: get an existing protocol, and implement it. This has some nice benefits, including the possibility to test your implementation against others, which is great in tracking bugs. This will also give you some insights on how protocols are structured, e.g. to support streamed operation.

Since you are concentrating on the sending of a file as a single message, the model appears to be close to what OpenPGP purports to solve; therefore, making a working implementation of OpenPGP in C# is what I recommend.

If you still choose to make your own protocol right away, then the following can be said about your choices:

  • Your key sizes are overkill. 256-bit AES keys are quite useless in practice, since 128-bit keys are already quite far beyond that which is breakable with existing technology (e.g. see this answer). Similarly, 4096-bit RSA keys are oversized; 2048 bits are already more than enough to ensure security. Since larger keys imply reduced performance, oversized keys imply unrealistic slowness which will not be representative of what can be achieved, thus lowering the pedagogical value of the whole endeavour.

  • Conversely, 1000 rounds of PBKDF2 could be considered as too small. For a password-to-key conversion, the iterations are there for a "muscle show", to cope with the inherent weakness of the password (that is, the inherent weakness of the meat bag who will have to remember it), so you want that iteration count to be as high as is tolerable, thus relating to the power of your computer and your patience. A typical count value, with today's computer, would be around 100000 or even some millions.

  • You use a MAC for the encryption of the file, but not for the encryption of your private key. This looks strange. There are situations where a MAC is not necessary, but most contexts where encryption is warranted also call for a MAC. It would be simpler to use the same format for both encryptions.

  • You apply the MAC on the encrypted file: that's encrypt-then-MAC, and that's good -- and you did think of making the encryption IV part of the MAC input, which is even better (that's a classic mistake which you neatly avoided). You might want, though, to make some allowance for some future algorithm agility: maybe, at some point, you will want to use another symmetric encryption algorithm. In that case, some symbolic identifier for the encryption algorithm should also be part of the MAC input; this can be as simple as a first byte of value 0x00, leaving you some root for 255 other future algorithms.

  • You use HMAC with SHA-512, which is not bad, but SHA-256 is not bad either and will offer substantially better performance on 32-bit systems. There again, race to the largest outputs and key sizes is artificial and sterile. SHA-256 is more than "secure enough".

  • CBC mode and a separate MAC are "old style". There are modern encryption modes which combine the encryption and the MAC within a single primitive, reducing the possible scope of implementation errors; in particular, EAX and GCM. I understand that .NET does not offer them (yet); CipherMode is limited to CBC, CFB, CTS, ECB and OFB. In that case, CBC is not the worst choice, with an extra MAC (i.e. HMAC, as you suggest). To use CBC properly, an unpredictable random IV is necessary, and that's what you suggest (good).

  • Be extra wary when implementing the decryption. You should first verify the MAC, and then, only if the MAC verification succeeded, may you proceed with the decryption. The tricky point is in the padding handling: an attacker could try to extract secrets from the recipient, by sending altered files and see how the recipient reacts, e.g. when the recipient found a syntactically valid padding or not. This is the basis of padding oracle attacks. You can process the incoming data in blocks, computing both the MAC and the putative decryption in parallel, but take care, for the last block, to compute and check the MAC first (this is the kind of hurdle that is avoided with GCM or EAX).

  • Encrypting the key for AES and the key for HMAC is weird; it would allow a creative attacker to swap them around, making the recipient use the HMAC key for decryption, and vice versa. Although the benefits would not be obvious, it is very hard to tell whether a vulnerability exists that way; it depends on how "different" AES is from HMAC, a property which is not well-defined and, in any case, has not been thoroughly investigated.

    It seems safer, and also more performant, to use a single "master key" K, encrypted with the recipient's public key, and to derive the AES key and the HMAC key from it with some Key Derivation Function. It can be as simple as making K a 256-bit key (generated randomly) and splitting it in two halves, one for AES and the other for HMAC. Or you could make K longer (or even shorter), hash it with SHA-256, use split the SHA-256 output into two halves. The point is that a single RSA encryption is needed, not two. Also, deriving the encryption and MAC keys from a single source will map better to a possible future use of GCM or EAX.

  • Your file format does not allow for streamed processing. In your format, the MAC is computed over the complete encrypted file, but its value comes first, before the encryption result. This means that the sender will have to process the whole file before beginning the sending. Thus, for a 1 GB file, 1 GB of temporary disk space (or RAM) will be necessary. This could be avoided if the MAC value was appended at the end, rather than stored in a prefix.

    Note that this buffering issue necessarily exists for the recipient, who must verify the MAC before using the decrypted data in any way. To fully cope with that, you would have to make a more complex format, with moderate-sized chunks, each with its own encryption and MAC, and some glue for secure assembly. The model would be SSL/TLS here.

  • There is no authentication ! The recipient has no way of knowing whether the file he received indeed came from the alleged sender. To add authentication, in your setup, you will need digital signatures. Note that a digital signature may replace a MAC, if applied adequately, which is not an easily guaranteed property. It is safer to simply compute a signature (with the sender's private key) over the complete structure.

  • There is no protection against replay attacks. An attacker, observing a file in transit, could send the same file again at a later date. It is then up to the recipient to take measures to detect such duplicates. This can be as simple as a date somewhere within the data file, under the protection of the MAC (and possibly the encryption, although this is not necessary), but it must be given some thought.

  • By encrypting the private key yourself, I understand that you do not let the operating system take care of it. This could be limitative. Indeed, a RSACryptoServiceProvider instance could, potentially, use a key which physically resides in a smart card; and even for software keys, there can be some security benefits with letting the OS take care of the storage (the OS has privileged access to the hardware, and can, for instance, avoid leaking secret keys to the disk through virtual memory). By using an explicit key file, you make your system incompatible with such potentialities, which is a shame.

  • There is no room for metadata, in particular a content type. The same sequence of bytes can have several interpretations, and there can be amusing consequences if the recipient can be induced into opening a HTML file as PDF or vice versa. Ideally, that which you actually encrypt should be a structure with a header containing a designation of the type of data (e.g. a media type) followed by the data itself.

All of this is said in no particular order and without claiming exhaustivity. The synthetic conclusion is that protocol design is not simple; it helps to have some solid implementation experience on which to base the design.