Why is some meta data not encrypted in Proton Mail?

Proton Mail uses an encryption format called OpenPGP. It is only designed to encrypt the message. Unless the subject is put in the message and the subject field is left blank, the subject will be kept unencrypted. The e-mail sender and receiver fields, on the other hand, need to be unencrypted for proper routing to occur. This is all a limitation in the design of e-mail, a default unencrypted system.


As already discussed, Proton Mail uses OpenPGP (RFC 4880).

OpenPGP traces its roots to the original Pretty Good Privacy, written by Philip Zimmermann in 1991. At that time, while Internet (SMTP) e-mail was accessible, primarily via research institutions (general Internet access started to become a thing around 1994-1996), Internet access was far from common among the general public.

Given that, it's not surprising that PGP's message structure and file formats are general-purpose in design. While encrypted e-mail is a major use case for PGP, GnuPG and other OpenPGP implementations, that's not the only way they can be used. (For example, I use GnuPG to encrypt large blobs of compressed data before uploading them to a cloud storage provider for backup purposes.)

Therefore, the OpenPGP standard cannot mandate a particular plaintext message structure. Doing so would go against its general-purpose nature.

However, Internet (SMTP) e-mail does have a structured format. There's envelope data (which is technically outside of the message), header data and body data (which itself can be structured, for example as described in the MIME standard).

The envelope data is used for message routing, and primarily consists of the sender and recipient e-mail addresses. These must be accessible to any mail server along the message's path, and although the sender e-mail address can be masked (VERP is one variation of that), it must be valid. One can't hide these from the mail servers yet claim to do SMTP e-mail. However, a large fraction of Internet e-mail these days is protected by SMTP STARTTLS encryption, which is opportunistic and typically unverified, but does protect against passive eavesdropping. Depending on your threat model, that may be good enough; best current practice in Internet protocol design is to consider pervasive monitoring to be an attack which should be mitigated where possible in new designs.

The header data contains trace headers (Received:), subject, sender, recipients (except the Bcc field), message threading information, and a variety of other diagnostic data, technical information and other message metadata. Some of this must be unencrypted; for example, Received: headers are added by mail servers, which may not have knowledge of who the ultimate recipient is. Since it's all lumped together, the common approach is to just keep it all unencrypted (within the opportunistically encrypted channel) in transit. Doing it otherwise would require revisiting the basic standards governing the Internet e-mail format, which go back at least to RFC 822 from back in 1982.

The distinction between envelope and header data is also why you can receive an e-mail that doesn't show your e-mail address in any of the recipient fields.

The body data is what you normally might think of as the e-mail, and contains whatever you write typically in the large text field in the mail client. MIME can support carrying a payload that is encrypted, signed or both; usually either S/MIME or PGP/MIME.

It would be technically possible to retrofit e-mail such that certain metadata fields are moved to within the MIME data instead of the header section of the e-mail. Doing so might even be relatively easy, since the MIME payload can itself be an entire e-mail message (headers and body, but not envelope data). However, this would require specialized mail client support (it would no longer be Internet e-mail, but rather something that uses Internet e-mail as its transport layer and for inspiration for its structure), and it would also require the entire message to be decrypted before the protected header fields can be viewed or otherwise worked with, including for example searched for.

My understanding is that this has just not been seen as a priority, given that it is relatively easy for the user (assuming, that is, that they are aware of the need) to restrict their use of the fields that do end up in the unencrypted header section. Recent, fairly widespread deployment of SMTP STARTTLS, as well as the generally low amount of end-to-end (as opposed to transport) encrypted e-mail traffic, has likely done little but reinforced this.

And that's almost certainly why Proton Mail doesn't encrypt such metadata when sending e-mail across the Internet.