What is the purpose of associated authenticated data in AEAD?

As a very general rule, the purpose of associated data (hereafter "AD") is to bind a ciphertext to the context where it's supposed to appear, so that attempts to "cut-and-paste" a valid ciphertext into a different context can be detected and rejected.

For example, suppose I'm encrypting the values that I insert into a key/value database, and I use the record key as AD. What does that do? Well, first of all, mechanically it means now that whenever I decrypt the value, I must present the same key as AD or otherwise the decryption will fail with an authenticity error. So my application will have to perform that step in order to decrypt any of the data correctly.

Second, and more deep, is that by using the key as AD, I've thwarted one sort of attack an insider—say, the database administrator—could conceivably carry out against my application: take two records in the database and swap their values. Without the AD, the application would happily decrypt these, and blindly assume that it read the correct value for those keys. With the key as AD, instead the application would immediately notice that the values are not authentic—even though the attacker never actually modified them—because they're occurring in the wrong context. That's just one example out of many possible, but applications of associated data tend to have that flavor.

One important detail that people often miss from examples is that the associated data doesn't necessarily have to be stored or transmitted with the ciphertext. Any context-dependent non-secret values that the honest parties are both able to correctly infer can be useful as associated data. For example, if the parties are executing a complex protocol that's been formulated in terms of a state machine, such that every correct party can always tell their own state and that which an honest counterparty should be in, then those states—even though they're implicit to the protocol—can be used as AD. This is the sort of thing that makes AEAD suites so attractive to designers of protocols like TLS—it's a tool that can be used to cut down on the complexity of a secure protocol.


From RFC 5116 - An Interface and Algorithms for Authenticated Encryption:

Authenticated encryptionBN00 is a form of encryption that, in addition to providing confidentiality for the plaintext that is encrypted, provides a way to check its integrity and authenticity.

Authenticated Encryption with Associated Data, or AEADR02, adds the ability to check the integrity and authenticity of some Associated Data (AD), also called "additional authenticated data", that is not encrypted.

And from 2.1 - Authenticated Encryption:

The associated data A is used to protect information that needs to be authenticated, but does not need to be kept confidential. When using an AEAD to secure a network protocol, for example, this input could include

  • addresses,
  • ports,
  • sequence numbers,
  • protocol version numbers,
  • and other fields that indicate how the plaintext or ciphertext should be handled
  • forwarded
  • or processed.

In many situations, it is desirable to authenticate these fields, though they must be left in the clear to allow the network or system to function properly. When this data is included in the input A, authentication is provided without copying the data into the plaintext.

In 2002, Phillip Rogaway noted:

Authenticated-Encryption with Associated-Data

Oʀɪɢɪɴ ᴏꜰ ᴛʜᴇ ᴘʀᴏʙʟᴇᴍ. The need to handle associated-data when using an integrated AE mode was first pointed out to the author by Burt Kaliski [16]. Several more individuals soon communicated the same sentiment. Those attuned to this problem were involved in standardization efforts that needed to bind to a ciphertext some cleartext data, such as an IP address. People wanted a cheap and secure way to do this when using an AE-mode such as OCB

[16] B. Kaliski. Personal communication, May 2001.

It's stuff that you want to go with the encrypted data, and be sure it hasn't been altered, but needs (or can) remain visible.