What changed between TLS and DTLS

DTLS is currently (version 1.2) defined in RFC 6347 by explaining the differences with TLS 1.2 (RFC 5246). Most of the TLS elements are reused with only the smallest differences.

The context is that the client and the server want to send each other a lot of data as "datagrams"; they really both want to send a long sequence of bytes, with a defined order, but do not enjoy the luxury of TCP. TCP provides a reliable bidirectional tunnel for bytes, where all bytes eventually reach the receiver in the same order as what the sender used; TCP achieves that through a complex assembly of acknowledge messages, transmission timeouts, and reemissions. This allows TLS to simply assume that the data will go unscathed under normal conditions; in other words, TLS deems it sufficient to detect alterations, since such alterations will occur only when under attack.

On the other hand, DTLS works over datagrams which can be lost, duplicated, or received in the wrong order. To cope with that, DTLS uses some extra mechanisms and some extra leniency.

Main differences are:

  1. Explicit records. With TLS, you have one long stream of bytes, which the TLS implementation decides to split into records as it sees fit; this split is transparent for applications. Not so with DTLS: each DTLS record maps to a datagram. Data is received and sent on a record basis, and a record is either received completely or not at all. Also, applications must handle path MTU discovery themselves.

  2. Explicit sequence numbers. TLS records include a MAC which guarantees the record integrity, and the MAC input includes a record sequence number which thus verifies that no record has been lost, duplicated or reordered. In TLS, this sequence number (a 64-bit integer) is implicit (this is always one more than the previous record). In DTLS, the sequence number is explicit in each record (so that's an extra 8-byte overhead per record -- not a big deal). The sequence number is furthermore split into a 16-bit "epoch" and a 48-bit subsequence number, to better handle cipher suite renegotiations.

  3. Alterations are tolerated. Datagrams may be lost, duplicated, reordered, or even modified. This is a "fact of life" which TLS would abhor, but DTLS accepts. Thus, both client and server are supposed to tolerate a bit of abuse; they use a "window" mechanism to make sense of records which are "a bit early" (if they receive records in order 1 2 5 3 4 6, the window will keep the record "5" in a buffer until records 3 and 4 are received, or the receiver decides that records 3 and 4 have been lost and should be skipped). Duplicates MAY be warned upon, as well as records for which the MAC does not match; but, in general, anomalous records (missing, duplicated, too early beyond window scope, too old, modified...) are simply dropped.

    This means that DTLS implementation do not (and, really, cannot) distinguish between normal "noise" (random errors which can occur) and an active attack. They can use some threshold (if there are too many errors, warn the user).

  4. Stateless encryption. Since records may be lost, encryption must not use a state which is modified with each record. In practice, this means no RC4.

  5. No verified termination. DTLS has no notion of a verified end-of-connection like what TLS does with the close_notify alert message. This means that when a receiver ceases to receive datagrams from the peer, it cannot know whether the sender has voluntarily ceased to send, or whether the rest of the data was lost. Note that such a thing was considered one of the capital sins of SSL 2.0, but for DTLS, this appears to be OK. It is up to whatever data format which is transmitted within DTLS to provision for explicit termination, if such a thing is needed.

  6. Fragmentation and reemission. Handshake messages may exceed the natural datagram length, and thus may be split over several records. The syntax of handshake messages is extended to manage these fragments. Fragment handling requires buffers, therefore DTLS implementations are likely to require a bit more RAM than TLS implementations (that is, implementations which are optimized for embedded systems where RAM is scarce; TLS implementations for desktop and servers just allocate big enough buffers and DTLS will be no worse for them). Reemission is done through a state machine, which is a bit more complex to implement than the straightforward TLS handshake (but the RFC describes it well).

  7. Protection against DoS/spoof. Since a datagram can be sent "as is", it is subject to IP spoofing: an evildoer can send a datagram with a fake source address. In particular a ClientHello message. If the DTLS server allocates resources when it receives a ClientHello, then there is ample room for DoS. In the case of TLS, a ClientHello occurs only after the three-way handshake of TCP is completed, which implies that the client uses a source IP address that it can actually receive. Being able to DoS a server without showing your real IP is a powerful weapon; hence DTLS includes an optional defense.

    The defensive mechanism in DTLS is a "cookie": the client sends its ClientHello, to which the server responds with an HelloVerifyRequest message which contains an opaque cookie, which the client must send back as a second ClientHello. The server should arrange for a type of cookie which can be verified without storing state; i.e. a cookie with a time stamp and a MAC (strangely enough, the RFC alludes to such a mechanism but does not fully specify it -- chances are that some implementations will get it wrong).

    This cookie mechanism is really an emulation of the TCP three-way handshake. It implies one extra roundtrip, i.e. brings DTLS back to TLS-over-TCP performance for the initial handshake.

Apart from that, DTLS is similar to TLS. Non-RC4 cipher suites of TLS apply to DTLS. DTLS 1.2 is protected against BEAST-like attacks since, like TLS 1.2, it includes per-record random IV when using CBC encryption.

To sum up, DTLS extra features are conceptually imports from TCP (receive window, reassembly with sequence numbers, reemissions, connection cookie...) thrown over a normal TLS (the one important omission is the lack of acknowledge messages). The protocol is more lenient with regards to alterations, and does not include a verified "end-of-transmission" (but DTLS is supposed to be employed in contexts where this would not really make sense anyway).

The domain of application of DTLS is really distinct from that of TLS; it is meant to be applied to data streaming applications where losses are less important than latency, e.g. VoIP or live video feeds. For a given application, either TLS makes much more sense than DTLS, or the opposite; best practice is to choose the appropriate protocol.