Does using HTTPS, TLS, S/MIME, SSL e.t.c. protect you from Deep Packet Inspection and 'Big Data' analytics?

Deep packet inspection (DPI) is a term that commonly refers to standard network middle-men, such as the routers at an ISP, examining content at a protocol layer higher than the layer they need to in order to process the packet (thus inspecting "deeper" into the packet than necessary). For example, an IP router may need to only look at the IP layer (layer 3 in the OSI model) of the packet, but if it also inspects the application layer data (layer 5/7) then it is performing deep packet inspection.

HTTPS (referring to the wide suite you mentioned) is an application layer protocol. It will defend against deep packet inspection that reads the content of the application layer, but not against DPI at lower levels of the protocol wrapping. For example, HTTPS will not prevent DPI from looking at the TCP packet and examining the destination port to guess what protocol it is for. But it will prevent the DPI from learning the actual application data payload of the protocol.

Big Data Analysis refers to analysis performed on very large databases of collected data, but how that data is collected has little to do with HTTPS. HTTPS is only designed to protect data in network transit, when the destination server in the HTTPS protocol reads data, the data is decrypted and HTTPS protection no longer exists. What happens at that point is up to whatever the client and server are using on top of HTTPS. Likely, the server can do pretty much whatever it wants at that point. (In other words, Big Data generally refers to Data At Rest, whereas HTTPS protects Data In Motion. Since they address different stages of the data, it makes sense that protection in one stage won't apply to data when it is moved to a different stage.)


Will transport layer encryption (SSL/TLS/https) prevent you from Big Data Analytics?

Depends. Does the server at the other end use your data for big data analytics? If they don't, you are safe from big data analytics. If you use google/facebook/amazon, etc they are analyzing your data (and potentially sharing with other entities (like the US gov't/NSA)), they are using doing big data analytics on the information you gave them.

TLS (and SSL is just an old version of that; and HTTPS is just HTTP + SSL/TLS) just limits network eavesdroppers to only know when, whom (IP address), and how much data you are sending and receiving. (This assumes they have not compromised a CA or your browsers' trusted certificates for a sophisticated attack). This limited information is sometimes quite useful in a side-channel attacks to deduce what private information you are sending (especially if there are ajax auto-complete features like google suggest).

However, in general the transport encryption only lets your computer and the https servers at the end other end of the connection see your data; though each computer is then free to share that data with adversaries to analyze.

To be specifically clear, yes transport layer encryption prevent your ISP/other network eavesdroppers from Deep Packet Inspection; with the exception of seeing when, whom, and how much encrypted data you are sending/receiving. If you are paranoid about even this limited information, you can proxy all your traffic through an encrypted tunnel/VPN/tor to hide what web sites/machines you are visiting, so the ISP can only discover how much data you are sending to your proxy at any given time. (Granted the ISP of the proxy server could eavesdrop on which sites are visited by its proxy servers as well as what computers are connected to the proxy server.)

Tags:

Tls