Most secure data storage?

Your question is subject to some subtleties. Fasten your seat belt, I am going to be verbose.

Digital Medium

You want a "digital" medium. What is that exactly ? In a hard disk, a bit is written by changing the orientation of some magnetic dipoles, created by the "movement" (inasmuch as it can be defined as per quantum mechanics) of electrons in some of the atoms of the disk platter. Roughly speaking, each atom has a natural orientation, and the bit is stored by changing this orientation, i.e. "rotating" the atom. So there is some amount of "physical movement" in it. That's also true with flash memory: data is stored by accumulating "charges" in a semiconductor substrate; the charge corresponds to the "physical" movement of electrons (they hop between "bands" -- the kind-of orbital trajectories of electrons around atom kernels).

Now consider a sheet of some material in which you engrave some data; there again, that's just "moving around" some of the atoms of the material, so it is not qualitatively distinct from magnetic storage. You can make a distinction based on the amount of matter which is moved for a single bit, but this involves an arbitrary threshold.

Engraving has a remarkable record of long-term reliability for information storage. Consider this marvelous extract from the Schøyen collection:

MS 3029 - Gift from the high and mighty of Adab to the high priestess, on the occasion of her election to the temple (Sumer, 26th century BC)

This specific data file has been stored for more than 4500 years, in an area which has been scourged by countless wars in between. Empires have waxed and waned, and the soil soaked by the blood of warriors. Yet the information remained and was not lost; and, even better, it was stored at no marginal cost at all. As can be seen on the picture above, the medium would be good for a few more millenia at least. Data integrity is more threatened by, let's say, "encoding problems" (although that specific example could be translated nonetheless).

From the wording of your question, I assume that you would declare such engraving as "not digital" and reject it. Now, have a look at this:

Microscoping view of the surface of a CD

This is a close-up view of the surface of a Compact Disc, of the "pressed" variety (i.e. not a CD-R). The bits are encoding by embossing, the dual operation of engraving; a big press is involved in the process. A CD-R does not use embossing; rather, the optical properties of a dye layer are modified with a laser, so a CD-R is more similar to printing on paper (in particular thermal printers as are common in payment terminals).

A CD is definitely "digital". However, it is not qualitatively distinct from the work of a Sumerian scribe on a stone slab. To declare the latter "not digital", you have to enforce a totally arbitrary threshold, which, as such, can be challenged.

You might want to make a distinction based on how the data can be read back into a computer, on the basis that a CD goes into a reader which outputs zeroes and ones on a wire. However, the slab picture above which you saw went to your computer over wires as zeroes and ones, so, there again, the distinction is subject to an arbitrary threshold. As a more striking example, consider QR codes like this one:

QR code for Wikipedia (mobile version of the main English page)

Now, a QR code printed on a sheet of paper, or engraved on a marble slab (to be read with tangential lighting, so as to shadow the pits), is that digital or not ?

Magnetism and EMP

Magnetism is a property of some materials. Magnetism is convenient for computerized data storage, because of the achievable high density, low latency, and possibility of multiple rewrites. However, this last property is also the bane of long-term storage: stored data can be affected by external magnetic sources, and is subject to gradual leakage. Even reading implies "grabbing" a bit of energy from the medium, thus weakening the data storage. Magneto-optical drives fare a bit better:

  • The medium magnetism can be altered only at high temperatures; at room temperature, it is "fixed".
  • Reading can be done optically (because the medium optical qualities are modified by the magnetic orientation which was forced upon it when it was last heated), thus leaving the magnetic field "alone".

Although manufacturers of magneto-optical drives claim reliable storage for long durations (decades, up to a century or two), this heavily depends on environmental conditions, and has never been tested "in full size" since the technology itself is not that old.

In particular, magnetic storage, whether magneto-optical or not, can be disrupted by applying a huge amount of magnetism in one go, something known as an Electromagnetic Pulse. This is the electromagnetic equivalent of a full stadium of beer-powered sport fans bellowing simultaneously (this has been used by some movie directors to obtain sound effects which are hard to simulate in a lab, resulting for instance in this scene -- a cricket stadium was involved). The method of choice for generating big EMP is through nuclear weapons: nuclear fission and fusion emit huge amounts of high energy gamma rays, which, by colliding with electrons, create the EMP. EMP can also be generated with non-nuclear devices, albeit with a much lower energy.

Hollywood, in its everlasting educational mission, has depicted a non-nuclear EMP in which the generating device looks like a jukebox. There has been some allowance for artistic license, though: a non-nuclear EMP works by rapid compression of a conductor in a magnetic field, where "rapid" means "high explosives". While the EMP effect has some military applications in specific situations (especially disabling onboard electronics in missiles, without needing to actually hit the missile with another missile), the common wisdom is that the explosives are more a general threat than the EMP itself. Would-be terrorists would not care about electromagnetism; they would just blow up things with the explosives alone.

A Faraday cage is effective against EMP proper; however, it does not block gamma rays and neutrons from a nuclear explosion, so gamma rays may enter the cage and generate a local pulse by interacting with the magnetic storage medium itself. The best protection for magnetic storage devices against a nuclear event is a deep underground bunker (it is also efficient for protecting human operators). That's what they do for NORAD: the headquarters are buried under a mountain.

Security: Threats and Goals

There are two sides to data security:

  • Protection against destruction: we want to be able to read the data back later on, possibly after many decades. An attacker may want to prevent us from doing that.
  • Protection against data theft: we do not want an attacker to be able to access the data.

These two goals are partially opposite to each other. The best protection against destruction is duplication; if you have a dozen copies of the data scattered over five continents, then the attacker will have a hard time obliterating them all (even Harry Potter could not do it in less than seven books). However, the more physical copies there are, the harder it is to protect them all against illicit copying. To some extent, theft is just another copy, so theft protects against destruction...

The situation is substantially different, depending on when and how you want to be able to access the stored data (data storage makes sense only if you plan to access the data at some point, even conditionally). If you just want long-term storage as a backup in case of a large-scale disaster, then you can use a non-networked storage facility, thereby removing all threats related to "hacking". Thus, protection against theft becomes a matter of physical security: store the data in a bunker, enlist guardians with (as @Lucas suggests) fierce dogs.

Guardians and dogs are a worry, though, because:

  • There are costs: you must feed them, entertain them, see to their general well-being and health, but in the same time subject them to enough stress and well-dosed misery so that they remain sharp and mean killing machines. And you must do that for the dogs, too.

  • Bribery is a problem. Each biological entity you allow around the premises is yet another target for subversion. Elaborate cross-spying schemes, with rewards on ratting out, can mitigate the issue, but involve even higher costs, and are considered to be "toxic to the workplace".

The storage medium can trigger the same kind of issue. If you use magnetic storage such as hard disks, then you must regularly (say, on a yearly basis) power the drives, and replace the units which fail to go live (invariably, some will fail to power up). This entails some redundancy (error correcting codes and variants, such as what is used in some RAID arrays) and, you get it, operators. Those pesky humans tend to have external lives, through which they can be controlled by adverse parties (give a man one million dollar, or kidnap his mother, and see if he keeps on refusing something as simple as pinching a hard disk).

If the facility does not need regular intervention, then you can, to some extent, use discretion as a substitute. For instance, consider the external lookout of the tomb of Tutankhamun (actually, a view from the Valley of the Kings, in Egypt, at a place where archeologists of the last two centuries have not yet defaced the landscape with their excavations):

View from the Valley of the Kings, Egypt

and compare that to the tomb of Khufu, of, let's say, questionable inconspicuousness:

The Great Pyramid of Giza

Now guess which one was not robbed in more than three thousand years ?

That's "Security through Obscurity" and it is known to be unreliable (e.g. see the plot of For a Few Dollars More), but "unreliable" does not mean that it never works...

Encryption

It is time to flourish our secret weapon: encryption. Encryption does not create secrecy; however, it concentrates secrecy. You have a key; it is a sequence of bits which have been generated with as much randomness as is practical (e.g. with coin flips, or sufficiently unbiased dice). Using the key, you can transform the data, even huge amounts of data, into a big heap of meaningless junk of roughly the same size of the original data; but, knowing the key, you can reverse the operation and recover your precious data.

A cryptographic key for symmetric encryption ("symmetric" means that the same key is used for encrypting and for decrypting) needs not be long; it just has to be long enough to thwart exhaustive search (i.e. it would be way too expensive, or even downright impossible with energetic resources available on Earth, to try out any substantial part of the set of possible key values). 128 bits are enough, but, in order to accommodate inflexible administrative regulations and quiet the qualms of managers who read too many science-fiction books, you might want to pump the key size up to 256 bits (don't believe Dan Brown: exhaustive search is not a ultimately workable attack strategy, even if the attack machine is expensive; the scientific basis for this novel is no better than the quality of writing or the plausibility of the characters). As for the encryption algorithm, there is no guarantee that any specific algorithm will keep up its security promises for the next century, but there are good looking candidates which have sustained cryptanalytic advances for more than a decade, and that's not for lack of trying. I am of course speaking about the AES, which seems to be quite secure, and, furthermore, is Approved by governmental bodies throughout the World. AES might not ultimately protect your data, but it is the best that can be offered right now, and its security is plausible enough that you can get insurance against possible attacks at reasonable costs.

The secrecy concentration works like this: if the data you store is encrypted, then you no longer have to worry about theft of the data itself. You can store your gigabytes on any number of hard disks, copied and stored over dozens of remote places so that local disasters and "Acts of God" do not destroy it. Hundreds of underpaid henchmen may oversee the regular maintenance operations, with no bribery issue. You still need to take steps to guarantee the secrecy of the key, but the key is very small, so that's much easier.

The Solution

For long-term storage, I thus recommend the following scheme:

  • Encrypt the data, with AES and a 128-bit key (or 256-bit key if you need to woo investors).
  • Store the encrypted data on hard disks or pressed CD (the former have more capacity, the latter require much less maintenance), and put these in guarded bunkers in geographical locations which are far away from each other (preferably, in distinct countries over several continents).
  • Make a few copies of the key by engraving it on stone slabs (marble is classical; avoid limestone, which might lack durability in humid conditions; the Distinguished Customer will want to engrave on diamonds). Put the slabs in the safes of a few banks (do not conceal them in jewelry worn by your mistress -- that's the way of James Bond villains and it does not work well, for long-term secrecy).

For extra security, split the key into parts with Shamir's Secret Sharing, so that plundering one bank does not endanger the secrecy of the key.

The trick, of course, is to make attacks more expensive than what the data is worth. If destroying or stealing your data involves a wide-scale nuclear war or a worldwide meteorite strike, then chances are that your data will be quite irrelevant in such conditions, and nobody will bother resorting to such drastic measures just to get at it.


I'd go with an underground nuclear bunker with a Faraday cage on the inside. In there I'd put a vault. In the vault I would put a usb or harddrive with embedded hardware encryption.

Also add halon for both fire suppressant and killing things that breathe.

You can add vicious dogs (or psychotic AIBO's) as well if you want to.


You say you don't trust these places, but trust is something that can be grey, rather than black and white.

What I do with my photos is keep 1 copy on my hard drive, one on a removable hard drive (replicated weekly) a set of key images burned to DVD monthly, and a full mirror on a server on a different continent.

For sensitive data, I encrypt it first, and do a very similar process.

This way I don't need to trusy that any one method will work, just that I am unlikely to have all fail at the same time.

If you strongly encrypt your data, you can store it with whatever organisation - they won't be able to read it anyway, so all you have to do is choose one with the resilience options you like. The same goes for the cloud - there are private and public cloud variants; fixed location and global; pop your data to a few locations or methods and you should be as secure as you want to be.