Steganography to hide text within text

Yes, there exists algorithms that hide messages inside messages that can look quite innocent. Take for instance spammimic. It gives the possibility to hide your message inside a typical looking spam message.

A google search for "Steganography hiding text in text" gives you more research and examples around this.


My personal (maybe biased though) opinion is that spammimic isn't very "natural". A humble attempt of mine is to use the number of words in a line of emails or similar text documents, e.g. HTML source files, where one normally doesn't care too much about the ruggedness of the line ends, to transmit one stego bit. A Python code to help do that formatting is available under the name EMAILSTEGANO. Its bit rate is of course unfortunately very low. On the other hand occasionally very short stego messages could be sufficient for one's purposes (e.g. when an appropriately built codebook could be employed to express the informations to be transmitted in highly compressed forms). Note that for hand-written texts, the said problem of more or less unsatisfactory ruggedness of the line ends may even completely disappear, if corresponding care is taken in writing.

[Addendum, edited] I have now a different scheme WORDLISTTEXTSTEGANOGRAPHY (employing an extensive word list) which has a higher bit rate, albeit requiring the user to compose the covertexts under the guidance of the software. Both schemes mentioned are in the most recent versions accessible from my home page mok-kong-shen.de


I have a brilliant example for you! I've recently seen ONE application of steganography being used to hide a text message within a text document.

There is a National Geographic video on YouTube regarding the Aryan Brotherhood and how they use to communicate while in prison, across the nation. The gang was created inside a maximum security prison in California, and managed from other super max prisons. They are the most violent gang in prison and while only making up 1/10 of 1% of the population are responsible for more than 20% of the murders that take place within the prisons.

The steganographic technology that they employed was a bi-literal cipher developed 400 years ago by Sir Francis Bacon and was broken by a multi-jurisdictional federal organization including experts at the FBI, NSA and other orgs. Naturally, you cannot use this technology since it has been broken, but some of the logic behind it is still solid.

You really need to see the video if you do not understand what I am stating here. As stated, the texts are meshed together. In this technology, one "alphabet" is written in plain block letters, and the other "alphabet" is written in cursive. The plain block letters become As, and the cursive letters become Bs. Then the letters are arranged in groups of five, and they must then be deciphered using a key.

I know this sounds easy to break should it be posted on the Internet but there are some very close fonts that may make this a capable technology unless every document is poured over, and any document may contain dozens of fonts. One way to hide the font changes would be to place the different fonts in a PDF document, or image, and a special technology would be necessary to extract the different fonts, something which is not common with most OCR software.