What makes Adobe formats so vulnerable?

There are many factors but these are some:

  1. They're non-standard binary files. Not based on XML/JSON/YML/etc, which means that each Adobe product developer has to reinvent their own parsers, which is one of the most difficult and tedious tasks for programmers.

  2. These files are often designed with efficient processing, with little thoughts given to ease of writing secure implementation. For example, PDF contains an xref table, which contains the byte offset that programs have to seek() to find sections. If that's not bad enough, this xref table is located at the end of the file, necessitating PDF parser to read the file backwards from the end of the file. To add even more to complication, there can be more than one such xref tables in a single file, and some entries can be inactive or overridden by later entries.

  3. The binary format can contain conflicting information. For example, a PDF section can be specified in different ways: each section contains the length of the section in ASCII numbers, an "endobj" marker, and entries according to the xref table. These conflicting informations means that different parts of Adobe's product and different implementations may rely on different ways to interpret the same section.

  4. These file formats can contain Turing-complete executable code. A PDF file can contain Javascript code, a SWF file can contain ActionScript. Like any other document macro language (e.g. VBScript in MS Office files, JavaScript in HTML), these Turing-complete languages are a source of bugs of their own.

  5. PDF touts itself as a format that should look the same everywhere, and you can open ancient PDF and expect it to render the same in modern readers. Some ancient PDF generators produced faulty PDF, which happens to rely on a bug in the old Reader/Acrobat. Rather than fixing the program and rejecting these faulty files as corrupted, Adobe programmed these hacks to keep these faulty files readable. Unfortunately, these hacks aren't documented, so it makes it difficult for other implementations to keep up with Adobe on how exactly to parse these malformed documents.

  6. The formats have massive amount of features that's tangential to document formatting. Rather than using separate layers, Adobe like to implement everything in their own format specification. For example, many document formats like ODF, DOCX, JAR are simple XML/Class or and other metadata in a regular zip file. PDF intermixes compressed format into regular data stream, you can't just use a regular zlib decompressor to get uncompressed PDF. Same thing with digital signatures and version control.

Adobe is still making a lot of money despite these problems, which is probably the biggest reason why they don't really feel the need to care to change their developer culture.


Because Shockwave/Flash used a intermediate bytecode that gets generated from ActionScript, it is the code generator and the library found in the flash_player plugin that has extremely poor security coding practice.

Also, flash_player plugin (along with Firefox JS engine, Chrome JS engine, and node) makes use of dynamic executable code generation when reading these intermediate bytecodes that will continue to makes it vulnerable for the near forseeable future.

I learned all of this while using IDA Pro on it.

And it still has plenty of "opportunities".