Why is this 435 × 652 pixel JPEG over 6 MB?

Short answer: It's an artifact of Nikon Picture Project

I had difficulty finding "Nikon Picture Project" but finally found a 1.5 version to try. The last version produced was 1.7.6 .

It turns out that "Nikon Picture Project" does indeed implement non-destructive editing with undo and versioning capabilities. Unlike every other photo editing software I've ever seen, it does this by directly altering the JPG file structure and embedding edit controls and versions directly in the JPG. There is an Export JPEG function in the software to flatten and remove history but it looks like the native munged JPGs were posted instead of using the export.

I loaded up your first reference image (resized here)

Reference.

Sure enough, "Nikon Picture Project" showed it as an edit and crop of a much larger picture (resized here)

Original.

Checking the before and after file structures verifies the weird artifacts.

Thanks for the puzzle!


This was less interesting than it seemed at first. The user might just have a broken camera, broken memory card, or malfunctioning photo editing software that fails to save the full resolution image, but is able to save various size of working thumbnails, including the 435 × 652 "original" picture.

The filesize of your example picture is explained by a 4032 × 3024 pixels and 5,47 MB JPEG stream that is broken and, scaled down, looks like this:

Broken image scaled down

It begins here with the FF D8 SOI (Start Of Image):

Start Of Image from HxD

And ends here with the FF D9 EOI (End Of Image):

End Of Image from HxD

There is also another differently broken 1920 × 1440 thumbnail of the same image and a thumbnail of this broken image, but if there's something interesting hidden in the gray, it's between 006A4F and 5812A2. However, I wouldn't bet on it.


As other commenters have mentioned, the file contains data from Nikon Picture Project. What if you couldn't run that software, but you still wanted to know what was hidden inside?

Nikon's Picture Project format seems to be entirely undocumented, which is no surprise given that it's a custom format for a particular app and was never designed for interchange. That said, the format seems to be extremely simple and can be discerned by examining the APP10 chunks (FF EA tags) embedded in the binary. I looked at the chunks using Hachoir (a general-purpose file parsing tool) using the following code:

from hachoir.parser.image.jpeg import JpegFile
from hachoir.stream import FileInputStream
import struct

p = JpegFile(FileInputStream('20200519221417!Goniobranchus_aureomarginatus_2.jpg'))
for i in p.array('chunk'):
    print(i['data'].value[:100].hex())

Just lining up all the chunks like this, one immediately sees patterns:

4e696b6f6e20496d61676520496e666f000200000001f00000618396ffd8ffdb0084000101010101010101010101010101010201010101010202020102020202020202020202030303020303030202030403030304040404020304040404040304040401
4e696b6f6e20496d61676520496e666f000200000002f000bdcc1b6d3b9c535cb2bf520b2bff00340964d84ab6dc03cb7bf3c8ce6bd5bc1fae18562188d5e194bb9597040e36820f5e99e4f7fad7979b41bfebe67a5867785cf6e1e30c5b6e92621d8ef6
4e696b6f6e20496d61676520496e666f000200000003f000e0753fe7debf986355e1d34cfea696b17639dfb088ae1434600070a0fe7c57456f6931450a62507e47431072c3af04e3079af2b1152cf9bab65538dd5999b77a32f9991103d4739ce49e7eb5
4e696b6f6e20496d61676520496e666f000200000004f0000948036296da18e4e78e2bd98d292a577bbfebf1382b452bdcd28ef448cd8904a91a95f2cae368ee73d4fad4134b0ac68e082cd2336d033839ea7fbd9cf35c9384bda5dbd422a37b1fffd3fc
4e696b6f6e20496d61676520496e666f000200000005f000aa47dbc746ce9c2569c612aab7b9ffd3fcc2d67c0bf1b3e2d7c42bff00106972b695e0fd1ef46a1e25d7a4dc16360f84b7deea57730380b9dd5f6b7876730e8b6d664bce2c20581e590e7715
4e696b6f6e20496d61676520496e666f000200000006f000a0aa5a99ffd1fc2e5bb3ba2937c46471fc07210e73f89ae82c6e0163299631b8e58e793827afeb5fcc74aa3a57b1fd758cf7a3a1e8fa19230102b051921864306e7b9f7af54b1558e18dc310
4e696b6f6e20496d61676520496e666f000200000007f0009ddd707f957e974e7f5887b56f563a10743961d57e274be1ed7fed266b4b53219236659f703b78273863c139eb8ef5da695aba5a6b3610ddc2f3594ab219f6b0c162328727d0f6ef5e0d6c23
4e696b6f6e20496d61676520496e666f000200000008f000375ab993f790cd188651874393939cee0dd7a6411f4ae7478836811db4eac9972c4e41f94fcc416e5b9e01afa4861a528b99e34a6a261ea1e2268edc012399d0923692d9dc4920fe679ae12f
4e696b6f6e20496d61676520496e666f000200000009f000cd6fc7e6ee6de32bf75492727be7f0e6bf8be10536dbef73fb25c6317643f50958d9b9190318720124d73ba5c71c97af15e42df67b88c46252721893cf07ebfcfd2b2745467ccf7b9e950925
4e696b6f6e20496d61676520496e666f00020000000af000b0f9659df1b0f5c903a9c73f98aee03c5344b0c368bf31cf981f25f3fdecd44b1156524e5b1e156a692777a9c77882e65b547b60db6220b9dbd171ea7b579aa91a8de189519b24072b260e72
4e696b6f6e20496d61676520496e666f00020000000bf000d4fceeb5d124f262789d622cfb08924cf24e7a1e6bad8b462234ce245fe251b8063cf39afe48c5d48ceab6fccfe9ba5074e96a6c59db3c0ca8f1b850a18b2938662581e7f0fd6ba9b5b958fe
...
4e696b6f6e20496d61676520496e666f000200000068f0001acec0e2a791b919b9d91fffd6c432c611ce79c71594cf1cb202d8241af9849ec7b37b97ed648e59d60de067a8cd67f8816350d120048ef4a707a32a9cd5ec729a4de8d1b53576190c7a1af4
4e696b6f6e20496d61676520496e666f00020000006903960f515ce93b9d6e57d0cfbb94953c74eb58372df31e7f0ae983b239ea22a32a95e4d4ba7a057c139ad5dec713dcffd7f8f6f13692cc7807818a8609c4732b7615e7ad51dcb73a55bd82e60f9c
4e696b6f6e20496d61676520496e666f00030000000100000bbb0bbb40a9867a1be9d211a90a00aa00b1c1b70200a90b00000032a476a217d411a90a00aa00b1c1b70100050000000161512be4df5dd211a90a00aa00b1c1b7020005000000000132a476

We can see that there's a fixed header (4e696b6f6e20496d61676520496e666f: Nikon Image Info in ASCII), followed by either 0002 or 0003, then what seems to be an incrementing number (starting at 00000001 and ending at 00000069), and finally some kind of length field (f000 for most chunks except the last two, which have 0396 and 0000). After that it looks like data.

So, I guessed the header was something like this:

uint16_t chunktype;
uint16_t unknown; /* always zero */
uint16_t serial;
uint16_t datasize;
uint8_t payload[];

and then dumped out all the payload bits to a file:

out = open('dump.bin', 'wb')
for i in p.array('chunk'):
    data = i['data'].value
    magic, ctype, unknown, serial, size = struct.unpack('>16sHHHH', data[:24])
    print(magic, ctype, serial, size, len(data[24:]))
    chunk = data[24:24+size]
    out.write(chunk)

The resulting file starts with four bytes 00 61 83 96 (0x618396) which matches the total length of the data (0x618396 = 6390678 bytes). Next is FF D8 FF DB, the start of a JPEG, so stripping the length field off reveals a 4032x3024 JPEG. This is presumably the original photo from the camera. Here's the photo, resized to fit within the upload limit:

first image - 4032x3024

A trip to Hachoir shows that the JPEG is quite normal in structure, but it's been stripped of all metadata. Curiously, Hachoir also shows that it ends after 5742120 bytes. Dumping out the data after the end reveals a second JPEG, 1920x1440 in size:

second image - 1920x1440

Sadly it's not some exciting spy stuff, it's just another version of the original picture but somewhat downscaled. It's still much, much larger than the actual cropped photo data, though! This time there's nothing at the end, so we've extracted out all the images from the file.

All that remains is the last chunk of data, which is 3008 bytes long. This chunk appears to contain the actual picture project info, presumably including a history of edits, detailed edit information, etc. The format is a lot more irregular, although I recognize quite a few double-precision floating point numbers and some things that look like magic numbers (65 D4 11 D1 91 94 44 45 53 54). With a little more work it should be possible to reverse engineer these chunks too - but there does not appear to be anything interesting hidden here steganographically :)