Why is this 435 × 652 pixel JPEG over 6 MB?

Short answer: It's an artifact of Nikon Picture Project

I had difficulty finding "Nikon Picture Project" but finally found a 1.5 version to try. The last version produced was 1.7.6 .

It turns out that "Nikon Picture Project" does indeed implement non-destructive editing with undo and versioning capabilities. Unlike every other photo editing software I've ever seen, it does this by directly altering the JPG file structure and embedding edit controls and versions directly in the JPG. There is an Export JPEG function in the software to flatten and remove history but it looks like the native munged JPGs were posted instead of using the export.

I loaded up your first reference image (resized here)


Sure enough, "Nikon Picture Project" showed it as an edit and crop of a much larger picture (resized here)


Checking the before and after file structures verifies the weird artifacts.

Thanks for the puzzle!

This was less interesting than it seemed at first. The user might just have a broken camera, broken memory card, or malfunctioning photo editing software that fails to save the full resolution image, but is able to save various size of working thumbnails, including the 435 × 652 "original" picture.

The filesize of your example picture is explained by a 4032 × 3024 pixels and 5,47 MB JPEG stream that is broken and, scaled down, looks like this:

Broken image scaled down

It begins here with the FF D8 SOI (Start Of Image):

Start Of Image from HxD

And ends here with the FF D9 EOI (End Of Image):

End Of Image from HxD

There is also another differently broken 1920 × 1440 thumbnail of the same image and a thumbnail of this broken image, but if there's something interesting hidden in the gray, it's between 006A4F and 5812A2. However, I wouldn't bet on it.

As other commenters have mentioned, the file contains data from Nikon Picture Project. What if you couldn't run that software, but you still wanted to know what was hidden inside?

Nikon's Picture Project format seems to be entirely undocumented, which is no surprise given that it's a custom format for a particular app and was never designed for interchange. That said, the format seems to be extremely simple and can be discerned by examining the APP10 chunks (FF EA tags) embedded in the binary. I looked at the chunks using Hachoir (a general-purpose file parsing tool) using the following code:

from hachoir.parser.image.jpeg import JpegFile
from hachoir.stream import FileInputStream
import struct

p = JpegFile(FileInputStream('20200519221417!Goniobranchus_aureomarginatus_2.jpg'))
for i in p.array('chunk'):

Just lining up all the chunks like this, one immediately sees patterns:


We can see that there's a fixed header (4e696b6f6e20496d61676520496e666f: Nikon Image Info in ASCII), followed by either 0002 or 0003, then what seems to be an incrementing number (starting at 00000001 and ending at 00000069), and finally some kind of length field (f000 for most chunks except the last two, which have 0396 and 0000). After that it looks like data.

So, I guessed the header was something like this:

uint16_t chunktype;
uint16_t unknown; /* always zero */
uint16_t serial;
uint16_t datasize;
uint8_t payload[];

and then dumped out all the payload bits to a file:

out = open('dump.bin', 'wb')
for i in p.array('chunk'):
    data = i['data'].value
    magic, ctype, unknown, serial, size = struct.unpack('>16sHHHH', data[:24])
    print(magic, ctype, serial, size, len(data[24:]))
    chunk = data[24:24+size]

The resulting file starts with four bytes 00 61 83 96 (0x618396) which matches the total length of the data (0x618396 = 6390678 bytes). Next is FF D8 FF DB, the start of a JPEG, so stripping the length field off reveals a 4032x3024 JPEG. This is presumably the original photo from the camera. Here's the photo, resized to fit within the upload limit:

first image - 4032x3024

A trip to Hachoir shows that the JPEG is quite normal in structure, but it's been stripped of all metadata. Curiously, Hachoir also shows that it ends after 5742120 bytes. Dumping out the data after the end reveals a second JPEG, 1920x1440 in size:

second image - 1920x1440

Sadly it's not some exciting spy stuff, it's just another version of the original picture but somewhat downscaled. It's still much, much larger than the actual cropped photo data, though! This time there's nothing at the end, so we've extracted out all the images from the file.

All that remains is the last chunk of data, which is 3008 bytes long. This chunk appears to contain the actual picture project info, presumably including a history of edits, detailed edit information, etc. The format is a lot more irregular, although I recognize quite a few double-precision floating point numbers and some things that look like magic numbers (65 D4 11 D1 91 94 44 45 53 54). With a little more work it should be possible to reverse engineer these chunks too - but there does not appear to be anything interesting hidden here steganographically :)