Make grown extracted tar file small again

[this answer is assuming GNU tar and GNU cp]

There is absolutely no diff and the checksum is the exact same value. Yet, one file is as twice as big as the original one.

1.1M    /path/to/old/folder/subfolder/file.mcapm
2.4M    /path/to/extracted/folder/subfolder/file.mcapm

That .mcapm file is probably sparse. Use the -S (--sparse) tar option when creating the archive.

Example:

$ dd if=/dev/null seek=100 of=dummy
...
$ mkdir extracted

$ tar -zcf dummy.tgz dummy
$ tar -C extracted -zxf dummy.tgz
$ du -sh dummy extracted/dummy
0       dummy
52K     extracted/dummy

$ tar -S -zcf dummy.tgz dummy
$ tar -C extracted -zxf dummy.tgz
$ du -sh dummy extracted/dummy
0       dummy
0       extracted/dummy

You can also "re-sparse" a file afterwards with cp --sparse=always:

$ dd if=/dev/zero of=junk count=100
...
$ du -sh junk
52K     junk
$ cp --sparse=always junk junk.sparse && mv junk.sparse junk
$ du -sh junk
0       junk

@mosvy points out that your files were probably sparse. Re-doing the archive + extract with tar --sparse works, or you can make existing files in the filesystem sparse again using
fallocate -d
(from util-linux) to punch holes in in-place.

for f in **/*some*pattern*;do
    fallocate --dig-holes "$f"
done

The man page describes this option as

You can think of this option as doing a cp --sparse and then renaming the destination file to the original, without the need for extra disk space.


Linux supports the fallocate(2) system call which allows cool stuff like this, including closing up or expanding page-sized holes in a file to shorten or grow a file, instead of just turning a range into a hole. It depends on the underlying FS to support each of the various fallocate features separately, and of course sparse files / extents in general.

It also lets you preallocate unwritten extents (like hole but with space reserved on disk), e.g. before a torrent download to avoid fragmentation. That's where the "allocate" in the name comes from.

Other kernels that util-linux can run under may support some or all of this functionality, IDK. If it doesn't work, cp --sparse and rename should work; sparse files in general (seek instead of writing zeros) are well-established and widespread in Unix, dating back much much farther than preallocated extents, punching holes, or especially expanding or collapsing holes between existing data.