Why `--modify-window=1` when using `rsync` command?

Just to avoid any confusion about how the modify_window works, it's checked in either direction. (If you want to read this in the source code, check util.c :: cmp_time().)

That means,

  • if A is newer than B, it checks if A is still newer than B + modify_window.
  • if B is newer than A, it checks if B is still newer than A + modify_window.

So let's say the original A has the time 123, but your backup filesystem is lousy so copy B ends up with either time 122 (making A newer than B), or time 124 (making B newer than A).

What happens with modify_window = 1?

  • If A (123) is newer than B (122), it checks if A (123) is still newer than B (122+1 = 123).
  • If B (124) is newer than A (123), it checks if B (124) is still newer than A (123+1 = 124).

In both cases it turns out to be identical, so modify_window = 1 is sufficient for the time to deviate by one second in either direction.

According to the rsync manpage, this is supposed to be good enough(tm) for FAT32.

According to the documentation you cited (turning 122 into 124, what the heck), it's not good enough.

So this is inconclusive.


By experimentation, using NTFS(-3g) and FAT32 in Linux, modify_window = 1 seems to work fine.

My test setup was thus:

truncate -s 100M ntfs.img fat32.img
mkfs.ntfs -F ntfs.img
mkfs.vfat -F 32 fat32.img
mount -o loop ntfs.img /tmp/ntfs/
mount -o loop fat32.img /tmp/fat32/

So, a 100M NTFS/FAT32 filesystem.

Create a thousand files with a variety of timestamps:

cd /tmp/ntfs

for f in {000..999}
do
    sleep 0.0$RANDOM # widens the timestamp range
    touch "$f"
done

For example:

# stat --format=%n:%y 111 222 333
111:2018-08-10 20:19:10.011984300 +0200
222:2018-08-10 20:19:13.553878700 +0200
333:2018-08-10 20:19:17.765753000 +0200

According to you, 20:19:10.011 should come out as 2018-08-10 20:19:12.000.

So let's see what happens. First, copy all of these files over to FAT32.

# rsync -a /tmp/ntfs/ /tmp/fat32/

Then I noticed the timestamps are actually accurate, until you umount and re-mount:

# umount /tmp/fat32
# mount -o loop fat32.img /tmp/fat32

Compare:

# stat --format=%n:%y /tmp/{ntfs,fat32}/{111,222,333}
/tmp/ntfs/  111:2018-08-10 20:19:10.011984300 +0200
/tmp/fat32/ 111:2018-08-10 20:19:10.000000000 +0200
/tmp/ntfs/  222:2018-08-10 20:19:13.553878700 +0200
/tmp/fat32/ 222:2018-08-10 20:19:12.000000000 +0200
/tmp/ntfs/  333:2018-08-10 20:19:17.765753000 +0200
/tmp/fat32/ 333:2018-08-10 20:19:16.000000000 +0200

So this pretty much looks like it got floored to me. I don't know if Windows would do it the same way, but this is what happens using Linux and rsync.

What rsync would do when copying again:

# rsync -av --dry-run /tmp/ntfs/ /tmp/fat32
sending incremental file list
./
000
001
002
035
036
...
963
964
997
998
999

So there are some gaps in the list but in general, it would re-copy quite a lot of files.

With --modify-window=1, the list is empty:

# rsync -av --dry-run --modify-window=1 /tmp/ntfs/ /tmp/fat32/
sending incremental file list
./

So, at least for Linux, the man page is accurate. The offset seems to be never larger than 1. (Well, one plus fraction, but that is ignored as well.)


So, should you be using --modify-time=2 anyway? Not until you can show experimentally that this is actually a possible condition. Even then, it's hard to tell. This is an awful hack in the first place and the larger the time window, the more likely that genuine modifications will be missed.

Even --modify-time=1 already ignores changes that can't be related to the way FAT32 timestamps get rounded - since it goes in both directions, but FAT32 only ever floors, and rsync ignores this when copying to FAT32 (target files can only be older), and vice versa when copying from FAT32 (target files can only be newer).

An option to handle this better does not seem to exist.


I also tried to track this behavior down in the kernel sources, unfortunately the comments (in linux/fs/fat/misc.c) don't give much to go on.

/*
 * The epoch of FAT timestamp is 1980.
 *     :  bits :     value
 * date:  0 -  4: day   (1 -  31)
 * date:  5 -  8: month (1 -  12)
 * date:  9 - 15: year  (0 - 127) from 1980
 * time:  0 -  4: sec   (0 -  29) 2sec counts
 * time:  5 - 10: min   (0 -  59)
 * time: 11 - 15: hour  (0 -  23)
 */

So according to this, FAT timestamp uses 5 bits for seconds, so you get only 32 possible states, of which 30 are used. The conversion is done with a simple bit shift.

in fs/fat/misc.c :: fat_time_unix2fat()

    /* 0~59 -> 0~29(2sec counts) */
    tm.tm_sec >>= 1;

So 0 is 0, 1 is 0, 2 is 1, 3 is 1, 4 is 2, and so on...

in fs/fat/misc.c :: fat_time_fat2unix()

    second =  (time & 0x1f) << 1;

Reverse of the above, and the 0x1f is the bitmask to only grab bits 0-4 of the FAT time which represents 0-29 seconds.

If this is any different than it should be, there is nothing about it in the comments that I could see.


An interesting post by Raymond Chen about why Windows would go to the trouble of rounding the times up: https://blogs.msdn.microsoft.com/oldnewthing/20140903-00/?p=83

Okay, but why does the timestamp always increase to the nearest two-second interval? Why not round to the nearest two-second interval? That way, the timestamp change is at most one second.

Because rounding to the nearest interval means that the file might go backward in time, and that creates its own problems. (Causality can be such a drag.)

According to this, the Windows xcopy tool has a /D flag which says "only copy source files if newer than destination file". Basically what rsync --update or cp --update would do.

Rounding the time down, making files seem to be created 1 second in the past, as it happens in Linux, would cause files to be copied all over again every time you run the command. Rounding time up fixes that.

OTOH the Windows solution just gives you the same headache when copying those files back. It would copy files that are made out to be newer than they really are, and then you have to be careful the roundup doesn't happen twice.

No matter what you do, it's always wrong, a filesystem that can't store timestamps properly is just a bother.

Tags:

Rsync