How to retrieve the last modification date of all files in a git repository

Solution 1:

A simple answer would be to iterate through each file and display its modification time, i.e.:

git ls-tree -r --name-only HEAD | while read filename; do
  echo "$(git log -1 --format="%ad" -- $filename) $filename"
done

This will yield output like so:

Fri Dec 23 19:01:01 2011 +0000 Config
Fri Dec 23 19:01:01 2011 +0000 Makefile

Obviously, you can control this since its just a bash script at this point--so feel free to customize to your heart's content!

Solution 2:

This approach also works with filenames that contain spaces:

git ls-files -z | xargs -0 -n1 -I{} -- git log -1 --format="%ai {}" {}

Example output:

2015-11-03 10:51:16 -0500 .gitignore
2016-03-30 11:50:05 -0400 .htaccess
2015-02-18 12:20:26 -0500 .travis.yml
2016-04-29 09:19:24 +0800 2016-01-13-Atlanta.md
2016-04-29 09:29:10 +0800 2016-03-03-Elmherst.md
2016-04-29 09:41:20 +0800 2016-03-03-Milford.md
2016-04-29 08:15:19 +0800 2016-03-06-Clayton.md
2016-04-29 01:20:01 +0800 2016-03-14-Richmond.md
2016-04-29 09:49:06 +0800 3/8/2016-Clayton.md
2015-08-26 16:19:56 -0400 404.htm
2016-03-31 11:54:19 -0400 _algorithms/acls-bradycardia-algorithm.htm
2015-12-23 17:03:51 -0500 _algorithms/acls-pulseless-arrest-algorithm-asystole.htm
2016-04-11 15:00:42 -0400 _algorithms/acls-pulseless-arrest-algorithm-pea.htm
2016-03-31 11:54:19 -0400 _algorithms/acls-secondary-survey.htm
2016-03-31 11:54:19 -0400 _algorithms/acls-suspected-stroke-algorithm.htm
2016-03-31 11:54:19 -0400 _algorithms/acls-tachycardia-algorithm-stable.htm
...

The output can be sorted by modification timestamp by adding | sort to the end:

git ls-files -z | xargs -0 -n1 -I{} -- git log -1 --format="%ai {}" {} | sort

Solution 3:

This is a small tweak of Andrew M.'s answer. (I was unable to comment on his answer.)

Wrap the first $filename in double quotes, in order to support filenames with embedded spaces.

git ls-tree -r --name-only HEAD | while read filename; do
    echo "$(git log -1 --format="%ad" -- "$filename") $filename"
done

Sample output:

Tue Jun 21 11:38:43 2016 -0600 subdir/this is a filename with spaces.txt

I appreciate that Andrew's solution (based on ls-tree) works with bare repositories! (This isn't true of solutions using ls-files.)


Solution 4:

Here's yet another answer:

git ls-tree -r --name-only HEAD -z | TZ=UTC xargs -0n1 -I_ git --no-pager log -1 --date=iso-local --format="%ad _" -- _

Changes to previously given answers:

  • Correctly handles spaces in filenames.
  • Uses ls-tree instead of ls-files and as such can be used with bare repositories.
  • Prints all times with zero offset (UTC) in ISO 8601 like format. This allows correct sorting also for times near daylight saving changes (or commits from different timezones) by appending | sort to the command.
  • Doesn't require using subshells so the performance should be as good as possible.

Note that this doesn't correctly handle filenames with % character. See below for more elaborate command to correctly handle all characters in filenames.

Note that this command is still really slow because Git doesn't really store the information we're looking after. Technically this goes through all the files, filters all changes to any given file from the whole project history, takes the latest commit and prints its author timestamp. As a result, the times displayed times match the last commit that changed each file. If the file had different timestamp on disk at the time the original commit was made, it was not ever stored anywhere in the git repository and as such it cannot ever be restored without external data source.

If you want to set filesystem modification times to last author commit time of each file, you can do something like this to deal with special characters in filenames (add | bash to automatically execute all emitted commands):

git ls-tree -r --name-only HEAD -z | TZ=UTC xargs -0n1 git --no-pager log -1 --date=iso-local --name-only -z --format="format:%ad" | perl -npe "INIT {\$/ = \"\\0\"} s@^(.*? .*?) .*?\n(.*)\$@\$date=\$1; \$name=\$2; \$name =~ s/'/'\"'\"'/sg; \"TZ=UTC touch -m --date '\$date' '\$name';\n\"@se"

Even though this is much more complex than the command above, the performance of this command should be about equal to the first one because the performance is limited by searching for last modification time of each file instead of actually setting the modification time. Note that this converts times to UTC, uses null-separated files and resets correct timestamp for each file on the filesystem using UTC timezone while setting the time.

If the order of output is not strictly important, you can improve performance of this command by adding -P $(nproc) to xargs flags to scale Git to all CPUs making the command look like ...TZ=UTC xargs -0n1 -P $(nproc) git....

If you prefer committer time instead of author date, use %cd instead of %ad in above command line.


Solution 5:

For those of us using Windows and PowerShell, Andrew M's answer, with the computer readable tweak:

git ls-tree -r --name-only HEAD | ForEach-Object { "$(git log -1 --format="%ai" -- "$_")`t$_" }

Example output:

2019-05-07 12:00:37 -0500   .editorconfig
2016-07-13 14:03:49 -0500   .gitattributes
2019-05-07 12:00:37 -0500   .gitignore
2018-02-03 22:01:17 -0600   .mailmap

Tags:

Git