Argument list too long when copying files

cp *.prj ../prjshp/ is the right command, but you've hit a rare case where it runs into a size limitation. The second command you tried doesn't make any sense.

One method is to run cp on the files in chunks. The find command knows how to do this:

find -maxdepth 1 -name '*.prj' -exec mv -t ../prjshp {} +
  • find traverses the current directory and the directories below it recursively.
  • -maxdepth 1 means to stop at a depth of 1, i.e. don't recurse into subdirectories.
  • -name '*.prj' means to only act on the files whose name matches the specified pattern. Note the quotes around the pattern: it will be interpreted by the find command, not by the shell.
  • -exec … {} + means to execute the specified command for all the files. It invokes the command multiple times if necessary, taking care not to exceed the command line limit.
  • mv -t ../prjshp moves the specified files into ../prjshp. The -t option is used here because of a limitation of the find command: the found files (symbolized by {}) are passed as the last argument of the command, you can't add the destination after it.

Another method is to use rsync.

rsync -r --include='*.prj' --exclude='*' . ../prjshp
  • rsync -r … . ../prjshp copies the current directory into ../prjshp recursively.
  • --include='*.prj' --exclude='*' means to copy files matching *.prj and exclude everything else (including subdirectories, so .prj files in subdirectories won't be found).

This command copies the files one by one and will work even if there are too many of them for * to expand into a single cp command:

for i in *; do cp "$i" ../prjshp/; done

There's 3 key points to keep in mind when facing Argument list too long error:

  • The length of command-line arguments is limited by ARG_MAX variable, which by POSIX definition is "...[m]aximum length of argument to the exec functions including environment data" (emphasis added)". That is, when shell executes a non-built-it command, it has to call one of exec() to spawn that command's process, and that's where ARG_MAX comes into play. Additionally, the name or path to the command itself ( for example, /bin/echo ) plays a role.

  • Shell built-in commands are executed by shell, which means the shell doesn't use exec() family of functions and therefore aren't affected by ARG_MAX variable.

  • Certain commands, such as xargs and find are aware of ARG_MAX variable and repeatedly perform actions under that limit

From the points above and as shown in Kusalananda's excellent answer on related question, the Argument list too long can also occur when environment is big. So taking in consideration that each user's environment may vary, and the argument size in bytes is relevant, it's hard to come up with a single number of files/arguments.

How to handle such error ?

The key thing is to focus not on the number of files, but focus on whether or not the command you're going to use involves exec() family of function and tangentially - the stack space.

Use shell built-ins

As discussed before, the shell built-ins are immune to ARG_MAX limit, that is things such as for loop, while loop, built-in echo, and built-in printf - all those will perform well enough.

for i in /path/to/dir/*; do cp "$i" /path/to/other/dir/; done

On related question about deleting files, there was a solution as such:

printf '%s\0' *.jpg | xargs -0 rm --

Note that this uses shell's built-in printf. If we're calling the external printf, that will involve exec(), hence will fail with large number of arguments:

$ /usr/bin/printf "%s\0" {1..7000000}> /dev/null
bash: /usr/bin/printf: Argument list too long

bash arrays

According to an answer by jlliagre, bash doesn't impose limits on arrays, so building array of filenames and using slices per iteration of loop can be done as well, as shown in danjpreron's answer:

files=( /path/to/old_dir/*.prj )
for((I=0;I<${#files[*]};I+=1000)); do 
    cp -t /path/to/new_dir/ "${files[@]:I:1000}" 
done

This, however, has limitation of being bash-specific and non-POSIX.

Increase stack space

Sometimes you can see people suggest increasing the stack space with ulimit -s <NUM>; on Linux ARG_MAX value is 1/4th of stack space for each program, which means increasing stack space proportionally increases space for arguments.

# getconf reports value in bytes, ulimit -s in kilobytes
$ getconf ARG_MAX
2097152
$ echo $((  $(getconf ARG_MAX)*4 ))
8388608
$ printf "%dK\n" $(ulimit -s) | numfmt --from=iec --to=none
8388608
# Increasing stack space results in increated ARG_MAX value
$ ulimit -s 16384
$ getconf ARG_MAX
4194304

According to answer by Franck Dernoncourt, which cites Linux Journal, one can also recompile Linux kernel with larger value for maximum memory pages for arguments, however, that's more work than necessary and opens potential for exploits as stated in the cited Linux Journal article.

Avoid shell

Another way, is to use python or python3 which come by default with Ubuntu. The python + here-doc example below, is something I personally used to copy a large directory of files somewhere in the range of 40,000 items:

$ python <<EOF
> import shutil
> import os
> for f in os.listdir('.'):
>    if os.path.isfile(f):
>         shutil.copy(f,'./newdir/')
> EOF

For recursive traversals, you can use os.walk.

See also:

  • What defines the maximum size for a command single argument?