How can I tell which user limit I am running into?

nproc was the problem:

[root@localhost ~]# ps -eLf | grep pascal | wc -l
4068
[root@localhost ~]# cat /etc/security/limits.d/20-nproc.conf
# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.

*          soft    nproc     4096
root       soft    nproc     unlimited
[root@localhost ~]#

man limits.conf states:

   Also, please note that all limit settings are set per login. They are
   not global, nor are they permanent; existing only for the duration of
   the session. One exception is the maxlogin option, this one is system
   wide. But there is a race, concurrent logins at the same time will not
   always be detected as such but only counted as one.

It appears to me that nproc is only enforced per login but counts globally. So a login with nproc 8192 and 5000 threads would have no problems, but a simultaneous login of the same UID with nproc 4096 and 50 threads would not be able to create more because the global count (5050) is above its nproc setting.

[root@localhost ~]# ps -eLf | grep pascal | grep google/chrome | wc -l
3792

If you can't access the account at all, you'll have a hard time finding out what the problem is. But do check system or application logs, hopefully some program will have left a clue there (especially for a failed login attempt).

If you can run programs to experiment, you can tell which limit has been reached by attempting to increase each limited value and seeing when it works and when the attempt fails with EAGAIN. It's also possible to list the resources used for each value; I can't think of a utility that collects the data for all limits but there may well be one.

Assuming that the problem is a kernel limit, those are listed in the setrlimit man page. The ones that apply per user ID are:

  • RLIMIT_MEMLOCK — size of unswappable memory. Shouldn't prevent logging in, very few programs request unswappable memory.
  • RLIMIT_MSGQUEUE — size of message queues. Shouldn't prevent logging in, very few programs use message queues.
  • RLIMIT_NPROC — maximum number of processes. This one absolutely will prevent logins if it's reached. Increasing the limit in /etc/security/limits.conf won't affect the existing sessions, but it will affect new processes, so if the system administrator increases the value there, the user will be able to log in.
  • RLIMIT_SIGPENDING — maximum number of pending signals. Shouldn't prevent logging in, very few programs use sigqueue to enqueue signals.

So the limit on processes is the most likely one. If you have access to a running shell, you can confirm by trying to run a program; the error should be pretty distinctive:

$ ls
bash: fork: retry: No child processes
bash: fork: retry: No child processes
bash: fork: retry: No child processes
bash: fork: retry: No child processes
bash: fork: Resource temporarily unavailable

You can print out this limit with ulimit -u. If you have access to a shell running as the problematic user, and the user hasn't run any setuid program, you can list the processes that count against this limit with set /proc/*/task/*/cwd/.; echo $# (lists the kernel threads for which the user can read the cwd link, which means that the user has full control over the process).

Tags:

Linux

Limit