How can I tell which user limit I am running into?
nproc was the problem:
[root@localhost ~]# ps -eLf | grep pascal | wc -l 4068 [root@localhost ~]# cat /etc/security/limits.d/20-nproc.conf # Default limit for number of user's processes to prevent # accidental fork bombs. # See rhbz #432903 for reasoning. * soft nproc 4096 root soft nproc unlimited [root@localhost ~]#
man limits.conf states:
Also, please note that all limit settings are set per login. They are not global, nor are they permanent; existing only for the duration of the session. One exception is the maxlogin option, this one is system wide. But there is a race, concurrent logins at the same time will not always be detected as such but only counted as one.
It appears to me that nproc is only enforced per login but counts globally. So a login with nproc 8192 and 5000 threads would have no problems, but a simultaneous login of the same UID with nproc 4096 and 50 threads would not be able to create more because the global count (5050) is above its nproc setting.
[root@localhost ~]# ps -eLf | grep pascal | grep google/chrome | wc -l 3792
If you can't access the account at all, you'll have a hard time finding out what the problem is. But do check system or application logs, hopefully some program will have left a clue there (especially for a failed login attempt).
If you can run programs to experiment, you can tell which limit has been reached by attempting to increase each limited value and seeing when it works and when the attempt fails with
EAGAIN. It's also possible to list the resources used for each value; I can't think of a utility that collects the data for all limits but there may well be one.
Assuming that the problem is a kernel limit, those are listed in the
setrlimit man page. The ones that apply per user ID are:
RLIMIT_MEMLOCK— size of unswappable memory. Shouldn't prevent logging in, very few programs request unswappable memory.
RLIMIT_MSGQUEUE— size of message queues. Shouldn't prevent logging in, very few programs use message queues.
RLIMIT_NPROC— maximum number of processes. This one absolutely will prevent logins if it's reached. Increasing the limit in
/etc/security/limits.confwon't affect the existing sessions, but it will affect new processes, so if the system administrator increases the value there, the user will be able to log in.
RLIMIT_SIGPENDING— maximum number of pending signals. Shouldn't prevent logging in, very few programs use
sigqueueto enqueue signals.
So the limit on processes is the most likely one. If you have access to a running shell, you can confirm by trying to run a program; the error should be pretty distinctive:
$ ls bash: fork: retry: No child processes bash: fork: retry: No child processes bash: fork: retry: No child processes bash: fork: retry: No child processes bash: fork: Resource temporarily unavailable
You can print out this limit with
ulimit -u. If you have access to a shell running as the problematic user, and the user hasn't run any setuid program, you can list the processes that count against this limit with
set /proc/*/task/*/cwd/.; echo $# (lists the kernel threads for which the user can read the
cwd link, which means that the user has full control over the process).