Normal user is unable to login on the system with an error "fork: Resource temporarily unavailable"
Environment
- Red Hat Enterprise Linux 6
Issue
- The
su - <user>
command failed with an error "Resource temporarily unavailable". - A vmcore file is captured during this issue to determine the root cause.
# su – <user>
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: Resource temporarily unavailable
Resolution
- Increase the value of "nproc" parameter for
user or all user's in /etc/security/limits.d/90-nproc.conf
> Here is an example of/etc/security/limits.d/90-nproc.conf
file.
<user> - nproc 2048 <<<----[ Only for "<user>" user ]
* - nproc 2048 <<<----[ For all user's ]
NOTE: Above is an snippet of example, nproc value should be increased from current value to double or triple as per requirements.
Root Cause
- The system was not able to create new process(es), because of the limits set for nproc in /etc/security/limits.conf file.
- The process(es) initiated by user "test" having uid (702638) are reached to it's soft limit.
- The soft limit for number of process(es) (nproc) is set to 1024 in /etc/security/limits.conf file.
- The total number of process(es) running on this system with uid (702638) are 1023 process(es).
Diagnostic Steps
- Determine the total number of process(es) on the system.
crash> ps | wc -l
1370 <<<-----[ Total number of process(es) on the system ]
- Determine the process name with the highest number of instances.
crash> ps | gawk '{count[$NF]++}END{for(j in count) print ""count[j]":",j}'|sort -rn|head -n20
1021: java <<<-----[ Total 1021 are "java" process(es) ]
64: console-kit-dae
57: klzagent
56: kloagent
7: [kdmflush]
7: [ext4-dio-unwrit]
7: avagent.bin
6: multipathd
6: mingetty
6: elxhbamgrd
5: .vasd
4: rscd
4: collect
4: automount
3: udevd
3: sshd
3: sh
3: rsyslogd
2: sleep
2: sendmail
- Determine the "NPROC" limit of the process with highest number of instances (i.e "java").
crash> set 32724
PID: 32724
COMMAND: "java"
TASK: ffff8801298ce040 [THREAD_INFO: ffff8801030b2000]
CPU: 0
STATE: TASK_INTERRUPTIBLE
crash> ps -r 32724
PID: 32724 TASK: ffff8801298ce040 CPU: 0 COMMAND: "java"
RLIMIT CURRENT MAXIMUM
CPU (unlimited) (unlimited)
FSIZE (unlimited) (unlimited)
DATA (unlimited) (unlimited)
STACK 10485760 (unlimited)
CORE 0 (unlimited)
RSS (unlimited) (unlimited)
NPROC 1024 30527
NOFILE 8192 8192
MEMLOCK 65536 65536
AS (unlimited) (unlimited)
LOCKS (unlimited) (unlimited)
SIGPENDING 30527 30527
MSGQUEUE 819200 819200
NICE 0 0
RTPRIO 0 0
RTTIME (unlimited) (unlimited)
crash> ps -r | grep -e 'NPROC 1024' -B 8 | grep -e PID -e NPROC
PID: 334 TASK: ffff88012c8a4080 CPU: 0 COMMAND: "java"
NPROC 1024 30527
PID: 335 TASK: ffff880101e1c040 CPU: 0 COMMAND: "java"
NPROC 1024 30527
PID: 336 TASK: ffff88010a390aa0 CPU: 0 COMMAND: "java"
NPROC 1024 30527
PID: 453 TASK: ffff8801298e7540 CPU: 0 COMMAND: "java"
NPROC 1024 30527
[..]
- Determine the "UID" and "GID" of the process with highest number of instances( i.e "java").
crash> set 32724
PID: 32724
COMMAND: "java"
TASK: ffff8801298ce040 [THREAD_INFO: ffff8801030b2000]
CPU: 0
STATE: TASK_INTERRUPTIBLE
crash> task_struct.real_cred ffff8801298ce040
real_cred = 0xffff880139eeb300
crash> cred.uid 0xffff880139eeb300
uid = 702638 <<<----[ User ID ]
crash> cred.gid 0xffff880139eeb300
gid = 626431 <<<----[ Group ID ]
-
The "java" process(es) are running with uid (702638) and gid (9626431).
-
Determine the total number of process(es) running with uid (702638).
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments