pam_tally2 process consuming high cpu on RHEL5

Solution In Progress - Updated -

Environment

Red Hat Enterprise Linux 5.9
pam-0.99.6.2-12.el5

Issue

The pam_tally2 process is using 80% cpu.

Resolution

The tallylog file format is used by a few utilities, and changing it will result in wide spread breakage. Bugs have been filed about this issue, but the consensus is that this file format is not going to change in RHEL6 or RHEL7 either.

Having said that, it is possible to workaround this issue as the pam_tally2 command is capable of seeking to any individual user's data without having to read the whole file, using pam_tally2's -u option. A simple shell script that avoids this problem, yet emulates the default pam_tally2 behaviour, using the pam_tally2's -u option is as follows:

#!/bin/bash

# root gets special treatment - just to print the header
pam_tally2 -u root

for i in  `getent passwd|cut -f 1 -d :|grep -v root &&          # local users
           ipa user-find|grep login | cut -f 5 -d " " `         # IPA users
do
  pam_tally2 -u $i | grep -v Login
done 

Root Cause

The tallylog file, used by pam_tally2 is a sparse file. It uses an offset of the uid of the user which makes it possible for the file to get very ‘large’:

root@nykpsr3752 PROD # ls -lh /var/log/tallylog
-rw------- 1 root root 5.4G Oct  4 12:29 /var/log/tallylog

root@host # du -sh /var/log/tallylog
40K /var/log/tallylog

A strace shows the amount of time spent in reading the tallylog:

root@host # strace -c pam_tally2
Login           Failures Latest failure     From
user1            1    07/21/12 22:54:36  unknown
user2            1    10/03/12 15:01:15  unknown
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00    0.013377           0   1406423           read
  0.00    0.000000           0         3           write
  0.00    0.000000           0        12           open
  0.00    0.000000           0        14           close
  0.00    0.000000           0         6           stat
  0.00    0.000000           0        14           fstat
  0.00    0.000000           0         1           lseek
  0.00    0.000000           0        23           mmap
  0.00    0.000000           0         9           mprotect
  0.00    0.000000           0         7           munmap
  0.00    0.000000           0         3           brk
  0.00    0.000000           0         1         1 access
  0.00    0.000000           0         2           socket
  0.00    0.000000           0         2         2 connect
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         6           fcntl
  0.00    0.000000           0         1           umask
  0.00    0.000000           0         1           arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.013377               1406529         3 total

This stems from the struct that faillog uses to update /var/log/faillog with failed user attempts:


The file is fixed length record, indexed by numerical UID.

The problem is that because IPA has large uidnumbers, a few failed logins for someone with uid 23983838 builds a large file which takes a significant amount of time to parse:

$ ls -lh /var/log/faillog
-rw------- 1 root root 15G Mar 12 17:28 /var/log/faillog
$ du -sh /var/log/faillog
36K     /var/log/faillog
  • Component
  • pam

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.