Why `STIME` field is broken when /proc/stat contains more than hardcoded BUFFSIZE (64*1024) bytes

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux (RHEL) 6
  • procps-3.2.8-23.el6.x86_64

Issue

  • In case /proc/stat has more than BUFFSIZE bytes, it prints in STIME column year 1970.

  • Run e.g. the following (-f option adds STIME field):

# ps -ft pts/290
UID         PID   PPID  C STIME TTY          TIME CMD
root     548150 548147  0  1970 pts/290  00:00:00 -bash
root     548182 548150  0  1970 pts/290  00:00:01 ps -ft pts/290
#

# wc -c /proc/stat
80878 /proc/stat

(/proc/stat file is attached)

  • The expected result is that STIME should contain correct date.

Resolution

Update to procps-3.2.8-45.el6_9.1 shipped with Advisory RHBA-2017:1726 or newer.

Root Cause

Previously, when reading the /proc/stat file, a buffer overflow could occur. As a consequence, the starting time of a process (STIME) value was computed incorrectly.

Diagnostic Steps

  • The problem can be fixed also by increasing BUFFSIZE at least to 80K, but teoretically size could rise to ~192647bytes (I wouldn't put an arm to a fire at all for correctness of the values), so ok let's look how big the /proc/stat could really be according to kernel code:
fs/proc/stat.c:
 25 static int show_stat(struct seq_file *p, void *v)
 26 {
...
  65         seq_printf(p, "cpu  %llu %llu %llu %llu %llu %llu %llu %llu %llu\n"
...
 75         for_each_online_cpu(i) {
...
 89                         "cpu%d %llu %llu %llu %llu %llu %llu %llu %llu %llu\n",
...
100         }
101         seq_printf(p, "intr %llu", (unsigned long long)sum);
...
104         for_each_irq_nr(j)
105                 seq_printf(p, " %u", kstat_irqs(j));
... 
107         seq_printf(p,
108                 "\nctxt %llu\n"
109                 "btime %lu\n"
110                 "processes %lu\n"
111                 "procs_running %lu\n"
112                 "procs_blocked %lu\n",
... 
119         seq_printf(p, "softirq %llu", (unsigned long long)sum_softirq);
10
121         for (i = 0; i < NR_SOFTIRQS; i++)
122                 seq_printf(p, " %u", per_softirq_sums[i]);
123         seq_printf(p, "\n")
  • With 256 cores the contents should perfectly fit into 64kb buffer as the output could teoretically maximally contain ~55559 bytes:
strlen("cpu  ") + 9 * 21 (18,446,744,073,709,551,615) + 8 * " " + strlen("\n") = 5 + 189 + 9 = 203
256 * (strlen("cpu256") + 9 * 21 + 8 * " " + strlen("\n")) = 256 * (6 + 189 + 9) = 256 * 204 = 52224
strlen("intr ") + 21 + strlen("\n") = 27
255 * (strlen(" ") + 10) = 2805
strlen('\n') = 1
strlen("ctxt ") + 21 + strlen("\n") = 27
strlen("btime ") + 21 + strlen("\n") = 28
strlen("processes ") + 21 + strlen("\n") = 32
strlen("procs_running ") + 21 + strlen("\n") = 36
strlen("procs_blocked ") + 21 + strlen("\n") = 36
strlen("softirq ") + 21 = 29
10 * (strlen(" ") + 10) = 110
strlen("\n") = 1
  • But with 928 cores, the output could be theoreticaly long ~192647bytes.
  • Notes from investigation:
    The code which calculates the STIME column is the following (line #933):
procps-3.2.8/ps/output.c:
 914 /* Unix98 specifies a STIME header for a column that shows the start
 915  * time of the process, but does not specify a format or format specifier.
 916  * From the general Unix98 rules, we know there must not be any spaces.
 917  * Most systems violate that rule, though the Solaris documentation
 918  * claims to print the column without spaces. (NOT!)
 919  *
 920  * So this isn't broken, but could be renamed to u98_std_stime,
 921  * as long as it still shows as STIME when using the -f option.
 922  */
 923 static int pr_stime(char *restrict const outbuf, const proc_t *restrict const pp){
 924   struct tm *proc_time;
 925   struct tm *our_time;
 926   time_t t;
 927   const char *fmt;
 928   int tm_year; 
 929   int tm_yday; 
 930   our_time = localtime(&seconds_since_1970);   /* not reentrant */
 931   tm_year = our_time->tm_year;
 932   tm_yday = our_time->tm_yday;
 933   t = time_of_boot + pp->start_time / Hertz;
 934   proc_time = localtime(&t); /* not reentrant, this corrupts our_time */
 935   fmt = "%H:%M";                                   /* 03:02 23:59 */
 936   if(tm_yday != proc_time->tm_yday) fmt = "%b%d";  /* Jun06 Aug27 */
 937   if(tm_year != proc_time->tm_year) fmt = "%Y";    /* 1991 2001 */
 938   return strftime(outbuf, 42, fmt, proc_time);
 939 }
  • The variables which contribute to the wrong value are 'time_of_boot' and 'and pp->start_time' at line #933. If we look at how the variables get initialized, we can see the real failing code showing time_of_boot which surely doesn't get initialized and stays as -1 thus the STIME column shows year 1970 due to small buffer:
C symbol: time_of_boot

  File                     Function     Line
...
1 procps-3.2.8/ps/global.c <global>      73 unsigned long time_of_boot = -1;
2 procps-3.2.8/ps/global.c reset_global 376 sscanf(b, "btime %lu", &time_of_boot);
...

358 /************ Call this to reinitialize everything ***************/
359 void reset_global(void){
360   static proc_t p;
361   reset_selection_list();
362   look_up_our_self(&p);
363   set_screen_size();
364   set_personality();
365   int fd;
366   char *buf[BUFFSIZE];
367   const char *b;
368 
369   /* get boot time from /proc/stat */
370   fd = open("/proc/stat", O_RDONLY, 0);
371   if (fd != -1) {
372     buf[BUFFSIZE-1] = 0;
373     read(fd, buf, BUFFSIZE-1);
374     b = strstr(buf, "btime ");
375     if (b) {
376       sscanf(b, "btime %lu", &time_of_boot);

Value 'pp->start_time' is surely correct as is being obtained from /proc//stat which is one line file and doesn't suffer this problem.

Note: There are plenty of other places which suffer this problem, e.g. in getstat()

Attachments

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.