Profiling Tool Recommendations?

Latest response

So, our security group is requiring our RHEL-based servers all be equipped with on-access A/V software. Our Enterprise selected McAfee so that's what's now being rolled out to our Linux servers. It's been an interesting experience. Network file transfers that take 10s for 100MB of data with NAI disabled take a hair under a minute-thirty to load onto a "protected" system. Installation of that 100MB compressed software package takes about a a little less than two minutes when NAI is disabled and takes a shade under fifteen minutes.

Unfortunately, simply providing anecdotal data such as this isn't sufficient for getting our security folks to see if there may be something wrong with McAfee or if McAfee may simply be the wrong solution for our systems. Does anyone have an recommendation for memory and/or CPU profiling tools to create good, concise reports to turn over to our decision makers. Granted, I could probably use the tools in the sysstat package to generate data to dump into Excel, just wanted to know if there were any (free) tools that might either do more of the legwork for me or would allow me to do more fine-grained resource profiling of just the NAI activity.

Responses

RHEL servers ship configured with sar which provides CPU, memory and network performance.  Some people prefer to use other tools such as collectl.

In RHEL 6, to help isolate where the kernel is spending most of its time, it is useful to install perf, and run perf top.

I recommend nmon, isnt CPU/IO/MEM/Kernel profiler but fits more on performance and you can export it to spredsheets with nice graphics =D.

NMON- http://nmon.sourceforge.net/pmwiki.php

NMON Analyzer (will generate .xls) - https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Power%20Systems/page/nmon_analyser

 

And as Marc Milgram mentioned, use "perf" native on Red Hat (i think its native on systems with Kernel 2.6+) for real profiling.

PERF - https://perf.wiki.kernel.org/index.php/Tutorial

The `sar` utility is great for getting overall metrics, but doesn't normally offer the level of granularity I was looking to capture.

I'll look at the other utilities, though.

I'll have to give nmon a looksee.

Thanks

Needed to get numbers before I could look at any of the other utilities. Ended up writing a basic PS script:

#!/bin/bash

while [ true ]
do
   STAMP=`date "+%Y%m%d-%H:%M:%S"
   PROCS=`(pgrep scanner ; pgrep cma ; pgrep nailsd ; pgrep nailslogd ; pgrep logepo)`

   PCTCPU=`ps --noheader -o pcpu -p ${PROCS} | awk '{ sum += $1 } END { print sum}'`
   TOTMEM=`ps --noheader -o rss -p ${PROCS} | awk '{ sum += $1 } END { print sum}'`

   printf "%s - CPU Use: %5s%% ; Memory Used: %s\n" ${STAMP} ${PCTCPU} ${TOTMEM}

   sleep 1
done

Got me enough to begin the fight with - hopefully one of the other tools will give me prettier ammo. As it is, the above script showed that, on a multi-core server with 32GB of RAM, McAfee was chewing up 42% of my CPU when I attempted to SFTP a 150MB archive-file (that contained a bunch of JARs nested inside a GZIP archive) to the system.

Brutal.

Let us know how you find it, Tom.