High device bandwidth utilization

Latest response

Hi All,
Please have a look on the below outputs of a same command in two different servers with same configuration.

iostat -x sda
Linux 2.6.32-220.el6.x86_64 (hostname 1) 04/01/2014 x86_64 (8 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
0.07 0.00 0.07 0.80 0.00 99.06

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.94 0.43 3.84 0.41 109.91 6.75 27.43 0.21 48.63 13.08 5.56

iostat -x sda
Linux 2.6.32-220.el6.x86_64 (hostname 2) 04/01/2014 x86_64 (8 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
1.31 0.00 0.24 0.05 0.00 98.39

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.11 1.89 0.03 1.71 2.74 28.78 18.14 0.01 8.29 3.65 0.63

I need to know why these bandwidth utilization values differ for both the servers though these are with a same configuration.
Please Suggest.
Regards,
Rahul

Responses

Rahul,

I am not sure what your specific concern is, but the method you are using is giving you numbers since the server was booted (default from iostat). From the man page:

The first report generated by the iostat command provides statistics concerning the time since the system was booted.

It's likely one of the servers has had more load at one point in time which has increased the overall average.

Use the following command to get output every 2 seconds which will better represent the workload at the time of testing:

iostat -x sda 2

The trailing '2' is the interval at which iostat will poll / display io information.

Thanks for your Reply!!!
During running the command suggested by you, sometimes I found larger values for bandwidth utilization as like below in my affected server.
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.50 0.00 1.50 0.00 16.00 10.67 0.15 102.00 102.00 15.30

Is this indicating a load on the server or a hardware problem (disk) in my server?

Hi Rahul - if I understand your question correctly, I believe you are wondering why 2 identical servers have different values for that particular query.

The OS runs a number of tools in the background:
updatedb (for locate commmand)
rhn-profile-sync (if your host is registered it will inventory installed packages)
etc...

You likely have looked at the 2 hosts while different things are going on.

I recommend that you look in to the command iotop.

Thanks James for your reply!!
Same application is running in both the server with same hardware configuration.But we are experiencing slowness of the application where the bandwidth utilization value is high. Both the servers have been built through a single kickstart file.Still in a single server I am getting these larger values...What might be the exact cause..

Regards,
Rahul

I thought you were specifically wondering about only the disk statistics.
Performance Analysis (and tuning) is certainly challenging (the intro course is a week ;-)
I would try looking for information on the following:
iotop
sar -b
vmstat
iostat

I am also working through some performance issues myself right now. Unfortunately (for me) it involves 3 tiers of servers, multiple firewalls and load balancers.

It will likely take a bit of work to narrow it down, but first thing I would do is run 'sar' on both (no switches) and let us know what the '%iowait' figure shows for both servers (note: these are 10 minute averages). This will tell you if the server is spending time waiting on disk IO.

As mentioned by James as well, the 'sar -b' will give some statistics on performance of the disks over time too which may provide some more insight.

Also, can you provide the output of 'free' to show if the server is utilising swap?

Are these servers using SAN storage or local disk?
If it's SAN, is it FC or ISCSI?
If it's ISCSI are you using a dedicated network, or shared?
What is the performance issue you are seeing? slow network copy? slow filesystem copy? poor application performance? or are you just comparing the numbers between servers and have concerns based on the differences?

These servers are having only local disks.Here I am giving the outputs of "sar -b" and "free"
12:40:01 PM tps rtps wtps bread/s bwrtn/s
12:50:01 PM 0.20 0.00 0.20 0.00 3.35
01:00:01 PM 0.20 0.00 0.20 0.00 3.35
01:10:01 PM 0.26 0.00 0.26 0.00 3.95
01:20:01 PM 0.22 0.00 0.22 0.00 3.35
01:30:01 PM 0.22 0.00 0.22 0.01 3.42
01:40:01 PM 0.23 0.00 0.23 0.00 3.49
01:50:01 PM 0.21 0.00 0.21 0.00 3.20
02:00:01 PM 0.22 0.00 0.22 0.00 3.39
02:10:01 PM 0.23 0.00 0.23 0.00 3.59
02:20:01 PM 0.21 0.00 0.21 0.00 3.30
02:30:01 PM 0.23 0.00 0.23 0.00 3.43
02:40:01 PM 0.29 0.00 0.29 0.00 4.09
02:50:01 PM 0.22 0.00 0.22 0.01 3.30
03:00:01 PM 0.26 0.01 0.25 0.24 3.68
03:10:01 PM 0.29 0.00 0.29 0.04 4.27
03:20:01 PM 0.21 0.00 0.21 0.00 3.32
Average: 0.31 0.09 0.22 1.06 3.54

free

         total       used       free     shared    buffers     cached

Mem: 3733216 417320 3315896 0 53208 126848
-/+ buffers/cache: 237264 3495952

Swap: 5963768 0 5963768

12:40:01 PM tps rtps wtps bread/s bwrtn/s
12:50:01 PM 2.37 0.00 2.37 0.00 34.85
01:00:01 PM 1.93 0.00 1.93 0.00 31.94
01:10:01 PM 2.33 0.00 2.33 0.00 35.49
01:20:01 PM 1.76 0.00 1.76 0.00 29.52
01:30:01 PM 2.52 0.00 2.52 0.00 37.84
01:40:01 PM 2.17 0.00 2.17 0.00 34.79
01:50:01 PM 2.09 0.00 2.09 0.00 31.13
02:00:01 PM 1.97 0.00 1.97 0.00 33.57
02:10:01 PM 2.32 0.00 2.32 0.00 35.67
02:20:01 PM 2.02 0.00 2.02 0.00 33.76
02:30:01 PM 2.54 0.00 2.54 0.00 38.04
02:40:01 PM 2.31 0.00 2.31 0.00 38.43
02:50:01 PM 2.42 0.00 2.42 0.00 34.83
03:00:01 PM 1.98 0.00 1.98 0.00 33.63
03:10:01 PM 2.23 0.00 2.23 0.00 34.78
03:20:01 PM 1.71 0.00 1.71 0.00 28.78
Average: 2.16 0.00 2.16 0.00 33.99

free

         total       used       free     shared    buffers     cached

Mem: 3733848 2824172 909676 0 402396 1411196
-/+ buffers/cache: 1010580 2723268
Swap: 5963768 0 5963768

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.