Chapter 17. Monitoring Red Hat Gluster Storage Gluster Workload

Monitoring storage volumes is helpful when conducting a capacity planning or performance tuning activity on a Red Hat Gluster Storage volume. You can monitor the Red Hat Gluster Storage volumes with different parameters and use those system outputs to identify and troubleshoot issues.
You can use the volume top and volume profile commands to view vital performance information and identify bottlenecks on each brick of a volume.
You can also perform a statedump of the brick processes and NFS server process of a volume, and also view volume status and volume information.

Note

If you restart the server process, the existing profile and top information will be reset.

17.1. Profiling volumes

17.1.1. Server-side volume profiling using volume profile

The volume profile command provides an interface to get the per-brick or NFS server I/O information for each file operation of a volume. This information helps in identifying the bottlenecks in the storage system.
This section describes how to use the volume profile command.

17.1.1.1. Start Profiling

To view the file operation information of each brick, start the profiling command:
# gluster volume profile VOLNAME start
For example, to start profiling on test-volume:
# gluster volume profile test-volume start
Starting volume profile on test has been successful

Important

Running profile command can affect system performance while the profile information is being collected. Red Hat recommends that profiling should only be used for debugging.
When profiling is started on the volume, the following additional options are displayed when using the volume info command:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on

17.1.1.2. Displaying the I/O Information

To view the I/O information of the bricks on a volume, use the following command:
# gluster volume profile VOLNAME info
For example, to view the I/O information of test-volume:
# gluster v profile glustervol info
Brick: rhsqaci-vm33.lab.eng.blr.redhat.com:/bricks/brick0/1
-----------------------------------------------------------
Cumulative Stats:
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us             11     RELEASE
      0.00       0.00 us       0.00 us       0.00 us            238  RELEASEDIR
      0.35     380.05 us     380.05 us     380.05 us              1    SETXATTR
      0.40     107.73 us       5.50 us     413.31 us              4     OPENDIR
      0.62     167.65 us      91.33 us     339.28 us              4      STATFS
      0.86     187.42 us      28.50 us     534.96 us              5    GETXATTR
      2.16     106.54 us      32.16 us     383.58 us             22     ENTRYLK
      2.17     106.97 us      39.01 us     251.65 us             22       FLUSH
      2.92     263.57 us     189.06 us     495.05 us             12     SETATTR
      3.22     124.60 us      43.08 us     311.69 us             28     INODELK
      3.41     616.76 us     319.27 us    1028.72 us              6     READDIR
     10.11     997.03 us     413.73 us    3507.02 us             11      CREATE
     73.79     256.58 us      50.02 us     924.61 us            312      LOOKUP

    Duration: 46537 seconds
   Data Read: 0 bytes
Data Written: 0 bytes

Interval 1 Stats:
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us             11     RELEASE
      0.00       0.00 us       0.00 us       0.00 us              4  RELEASEDIR
      0.35     380.05 us     380.05 us     380.05 us              1    SETXATTR
      0.40     107.73 us       5.50 us     413.31 us              4     OPENDIR
      0.62     167.65 us      91.33 us     339.28 us              4      STATFS
      0.86     187.42 us      28.50 us     534.96 us              5    GETXATTR
      2.16     106.54 us      32.16 us     383.58 us             22     ENTRYLK
      2.17     106.97 us      39.01 us     251.65 us             22       FLUSH
      2.92     263.57 us     189.06 us     495.05 us             12     SETATTR
      3.22     124.60 us      43.08 us     311.69 us             28     INODELK
      3.41     616.76 us     319.27 us    1028.72 us              6     READDIR
     10.11     997.03 us     413.73 us    3507.02 us             11      CREATE
     73.79     256.58 us      50.02 us     924.61 us            312      LOOKUP

    Duration: 347 seconds
   Data Read: 0 bytes
Data Written: 0 bytes

Brick: rhsqaci-vm33.lab.eng.blr.redhat.com:/bricks/brick1/1
-----------------------------------------------------------
Cumulative Stats:
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us             12     RELEASE
      0.00       0.00 us       0.00 us       0.00 us            238  RELEASEDIR
      0.14     146.24 us     146.24 us     146.24 us              1        OPEN
      0.26     266.64 us     266.64 us     266.64 us              1    SETXATTR
      0.26      67.88 us       2.50 us     243.52 us              4     OPENDIR
      0.42     108.83 us      81.87 us     139.11 us              4      STATFS
      0.98     201.26 us      82.36 us     306.38 us              5    GETXATTR
      2.49     116.34 us      23.53 us     304.10 us             22     ENTRYLK
      2.75     236.13 us     124.73 us     358.80 us             12     SETATTR
      3.12     114.82 us      44.34 us     550.01 us             28     INODELK
      3.17     142.00 us      23.16 us     388.56 us             23       FLUSH
      4.37     748.73 us     324.70 us    1115.96 us              6     READDIR
      6.57     614.44 us     364.94 us     807.17 us             11      CREATE
     75.46     248.88 us      66.43 us     599.31 us            312      LOOKUP

    Duration: 46537 seconds
   Data Read: 0 bytes
Data Written: 0 bytes
To view the I/O information of the NFS server on a specified volume, use the following command:
# gluster volume profile VOLNAME info nfs
For example, to view the I/O information of the NFS server on test-volume:
# gluster volume profile test-volume info nfs
NFS Server : localhost
----------------------
Cumulative Stats:
Block Size:              32768b+               65536b+
No. of Reads:                    0                     0
No. of Writes:                 1000                  1000
%-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
---------   -----------   -----------   -----------   ------------        ----
0.01     410.33 us     194.00 us     641.00 us              3      STATFS
0.60     465.44 us     346.00 us     867.00 us            147       FSTAT
1.63     187.21 us      67.00 us    6081.00 us           1000     SETATTR
1.94     221.40 us      58.00 us   55399.00 us           1002      ACCESS
2.55     301.39 us      52.00 us   75922.00 us            968        STAT
2.85     326.18 us      88.00 us   66184.00 us           1000    TRUNCATE
4.47     511.89 us      60.00 us  101282.00 us           1000       FLUSH
5.02    3907.40 us    1723.00 us   19508.00 us            147    READDIRP
25.42    2876.37 us     101.00 us  843209.00 us           1012      LOOKUP
55.52    3179.16 us     124.00 us  121158.00 us           2000       WRITE

Duration: 7074 seconds
Data Read: 0 bytes
Data Written: 102400000 bytes

Interval 1 Stats:
Block Size:              32768b+               65536b+
No. of Reads:                    0                     0
No. of Writes:                 1000                  1000
%-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
---------   -----------   -----------   -----------   ------------        ----
0.01     410.33 us     194.00 us     641.00 us              3      STATFS
0.60     465.44 us     346.00 us     867.00 us            147       FSTAT
1.63     187.21 us      67.00 us    6081.00 us           1000     SETATTR
1.94     221.40 us      58.00 us   55399.00 us           1002      ACCESS
2.55     301.39 us      52.00 us   75922.00 us            968        STAT
2.85     326.18 us      88.00 us   66184.00 us           1000    TRUNCATE
4.47     511.89 us      60.00 us  101282.00 us           1000       FLUSH
5.02    3907.40 us    1723.00 us   19508.00 us            147    READDIRP
25.41    2878.07 us     101.00 us  843209.00 us           1011      LOOKUP
55.53    3179.16 us     124.00 us  121158.00 us           2000       WRITE

Duration: 330 seconds
Data Read: 0 bytes
Data Written: 102400000 bytes

17.1.1.3. Stop Profiling

To stop profiling on a volume, use the following command:
# gluster volume profile VOLNAME stop
For example, to stop profiling on test-volume:
# gluster volume profile test-volume stop
Stopping volume profile on test has been successful

17.1.2. Client-side volume profiling (FUSE only)

Red Hat Gluster Storage lets you profile how your mount point is being accessed, so that you can investigate latency issues even when you cannot instrument the application accessing your storage.
The io-stats translator records statistics of all file system activity on a Red Hat Gluster Storage volume that travels through a FUSE mount point. It collects information on files opened from the FUSE mount path, the read and write throughput for these files, the number of blocks read and written, and the latency observed for different file operations.
Run the following command to output all recorded statistics for the specified mount point to the specified output file.
# setfattr -n trusted.io-stats-dump -v output_file_id mount_point
This generates a number of files in the /var/run/gluster directory. The output_file_id is not the whole file name, but is used as part of the name of the generated files.