OpenStack manage Ceph High Slow requests.

Solution In Progress - Updated -

Issue

  • We are seeing very high slow request on OpenStack 13 managed ceph cluster, which is also fluctuating the state of ceph cluster health.

  • This creating problem to provision cluster on OpenStack environment, could anyone please help to investigate on this.

Every 2.0s: ceph -s                                                                                                                                                                        Sun May 10 10:11:01 2020

  cluster:
    id:     0508166a-302c-11e7-bf96-141877347430
    health: HEALTH_WARN
            noscrub,nodeep-scrub flag(s) set
            1 slow requests are blocked > 32 sec. Implicated osds 10

  services:
    mon: 3 daemons, quorum overcloud-controller-0,overcloud-controller-2,overcloud-controller-1
    mgr: overcloud-controller-1(active), standbys: overcloud-controller-0, overcloud-controller-2
    osd: 265 osds: 264 up, 264 in
         flags noscrub,nodeep-scrub

  data:
    pools:   4 pools, 13344 pgs
    objects: 2.92M objects, 11.0TiB
    usage:   32.7TiB used, 254TiB / 287TiB avail
    pgs:     13344 active+clean

  io:
    client:   517MiB/s rd, 106MiB/s wr, 3.20kop/s rd, 10.27kop/s wr
  • The following osds are logging suboptimal requests:
[root@overcloud-controller-0 ~]# grep -r 'slow' /var/log/messages|awk '/subop/ {split($NF,a,","); for(i=1;b=a[i];i++) { print "osd."b}; next; } { print $10}' | grep -v mon|sort -g | uniq -c | sort -k1 -n -r | head
   7150 0
    138 osd.205
     98 osd.253
     98 osd.216
     50 osd.51
     40 osd.76
     40 osd.171
     32 osd.49
     28 osd.28
     20 osd.83
  • Slow requests are being logged:
[root@overcloud-controller-0 ~]#  grep -r 'slow request' /var/log/messages|sed -e 's/^.*currently //' -e 's/from.*$//' | sort -g | uniq -c | sort -k1 -n -r | head
    638 sub_op_commit_rec 
     36 op_applied
      1 May 10 10:11:04 overcloud-controller-0 journal: debug 2020-05-10 10:11:04.089447 7faa0618d700  0 log_channel(cluster) log [INF] : Health check cleared: REQUEST_SLOW (was: 1 slow requests are blocked > 32 sec. Implicated osds 10)
      1 May 10 10:11:04 overcloud-controller-0 journal: cluster 2020-05-10 10:11:04.089456 mon.overcloud-controller-0 mon.0 10.10.10.10:6789/0 217610 : cluster [INF] Health check cleared: REQUEST_SLOW (was: 1 slow requests are blocked > 32 sec. Implicated osds 10)
      1 May 10 10:11:04 overcloud-controller-0 docker: debug 2020-05-10 10:11:04.089447 7faa0618d700  0 log_channel(cluster) log [INF] : Health check cleared: REQUEST_SLOW (was: 1 slow requests are blocked > 32 sec. Implicated osds 10)
      1 May 10 10:11:04 overcloud-controller-0 docker: cluster 2020-05-10 10:11:04.089456 mon.overcloud-controller-0 mon.0 10.10.10.10:6789/0 217610 : cluster [INF] Health check cleared: REQUEST_SLOW (was: 1 slow requests are blocked > 32 sec. Implicated osds 10)
      1 May 10 10:10:58 overcloud-controller-0 journal: debug 2020-05-10 10:10:58.030673 7faa0618d700  0 log_channel(cluster) log [WRN] : Health check failed: 1 slow requests are blocked > 32 sec. Implicated osds 10 (REQUEST_SLOW)
      1 May 10 10:10:58 overcloud-controller-0 journal: cluster 2020-05-10 10:10:58.030683 mon.overcloud-controller-0 mon.0 10.10.10.10:6789/0 217609 : cluster [WRN] Health check failed: 1 slow requests are blocked > 32 sec. Implicated osds 10 (REQUEST_SLOW)
      1 May 10 10:10:58 overcloud-controller-0 docker: debug 2020-05-10 10:10:58.030673 7faa0618d700  0 log_channel(cluster) log [WRN] : Health check failed: 1 slow requests are blocked > 32 sec. Implicated osds 10 (REQUEST_SLOW)
      1 May 10 10:10:58 overcloud-controller-0 docker: cluster 2020-05-10 10:10:58.030683 mon.overcloud-controller-0 mon.0 10.10.10.10:6789/0 217609 : cluster [WRN] Health check failed: 1 slow requests are blocked > 32 sec. Implicated osds 10 (REQUEST_SLOW)

Environment

  • Red Hat OpenStack Platform (RHOSP)
  • Red Hat Ceph Storage 3.3 (RHCS)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In