[RHEL 5.2] Cluster service failover happens to another host before multipath failover to another path

Solution Verified - Updated -

Issue

  • after recent configuration changes, testing qdisk loss of path in multipath configuration results in cluster failover to another node and node is fenced
  • failover to another available device path did not happen as fast as needed/expected
  • timing parameter already tuned according to http://kbase.redhat.com/faq/docs/DOC-2882

Jan 10 18:04:51 node kernel: lpfc 0000:0e:00.0: 0:1305 Link Down Event x2 received Data: x2 x20 x80110    <<<--- ***lpfc detect link down***
Jan 10 18:05:10 node openais[12465]: [CMAN ] cman killed by node 1 because we were killed by cman_tool or other application  <<<--- ***qdisk already activated and cluster failover ***
Jan 10 18:05:10 node dlm_controld[12487]: cluster is down, exiting
Jan 10 18:05:10 node kernel: dlm: closing connection to node 2
Jan 10 18:05:10 node kernel: dlm: closing connection to node 1
Jan 10 18:05:10 node gfs_controld[12493]: cluster is down, exiting
Jan 10 18:05:10 node fenced[12481]: cluster is down, exiting
Jan 10 18:05:21 node kernel:  rport-0:0-3: blocked FC remote port time out: saving binding  <<<--- ***SCSI timeout (30s later then link down)***
Jan 10 18:05:21 node kernel: sd 0:0:1:1: SCSI error: return code = 0x00010000
Jan 10 18:05:21 node kernel: end_request: I/O error, dev sda, sector 0
Jan 10 18:05:21 node kernel: sd 0:0:1:3: SCSI error: return code = 0x00010000
Jan 10 18:05:21 node kernel: end_request: I/O error, dev sdb, sector 0
Jan 10 18:05:21 node kernel: sd 0:0:1:4: SCSI error: return code = 0x00010000
Jan 10 18:05:21 node kernel: end_request: I/O error, dev sdc, sector 0
Jan 10 18:05:21 node kernel: device-mapper: multipath: Failing path 8:32.
Jan 10 18:05:21 node multipathd: sdc: directio checker reports path is down
Jan 10 18:05:21 node multipathd: checker failed path 8:32 in map qdisk
Jan 10 18:05:21 node kernel: lpfc 0000:0e:00.0: 0:(0):0203 Devloss timeout on WWPN 50:0a:09:83:99:ea:cd:bf NPort x330300 Data: x0 x7 x0
Jan 10 18:05:21 node multipathd: qdisk: remaining active paths: 1
Jan 10 18:05:21 node multipathd: dm-8: add map (uevent)
Jan 10 18:05:21 node multipathd: dm-8: devmap already registered
Jan 10 18:05:21 node kernel: md: stopping all md devices.

  • upon loss of san link to disk(s), cluster failover happens before local path failover (due to configuration)

Environment

  • Red Hat Enterprise Cluster Suite 5.2
  • device mapper multipath
  • san based storage

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In