Is there a way to limit multipath failover times in order to avoid Oracle RAC cluster evictions?

Solution Verified - Updated -

Issue

  • Multipath takes too long to react during SAN failures, exceeding Oracle RAC cluster timeouts and triggering evictions.
The voting disk timeout is 200s
The network heartbeat is 30s (css_misscount)
The SDTO is 27s (short disk timeout)

The SDTO is not public-ally known

Network Ping                            Disk Ping                                                      Reboot
Completes within misscount seconds  Completes within Misscount seconds                                  N
Completes within Misscount seconds  Takes more than misscount seconds but less than Disktimeout seconds N
Completes within Misscount seconds  Takes more than Disktimeout seconds                                 Y
Takes more than  Misscount Seconds  Completes within Misscount seconds                                  Y

These messages show the SDTO
[ CSSD][xxxxxxx]clssnmPollingThread: local diskTimeout set to 27000 ms, remote disk timeout set to 27000, impending reconfig status(1)
More than disk timeout of 27000 after the last NHB (network heartbeat)

Environment

  • Red Hat Enterprise Linux 6
  • Red Hat Enterprise Linux 7
  • Red Hat Enterprise Linux 8
  • Oracle RAC
  • Fibre Channel SAN storage

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content