Oracle RAC under VMware started pinning all the CPU with storage I/O timeouts
Issue
Multiple Oracle processes are taking up all CPU time with the following dmesg
storage errors:
sd 5:0:1:0: [sdw] task abort on host 5, ffff8805445206c0
sd 5:0:1:0: [sdw] Failed to abort cmd ffff8805445206c0
sd 5:0:2:0: [sdx] task abort on host 5, ffff88051cb68ac0
sd 5:0:2:0: [sdx] Failed to abort cmd ffff88051cb68ac0
sd 5:0:3:0: [sdy] task abort on host 5, ffff880462cc1480
sd 5:0:3:0: [sdy] Failed to abort cmd ffff880462cc1480
sd 4:0:1:0: [sds] task abort on host 4, ffff8805ee414c80
sd 4:0:1:0: [sds] Failed to abort cmd ffff8805ee414c80
sd 4:0:0:0: [sdr] task abort on host 4, ffff8803f3df23c0
sd 4:0:0:0: [sdr] Failed to abort cmd ffff8803f3df23c0
sd 4:0:1:0: [sds] task abort on host 4, ffff8806243d5180
sd 4:0:1:0: [sds] Failed to abort cmd ffff8806243d5180
sd 5:0:3:0: [sdy] task abort on host 5, ffff88060c9a5780
sd 5:0:3:0: [sdy] Failed to abort cmd ffff88060c9a5780
sd 4:0:2:0: [sdt] task abort on host 4, ffff8801fa7c5480
sd 4:0:2:0: [sdt] Failed to abort cmd ffff8801fa7c5480
sd 4:0:0:0: [sdr] task abort on host 4, ffff8803ec0f04c0
sd 4:0:0:0: [sdr] Failed to abort cmd ffff8803ec0f04c0
sd 5:0:3:0: [sdy] task abort on host 5, ffff8801b58e0c80
sd 5:0:3:0: [sdy] Failed to abort cmd ffff8801b58e0c80
sd 4:0:1:0: [sds] task abort on host 4, ffff880739b357c0
sd 4:0:1:0: [sds] Failed to abort cmd ffff880739b357c0
No configuration changes have been made or disk failures observed on the backend storage.
The ESX/i hypervisor is reporting "performance has deteriorated - I/O latency increased" in the event log with I/O exceeding 574 Mb/s.
Environment
- Red Hat Enterprise Linux 6.8
- Oracle RAC
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.