High latency events in event log
Hello,
Could someone tell me a about event log meesages that have the form:
"Storage domain X experienced a high latency of X from host X..."
We are recieving them regularly now for all of our storage domains. We have not noticed a slight drop off in VM performance over the last year or so - nothing significant however. RHEV is version 3.0, NFS data center.
What sort of test is done to measure latency? Thanks!
DW
Responses
I don't know what test is used to trigger these warning event messages, but they sometimes appear in our RHEV-M Web Admin portal. When I query event type 524, the resulting messages all show a latency of at least 5 seconds.
Our RHEV-M service is at version 3.1 and has an iSCSI data center. The connectivity from the hosts to the iSCSI storage array has only one path.
Greetings.
We have the same messages on console. We had checked all our FC network topology and don't find any issue.
Anybody knows how to troubleshoot this messages ?
We have noticed this message twice now too. It came up when dd'ing a VM from a KVM hypervisor host into/onto a RHEV-H host, on which a rescue system was booted. The RHEV-H hosts have two paths via 10GB ethernet to a iSCSI storage appliance. I was surprised because I assumed that there should not be any problems as far as the speed to the storage is concerned. Besides that there was almost no other traffic at the same time on the storage network.
I would also be interested how to troubleshoot this, if there is any trouble at all.
If you look at the bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=948232 , you can see there are some false positives here. Sometimes the report about latency made is not accurate and reports 5 second latency even if that is not true. In a later version, based on the bugzilla, we are going to change the way we measure the latency to make it accurate.
Till that time,
You can try running "time /bin/dd iflag=direct if=/dev/<sd>/metadata of=/dev/null bs=4k count=1" multiple times a day (if possible close to the time the message was reported on events tab or keep it running via a cron job and examine the result later) and if you don't see the time taken isn't as per the message and you are not facing any kind of performance problems, you can assume your storage is working fine and ignore this error.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
