Change in behavior of poll() function timeout between RH 5.5 and RH 6.0

Latest response

Hi,

 

We have a C program where we use the poll(fds, 1, timeout_msec) function to block for 10 ms, then send an event to another thread and then block again for 10 ms. We have been using this code for a while and it worked fine on all bare metal systems.

 

With Redhat 6.0, we are seeing a problem on a particular system. Sometimes, the timeout will be much greater than 10 ms for a period of time. We have seen a timeout as large as 500 ms.  During an 18 hour test, this large timeout might occur 2 or 3 times and last for 5 minutes each time. We have a Dell PowerEdge R410 system where the problem occurs. If we install Redhat 5.5 on this system, we do not see the problem, but if we install Redhat 6.0 on this system, the problem appears. We have another system, a Dell PowerEdge R310, where we do not see the problem with Redhat 6.0 installed.

 

Is there some way to figure out why we are seeing such large timeouts? Is this a known problem? Is there a workaround or patch to fix this? Thank you.

Responses

There's an open ticket for RHEL6 for the R410s (warning, this is long and may be a red herring, but looked close enough to mention):
https://bugzilla.redhat.com/show_bug.cgi?id=710265

If the latency is localized to just your poll() system calls, this is probably not related. If you are noticing slow performance on overall benchmarks or regular system use, this may be related.

Since this sounds related to your business/product, I'd open a ticket with Red Hat Support and start the ball rolling there.

It probably wouldn't hurt to upgrade your System, PERC and DRAC BIOSes to latest if you haven't already. Please take the normal considerations if these are production systems.

System BIOS:
http://support.dell.com/support/downloads/format.aspx?c=us&cs=555&l=en&s=biz&deviceid=11809&libid=1&releaseid=R311421&vercnt=10&formatcnt=0&SystemID=PWE_R410&servicetag=&os=WNET&osl=en&catid=-1&dateid=-1&typeid=-1&formatid=-1&impid=-1&checkFormat=true

Thank you. It is possible that this might be related to our problem. I will read through the thread and try some of the suggestions for a workaround for now. I will also see if our BIOS needs to be updated.

 

I have seen similar misbehaviors, not attributed to a change in poll, but instead is usually related to the low-power modes relating to the motherboard/cpu combinations.

 

Can you try entering the bios and setting:

 

Power Management settings to: "Active Power Control" instead of "Performance".

 

I think that this bug https://bugzilla.redhat.com/show_bug.cgi?id=710265 , is related to your problem.  If you wish to have the test kernel please lodge a ticket to Red Hat Support Services and ask for the test kernel.

 

My previous subject heading was incorrect, i was thinking S3 states, not SMI interrupts.