Slow thread response

Posted on

I have a program that uses 32 threads. The program is written in C. The threads are pthreads, scheduling set to FIFO, and are each bound to a separate core on a machine with 64 cores. These threads are all set to FIFO max pri. 16 of the threads exchange a bit of data between each other in a sequential manner. So thread 0 send data to thread 1, which sends some data to thread 2, and so on and thread 15 sends data to thread 0. I use a memcpy to copy the data across and then have tried to use named pipes, POSIX message queues and shared memory flags to notify the the downstream thread that it has data. My problem is that one of the threads just doesn't respond to any mechanism in a timely fashion. It won't read the named pipe or message queue for a long time (about .3 seconds as best I can measure). Even the shared memory variable doesn't update (I can check the variable from a different thread and it's been updated, but the suspect thread doesn't do it for the exact same interval - .3 seconds). I can put the thread into a loop and get it to do things like get the correct response form something like nanosleep. But it just won't respond to an IPC. I can change affinity with no difference. The code for all threads is identical. Just one thread (it's always thread 1) seems to go into lala land. Any ideas on how to chase this down would be appreciated. I've tried the RT kernel as well. same thing.

Responses