DNS query latency.

Latest response

Hi Everyone,

We are having issues with DNS queries. I have two app servers running oracle in the back end. and they are using our public DNS. And our DNS server is Windows box. Customer calls saying application loads very slow. I go to the app server and run dig +trace command and it gets hung for about 30 sec up to 1 min.

1- Checking system resources and they are fine.
2- Checking DNS server box resources its fine too, no network errors or any indication of issue.
3- Checking Network connectivy from client to DNS. No issue.

Once I changed from our public dns to 8.8.8.8 to primary it fixed the issue but next day around same time application got slow. did same command dig +trace its hung again. Changed back to our public DNS its fixes the issue of cause only till next day. Again around same time it gets hung and after 30 -40 min issue resolved itself.

Run out of Ideas what might be wrong? and how to diagnose the issue. Windows guys telling me that DNS server is good no problem there. Network team has the same answer. So please advice what to do?

Thanks

Responses

Can we get additional information?
what version of RHEL are you running?
what is your resolver order (grep ^hosts /etc/nsswitch.conf)
are your lookups primarily on-premise, or do they require external servers?

We had some issues in the past and there was an nscd update (I believe) that we applied. We also deployed dnsmasq in some other environments, but those hosts generally do not do any "public" lookups. You can tweak your client settings to not cache?

If you are able to pick a time when the host is having an issue you could use tcpdump and then review the dump with Wireshark to determine what is going on with the packets.

sudo tcpdump -i eth0  'udp port 53' -w tcpdump_port53.pcap

Then open the file from Wireshark. Unfortunately at this point, I would need to do more research to know a definitive direction to go - but I believe you may be able to get an idea from the output (specifically the timing of events - and what events actually take place).

+1 to this (I was about to write the same thing).

I would first collect the DNS queries and check which hostnames it is attempting to resolve, and how frequently it is sending requests during the periods of slowdown.

I would also check the general interface statistics/usage during the period (perhaps use something like iptraf so you aren't relying on the network guys?)

Hi again,

Thanks for responses. We are using RHEL 5.5. Order of nsswithch.conf is "files dns" I believe its a default. It shouldn't go external just internal lookups.

So here is an other question. As I know it might be wrong too but from my knowledge client server does not cache. When I checked nscd deamon it turned off. Even if it was turned on I know as it will cache only resolve.conf info not quiries that was made on the DNS box. Is this right or wrong info that I have? I will try to capture of activity of port 53 if I will see any issue again. But for now I have this info's only.

Thanks again.

Hi Farrukh,

nscd will locally cache any DNS queries made from the node it is running on. It will use the information in resolv.conf to find the numbers it needs as any client would.

However, if you only need a DNS caching solution, I would propose that you seriously consider dnsmasq.

dnsmasq is in the RHEL 5 Desktop and server channels from here: https://rhn.redhat.com/rhn/software/packages/details/Overview.do?pid=492577.

It is a small, robust daemon that scales well (although that is not the use case here, I know) and is very fast.

Best regards,
Mark

I will second the recommendation for DNSmasq. They advertise dnsmasq as a "small networks" solution, which is a bit distracting.

In our situation, we have applications that are extremely latency-sensitive and we were having some odd issues with our core infrastructure DNS. We deployed dnsmasq to those systems and I have not heard anything about those systems since ;-)

sorry for late response,

Well the issue did not happend from last week. So for now as small change I have added hostnames that are being queried when application works to local hosts file.
But I did not get Mark, so if my linux boxes are the clients of the DNS server which windows box. Should I or is it good practice to enable cache deamon "nscd" on the clients? And I did not used dnsmasq but this is my next step if the issue happens again.

Thanks for all.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.