DNS query latency.
Hi Everyone,
We are having issues with DNS queries. I have two app servers running oracle in the back end. and they are using our public DNS. And our DNS server is Windows box. Customer calls saying application loads very slow. I go to the app server and run dig +trace command and it gets hung for about 30 sec up to 1 min.
1- Checking system resources and they are fine.
2- Checking DNS server box resources its fine too, no network errors or any indication of issue.
3- Checking Network connectivy from client to DNS. No issue.
Once I changed from our public dns to 8.8.8.8 to primary it fixed the issue but next day around same time application got slow. did same command dig +trace its hung again. Changed back to our public DNS its fixes the issue of cause only till next day. Again around same time it gets hung and after 30 -40 min issue resolved itself.
Run out of Ideas what might be wrong? and how to diagnose the issue. Windows guys telling me that DNS server is good no problem there. Network team has the same answer. So please advice what to do?
Thanks
Responses
Can we get additional information?
what version of RHEL are you running?
what is your resolver order (grep ^hosts /etc/nsswitch.conf)
are your lookups primarily on-premise, or do they require external servers?
We had some issues in the past and there was an nscd update (I believe) that we applied. We also deployed dnsmasq in some other environments, but those hosts generally do not do any "public" lookups. You can tweak your client settings to not cache?
If you are able to pick a time when the host is having an issue you could use tcpdump and then review the dump with Wireshark to determine what is going on with the packets.
sudo tcpdump -i eth0 'udp port 53' -w tcpdump_port53.pcap
Then open the file from Wireshark. Unfortunately at this point, I would need to do more research to know a definitive direction to go - but I believe you may be able to get an idea from the output (specifically the timing of events - and what events actually take place).
+1 to this (I was about to write the same thing).
I would first collect the DNS queries and check which hostnames it is attempting to resolve, and how frequently it is sending requests during the periods of slowdown.
I would also check the general interface statistics/usage during the period (perhaps use something like iptraf so you aren't relying on the network guys?)
Hi Farrukh,
nscd will locally cache any DNS queries made from the node it is running on. It will use the information in resolv.conf to find the numbers it needs as any client would.
However, if you only need a DNS caching solution, I would propose that you seriously consider dnsmasq.
dnsmasq is in the RHEL 5 Desktop and server channels from here: https://rhn.redhat.com/rhn/software/packages/details/Overview.do?pid=492577.
It is a small, robust daemon that scales well (although that is not the use case here, I know) and is very fast.
Best regards,
Mark
I will second the recommendation for DNSmasq. They advertise dnsmasq as a "small networks" solution, which is a bit distracting.
In our situation, we have applications that are extremely latency-sensitive and we were having some odd issues with our core infrastructure DNS. We deployed dnsmasq to those systems and I have not heard anything about those systems since ;-)
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
