Bind fails to resolve external IPv4 recursive queries periodically in RHEL
Issue
- Two of the nameservers in a particular data center are slow to resolve some external recursive IPv4 record queries or time out, and will eventually require restarting the named service before they can successfully resolve A records in "problem" domains again.
- Occurs consistently with some "problem" domains, and sporadically with other domains. Specific examples of "problem" domains include www.example.com, www.example.com and others.
- Impact: This causes mail to fail to be delivered in the event one of the failed queries is in response to an MX lookup.
- The problem is more apparent on A records with very short TTL (time to live) values since they cannot be cached longer than the TTL allows.
- Using
nscd flushhas no effect.
Environment
- Red Hat Enterprise Linux (RHEL)
- Note: Since this is a network issue, this issue can occur with any RHEL release..
- bind nameserver configured to perform recursion of external records for clients
-
bind configuration (/etc/named.conf - zone declarations omitted)
options {
directory "/";
allow-transfer { trustedslaves; };
allow-recursion { recursive; };
auth-nxdomain no;
version "cowbell++";
max-ncache-ttl 10;
statistics-file "/var/log/named.stats";
memstatistics-file "/var/log/named.memstats";
zone-statistics yes;
dump-file "/var/log/named.dump";
recursing-file "/var/log/named.recursing";
querylog yes;
notify no;
listen-on port 53 {
any;
};
};
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
