dnsmasq service on OCP SNO fails to read /etc/resolv.conf file during system startup
Issue
-
In a SNO installation dnsmasq service fails to read
/etc/resolv.conffile during system startup. As a result it is unable to configure upstream DNS servers for DNS query forwarding and uses the SNO systems' own IP address to resolve all DNS query as a authoritative DNS server. -
It was noticed that this problem only appears on
baremetalSNO cluster nodes having multiple NICs. This is because when there are delays in bringing up network interfaces during the system startup, systemd doesn't wait for the complete networking to start before startingdnsmasq.service.
Boot logs
Dec 07 19:49:51 snotest.example.local systemd[1]: Starting Run dnsmasq to provide local dns for Single Node OpenShift...
Dec 07 19:49:51 snotest.example.local systemd[1]: Started Run dnsmasq to provide local dns for Single Node OpenShift.
Dec 07 19:49:51 snotest.example.local dnsmasq[3138]: started, version 2.85 cachesize 150
Dec 07 19:49:51 snotest.example.local dnsmasq[3138]: failed to read /etc/resolv.conf: Permission denied
Dec 07 19:49:51 snotest.example.local dnsmasq[3138]: no servers found in /etc/resolv.conf, will retry
Packet Capture
# tcpdump -i any port 53 -nnn -vvv | grep test.example.local
192.x.x.101.38853 > 192.x.x.101.53: [bad udp cksum 0xb603 -> 0x927a!] 63832+ [1au] A? test.example.local.labs.example.local. ar: . OPT UDPsize=4096 [COOKIE d8bac0de5e86d3f1] (101)
192.x.x.101.53 > 192.x.x.101.38853: [bad udp cksum 0xb5f7 -> 0x7206!] 63832 NXDomain q: A? test.example.local.labs.example.local. 0/0/1 ar: . OPT UDPsize=1232 (89)
192.x.x.101.58508 > 192.x.x.101.53: [bad udp cksum 0xb5f3 -> 0xdda2!] 29607+ [1au] A? test.example.local.lab1.example.local. ar: . OPT UDPsize=4096 [COOKIE d8bac0de5e86d3f1] (85)
192.x.x.101.53 > 192.x.x.101.58508: [bad udp cksum 0xb5e7 -> 0xbd2e!] 29607 NXDomain q: A? test.example.local.lab1.example.local. 0/0/1 ar: . OPT UDPsize=1232 (73)
192.x.x.101.49840 > 192.x.x.101.53: [bad udp cksum 0xb5e0 -> 0x4a66!] 2663+ [1au] A? test.example.local. ar: . OPT UDPsize=4096 [COOKIE d8bac0de5e86d3f1] (66)
192.x.x.101.53 > 192.x.x.101.49840: [bad udp cksum 0xb5d4 -> 0xa17a!] 2663 NXDomain q: A? test.example.local. 0/0/1 ar: . OPT UDPsize=1232 (54)
Environment
- Red Hat OpenShift Container Platform (RHOCP) 4.16 and later
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.