Worker node name resolution fails after updating to RHEL 7.9

Solution Verified - Updated -

Issue

  • After starting an OpenShift Container Platform worker node it seems that somehow the network or the SDN is not completely working, as we can see following errors:

    Oct 15 09:04:00 ip-10-140-10-132.eu-central-1.compute.internal dockerd-current[1909]: time="2020-10-15T07:04:16.309085682+02:00" level=error msg="Attempting next endpoint for pull after error: Get https://docker-registry.default.svc:5000/v2/: dial tcp: lookup docker-registry.default.svc on 10.140.10.2:53: no such host"
    
  • After some time (between 15 minutes and 30 minutes) the issue disappears and the name resolution works as expected.

  • The issue appeared after updating to Red Hat Enterprise Linux 7.9.

Environment

  • Red Hat OpenShift Container Platform (OCP) 3.11
  • Red Hat Enterprise Linux 7.9
  • Package cloud-init-19.4-7.el7.x86_64

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content