TLS handshake fails due to large packets discarded for OpenShift 4 on Azure

Solution Verified - Updated -

Issue

  • TLS handshake errors occur although TCP communication is possible. A traffic capture shows large packets being discarded (see "Diagnostic Steps").
  • Unexpected ICMP fragmentation needed messages are received for direct communications happening between OpenShift nodes but without vxlan encapsulation. Requested MTU is lower than the one set in both ends and/or required by any intermediate element.
  • Routing cache shows bad entries as described in "Diagnostic Steps".
  • After some time, the OpenShift Cluster becomes very slow and many operators start to become unhealthy (degraded state).

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • OpenShift SDN
      • 4.6
      • 4.7
    • OVN-Kubernetes
      • 4.8
      • 4.9
      • 4.10
  • Azure

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content