Why keepalived is not performing a failover upon network restart?

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux(RHEL) 7.
  • Keepalived (All Versions).
  • NetworkManager-1.0.4-9.el7 and above.

Issue

  • When NetworkManager is running in the system and network restart is performed
  1. Keepalived is losing VIP
  2. Keepalived is not performing a failover

Resolution

  • When NetworkManager is running in the system, it is NOT recommended to restart network service.
  • Use ip link to mark the interface link down and to verify the keepalived failover.
 # ip link set down dev <interface-name>

Root Cause

  • Keepalived tracks the interface link status and performs a failover if the link goes down.

  • When there is NetworkManager running in the system and network service stops, it does not mark the interface link status down. So keepalived does not perform a failover, as link status remains the same.

  • However, when there is no NetworkManager and the network service stops, it marks the interface link status down. As link goes down, keepalived detects it and performs the failover.

Diagnostic Steps

  • When there is no NetworkManager and network service stops, it brings down the interface, removes the IPs and marks the interface link status down.
Interface status when network service is running :- 

# ethtool ens3 | grep -i "link det"
    Link detected: yes

# ip a s ens3
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:00:00:00:00:aa brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.1/24 brd 10.0.0.255 scope global dynamic ens3
       valid_lft 3583sec preferred_lft 3583sec
    inet 10.0.0.2/24 scope global secondary keepalived
       valid_lft forever preferred_lft forever


# systemctl stop network

# ip a s ens3
2: ens3: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN qlen 1000              
    link/ether 00:00:00:00:00:aa brd ff:ff:ff:ff:ff:ff

# ethtool ens3 | grep -i "link det"
    Link detected: no
  • When there is NetworkManager and the network service stops, it brings down the interface, removes the IP but does not mark the interface link status down.
Interface status when network service is running :- 

# ethtool ens3 | grep -i "link det"
    Link detected: yes

# ip a s ens3
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:00:00:00:00:aa brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.1/24 brd 10.0.0.255 scope global dynamic ens3
       valid_lft 3583sec preferred_lft 3583sec
    inet 10.0.0.2/24 scope global secondary abc
       valid_lft forever preferred_lft forever

# systemctl stop network

# ip a s ens3
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:00:00:00:00:aa brd ff:ff:ff:ff:ff:ff

# ethtool ens3 | grep -i "link det"
    Link detected: yes

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments