Chapter 3. Recovery for HA applications with RWX storage

Applications that are using topologyKey: kubernetes.io/hostname or no topology configuration, have no protection against all of the application replicas being in the same zone.

Note

This can happen even with podAntiAffinity and topologyKey: kubernetes.io/hostname in the Pod spec because this anti-affinity rule is host-based and not zone-based.

If this happens and all replicas are located in the zone that fails, the application using ReadWriteMany (RWX) storage takes 6-8 minutes to recover on the active zone. This pause is for the OpenShift Container Platform nodes in the failed zone to become NotReady (60 seconds) and then for the default pod eviction timeout to expire (300 seconds).