Behavioir of glusterfs-storage daemonset

Posted on

When applying a NoSchedule taint to a node, I expected OCP to reschedule the pods running on the node EXCEPT for the daemonsets. This held true for the network and logging and other daemonset services except for Glusterfs. It killed the pods on the nodes I tainted which caused a number of glusterfs volumes to go offline. Once I fixed that mess, I added tolerations to the glusterfs-daemonset definition and ran into another issue whereby it appears that there is a limit to the number of tolerations you can add. I had added three to the daemonset but when the pods were spun up, one of my tolerations was missing from the pod.

Attached are two files, one with the daemonset definition and the other with the captured text from an oc describe pod on one of the glusterfs pods. What am I doing wrong here?