How to add RHEL worker nodes when there is a MachineConfig configuring partitions for worker nodes in OCP 4

Solution Unverified - Updated -

Environment

  • Red Hat Openshift Container Platform (RHOCP)
    • 4.6

Issue

  • When configuring the partition machine-config for worker role in OpenShift 4, adding the RHEL worker nodes will be failed with the Ansible Playbook:

    TASK [openshift_node : Fetch bootstrap ignition file locally] ****************************************************************************
    Friday 23 April 2021  12:20:03 +0800 (0:00:00.441)       0:10:49.397 ********** 
    FAILED - RETRYING: Fetch bootstrap ignition file locally (60 retries left).
    …
    FAILED - RETRYING: Fetch bootstrap ignition file locally (1 retries left).
    fatal: [worker-3.ocp4.example.com]: FAILED! => {"attempts": 60, "changed": false, "connection": "close", "content": "", "content_length": "0", "date": "Fri, 23 Apr 2021 04:30:37 GMT", "elapsed": 0, "msg": "Status code was 500 and not [200]: HTTP Error 500: Internal Server Error", "path": "/tmp/ansible.avZNjc/bootstrap.ign", "redirected": false, "status": 500, "url": "https://api-int.ocp4.example.com:22623/config/worker"}
    
  • The error message of is outputing on the machine-config-server pod with scaling the RHEL worker nodes:

    $ oc -n openshift-machine-config-operator logs machine-config-server-xwpch | tail -n2
    I0423 04:30:26.733047       1 api.go:117] Pool worker requested by address:"10.0.0.1:60336" User-Agent:"Ignition/0.35.0" Accept-Header: ""
    E0423 04:30:26.845757       1 api.go:155] couldn't convert config for req: {worker 0xc00075d4c0}, error: failed to convert config from spec v3.1 to v2.2: unable to onvert cIgnition spec v3 config to v2: SizeMiB and StartMiB in Storage.Disks.Partitions is not supported on 2.2
    

Resolution

If the OpenShift 4 Cluster is using the both of RHCOS and RHEL worker nodes with a MachineConfig to configure partitions, due to the partition settings, the RHEL worker nodes scaling up will be failed. See how to use the machine-config settings (Disk partitioning) to create the partition for the RHCOS.

Important: as shown in the above document, Kubernetes supports only two filesystem partitions. If you add more than one partition to the original configuration, Kubernetes cannot monitor all of them, and this can cause issues in the cluster. Refer to Understanding OpenShift File System Monitoring (eviction conditions) for additional information.

It needs the following workaround to ignore the settings in the MachineConfig to configure partitions for applying to the RHEL worker nodes.

Workaround

1. Create a new machineconfigpool which is named worker-rhel for the RHEL worker nodes to exclude the partition settings. A new RHEL worker nodes should be use the following machineconfigpool.

# oc create -f mcp-worker-rhel.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: worker-rhel
spec:
  machineConfigSelector:
    matchExpressions:
      - {key: machineconfiguration.openshift.io/role, operator: In, values: [worker]}
      - {key: node.openshift.io/os_id, operator: NotIn, values: [rhcos]}
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker: ""
      node.openshift.io/os_id: rhel

2. (Samples)The following partition settings are alreay applied for the OpenShift 4 cluster. Please make sure that the worker nodes partition settings has two labels below in the 1. Step.

The one is for machineconfiguration.openshift.io/role: worker, and another is for node.openshift.io/os_id: rhcos

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  creationTimestamp: null
  labels:
    machineconfiguration.openshift.io/role: worker
    node.openshift.io/os_id: rhcos
  name: 98-worker-partition
The partition settings for the master nodes.

99_openshift-machineconfig_98-master-partition.yaml

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  creationTimestamp: null
  labels:
    machineconfiguration.openshift.io/role: master
  name: 98-master-partition
spec:
  config:
    ignition:
      version: 3.1.0
    storage:
      disks:
      - device: /dev/sda
        partitions:
        - label: var-log
          number: 5
          sizeMiB: 10240
          startMiB: 40960
        - label: var-lib-kubelet
          number: 6
          sizeMiB: 30720
          startMiB: 51200
        - label: var-lib-containers
          number: 7
          sizeMiB: 0
          startMiB: 81920
      filesystems:
      - device: /dev/disk/by-partlabel/var-log
        format: xfs
        path: /var/log
      - device: /dev/disk/by-partlabel/var-lib-kubelet
        format: xfs
        path: /var/lib/kubelet
      - device: /dev/disk/by-partlabel/var-lib-containers
        format: xfs
        path: /var/lib/containers
    systemd:
      units:
      - contents: |
          [Unit]
          Before=local-fs.target
          [Mount]
          Where=/var/log
          What=/dev/disk/by-partlabel/var-log
          Options=rw,relatime,seclabel,attr2,inode64,prjquota
          [Install]
          WantedBy=local-fs.target
        enabled: true
        name: var-log.mount
      - contents: |
          [Unit]
          Before=local-fs.target
          [Mount]
          Where=/var/lib/kubelet
          What=/dev/disk/by-partlabel/var-lib-kubelet
          Options=rw,relatime,seclabel,attr2,inode64,prjquota
          [Install]
          WantedBy=local-fs.target
        enabled: true
        name: var-lib-kubelet.mount
      - contents: |
          [Unit]
          Before=local-fs.target
          [Mount]
          Where=/var/lib/containers
          What=/dev/disk/by-partlabel/var-lib-containers
          Options=rw,relatime,seclabel,attr2,inode64,prjquota
          [Install]
          WantedBy=local-fs.target
        enabled: true
        name: var-lib-containers.mount
The partition settings for the worker nodes.

99_openshift-machineconfig_98-worker-partition.yaml

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  creationTimestamp: null
  labels:
    machineconfiguration.openshift.io/role: worker
    node.openshift.io/os_id: rhcos
  name: 98-worker-partition
spec:
  config:
    ignition:
      version: 3.1.0
    storage:
      disks:
      - device: /dev/sda
        partitions:
        - label: var-log
          number: 5
          sizeMiB: 10240
          startMiB: 40960
        - label: var-lib-kubelet
          number: 6
          sizeMiB: 30720
          startMiB: 51200
        - label: var-lib-containers
          number: 7
          sizeMiB: 0
          startMiB: 81920
      filesystems:
      - device: /dev/disk/by-partlabel/var-log
        format: xfs
        path: /var/log
      - device: /dev/disk/by-partlabel/var-lib-kubelet
        format: xfs
        path: /var/lib/kubelet
      - device: /dev/disk/by-partlabel/var-lib-containers
        format: xfs
        path: /var/lib/containers
    systemd:
      units:
      - contents: |
          [Unit]
          Before=local-fs.target
          [Mount]
          Where=/var/log
          What=/dev/disk/by-partlabel/var-log
          Options=rw,relatime,seclabel,attr2,inode64,prjquota
          [Install]
          WantedBy=local-fs.target
        enabled: true
        name: var-log.mount
      - contents: |
          [Unit]
          Before=local-fs.target
          [Mount]
          Where=/var/lib/kubelet
          What=/dev/disk/by-partlabel/var-lib-kubelet
          Options=rw,relatime,seclabel,attr2,inode64,prjquota
          [Install]
          WantedBy=local-fs.target
        enabled: true
        name: var-lib-kubelet.mount
      - contents: |
          [Unit]
          Before=local-fs.target
          [Mount]
          Where=/var/lib/containers
          What=/dev/disk/by-partlabel/var-lib-containers
          Options=rw,relatime,seclabel,attr2,inode64,prjquota
          [Install]
          WantedBy=local-fs.target
        enabled: true
        name: var-lib-containers.mount
The RHCOS partiton settings look like this.
[core@worker-2 ~]$ df -hT
Filesystem                           Type      Size  Used Avail Use% Mounted on
devtmpfs                             devtmpfs  7.8G     0  7.8G   0% /dev
tmpfs                                tmpfs     7.9G  168K  7.9G   1% /dev/shm
tmpfs                                tmpfs     7.9G   44M  7.8G   1% /run
tmpfs                                tmpfs     7.9G     0  7.9G   0% /sys/fs/cgroup
/dev/mapper/coreos-luks-root-nocrypt xfs        40G  3.4G   37G   9% /sysroot
tmpfs                                tmpfs     7.9G  4.0K  7.9G   1% /tmp
/dev/sda7                            xfs        40G  2.4G   38G   6% /var/lib/containers
/dev/sda5                            xfs        10G  433M  9.6G   5% /var/log
/dev/sda1                            ext4      364M  176M  165M  52% /boot
/dev/sda2                            vfat      127M  6.9M  120M   6% /boot/efi
/dev/sda6                            xfs        30G 1007M   30G   4% /var/lib/kubelet
tmpfs                                tmpfs     1.6G  8.0K  1.6G   1% /run/user/1000

3. Make sure that the variable with openshift_node_machineconfigpool=worker-rhel is added to the inventory file, the RHEL worker nodes will use the worker-rhel machineconfigpool to initialize the worker nodes for the scaling up.

[all:vars]
ansible_user=root
#ansible_become=True 

openshift_kubeconfig_path="/var/www/html/ocp4/ign/auth/kubeconfig"
openshift_node_machineconfigpool=worker-rhel

[new_workers]
worker-3.ocp4.example.com

4. Confirm the scaling up result, use the following command to confirm the new node(worker-3) has been added to the OpenShift 4 cluster, and the machineconfigpool settings is correct.

# oc get node
NAME                        STATUS   ROLES    AGE    VERSION
master-0.ocp4.example.com   Ready    master   15h    v1.19.0+3b01205
master-1.ocp4.example.com   Ready    master   15h    v1.19.0+3b01205
master-2.ocp4.example.com   Ready    master   15h    v1.19.0+3b01205
worker-0.ocp4.example.com   Ready    worker   13h    v1.19.0+3b01205
worker-1.ocp4.example.com   Ready    worker   13h    v1.19.0+3b01205
worker-2.ocp4.example.com   Ready    worker   13h    v1.19.0+3b01205
worker-3.ocp4.example.com   Ready    worker   5h5m   v1.19.0+a5a0987
# oc get mcp
NAME          CONFIG                                                  UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master        rendered-master-b133956217168c1f9f4e6bfd6725d7a8        True      False      False      3              3                   3                     0                      15h
worker        rendered-worker-1ae4d90d25165da1d3629db6043037cf        True      False      False      3              3                   3                     0                      15h
worker-rhel   rendered-worker-rhel-54597db01f4ca39ec1d4692f13a5faaa   True      False      False      1              1                   1                     0                      13h

Root Cause

The Private Bug 1908906 has been opened and is currently being tracked by Engineering.

Usually the RHEL worker nodes partition is configured by the user manually, it is not controlled by the machine-config. Also if there is some /dev/sda or /dev/vda partition machine-config in the cluster, it will broke the RHEL partition settings and cause the CRI-O cannot startup.

The scaling up playbook with the tasks Fetch bootstrap ignition file locally using the User-Agent "Ignition/0.35.0" for connecting to the machine-config-server which will return the configuration of the ignition 2.2 version, and the ignition 2.2 version is no supported for the disk partition feature.

$ vim /usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/config.yml +54
- name: Fetch bootstrap ignition file locally
  uri:
    url: "{{ openshift_node_bootstrap_endpoint }}"
    dest: "{{ temp_dir.path }}/bootstrap.ign"
    validate_certs: false
    http_agent: "Ignition/0.35.0"
  delay: 10
  retries: 60
  register: bootstrap_ignition
  until:
  - bootstrap_ignition.status is defined
  - bootstrap_ignition.status == 200

Diagnostic Steps

Check the differences between the worker and worker-rhel. The machineconfigpool of worker-rhel will exclude the 98-worker-partition partition settings.

$ oc get mcp worker -o jsonpath='{.spec.configuration}' | jq .

{
  "name": "rendered-worker-1ae4d90d25165da1d3629db6043037cf",
  "source": [
    {
      "apiVersion": "machineconfiguration.openshift.io/v1",
      "kind": "MachineConfig",
      "name": "00-worker"
    },
    {
      "apiVersion": "machineconfiguration.openshift.io/v1",
      "kind": "MachineConfig",
      "name": "01-worker-container-runtime"
    },
    {
      "apiVersion": "machineconfiguration.openshift.io/v1",
      "kind": "MachineConfig",
      "name": "01-worker-kubelet"
    },
    {
      "apiVersion": "machineconfiguration.openshift.io/v1",
      "kind": "MachineConfig",
      "name": "98-worker-partition"
    },
    {
      "apiVersion": "machineconfiguration.openshift.io/v1",
      "kind": "MachineConfig",
      "name": "99-worker-generated-registries"
    },
    {
      "apiVersion": "machineconfiguration.openshift.io/v1",
      "kind": "MachineConfig",
      "name": "99-worker-ssh"
    }
  ]
}
$ oc get mcp worker-rhel -o jsonpath='{.spec.configuration}' | jq .

{
  "name": "rendered-worker-rhel-54597db01f4ca39ec1d4692f13a5faaa",
  "source": [
    {
      "apiVersion": "machineconfiguration.openshift.io/v1",
      "kind": "MachineConfig",
      "name": "00-worker"
    },
    {
      "apiVersion": "machineconfiguration.openshift.io/v1",
      "kind": "MachineConfig",
      "name": "01-worker-container-runtime"
    },
    {
      "apiVersion": "machineconfiguration.openshift.io/v1",
      "kind": "MachineConfig",
      "name": "01-worker-kubelet"
    },
    {
      "apiVersion": "machineconfiguration.openshift.io/v1",
      "kind": "MachineConfig",
      "name": "99-worker-generated-registries"
    },
    {
      "apiVersion": "machineconfiguration.openshift.io/v1",
      "kind": "MachineConfig",
      "name": "99-worker-ssh"
    }
  ]
}

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments