6.3. 将资源移到基础架构 MachineSet

默认情况下,您的集群中已部署了某些基础架构资源。您可以将它们移至您创建的基础架构 MachineSet。

6.3.1. 移动路由器

您可以将路由器 Pod 部署到不同的 MachineSet。默认情况下,该 Pod 部署到 worker 节点。

先决条件

  • 在 OpenShift Container Platform 集群中配置额外的 MachineSet。

流程

  1. 查看路由器 Operator 的 IngressController 自定义资源:

    $ oc get ingresscontroller default -n openshift-ingress-operator -o yaml

    命令输出类似于以下文本:

    apiVersion: operator.openshift.io/v1
    kind: IngressController
    metadata:
      creationTimestamp: 2019-04-18T12:35:39Z
      finalizers:
      - ingresscontroller.operator.openshift.io/finalizer-ingresscontroller
      generation: 1
      name: default
      namespace: openshift-ingress-operator
      resourceVersion: "11341"
      selfLink: /apis/operator.openshift.io/v1/namespaces/openshift-ingress-operator/ingresscontrollers/default
      uid: 79509e05-61d6-11e9-bc55-02ce4781844a
    spec: {}
    status:
      availableReplicas: 2
      conditions:
      - lastTransitionTime: 2019-04-18T12:36:15Z
        status: "True"
        type: Available
      domain: apps.<cluster>.example.com
      endpointPublishingStrategy:
        type: LoadBalancerService
      selector: ingresscontroller.operator.openshift.io/deployment-ingresscontroller=default
  2. 编辑 ingresscontroller 资源,并更改 nodeSelector 以使用infra 标签:

    $ oc edit ingresscontroller default -n openshift-ingress-operator -o yaml

    spec 中添加使用 infra 标签的 nodeSelector 的部分,如下所示:

      spec:
        nodePlacement:
          nodeSelector:
            matchLabels:
              node-role.kubernetes.io/infra: ""
  3. 确认路由器 Pod 在 infra 节点上运行。

    1. 查看路由器 Pod 列表,并记下正在运行的 Pod 的节点名称:

      $ oc get pod -n openshift-ingress -o wide
      
      NAME                              READY     STATUS        RESTARTS   AGE       IP           NODE                           NOMINATED NODE   READINESS GATES
      router-default-86798b4b5d-bdlvd   1/1      Running       0          28s       10.130.2.4   ip-10-0-217-226.ec2.internal   <none>           <none>
      router-default-955d875f4-255g8    0/1      Terminating   0          19h       10.129.2.4   ip-10-0-148-172.ec2.internal   <none>           <none>

      在本例中,正在运行的 Pod 位于 ip-10-0-217-226.ec2.internal 节点上。

    2. 查看正在运行的 Pod 的节点状态:

      $ oc get node <node_name> 1
      
      NAME                           STATUS    ROLES          AGE       VERSION
      ip-10-0-217-226.ec2.internal   Ready     infra,worker   17h       v1.14.6+c4799753c
      1
      指定从 Pod 列表获得的 <node_name>

      由于角色列表包含 infra,因此 Pod 在正确的节点上运行。

6.3.2. 移动默认 registry

您需要配置 registry Operator,以便将其 Pod 部署到其他节点。

先决条件

  • 在 OpenShift Container Platform 集群中配置额外的 MachineSet。

流程

  1. 查看 config/instance 对象:

    $ oc get config/cluster -o yaml

    输出类似于以下文本:

    apiVersion: imageregistry.operator.openshift.io/v1
    kind: Config
    metadata:
      creationTimestamp: 2019-02-05T13:52:05Z
      finalizers:
      - imageregistry.operator.openshift.io/finalizer
      generation: 1
      name: cluster
      resourceVersion: "56174"
      selfLink: /apis/imageregistry.operator.openshift.io/v1/configs/cluster
      uid: 36fd3724-294d-11e9-a524-12ffeee2931b
    spec:
      httpSecret: d9a012ccd117b1e6616ceccb2c3bb66a5fed1b5e481623
      logging: 2
      managementState: Managed
      proxy: {}
      replicas: 1
      requests:
        read: {}
        write: {}
      storage:
        s3:
          bucket: image-registry-us-east-1-c92e88cad85b48ec8b312344dff03c82-392c
          region: us-east-1
    status:
    ...
  2. 编辑 config/instance 对象:

    $ oc edit config/cluster
  3. 在对象的 spec 部分添加以下文本行:

      nodeSelector:
        node-role.kubernetes.io/infra: ""
  4. 验证 registry Pod 已移至基础架构节点。

    1. 运行以下命令检查 registry Pod 所在的节点:

      $ oc get pods -o wide -n openshift-image-registry
    2. 确认节点具有您指定的标签:

      $ oc describe node <node_name>

      查看命令输出,并确认 node-role.kubernetes.io/infra 列在 LABELS 列表中。

6.3.3. 移动监控解决方案

默认情况下,部署包含 Prometheus、Grafana 和 AlertManager 的 Prometheus Cluster Monitoring 堆栈来提供集群监控功能。它由 Cluster Monitoring Operator 进行管理。若要将其组件移到其他机器上,需要创建并应用自定义 ConfigMap。

流程

  1. 将以下 ConfigMap 定义保存为 cluster-monitoring-configmap.yaml 文件:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: cluster-monitoring-config
      namespace: openshift-monitoring
    data:
      config.yaml: |+
        alertmanagerMain:
          nodeSelector:
            node-role.kubernetes.io/infra: ""
        prometheusK8s:
          nodeSelector:
            node-role.kubernetes.io/infra: ""
        prometheusOperator:
          nodeSelector:
            node-role.kubernetes.io/infra: ""
        grafana:
          nodeSelector:
            node-role.kubernetes.io/infra: ""
        k8sPrometheusAdapter:
          nodeSelector:
            node-role.kubernetes.io/infra: ""
        kubeStateMetrics:
          nodeSelector:
            node-role.kubernetes.io/infra: ""
        telemeterClient:
          nodeSelector:
            node-role.kubernetes.io/infra: ""
        openshiftStateMetrics:
          nodeSelector:
            node-role.kubernetes.io/infra: ""

    运行此 ConfigMap 会强制将监控堆栈的组件重新部署到基础架构节点。

  2. 应用新的 ConfigMap:

    $ oc create -f cluster-monitoring-configmap.yaml
  3. 观察监控 Pod 移至新机器:

    $ watch 'oc get pod -n openshift-monitoring -o wide'
  4. 如果组件没有移到 infra 节点,请删除带有这个组件的 pod:

    $ oc delete pod -n openshift-monitoring <pod>

    已删除 pod 的组件在 infra 节点上重新创建。

其他资源

  • 如需了解有关移动 OpenShift Container Platform 组件的信息,请参阅监控文档

6.3.4. 移动集群日志资源

您可以配置 Cluster Logging Operator,以将任何或所有 Cluster Logging 组件、Elasticsearch、Kibana 和 Curator 的 Pod 部署到不同的节点上。您无法将 Cluster Logging Operator Pod 从其安装位置移走。

例如,您可以因为 CPU、内存和磁盘要求较高而将 Elasticsearch Pod 移到一个单独的节点上。

注意

您应该将 MachineSet 设置为至少使用 6 个副本。

先决条件

  • 必须安装 Cluster Logging 和 Elasticsearch。默认情况下没有安装这些功能。

流程

  1. 编辑 openshift-logging 项目中的集群日志记录自定义资源:

    $ oc edit ClusterLogging instance
    apiVersion: logging.openshift.io/v1
    kind: ClusterLogging
    
    ....
    
    spec:
      collection:
        logs:
          fluentd:
            resources: null
          type: fluentd
      curation:
        curator:
          nodeSelector: 1
              node-role.kubernetes.io/infra: ''
          resources: null
          schedule: 30 3 * * *
        type: curator
      logStore:
        elasticsearch:
          nodeCount: 3
          nodeSelector: 2
              node-role.kubernetes.io/infra: ''
          redundancyPolicy: SingleRedundancy
          resources:
            limits:
              cpu: 500m
              memory: 16Gi
            requests:
              cpu: 500m
              memory: 16Gi
          storage: {}
        type: elasticsearch
      managementState: Managed
      visualization:
        kibana:
          nodeSelector: 3
              node-role.kubernetes.io/infra: '' 4
          proxy:
            resources: null
          replicas: 1
          resources: null
        type: kibana
    
    ....
1 2 3 4
添加 nodeSelector 参数,并设为适用于您想要移动的组件的值。您可以根据为节点指定的值,按所示格式使用 nodeSelector 或使用 <key>: <value> 对。

验证步骤

要验证组件是否已移动,您可以使用 oc get pod -o wide 命令。

例如:

  • 您需要移动来自 ip-10-0-147-79.us-east-2.compute.internal 节点上的 Kibana pod:

    $ oc get pod kibana-5b8bdf44f9-ccpq9 -o wide
    NAME                      READY   STATUS    RESTARTS   AGE   IP            NODE                                        NOMINATED NODE   READINESS GATES
    kibana-5b8bdf44f9-ccpq9   2/2     Running   0          27s   10.129.2.18   ip-10-0-147-79.us-east-2.compute.internal   <none>           <none>
  • 您需要将 Kibana Pod 移到 ip-10-0-139-48.us-east-2.compute.internal 节点,该节点是一个专用的基础架构节点:

    $ oc get nodes
    NAME                                         STATUS   ROLES          AGE   VERSION
    ip-10-0-133-216.us-east-2.compute.internal   Ready    master         60m   v1.16.2
    ip-10-0-139-146.us-east-2.compute.internal   Ready    master         60m   v1.16.2
    ip-10-0-139-192.us-east-2.compute.internal   Ready    worker         51m   v1.16.2
    ip-10-0-139-241.us-east-2.compute.internal   Ready    worker         51m   v1.16.2
    ip-10-0-147-79.us-east-2.compute.internal    Ready    worker         51m   v1.16.2
    ip-10-0-152-241.us-east-2.compute.internal   Ready    master         60m   v1.16.2
    ip-10-0-139-48.us-east-2.compute.internal    Ready    infra          51m   v1.16.2

    请注意,该节点具有 node-role.kubernetes.io/infra: " label:

    $ oc get node ip-10-0-139-48.us-east-2.compute.internal -o yaml
    
    kind: Node
    apiVersion: v1
    metadata:
      name: ip-10-0-139-48.us-east-2.compute.internal
      selfLink: /api/v1/nodes/ip-10-0-139-48.us-east-2.compute.internal
      uid: 62038aa9-661f-41d7-ba93-b5f1b6ef8751
      resourceVersion: '39083'
      creationTimestamp: '2020-04-13T19:07:55Z'
      labels:
        node-role.kubernetes.io/infra: ''
    ....
  • 要移动 Kibana Pod,请编辑集群日志记录 CR 以添加节点选择器:

    apiVersion: logging.openshift.io/v1
    kind: ClusterLogging
    
    ....
    
    spec:
    
    ....
    
      visualization:
        kibana:
          nodeSelector: 1
            node-role.kubernetes.io/infra: '' 2
          proxy:
            resources: null
          replicas: 1
          resources: null
        type: kibana
    1 2
    添加节点选择器以匹配节点规格中的 label。
  • 保存 CR 后,当前 Kibana Pod 将被终止,新的 Pod 会被部署:

    $ oc get pods
    NAME                                            READY   STATUS        RESTARTS   AGE
    cluster-logging-operator-84d98649c4-zb9g7       1/1     Running       0          29m
    elasticsearch-cdm-hwv01pf7-1-56588f554f-kpmlg   2/2     Running       0          28m
    elasticsearch-cdm-hwv01pf7-2-84c877d75d-75wqj   2/2     Running       0          28m
    elasticsearch-cdm-hwv01pf7-3-f5d95b87b-4nx78    2/2     Running       0          28m
    fluentd-42dzz                                   1/1     Running       0          28m
    fluentd-d74rq                                   1/1     Running       0          28m
    fluentd-m5vr9                                   1/1     Running       0          28m
    fluentd-nkxl7                                   1/1     Running       0          28m
    fluentd-pdvqb                                   1/1     Running       0          28m
    fluentd-tflh6                                   1/1     Running       0          28m
    kibana-5b8bdf44f9-ccpq9                         2/2     Terminating   0          4m11s
    kibana-7d85dcffc8-bfpfp                         2/2     Running       0          33s
  • 新 pod 位于 ip-10-0-139-48.us-east-2.compute.internal 节点上 :

    $ oc get pod kibana-7d85dcffc8-bfpfp -o wide
    NAME                      READY   STATUS        RESTARTS   AGE   IP            NODE                                        NOMINATED NODE   READINESS GATES
    kibana-7d85dcffc8-bfpfp   2/2     Running       0          43s   10.131.0.22   ip-10-0-139-48.us-east-2.compute.internal   <none>           <none>
  • 片刻后,原始 Kibana Pod 将被删除。

    $ oc get pods
    NAME                                            READY   STATUS    RESTARTS   AGE
    cluster-logging-operator-84d98649c4-zb9g7       1/1     Running   0          30m
    elasticsearch-cdm-hwv01pf7-1-56588f554f-kpmlg   2/2     Running   0          29m
    elasticsearch-cdm-hwv01pf7-2-84c877d75d-75wqj   2/2     Running   0          29m
    elasticsearch-cdm-hwv01pf7-3-f5d95b87b-4nx78    2/2     Running   0          29m
    fluentd-42dzz                                   1/1     Running   0          29m
    fluentd-d74rq                                   1/1     Running   0          29m
    fluentd-m5vr9                                   1/1     Running   0          29m
    fluentd-nkxl7                                   1/1     Running   0          29m
    fluentd-pdvqb                                   1/1     Running   0          29m
    fluentd-tflh6                                   1/1     Running   0          29m
    kibana-7d85dcffc8-bfpfp                         2/2     Running   0          62s