第 4 章 操作节点

4.1. 查看和列出 OpenShift Container Platform 集群中的节点

您可以列出集群中的所有节点,以获取节点的相关信息,如状态、年龄、内存用量和其他详情。

在执行节点管理操作时,CLI 与代表实际节点主机的节点对象交互。主控机(master)使用来自节点对象的信息执行健康检查,以此验证节点。

4.1.1. 关于列出集群中的所有节点

您可以获取集群中节点的详细信息。

  • 以下命令列出所有节点:

    $ oc get nodes
    
    NAME                   STATUS    ROLES     AGE       VERSION
    master.example.com     Ready     master    7h        v1.14.6+c4799753c
    node1.example.com      Ready     worker    7h        v1.14.6+c4799753c
    node2.example.com      Ready     worker    7h        v1.14.6+c4799753c
  • -wide 选项可提供所有节点的附加信息。

    $ oc get nodes -o wide
  • 以下命令列出一个节点的相关信息:

    $ oc get node <node>

    这些命令输出中的 STATUS 列可显示节点的以下状况:

    表 4.1. 节点状况

    状况描述

    Ready

    节点通过返回 True 来报告其自身对 apiserver 的就绪度。

    NotReady

    其中一个底层组件(如容器运行时或网络)遇到了问题或尚未配置。

    SchedulingDisabled

    无法通过调度将 Pod 放置到节点上。

  • 以下命令提供有关特定节点的更多详细信息,包括发生当前状况的原因:

    $ oc describe node <node>

    例如:

    $ oc describe node node1.example.com
    
    Name:               node1.example.com 1
    Roles:              worker 2
    Labels:             beta.kubernetes.io/arch=amd64   3
                        beta.kubernetes.io/instance-type=m4.large
                        beta.kubernetes.io/os=linux
                        failure-domain.beta.kubernetes.io/region=us-east-2
                        failure-domain.beta.kubernetes.io/zone=us-east-2a
                        kubernetes.io/hostname=ip-10-0-140-16
                        node-role.kubernetes.io/worker=
    Annotations:        cluster.k8s.io/machine: openshift-machine-api/ahardin-worker-us-east-2a-q5dzc  4
                        machineconfiguration.openshift.io/currentConfig: worker-309c228e8b3a92e2235edd544c62fea8
                        machineconfiguration.openshift.io/desiredConfig: worker-309c228e8b3a92e2235edd544c62fea8
                        machineconfiguration.openshift.io/state: Done
                        volumes.kubernetes.io/controller-managed-attach-detach: true
    CreationTimestamp:  Wed, 13 Feb 2019 11:05:57 -0500
    Taints:             <none>  5
    Unschedulable:      false
    Conditions:                 6
      Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
      ----             ------  -----------------                 ------------------                ------                       -------
      OutOfDisk        False   Wed, 13 Feb 2019 15:09:42 -0500   Wed, 13 Feb 2019 11:05:57 -0500   KubeletHasSufficientDisk     kubelet has sufficient disk space available
      MemoryPressure   False   Wed, 13 Feb 2019 15:09:42 -0500   Wed, 13 Feb 2019 11:05:57 -0500   KubeletHasSufficientMemory   kubelet has sufficient memory available
      DiskPressure     False   Wed, 13 Feb 2019 15:09:42 -0500   Wed, 13 Feb 2019 11:05:57 -0500   KubeletHasNoDiskPressure     kubelet has no disk pressure
      PIDPressure      False   Wed, 13 Feb 2019 15:09:42 -0500   Wed, 13 Feb 2019 11:05:57 -0500   KubeletHasSufficientPID      kubelet has sufficient PID available
      Ready            True    Wed, 13 Feb 2019 15:09:42 -0500   Wed, 13 Feb 2019 11:07:09 -0500   KubeletReady                 kubelet is posting ready status
    Addresses:   7
      InternalIP:   10.0.140.16
      InternalDNS:  ip-10-0-140-16.us-east-2.compute.internal
      Hostname:     ip-10-0-140-16.us-east-2.compute.internal
    Capacity:    8
     attachable-volumes-aws-ebs:  39
     cpu:                         2
     hugepages-1Gi:               0
     hugepages-2Mi:               0
     memory:                      8172516Ki
     pods:                        250
    Allocatable:
     attachable-volumes-aws-ebs:  39
     cpu:                         1500m
     hugepages-1Gi:               0
     hugepages-2Mi:               0
     memory:                      7558116Ki
     pods:                        250
    System Info:    9
     Machine ID:                              63787c9534c24fde9a0cde35c13f1f66
     System UUID:                             EC22BF97-A006-4A58-6AF8-0A38DEEA122A
     Boot ID:                                 f24ad37d-2594-46b4-8830-7f7555918325
     Kernel Version:                          3.10.0-957.5.1.el7.x86_64
     OS Image:                                Red Hat Enterprise Linux CoreOS 410.8.20190520.0 (Ootpa)
     Operating System:                        linux
     Architecture:                            amd64
     Container Runtime Version:               cri-o://1.13.9-1.rhaos4.1.gitd70609a.el8
     Kubelet Version:                         v1.14.6+c4799753c
     Kube-Proxy Version:                      v1.14.6+c4799753c
    PodCIDR:                                  10.128.4.0/24
    ProviderID:                               aws:///us-east-2a/i-04e87b31dc6b3e171
    Non-terminated Pods:                      (13 in total)  10
      Namespace                               Name                                   CPU Requests  CPU Limits  Memory Requests  Memory Limits
      ---------                               ----                                   ------------  ----------  ---------------  -------------
      openshift-cluster-node-tuning-operator  tuned-hdl5q                            0 (0%)        0 (0%)      0 (0%)           0 (0%)
      openshift-dns                           dns-default-l69zr                      0 (0%)        0 (0%)      0 (0%)           0 (0%)
      openshift-image-registry                node-ca-9hmcg                          0 (0%)        0 (0%)      0 (0%)           0 (0%)
      openshift-ingress                       router-default-76455c45c-c5ptv         0 (0%)        0 (0%)      0 (0%)           0 (0%)
      openshift-machine-config-operator       machine-config-daemon-cvqw9            20m (1%)      0 (0%)      50Mi (0%)        0 (0%)
      openshift-marketplace                   community-operators-f67fh              0 (0%)        0 (0%)      0 (0%)           0 (0%)
      openshift-monitoring                    alertmanager-main-0                    50m (3%)      50m (3%)    210Mi (2%)       10Mi (0%)
      openshift-monitoring                    grafana-78765ddcc7-hnjmm               100m (6%)     200m (13%)  100Mi (1%)       200Mi (2%)
      openshift-monitoring                    node-exporter-l7q8d                    10m (0%)      20m (1%)    20Mi (0%)        40Mi (0%)
      openshift-monitoring                    prometheus-adapter-75d769c874-hvb85    0 (0%)        0 (0%)      0 (0%)           0 (0%)
      openshift-multus                        multus-kw8w5                           0 (0%)        0 (0%)      0 (0%)           0 (0%)
      openshift-sdn                           ovs-t4dsn                              100m (6%)     0 (0%)      300Mi (4%)       0 (0%)
      openshift-sdn                           sdn-g79hg                              100m (6%)     0 (0%)      200Mi (2%)       0 (0%)
    Allocated resources:
      (Total limits may be over 100 percent, i.e., overcommitted.)
      Resource                    Requests     Limits
      --------                    --------     ------
      cpu                         380m (25%)   270m (18%)
      memory                      880Mi (11%)  250Mi (3%)
      attachable-volumes-aws-ebs  0            0
    Events:     11
      Type     Reason                   Age                From                      Message
      ----     ------                   ----               ----                      -------
      Normal   NodeHasSufficientPID     6d (x5 over 6d)    kubelet, m01.example.com  Node m01.example.com status is now: NodeHasSufficientPID
      Normal   NodeAllocatableEnforced  6d                 kubelet, m01.example.com  Updated Node Allocatable limit across pods
      Normal   NodeHasSufficientMemory  6d (x6 over 6d)    kubelet, m01.example.com  Node m01.example.com status is now: NodeHasSufficientMemory
      Normal   NodeHasNoDiskPressure    6d (x6 over 6d)    kubelet, m01.example.com  Node m01.example.com status is now: NodeHasNoDiskPressure
      Normal   NodeHasSufficientDisk    6d (x6 over 6d)    kubelet, m01.example.com  Node m01.example.com status is now: NodeHasSufficientDisk
      Normal   NodeHasSufficientPID     6d                 kubelet, m01.example.com  Node m01.example.com status is now: NodeHasSufficientPID
      Normal   Starting                 6d                 kubelet, m01.example.com  Starting kubelet.
     ...
    1
    节点的名称。
    2
    节点的角色,可以是 masterworker
    3
    应用到节点的标签。
    4
    应用到节点的注解。
    5
    应用到节点的污点。
    6
    节点状况。
    7
    节点的 IP 地址和主机名。
    8
    pod 资源和可分配的资源。
    9
    节点主机的相关信息。
    10
    节点上的 pod。
    11
    节点报告的事件。

4.1.2. 列出集群中某一节点上的 pod

您可以列出特定节点上的所有 pod。

流程

  • 列出一个或多个节点上的所有或选定 pod:

    $ oc describe node <node1> <node2>

    例如:

    $ oc describe node ip-10-0-128-218.ec2.internal
  • 列出选定节点上的所有或选定 pod:

    $ oc describe --selector=<node_selector>
    $ oc describe -l=<pod_selector>

    例如:

    $ oc describe node  --selector=beta.kubernetes.io/os
    $ oc describe node -l node-role.kubernetes.io/worker

4.1.3. 查看节点上的内存和 CPU 用量统计

您可以显示节点的用量统计,这些统计信息为容器提供了运行时环境。这些用量统计包括 CPU、内存和存储的消耗。

先决条件

  • 您必须有 cluster-reader 权限才能查看用量统计。
  • 必须安装 Metrics 才能查看用量统计。

流程

  • 查看用量统计:

    $ oc adm top nodes
    
    NAME                                   CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%
    ip-10-0-12-143.ec2.compute.internal    1503m        100%      4533Mi          61%
    ip-10-0-132-16.ec2.compute.internal    76m          5%        1391Mi          18%
    ip-10-0-140-137.ec2.compute.internal   398m         26%       2473Mi          33%
    ip-10-0-142-44.ec2.compute.internal    656m         43%       6119Mi          82%
    ip-10-0-146-165.ec2.compute.internal   188m         12%       3367Mi          45%
    ip-10-0-19-62.ec2.compute.internal     896m         59%       5754Mi          77%
    ip-10-0-44-193.ec2.compute.internal    632m         42%       5349Mi          72%
  • 查看具有标签的节点的用量统计信息:

    $ oc adm top node --selector=''

    您必须选择过滤所基于的选择器(标签查询)。支持 ===!=