3.4. 查看 Elasticsearch 日志存储的状态

您可以查看 OpenShift Elasticsearch Operator 的状态以及多个 Elasticsearch 组件的状态。

3.4.1. 查看 Elasticsearch 日志存储的状态

您可以查看 Elasticsearch 日志存储的状态。

先决条件

  • 安装了 Red Hat OpenShift Logging Operator 和 OpenShift Elasticsearch Operator。

流程

  1. 运行以下命令,切换到 openshift-logging 项目:

    $ oc project openshift-logging
  2. 查看状态:

    1. 运行以下命令,获取 Elasticsearch 日志存储实例的名称:

      $ oc get Elasticsearch

      输出示例

      NAME            AGE
      elasticsearch   5h9m

    2. 运行以下命令来获取 Elasticsearch 日志存储状态:

      $ oc get Elasticsearch <Elasticsearch-instance> -o yaml

      例如:

      $ oc get Elasticsearch elasticsearch -n openshift-logging -o yaml

      输出中包含类似于如下的信息:

      输出示例

      status: 1
        cluster: 2
          activePrimaryShards: 30
          activeShards: 60
          initializingShards: 0
          numDataNodes: 3
          numNodes: 3
          pendingTasks: 0
          relocatingShards: 0
          status: green
          unassignedShards: 0
        clusterHealth: ""
        conditions: [] 3
        nodes: 4
        - deploymentName: elasticsearch-cdm-zjf34ved-1
          upgradeStatus: {}
        - deploymentName: elasticsearch-cdm-zjf34ved-2
          upgradeStatus: {}
        - deploymentName: elasticsearch-cdm-zjf34ved-3
          upgradeStatus: {}
        pods: 5
          client:
            failed: []
            notReady: []
            ready:
            - elasticsearch-cdm-zjf34ved-1-6d7fbf844f-sn422
            - elasticsearch-cdm-zjf34ved-2-dfbd988bc-qkzjz
            - elasticsearch-cdm-zjf34ved-3-c8f566f7c-t7zkt
          data:
            failed: []
            notReady: []
            ready:
            - elasticsearch-cdm-zjf34ved-1-6d7fbf844f-sn422
            - elasticsearch-cdm-zjf34ved-2-dfbd988bc-qkzjz
            - elasticsearch-cdm-zjf34ved-3-c8f566f7c-t7zkt
          master:
            failed: []
            notReady: []
            ready:
            - elasticsearch-cdm-zjf34ved-1-6d7fbf844f-sn422
            - elasticsearch-cdm-zjf34ved-2-dfbd988bc-qkzjz
            - elasticsearch-cdm-zjf34ved-3-c8f566f7c-t7zkt
        shardAllocationEnabled: all

      1
      在输出中,集群状态字段显示在 status 小节中。
      2
      Elasticsearch 日志存储的状态:
      • 活跃的主分片的数量。
      • 活跃分片的数量。
      • 正在初始化的分片的数量。
      • Elasticsearch 日志存储数据节点的数量。
      • Elasticsearch 日志存储节点的总数。
      • 待处理的任务数量。
      • Elasticsearch 日志存储状态: 绿色红色黄色
      • 未分配分片的数量。
      3
      任何状态条件(若存在)。Elasticsearch 日志存储状态指示在无法放置 pod 时来自于调度程序的原因。显示与以下情况有关的所有事件:
      • 容器正在等待 Elasticsearch 日志存储和代理容器。
      • Elasticsearch 日志存储和代理容器的容器终止。
      • Pod 不可调度。此外还显示适用于多个问题的情况,具体请参阅情况消息示例
      4
      集群中的 Elasticsearch 日志存储节点,带有 upgradeStatus
      5
      Elasticsearch 日志存储在集群中的客户端、数据和 master pod,列在 failednotReadyready 状态下。

3.4.1.1. 情况消息示例

以下是来自 Elasticsearch 实例的 Status 部分的一些情况消息的示例。

以下状态消息表示节点已超过配置的低水位线,并且没有分片将分配给此节点。

status:
  nodes:
  - conditions:
    - lastTransitionTime: 2019-03-15T15:57:22Z
      message: Disk storage usage for node is 27.5gb (36.74%). Shards will be not
        be allocated on this node.
      reason: Disk Watermark Low
      status: "True"
      type: NodeStorage
    deploymentName: example-elasticsearch-cdm-0-1
    upgradeStatus: {}

以下状态消息表示节点已超过配置的高水位线,并且分片将重新定位到其他节点。

status:
  nodes:
  - conditions:
    - lastTransitionTime: 2019-03-15T16:04:45Z
      message: Disk storage usage for node is 27.5gb (36.74%). Shards will be relocated
        from this node.
      reason: Disk Watermark High
      status: "True"
      type: NodeStorage
    deploymentName: example-elasticsearch-cdm-0-1
    upgradeStatus: {}

以下状态消息表示自定义资源(CR)中的 Elasticsearch 日志存储节点选择器与集群中的任何节点都不匹配:

status:
    nodes:
    - conditions:
      - lastTransitionTime: 2019-04-10T02:26:24Z
        message: '0/8 nodes are available: 8 node(s) didn''t match node selector.'
        reason: Unschedulable
        status: "True"
        type: Unschedulable

以下状态消息表示 Elasticsearch 日志存储 CR 使用不存在的持久性卷声明(PVC)。

status:
   nodes:
   - conditions:
     - last Transition Time:  2019-04-10T05:55:51Z
       message:               pod has unbound immediate PersistentVolumeClaims (repeated 5 times)
       reason:                Unschedulable
       status:                True
       type:                  Unschedulable

以下状态消息表示 Elasticsearch 日志存储集群没有足够的节点来支持冗余策略。

status:
  clusterHealth: ""
  conditions:
  - lastTransitionTime: 2019-04-17T20:01:31Z
    message: Wrong RedundancyPolicy selected. Choose different RedundancyPolicy or
      add more nodes with data roles
    reason: Invalid Settings
    status: "True"
    type: InvalidRedundancy

此状态消息表示集群有太多 control plane 节点:

status:
  clusterHealth: green
  conditions:
    - lastTransitionTime: '2019-04-17T20:12:34Z'
      message: >-
        Invalid master nodes count. Please ensure there are no more than 3 total
        nodes with master roles
      reason: Invalid Settings
      status: 'True'
      type: InvalidMasters

以下状态消息表示 Elasticsearch 存储不支持您尝试进行的更改。

例如:

status:
  clusterHealth: green
  conditions:
    - lastTransitionTime: "2021-05-07T01:05:13Z"
      message: Changing the storage structure for a custom resource is not supported
      reason: StorageStructureChangeIgnored
      status: 'True'
      type: StorageStructureChangeIgnored

reasontype 类型字段指定不受支持的更改类型:

StorageClassNameChangeIgnored
不支持更改存储类名称。
StorageSizeChangeIgnored
不支持更改存储大小。
StorageStructureChangeIgnored

不支持在临时存储结构和持久性存储结构间更改。

重要

如果您尝试将 ClusterLogging CR 配置为从临时切换到持久性存储,OpenShift Elasticsearch Operator 会创建一个持久性卷声明(PVC),但不创建持久性卷(PV)。要清除 StorageStructureChangeIgnored 状态,您必须恢复对 ClusterLogging CR 的更改并删除 PVC。

3.4.2. 查看日志存储组件的状态

您可以查看多个日志存储组件的状态。

Elasticsearch 索引

您可以查看 Elasticsearch 索引的状态。

  1. 获取 Elasticsearch Pod 的名称:

    $ oc get pods --selector component=elasticsearch -o name

    输出示例

    pod/elasticsearch-cdm-1godmszn-1-6f8495-vp4lw
    pod/elasticsearch-cdm-1godmszn-2-5769cf-9ms2n
    pod/elasticsearch-cdm-1godmszn-3-f66f7d-zqkz7

  2. 获取索引的状态:

    $ oc exec elasticsearch-cdm-4vjor49p-2-6d4d7db474-q2w7z -- indices

    输出示例

    Defaulting container name to elasticsearch.
    Use 'oc describe pod/elasticsearch-cdm-4vjor49p-2-6d4d7db474-q2w7z -n openshift-logging' to see all of the containers in this pod.
    
    green  open   infra-000002                                                     S4QANnf1QP6NgCegfnrnbQ   3   1     119926            0        157             78
    green  open   audit-000001                                                     8_EQx77iQCSTzFOXtxRqFw   3   1          0            0          0              0
    green  open   .security                                                        iDjscH7aSUGhIdq0LheLBQ   1   1          5            0          0              0
    green  open   .kibana_-377444158_kubeadmin                                     yBywZ9GfSrKebz5gWBZbjw   3   1          1            0          0              0
    green  open   infra-000001                                                     z6Dpe__ORgiopEpW6Yl44A   3   1     871000            0        874            436
    green  open   app-000001                                                       hIrazQCeSISewG3c2VIvsQ   3   1       2453            0          3              1
    green  open   .kibana_1                                                        JCitcBMSQxKOvIq6iQW6wg   1   1          0            0          0              0
    green  open   .kibana_-1595131456_user1                                        gIYFIEGRRe-ka0W3okS-mQ   3   1          1            0          0              0

日志存储 pod

您可以查看托管日志存储的 pod 的状态。

  1. 获取 Pod 的名称:

    $ oc get pods --selector component=elasticsearch -o name

    输出示例

    pod/elasticsearch-cdm-1godmszn-1-6f8495-vp4lw
    pod/elasticsearch-cdm-1godmszn-2-5769cf-9ms2n
    pod/elasticsearch-cdm-1godmszn-3-f66f7d-zqkz7

  2. 获取 Pod 的状态:

    $ oc describe pod elasticsearch-cdm-1godmszn-1-6f8495-vp4lw

    输出中包括以下状态信息:

    输出示例

    ....
    Status:             Running
    
    ....
    
    Containers:
      elasticsearch:
        Container ID:   cri-o://b7d44e0a9ea486e27f47763f5bb4c39dfd2
        State:          Running
          Started:      Mon, 08 Jun 2020 10:17:56 -0400
        Ready:          True
        Restart Count:  0
        Readiness:  exec [/usr/share/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3
    
    ....
    
      proxy:
        Container ID:  cri-o://3f77032abaddbb1652c116278652908dc01860320b8a4e741d06894b2f8f9aa1
        State:          Running
          Started:      Mon, 08 Jun 2020 10:18:38 -0400
        Ready:          True
        Restart Count:  0
    
    ....
    
    Conditions:
      Type              Status
      Initialized       True
      Ready             True
      ContainersReady   True
      PodScheduled      True
    
    ....
    
    Events:          <none>

日志存储 pod 部署配置

您可以查看日志存储部署配置的状态。

  1. 获取部署配置的名称:

    $ oc get deployment --selector component=elasticsearch -o name

    输出示例

    deployment.extensions/elasticsearch-cdm-1gon-1
    deployment.extensions/elasticsearch-cdm-1gon-2
    deployment.extensions/elasticsearch-cdm-1gon-3

  2. 获取部署配置状态:

    $ oc describe deployment elasticsearch-cdm-1gon-1

    输出中包括以下状态信息:

    输出示例

    ....
      Containers:
       elasticsearch:
        Image:      registry.redhat.io/openshift-logging/elasticsearch6-rhel8
        Readiness:  exec [/usr/share/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3
    
    ....
    
    Conditions:
      Type           Status   Reason
      ----           ------   ------
      Progressing    Unknown  DeploymentPaused
      Available      True     MinimumReplicasAvailable
    
    ....
    
    Events:          <none>

日志存储副本集

您可以查看日志存储副本集的状态。

  1. 获取副本集的名称:

    $ oc get replicaSet --selector component=elasticsearch -o name
    
    replicaset.extensions/elasticsearch-cdm-1gon-1-6f8495
    replicaset.extensions/elasticsearch-cdm-1gon-2-5769cf
    replicaset.extensions/elasticsearch-cdm-1gon-3-f66f7d
  2. 获取副本集的状态:

    $ oc describe replicaSet elasticsearch-cdm-1gon-1-6f8495

    输出中包括以下状态信息:

    输出示例

    ....
      Containers:
       elasticsearch:
        Image:      registry.redhat.io/openshift-logging/elasticsearch6-rhel8@sha256:4265742c7cdd85359140e2d7d703e4311b6497eec7676957f455d6908e7b1c25
        Readiness:  exec [/usr/share/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3
    
    ....
    
    Events:          <none>

3.4.3. Elasticsearch 集群状态

OpenShift Container Platform Web 控制台的 Observe 部分中的仪表板显示 Elasticsearch 集群的状态。

要获取 OpenShift Elasticsearch 集群的状态,请访问位于 <cluster_url>/monitoring/dashboards/grafana-dashboard-cluster-logging 的 OpenShift Container Platform Web 控制台的 Observe 部分中的 Grafana 仪表板。

Elasticsearch 状态字段

eo_elasticsearch_cr_cluster_management_state

显示 Elasticsearch 集群是否处于受管状态或非受管状态。例如:

eo_elasticsearch_cr_cluster_management_state{state="managed"} 1
eo_elasticsearch_cr_cluster_management_state{state="unmanaged"} 0
eo_elasticsearch_cr_restart_total

显示 Elasticsearch 节点重启证书、滚动重启或调度重启的次数。例如:

eo_elasticsearch_cr_restart_total{reason="cert_restart"} 1
eo_elasticsearch_cr_restart_total{reason="rolling_restart"} 1
eo_elasticsearch_cr_restart_total{reason="scheduled_restart"} 3
es_index_namespaces_total

显示 Elasticsearch 索引命名空间的总数。例如:

Total number of Namespaces.
es_index_namespaces_total 5
es_index_document_count

显示每个命名空间的记录数。例如:

es_index_document_count{namespace="namespace_1"} 25
es_index_document_count{namespace="namespace_2"} 10
es_index_document_count{namespace="namespace_3"} 5

"Secret Elasticsearch fields are either missing or empty" 信息

如果 Elasticsearch 缺少 admin-certadmin-keylogging-es.crtlogging-es.key 文件,仪表板会显示类似以下示例的状态消息:

message": "Secret \"elasticsearch\" fields are either missing or empty: [admin-cert, admin-key, logging-es.crt, logging-es.key]",
"reason": "Missing Required Secrets",