Chapter 5. Management of managers using the Ceph Orchestrator

As a storage administrator, you can use the Ceph Orchestrator to deploy additional manager daemons. Cephadm automatically installs a manager daemon on the bootstrap node during the bootstrapping process.

In general, you should set up a Ceph Manager on each of the hosts running the Ceph Monitor daemon to achieve same level of availability.

By default, whichever ceph-mgr instance comes up first is made active by the Ceph Monitors, and others are standby managers. There is no requirement that there should be a quorum among the ceph-mgr daemons.

If the active daemon fails to send a beacon to the monitors for more than the mon mgr beacon grace, then it is replaced by a standby.

If you want to pre-empt failover, you can explicitly mark a ceph-mgr daemon as failed with ceph mgr fail MANAGER_NAME command.

5.1. Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to all the nodes.
  • Hosts are added to the cluster.

5.2. Deploying the manager daemons using the Ceph Orchestrator

The Ceph Orchestrator deploys two Manager daemons by default. You can deploy additional manager daemons using the placement specification in the command line interface. To deploy a different number of Manager daemons, specify a different number. If you do not specify the hosts where the Manager daemons should be deployed, the Ceph Orchestrator randomly selects the hosts and deploys the Manager daemons to them.

Note

Ensure your deployment has at least three Ceph Managers in each deployment.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Hosts are added to the cluster.

Procedure

  1. Log into the Cephadm shell:

    Example

    [root@host01 ~]# cephadm shell

  2. You can deploy manager daemons in two different ways:

Method 1

  • Deploy manager daemons using placement specification on specific set of hosts:

    Note

    Red Hat recommends that you use the --placement option to deploy on specific hosts.

    Syntax

    ceph orch apply mgr --placement=" HOST_NAME_1 HOST_NAME_2 HOST_NAME_3"

    Example

    [ceph: root@host01 /]# ceph orch apply mgr --placement="host01 host02 host03"

Method 2

  • Deploy manager daemons randomly on the hosts in the storage cluster:

    Syntax

    ceph orch apply mgr NUMBER_OF_DAEMONS

    Example

    [ceph: root@host01 /]# ceph orch apply mgr 3

Verification

  • List the service:

    Example

    [ceph: root@host01 /]# ceph orch ls

  • List the hosts, daemons, and processes:

    Syntax

    ceph orch ps --daemon_type=DAEMON_NAME

    Example

    [ceph: root@host01 /]# ceph orch ps --daemon_type=mgr

5.3. Removing the manager daemons using the Ceph Orchestrator

To remove the manager daemons from the host, you can just redeploy the daemons on other hosts.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to all the nodes.
  • Hosts are added to the cluster.
  • At least one manager daemon deployed on the hosts.

Procedure

  1. Log into the Cephadm shell:

    Example

    [root@host01 ~]# cephadm shell

  2. Run the ceph orch apply command to redeploy the required manager daemons:

    Syntax

    ceph orch apply mgr "NUMBER_OF_DAEMONS HOST_NAME_1 HOST_NAME_3"

    If you want to remove manager daemons from host02, then you can redeploy the manager daemons on other hosts.

    Example

    [ceph: root@host01 /]# ceph orch apply mgr "2 host01 host03"

Verification

  • List the hosts,daemons, and processes:

    Syntax

    ceph orch ps --daemon_type=DAEMON_NAME

    Example

    [ceph: root@host01 /]# ceph orch ps --daemon_type=mgr

Additional Resources

5.4. Using the Ceph Manager modules

Use the ceph mgr module ls command to see the available modules and the modules that are presently enabled.

Enable or disable modules with ceph mgr module enable MODULE command or ceph mgr module disable MODULE command respectively.

If a module is enabled, then the active ceph-mgr daemon loads and executes it. In the case of modules that provide a service, such as an HTTP server, the module might publish its address when it is loaded. To see the addresses of such modules, run the ceph mgr services command.

Some modules might also implement a special standby mode which runs on standby ceph-mgr daemon as well as the active daemon. This enables modules that provide services to redirect their clients to the active daemon, if the client tries to connect to a standby.

Following is an example to enable the dashboard module:

[ceph: root@host01 /]# ceph mgr module enable dashboard

[ceph: root@host01 /]# ceph mgr module ls

MODULE
balancer              on (always on)
crash                 on (always on)
devicehealth          on (always on)
orchestrator          on (always on)
pg_autoscaler         on (always on)
progress              on (always on)
rbd_support           on (always on)
status                on (always on)
telemetry             on (always on)
volumes               on (always on)
cephadm               on
dashboard             on
iostat                on
nfs                   on
prometheus            on
restful               on
alerts                -
diskprediction_local  -
influx                -
insights              -
k8sevents             -
localpool             -
mds_autoscaler        -
mirroring             -
osd_perf_query        -
osd_support           -
rgw                   -
rook                  -
selftest              -
snap_schedule         -
stats                 -
telegraf              -
test_orchestrator     -
zabbix                -

[ceph: root@host01 /]# ceph mgr services
{
        "dashboard": "http://myserver.com:7789/",
        "restful": "https://myserver.com:8789/"
}

The first time the cluster starts, it uses the mgr_initial_modules setting to override which modules to enable. However, this setting is ignored through the rest of the lifetime of the cluster: only use it for bootstrapping. For example, before starting your monitor daemons for the first time, you might add a section like this to your ceph.conf file:

[mon]
    mgr initial modules = dashboard balancer

Where a module implements comment line hooks, the commands are accessible as ordinary Ceph commands and Ceph automatically incorporates module commands into the standard CLI interface and route them appropriately to the module:

[ceph: root@host01 /]# ceph <command | help>

You can use the following configuration parameters with the above command:

Table 5.1. Configuration parameters

ConfigurationDescriptionTypeDefault

mgr module path

Path to load modules from.

String

"<library dir>/mgr"

mgr data

Path to load daemon data (such as keyring)

String

"/var/lib/ceph/mgr/$cluster-$id"

mgr tick period

How many seconds between manager beacons to monitors, and other periodic checks.

Integer

5

mon mgr beacon grace

How long after last beacon should a manager be considered failed.

Integer

30

5.5. Using the Ceph Manager balancer module

The balancer is a module for Ceph Manager (ceph-mgr) that optimizes the placement of placement groups (PGs) across OSDs in order to achieve a balanced distribution, either automatically or in a supervised fashion.

Currently the balancer module cannot be disabled. It can only be turned off to customize the configuration.

Modes

There are currently two supported balancer modes:

  • crush-compat: The CRUSH compat mode uses the compat weight-set feature, introduced in Ceph Luminous, to manage an alternative set of weights for devices in the CRUSH hierarchy. The normal weights should remain set to the size of the device to reflect the target amount of data that you want to store on the device. The balancer then optimizes the weight-set values, adjusting them up or down in small increments in order to achieve a distribution that matches the target distribution as closely as possible. Because PG placement is a pseudorandom process, there is a natural amount of variation in the placement; by optimizing the weights, the balancer counter-acts that natural variation.

    This mode is fully backwards compatible with older clients. When an OSDMap and CRUSH map are shared with older clients, the balancer presents the optimized weights as the real weights.

    The primary restriction of this mode is that the balancer cannot handle multiple CRUSH hierarchies with different placement rules if the subtrees of the hierarchy share any OSDs. Because this configuration makes managing space utilization on the shared OSDs difficult, it is generally not recommended. As such, this restriction is normally not an issue.

  • upmap: Starting with Luminous, the OSDMap can store explicit mappings for individual OSDs as exceptions to the normal CRUSH placement calculation. These upmap entries provide fine-grained control over the PG mapping. This CRUSH mode will optimize the placement of individual PGs in order to achieve a balanced distribution. In most cases, this distribution is "perfect", with an equal number of PGs on each OSD +/-1 PG, as they might not divide evenly.

    Important

    To allow use of this feature, you must tell the cluster that it only needs to support luminous or later clients with the following command:

    [ceph: root@host01 /]# ceph osd set-require-min-compat-client luminous

    This command fails if any pre-luminous clients or daemons are connected to the monitors.

    Due to a known issue, kernel CephFS clients report themselves as jewel clients. To work around this issue, use the --yes-i-really-mean-it flag:

    [ceph: root@host01 /]# ceph osd set-require-min-compat-client luminous --yes-i-really-mean-it

    You can check what client versions are in use with:

    [ceph: root@host01 /]# ceph features

Prerequisites

  • A running Red Hat Ceph Storage cluster.

Procedure

  1. Ensure the balancer module is enabled:

    Example

    [ceph: root@host01 /]# ceph mgr module enable balancer

  2. Turn on the balancer module:

    Example

    [ceph: root@host01 /]# ceph balancer on

  3. The default mode is upmap. The mode can be changed with:

    Example

    [ceph: root@host01 /]# ceph balancer mode crush-compact

    or

    Example

    [ceph: root@host01 /]# ceph balancer mode upmap

Status

The current status of the balancer can be checked at any time with:

Example

[ceph: root@host01 /]# ceph balancer status

Automatic balancing

By default, when turning on the balancer module, automatic balancing is used:

Example

[ceph: root@host01 /]# ceph balancer on

The balancer can be turned back off again with:

Example

[ceph: root@host01 /]# ceph balancer off

This will use the crush-compat mode, which is backward compatible with older clients and will make small changes to the data distribution over time to ensure that OSDs are equally utilized.

Throttling

No adjustments will be made to the PG distribution if the cluster is degraded, for example, if an OSD has failed and the system has not yet healed itself.

When the cluster is healthy, the balancer throttles its changes such that the percentage of PGs that are misplaced, or need to be moved, is below a threshold of 5% by default. This percentage can be adjusted using the target_max_misplaced_ratio setting. For example, to increase the threshold to 7%:

Example

[ceph: root@host01 /]# ceph config-key set mgr target_max_misplaced_ratio .07

For automatic balancing:

  • Set the number of seconds to sleep in between runs of the automatic balancer:

Example

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/sleep_interval 60

  • Set the time of day to begin automatic balancing in HHMM format:

Example

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/begin_time 0000

  • Set the time of day to finish automatic balancing in HHMM format:

Example

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/end_time 2359

  • Restrict automatic balancing to this day of the week or later. Uses the same conventions as crontab, 0 is Sunday, 1 is Monday, and so on:

Example

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/begin_weekday 0

  • Restrict automatic balancing to this day of the week or earlier. This uses the same conventions as crontab, 0 is Sunday, 1 is Monday, and so on:

Example

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/end_weekday 6

  • Define the pool IDs to which the automatic balancing is limited. The default for this is an empty string, meaning all pools are balanced. The numeric pool IDs can be gotten with the ceph osd pool ls detail command:

Example

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/pool_ids 1,2,3

Supervised optimization

The balancer operation is broken into a few distinct phases:

  1. Building a plan.
  2. Evaluating the quality of the data distribution, either for the current PG distribution, or the PG distribution that would result after executing a plan.
  3. Executing the plan.

    • To evaluate and score the current distribution:

      Example

      [ceph: root@host01 /]# ceph balancer eval

    • To evaluate the distribution for a single pool:

      Syntax

      ceph balancer eval POOL_NAME

      Example

      [ceph: root@host01 /]# ceph balancer eval rbd

    • To see greater detail for the evaluation:

      Example

      [ceph: root@host01 /]# ceph balancer eval-verbose ...

    • To generate a plan using the currently configured mode:

      Syntax

      ceph balancer optimize PLAN_NAME

      Replace PLAN_NAME with a custom plan name.

      Example

      [ceph: root@host01 /]# ceph balancer optimize rbd_123

    • To see the contents of a plan:

      Syntax

      ceph balancer show PLAN_NAME

      Example

      [ceph: root@host01 /]# ceph balancer show rbd_123

    • To discard old plans:

      Syntax

      ceph balancer rm PLAN_NAME

      Example

      [ceph: root@host01 /]# ceph balancer rm rbd_123

    • To see currently recorded plans use the status command:

      [ceph: root@host01 /]# ceph balancer status
    • To calculate the quality of the distribution that would result after executing a plan:

      Syntax

      ceph balancer eval PLAN_NAME

      Example

      [ceph: root@host01 /]# ceph balancer eval rbd_123

    • To execute the plan:

      Syntax

      ceph balancer execute PLAN_NAME

      Example

      [ceph: root@host01 /]# ceph balancer execute rbd_123

      Note

      Only execute the plan if it is expected to improve the distribution. After execution, the plan will be discarded.

5.6. Using the Ceph Manager alerts module

You can use the Ceph Manager alerts module to send simple alert messages about the Red Hat Ceph Storage cluster’s health by email.

Note

This module is not intended to be a robust monitoring solution. The fact that it is run as part of the Ceph cluster itself is fundamentally limiting in that a failure of the ceph-mgr daemon prevents alerts from being sent. This module can, however, be useful for standalone clusters that exist in environments where existing monitoring infrastructure does not exist.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to the Ceph Monitor node.

Procedure

  1. Log into the Cephadm shell:

    Example

    [root@host01 ~]# cephadm shell

  2. Enable the alerts module:

    Example

    [ceph: root@host01 /]# ceph mgr module enable alerts

  3. Ensure the alerts module is enabled:

    Example

    [ceph: root@host01 /]# ceph mgr module ls | more
    {
        "always_on_modules": [
            "balancer",
            "crash",
            "devicehealth",
            "orchestrator",
            "pg_autoscaler",
            "progress",
            "rbd_support",
            "status",
            "telemetry",
            "volumes"
        ],
        "enabled_modules": [
            "alerts",
            "cephadm",
            "dashboard",
            "iostat",
            "nfs",
            "prometheus",
            "restful"
        ]

  4. Configure the Simple Mail Transfer Protocol (SMTP):

    Syntax

    ceph config set mgr mgr/alerts/smtp_host SMTP_SERVER
    ceph config set mgr mgr/alerts/smtp_destination RECEIVER_EMAIL_ADDRESS
    ceph config set mgr mgr/alerts/smtp_sender SENDER_EMAIL_ADDRESS

    Example

    [ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_host smtp.example.com
    [ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_destination example@example.com
    [ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_sender example2@example.com

  5. Optional: Change the port to 465.

    Syntax

    ceph config set mgr mgr/alerts/smtp_port PORT_NUMBER

    Example

    [ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_port 587

    Important

    SSL is not supported in Red Hat Ceph Storage 5 cluster. Do not set the smtp_ssl parameter while configuring alerts.

  6. Authenticate to the SMTP server:

    Syntax

    ceph config set mgr mgr/alerts/smtp_user USERNAME
    ceph config set mgr mgr/alerts/smtp_password PASSWORD

    Example

    [ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_user admin1234
    [ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_password admin1234

  7. Optional: By default, SMTP From name is Ceph. To change that, set the smtp_from_name parameter:

    Syntax

    ceph config set mgr mgr/alerts/smtp_from_name CLUSTER_NAME

    Example

    [ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_from_name 'Ceph Cluster Test'

  8. Optional: By default, the alerts module checks the storage cluster’s health every minute, and sends a message when there is a change in the cluster health status. To change the frequency, set the interval parameter:

    Syntax

    ceph config set mgr mgr/alerts/interval INTERVAL

    Example

    [ceph: root@host01 /]# ceph config set mgr mgr/alerts/interval "5m"

    In this example, the interval is set to 5 minutes.

  9. Optional: Send an alert immediately:

    Example

    [ceph: root@host01 /]# ceph alerts send

Additional Resources

5.7. Using the Ceph manager crash module

Using the Ceph manager crash module, you can collect information about daemon crashdumps and store it in the Red Hat Ceph Storage cluster for further analysis.

By default, daemon crashdumps are dumped in /var/lib/ceph/crash. You can configure with the option crash dir. Crash directories are named by time, date, and a randomly-generated UUID, and contain a metadata file meta and a recent log file, with a crash_id that is the same.

You can use ceph-crash.service to submit these crash automatically and persist in the Ceph Monitors. The ceph-crash.service watches watches the crashdump directory and uploads them with ceph crash post.

The RECENT_CRASH heath message is one of the most common health messages in a Ceph cluster. This health message means that one or more Ceph daemons has crashed recently, and the crash has not yet been archived or acknowledged by the administrator. This might indicate a software bug, a hardware problem like a failing disk, or some other problem. The option mgr/crash/warn_recent_interval controls the time period of what recent means, which is two weeks by default. You can disable the warnings by running the following command:

Example

[ceph: root@host01 /]# ceph config set mgr/crash/warn_recent_interval 0

The option mgr/crash/retain_interval controls the period for which you want to retain the crash reports before they are automatically purged. The default for this option is one year.

Prerequisites

  • A running Red Hat Ceph Storage cluster.

Procedure

  1. Ensure the crash module is enabled:

    Example

    [ceph: root@host01 /]# ceph mgr module ls | more
    {
        "always_on_modules": [
            "balancer",
            "crash",
            "devicehealth",
            "orchestrator_cli",
            "progress",
            "rbd_support",
            "status",
            "volumes"
        ],
        "enabled_modules": [
            "dashboard",
            "pg_autoscaler",
            "prometheus"
        ]

  2. Save a crash dump: The metadata file is a JSON blob stored in the crash dir as meta. You can invoke the ceph command -i - option, which reads from stdin.

    Example

    [ceph: root@host01 /]# ceph crash post -i meta

  3. List the timestamp or the UUID crash IDs for all the new and archived crash info:

    Example

    [ceph: root@host01 /]# ceph crash ls

  4. List the timestamp or the UUID crash IDs for all the new crash information:

    Example

    [ceph: root@host01 /]# ceph crash ls-new

  5. List the timestamp or the UUID crash IDs for all the new crash information:

    Example

    [ceph: root@host01 /]# ceph crash ls-new

  6. List the summary of saved crash information grouped by age:

    Example

    [ceph: root@host01 /]# ceph crash stat
    8 crashes recorded
    8 older than 1 days old:
    2022-05-20T08:30:14.533316Z_4ea88673-8db6-4959-a8c6-0eea22d305c2
    2022-05-20T08:30:14.590789Z_30a8bb92-2147-4e0f-a58b-a12c2c73d4f5
    2022-05-20T08:34:42.278648Z_6a91a778-bce6-4ef3-a3fb-84c4276c8297
    2022-05-20T08:34:42.801268Z_e5f25c74-c381-46b1-bee3-63d891f9fc2d
    2022-05-20T08:34:42.803141Z_96adfc59-be3a-4a38-9981-e71ad3d55e47
    2022-05-20T08:34:42.830416Z_e45ed474-550c-44b3-b9bb-283e3f4cc1fe
    2022-05-24T19:58:42.549073Z_b2382865-ea89-4be2-b46f-9a59af7b7a2d
    2022-05-24T19:58:44.315282Z_1847afbc-f8a9-45da-94e8-5aef0738954e

  7. View the details of the saved crash:

    Syntax

    ceph crash info CRASH_ID

    Example

    [ceph: root@host01 /]# ceph crash info 2022-05-24T19:58:42.549073Z_b2382865-ea89-4be2-b46f-9a59af7b7a2d
    {
        "assert_condition": "session_map.sessions.empty()",
        "assert_file": "/builddir/build/BUILD/ceph-16.1.0-486-g324d7073/src/mon/Monitor.cc",
        "assert_func": "virtual Monitor::~Monitor()",
        "assert_line": 287,
        "assert_msg": "/builddir/build/BUILD/ceph-16.1.0-486-g324d7073/src/mon/Monitor.cc: In function 'virtual Monitor::~Monitor()' thread 7f67a1aeb700 time 2022-05-24T19:58:42.545485+0000\n/builddir/build/BUILD/ceph-16.1.0-486-g324d7073/src/mon/Monitor.cc: 287: FAILED ceph_assert(session_map.sessions.empty())\n",
        "assert_thread_name": "ceph-mon",
        "backtrace": [
            "/lib64/libpthread.so.0(+0x12b30) [0x7f679678bb30]",
            "gsignal()",
            "abort()",
            "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f6798c8d37b]",
            "/usr/lib64/ceph/libceph-common.so.2(+0x276544) [0x7f6798c8d544]",
            "(Monitor::~Monitor()+0xe30) [0x561152ed3c80]",
            "(Monitor::~Monitor()+0xd) [0x561152ed3cdd]",
            "main()",
            "__libc_start_main()",
            "_start()"
        ],
        "ceph_version": "16.2.8-65.el8cp",
        "crash_id": "2022-07-06T19:58:42.549073Z_b2382865-ea89-4be2-b46f-9a59af7b7a2d",
        "entity_name": "mon.ceph-adm4",
        "os_id": "rhel",
        "os_name": "Red Hat Enterprise Linux",
        "os_version": "8.5 (Ootpa)",
        "os_version_id": "8.5",
        "process_name": "ceph-mon",
        "stack_sig": "957c21d558d0cba4cee9e8aaf9227b3b1b09738b8a4d2c9f4dc26d9233b0d511",
        "timestamp": "2022-07-06T19:58:42.549073Z",
        "utsname_hostname": "host02",
        "utsname_machine": "x86_64",
        "utsname_release": "4.18.0-240.15.1.el8_3.x86_64",
        "utsname_sysname": "Linux",
        "utsname_version": "#1 SMP Wed Jul 06 03:12:15 EDT 2022"
    }

  8. Remove saved crashes older than KEEP days: Here, KEEP must be an integer.

    Syntax

    ceph crash prune KEEP

    Example

    [ceph: root@host01 /]# ceph crash prune 60

  9. Archive a crash report so that it is no longer considered for the RECENT_CRASH health check and does not appear in the crash ls-new output. It appears in the crash ls.

    Syntax

    ceph crash archive CRASH_ID

    Example

    [ceph: root@host01 /]# ceph crash archive 2022-05-24T19:58:42.549073Z_b2382865-ea89-4be2-b46f-9a59af7b7a2d

  10. Archive all crash reports:

    Example

    [ceph: root@host01 /]# ceph crash archive-all

  11. Remove the crash dump:

    Syntax

    ceph crash rm CRASH_ID

    Example

    [ceph: root@host01 /]# ceph crash rm 2022-05-24T19:58:42.549073Z_b2382865-ea89-4be2-b46f-9a59af7b7a2d

Additional Resources