Chapter 2. Configuring logging

This chapter describes how to configure logging for various Ceph subsystems.

Important

Logging is resource intensive. Also, verbose logging can generate a huge amount of data in a relatively short time. If you are encountering problems in a specific subsystem of the cluster, enable logging only of that subsystem. See Section 2.2, “Ceph subsystems” for more information.

In addition, consider setting up a rotation of log files. See Section 2.5, “Accelerating log rotation” for details.

Once you fix any problems you encounter, change the subsystems log and memory levels to their default values. See Appendix A, Ceph subsystems default logging level values for a list of all Ceph subsystems and their default values.

You can configure Ceph logging by:

2.1. Prerequisites

  • A running Red Hat Ceph Storage cluster.

2.2. Ceph subsystems

This section contains information about Ceph subsystems and their logging levels.

Understanding Ceph Subsystems and Their Logging Levels

Ceph consists of several subsystems.

Each subsystem has a logging level of its:

  • Output logs that are stored by default in /var/log/ceph/CLUSTER_FSID/ directory (log level). To enable logging Ceph Monitors, Ceph Manager, Ceph Object Gateway, and any other daemons, set log_to_file to true as detailed in the section Diagnosing the health of a storage cluster
  • Logs that are stored in a memory cache (memory level)

In general, Ceph does not send logs stored in memory to the output logs unless:

  • A fatal signal is raised
  • An assert in source code is triggered
  • You request it

You can set different values for each of these subsystems. Ceph logging levels operate on a scale of 1 to 20, where 1 is terse and 20 is verbose.

Use a single value for the log level and memory level to set them both to the same value. For example, debug_osd = 5 sets the debug level for the ceph-osd daemon to 5.

To use different values for the output log level and the memory level, separate the values with a forward slash (/). For example, debug_mon = 1/5 sets the debug log level for the ceph-mon daemon to 1 and its memory log level to 5.

Table 2.1. Ceph Subsystems and the Logging Default Values

SubsystemLog LevelMemory LevelDescription

asok

1

5

The administration socket

auth

1

5

Authentication

client

0

5

Any application or library that uses librados to connect to the cluster

bluestore

1

5

The BlueStore OSD backend

journal

1

5

The OSD journal

mds

1

5

The Metadata Servers

monc

0

5

The Monitor client handles communication between most Ceph daemons and Monitors

mon

1

5

Monitors

ms

0

5

The messaging system between Ceph components

osd

0

5

The OSD Daemons

paxos

0

5

The algorithm that Monitors use to establish a consensus

rados

0

5

Reliable Autonomic Distributed Object Store, a core component of Ceph

rbd

0

5

The Ceph Block Devices

rgw

1

5

The Ceph Object Gateway

Example Log Outputs

The following examples show the type of messages in the logs when you increase the verbosity for the Monitors and OSDs.

Monitor Debug Settings

debug_ms = 5
debug_mon = 20
debug_paxos = 20
debug_auth = 20

Example Log Output of Monitor Debug Settings

2022-05-12 12:37:04.278761 7f45a9afc700 10 mon.cephn2@0(leader).osd e322 e322: 2 osds: 2 up, 2 in
2022-05-12 12:37:04.278792 7f45a9afc700 10 mon.cephn2@0(leader).osd e322  min_last_epoch_clean 322
2022-05-12 12:37:04.278795 7f45a9afc700 10 mon.cephn2@0(leader).log v1010106 log
2022-05-12 12:37:04.278799 7f45a9afc700 10 mon.cephn2@0(leader).auth v2877 auth
2022-05-12 12:37:04.278811 7f45a9afc700 20 mon.cephn2@0(leader) e1 sync_trim_providers
2022-05-12 12:37:09.278914 7f45a9afc700 11 mon.cephn2@0(leader) e1 tick
2022-05-12 12:37:09.278949 7f45a9afc700 10 mon.cephn2@0(leader).pg v8126 v8126: 64 pgs: 64 active+clean; 60168 kB data, 172 MB used, 20285 MB / 20457 MB avail
2022-05-12 12:37:09.278975 7f45a9afc700 10 mon.cephn2@0(leader).paxosservice(pgmap 7511..8126) maybe_trim trim_to 7626 would only trim 115 < paxos_service_trim_min 250
2022-05-12 12:37:09.278982 7f45a9afc700 10 mon.cephn2@0(leader).osd e322 e322: 2 osds: 2 up, 2 in
2022-05-12 12:37:09.278989 7f45a9afc700  5 mon.cephn2@0(leader).paxos(paxos active c 1028850..1029466) is_readable = 1 - now=2021-08-12 12:37:09.278990 lease_expire=0.000000 has v0 lc 1029466
....
2022-05-12 12:59:18.769963 7f45a92fb700  1 -- 192.168.0.112:6789/0 <== osd.1 192.168.0.114:6800/2801 5724 ==== pg_stats(0 pgs tid 3045 v 0) v1 ==== 124+0+0 (2380105412 0 0) 0x5d96300 con 0x4d5bf40
2022-05-12 12:59:18.770053 7f45a92fb700  1 -- 192.168.0.112:6789/0 --> 192.168.0.114:6800/2801 -- pg_stats_ack(0 pgs tid 3045) v1 -- ?+0 0x550ae00 con 0x4d5bf40
2022-05-12 12:59:32.916397 7f45a9afc700  0 mon.cephn2@0(leader).data_health(1) update_stats avail 53% total 1951 MB, used 780 MB, avail 1053 MB
....
2022-05-12 13:01:05.256263 7f45a92fb700  1 -- 192.168.0.112:6789/0 --> 192.168.0.113:6800/2410 -- mon_subscribe_ack(300s) v1 -- ?+0 0x4f283c0 con 0x4d5b440

OSD Debug Settings

debug_ms = 5
debug_osd = 20

Example Log Output of OSD Debug Settings

2022-05-12 11:27:53.869151 7f5d55d84700  1 -- 192.168.17.3:0/2410 --> 192.168.17.4:6801/2801 -- osd_ping(ping e322 stamp 2021-08-12 11:27:53.869147) v2 -- ?+0 0x63baa00 con 0x578dee0
2022-05-12 11:27:53.869214 7f5d55d84700  1 -- 192.168.17.3:0/2410 --> 192.168.0.114:6801/2801 -- osd_ping(ping e322 stamp 2021-08-12 11:27:53.869147) v2 -- ?+0 0x638f200 con 0x578e040
2022-05-12 11:27:53.870215 7f5d6359f700  1 -- 192.168.17.3:0/2410 <== osd.1 192.168.0.114:6801/2801 109210 ==== osd_ping(ping_reply e322 stamp 2021-08-12 11:27:53.869147) v2 ==== 47+0+0 (261193640 0 0) 0x63c1a00 con 0x578e040
2022-05-12 11:27:53.870698 7f5d6359f700  1 -- 192.168.17.3:0/2410 <== osd.1 192.168.17.4:6801/2801 109210 ==== osd_ping(ping_reply e322 stamp 2021-08-12 11:27:53.869147) v2 ==== 47+0+0 (261193640 0 0) 0x6313200 con 0x578dee0
....
2022-05-12 11:28:10.432313 7f5d6e71f700  5 osd.0 322 tick
2022-05-12 11:28:10.432375 7f5d6e71f700 20 osd.0 322 scrub_random_backoff lost coin flip, randomly backing off
2022-05-12 11:28:10.432381 7f5d6e71f700 10 osd.0 322 do_waiters -- start
2022-05-12 11:28:10.432383 7f5d6e71f700 10 osd.0 322 do_waiters -- finish

2.3. Configuring logging at runtime

You can configure the logging of Ceph subsystems at system runtime to help troubleshoot any issues that might occur.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Access to Ceph debugger.

Procedure

  1. To activate the Ceph debugging output, dout(), at runtime:

    ceph tell TYPE.ID injectargs --debug-SUBSYSTEM VALUE [--NAME VALUE]
  2. Replace:

    • TYPE with the type of Ceph daemons (osd, mon, or mds)
    • ID with a specific ID of the Ceph daemon. Alternatively, use * to apply the runtime setting to all daemons of a particular type.
    • SUBSYSTEM with a specific subsystem.
    • VALUE with a number from 1 to 20, where 1 is terse and 20 is verbose.

      For example, to set the log level for the OSD subsystem on the OSD named osd.0 to 0 and the memory level to 5:

      # ceph tell osd.0 injectargs --debug-osd 0/5

To see the configuration settings at runtime:

  1. Log in to the host with a running Ceph daemon, for example, ceph-osd or ceph-mon.
  2. Display the configuration:

    Syntax

    ceph daemon NAME config show | less

    Example

    [ceph: root@host01 /]# ceph daemon osd.0 config show | less

Additional Resources

2.4. Configuring logging in configuration file

Configure Ceph subsystems to log informational, warning, and error messages to the log file. You can specify the debugging level in the Ceph configuration file, by default /etc/ceph/ceph.conf.

Prerequisites

  • A running Red Hat Ceph Storage cluster.

Procedure

  1. To activate Ceph debugging output, dout() at boot time, add the debugging settings to the Ceph configuration file.

    1. For subsystems common to each daemon, add the settings under the [global] section.
    2. For subsystems for particular daemons, add the settings under a daemon section, such as [mon], [osd], or [mds].

      Example

      [global]
              debug_ms = 1/5
      
      [mon]
              debug_mon = 20
              debug_paxos = 1/5
              debug_auth = 2
      
      [osd]
              debug_osd = 1/5
              debug_monc = 5/20
      
      [mds]
              debug_mds = 1

Additional Resources

2.5. Accelerating log rotation

Increasing debugging level for Ceph components might generate a huge amount of data. If you have almost full disks, you can accelerate log rotation by modifying the Ceph log rotation file at /etc/logrotate.d/ceph. The Cron job scheduler uses this file to schedule log rotation.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to the node.

Procedure

  1. Add the size setting after the rotation frequency to the log rotation file:

    rotate 7
    weekly
    size SIZE
    compress
    sharedscripts

    For example, to rotate a log file when it reaches 500 MB:

    rotate 7
    weekly
    size 500 MB
    compress
    sharedscripts
    size 500M
  2. Open the crontab editor:

    [root@mon ~]# crontab -e
  3. Add an entry to check the /etc/logrotate.d/ceph file. For example, to instruct Cron to check /etc/logrotate.d/ceph every 30 minutes:

    30 * * * * /usr/sbin/logrotate /etc/logrotate.d/ceph >/dev/null 2>&1

2.6. Creating and collecting operation logs for Ceph Object Gateway

User identity information is added to the operation log output. This is used to enable customers to access this information for auditing of S3 access. Track user identities reliably by S3 request in all versions of the Ceph Object Gateway operation log.

Procedure

  1. Find where the logs are located:

    Syntax

    logrotate -f

    Example

    [root@host01 ~]# logrotate -f
    /etc/logrotate.d/ceph-12ab345c-1a2b-11ed-b736-fa163e4f6220

  2. List the logs within the specified location:

    Syntax

    ll LOG_LOCATION

    Example

    [root@host01 ~]# ll /var/log/ceph/12ab345c-1a2b-11ed-b736-fa163e4f6220
     -rw-r--r--. 1 ceph ceph    412 Sep 28 09:26 opslog.log.1.gz

  3. List the current buckets:

    Example

    [root@host01 ~]# /usr/local/bin/s3cmd ls

  4. Create a bucket:

    Syntax

    /usr/local/bin/s3cmd mb s3://NEW_BUCKET_NAME

    Example

    [root@host01 ~]# /usr/local/bin/s3cmd mb s3://bucket1
    Bucket `s3://bucket1` created

  5. List the current logs:

    Syntax

    ll LOG_LOCATION

    Example

    [root@host01 ~]# ll /var/log/ceph/12ab345c-1a2b-11ed-b736-fa163e4f6220
     total 852
     ...
     -rw-r--r--. 1 ceph ceph    920 Jun 29 02:17 opslog.log
     -rw-r--r--. 1 ceph ceph    412 Jun 28 09:26 opslog.log.1.gz

  6. Collect the logs:

    Syntax

    tail -f LOG_LOCATION/opslog.log

    Example

    [root@host01 ~]# tail -f /var/log/ceph/12ab345c-1a2b-11ed-b736-fa163e4f6220/opslog.log
    
    {"bucket":"","time":"2022-09-29T06:17:03.133488Z","time_local":"2022-09-
    29T06:17:03.133488+0000","remote_addr":"10.0.211.66","user":"test1",
    "operation":"list_buckets","uri":"GET /
    HTTP/1.1","http_status":"200","error_code":"","bytes_sent":232,
    "bytes_received":0,"object_size":0,"total_time":9,"user_agent":"","referrer":
    "","trans_id":"tx00000c80881a9acd2952a-006335385f-175e5-primary",
    "authentication_type":"Local","access_key_id":"1234","temp_url":false}
    
    {"bucket":"cn1","time":"2022-09-29T06:17:10.521156Z","time_local":"2022-09-
    29T06:17:10.521156+0000","remote_addr":"10.0.211.66","user":"test1",
    "operation":"create_bucket","uri":"PUT /cn1/
    HTTP/1.1","http_status":"200","error_code":"","bytes_sent":0,
    "bytes_received":0,"object_size":0,"total_time":106,"user_agent":"",
    "referrer":"","trans_id":"tx0000058d60c593632c017-0063353866-175e5-primary",
    "authentication_type":"Local","access_key_id":"1234","temp_url":false}