Chapter 8. Ceph performance counters

As a storage administrator, you can gather performance metrics of the Red Hat Ceph Storage cluster. The Ceph performance counters are a collection of internal infrastructure metrics. The collection, aggregation, and graphing of this metric data can be done by an assortment of tools and can be useful for performance analytics.

8.1. Prerequisites

  • A running Red Hat Ceph Storage cluster.

8.2. Access to Ceph performance counters

The performance counters are available through a socket interface for the Ceph Monitors and the OSDs. The socket file for each respective daemon is located under /var/run/ceph, by default. The performance counters are grouped together into collection names. These collections names represent a subsystem or an instance of a subsystem.

Here is the full list of the Monitor and the OSD collection name categories with a brief description for each :

Monitor Collection Name Categories

  • Cluster Metrics - Displays information about the storage cluster: Monitors, OSDs, Pools, and PGs
  • Level Database Metrics - Displays information about the back-end KeyValueStore database
  • Monitor Metrics - Displays general monitor information
  • Paxos Metrics - Displays information on cluster quorum management
  • Throttle Metrics - Displays the statistics on how the monitor is throttling

OSD Collection Name Categories

  • Write Back Throttle Metrics - Displays the statistics on how the write back throttle is tracking unflushed IO
  • Level Database Metrics - Displays information about the back-end KeyValueStore database
  • Objecter Metrics - Displays information on various object-based operations
  • Read and Write Operations Metrics - Displays information on various read and write operations
  • Recovery State Metrics - Displays - Displays latencies on various recovery states
  • OSD Throttle Metrics - Display the statistics on how the OSD is throttling

RADOS Gateway Collection Name Categories

  • Object Gateway Client Metrics - Displays statistics on GET and PUT requests
  • Objecter Metrics - Displays information on various object-based operations
  • Object Gateway Throttle Metrics - Display the statistics on how the OSD is throttling

8.3. Display the Ceph performance counters

The ceph daemon .. perf schema command outputs the available metrics. Each metric has an associated bit field value type.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to the node.

Procedure

  1. To view the metric’s schema:

    ceph daemon DAEMON_NAME perf schema
    Note

    You must run the ceph daemon command from the node running the daemon.

  2. Executing ceph daemon .. perf schema command from the Monitor node:

    [root@mon ~]# ceph daemon mon.`hostname -s` perf schema

    Example

    {
        "cluster": {
            "num_mon": {
                "type": 2
            },
            "num_mon_quorum": {
                "type": 2
            },
            "num_osd": {
                "type": 2
            },
            "num_osd_up": {
                "type": 2
            },
            "num_osd_in": {
                "type": 2
            },
    ...

  3. Executing the ceph daemon .. perf schema command from the OSD node:

    [root@mon ~]# ceph daemon osd.0 perf schema

    Example

    ...
    "filestore": {
            "journal_queue_max_ops": {
                "type": 2
            },
            "journal_queue_ops": {
                "type": 2
            },
            "journal_ops": {
                "type": 10
            },
            "journal_queue_max_bytes": {
                "type": 2
            },
            "journal_queue_bytes": {
                "type": 2
            },
            "journal_bytes": {
                "type": 10
            },
            "journal_latency": {
                "type": 5
            },
    ...

Table 8.1. The bit field value definitions

BitMeaning

1

Floating point value

2

Unsigned 64-bit integer value

4

Average (Sum + Count)

8

Counter

Each value will have bit 1 or 2 set to indicate the type, either a floating point or an integer value. When bit 4 is set, there will be two values to read, a sum and a count. When bit 8 is set, the average for the previous interval would be the sum delta, since the previous read, divided by the count delta. Alternatively, dividing the values outright would provide the lifetime average value. Typically these are used to measure latencies, the number of requests and a sum of request latencies. Some bit values are combined, for example 5, 6 and 10. A bit value of 5 is a combination of bit 1 and bit 4. This means the average will be a floating point value. A bit value of 6 is a combination of bit 2 and bit 4. This means the average value will be an integer. A bit value of 10 is a combination of bit 2 and bit 8. This means the counter value will be an integer value.

Additional Resources

8.4. Dump the Ceph performance counters

The ceph daemon .. perf dump command outputs the current values and groups the metrics under the collection name for each subsystem.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to the node.

Procedure

  1. To view the current metric data:

    # ceph daemon DAEMON_NAME perf dump
    Note

    You must run the ceph daemon command from the node running the daemon.

  2. Executing ceph daemon .. perf dump command from the Monitor node:

    # ceph daemon mon.`hostname -s` perf dump

    Example

    {
        "cluster": {
            "num_mon": 1,
            "num_mon_quorum": 1,
            "num_osd": 2,
            "num_osd_up": 2,
            "num_osd_in": 2,
    ...

  3. Executing the ceph daemon .. perf dump command from the OSD node:

    # ceph daemon osd.0 perf dump

    Example

    ...
    "filestore": {
            "journal_queue_max_ops": 300,
            "journal_queue_ops": 0,
            "journal_ops": 992,
            "journal_queue_max_bytes": 33554432,
            "journal_queue_bytes": 0,
            "journal_bytes": 934537,
            "journal_latency": {
                "avgcount": 992,
                "sum": 254.975925772
            },
    ...

Additional Resources

8.5. Average count and sum

All latency numbers have a bit field value of 5. This field contains floating point values for the average count and sum. The avgcount is the number of operations within this range and the sum is the total latency in seconds. When dividing the sum by the avgcount this will provide you with an idea of the latency per operation.

Additional Resources

  • To view a short description of each OSD metric available, please see the Ceph OSD table.

8.6. Ceph Monitor metrics

Table 8.2. Cluster Metrics Table

Collection NameMetric NameBit Field ValueShort Description

cluster

num_mon

2

Number of monitors

 

num_mon_quorum

2

Number of monitors in quorum

 

num_osd

2

Total number of OSD

 

num_osd_up

2

Number of OSDs that are up

 

num_osd_in

2

Number of OSDs that are in cluster

 

osd_epoch

2

Current epoch of OSD map

 

osd_bytes

2

Total capacity of cluster in bytes

 

osd_bytes_used

2

Number of used bytes on cluster

 

osd_bytes_avail

2

Number of available bytes on cluster

 

num_pool

2

Number of pools

 

num_pg

2

Total number of placement groups

 

num_pg_active_clean

2

Number of placement groups in active+clean state

 

num_pg_active

2

Number of placement groups in active state

 

num_pg_peering

2

Number of placement groups in peering state

 

num_object

2

Total number of objects on cluster

 

num_object_degraded

2

Number of degraded (missing replicas) objects

 

num_object_misplaced

2

Number of misplaced (wrong location in the cluster) objects

 

num_object_unfound

2

Number of unfound objects

 

num_bytes

2

Total number of bytes of all objects

 

num_mds_up

2

Number of MDSs that are up

 

num_mds_in

2

Number of MDS that are in cluster

 

num_mds_failed

2

Number of failed MDS

 

mds_epoch

2

Current epoch of MDS map

Table 8.3. Level Database Metrics Table

Collection NameMetric NameBit Field ValueShort Description

leveldb

leveldb_get

10

Gets

 

leveldb_transaction

10

Transactions

 

leveldb_compact

10

Compactions

 

leveldb_compact_range

10

Compactions by range

 

leveldb_compact_queue_merge

10

Mergings of ranges in compaction queue

 

leveldb_compact_queue_len

2

Length of compaction queue

Table 8.4. General Monitor Metrics Table

Collection NameMetric NameBit Field ValueShort Description

mon

num_sessions

2

Current number of opened monitor sessions

 

session_add

10

Number of created monitor sessions

 

session_rm

10

Number of remove_session calls in monitor

 

session_trim

10

Number of trimed monitor sessions

 

num_elections

10

Number of elections monitor took part in

 

election_call

10

Number of elections started by monitor

 

election_win

10

Number of elections won by monitor

 

election_lose

10

Number of elections lost by monitor

Table 8.5. Paxos Metrics Table

Collection NameMetric NameBit Field ValueShort Description

paxos

start_leader

10

Starts in leader role

 

start_peon

10

Starts in peon role

 

restart

10

Restarts

 

refresh

10

Refreshes

 

refresh_latency

5

Refresh latency

 

begin

10

Started and handled begins

 

begin_keys

6

Keys in transaction on begin

 

begin_bytes

6

Data in transaction on begin

 

begin_latency

5

Latency of begin operation

 

commit

10

Commits

 

commit_keys

6

Keys in transaction on commit

 

commit_bytes

6

Data in transaction on commit

 

commit_latency

5

Commit latency

 

collect

10

Peon collects

 

collect_keys

6

Keys in transaction on peon collect

 

collect_bytes

6

Data in transaction on peon collect

 

collect_latency

5

Peon collect latency

 

collect_uncommitted

10

Uncommitted values in started and handled collects

 

collect_timeout

10

Collect timeouts

 

accept_timeout

10

Accept timeouts

 

lease_ack_timeout

10

Lease acknowledgement timeouts

 

lease_timeout

10

Lease timeouts

 

store_state

10

Store a shared state on disk

 

store_state_keys

6

Keys in transaction in stored state

 

store_state_bytes

6

Data in transaction in stored state

 

store_state_latency

5

Storing state latency

 

share_state

10

Sharings of state

 

share_state_keys

6

Keys in shared state

 

share_state_bytes

6

Data in shared state

 

new_pn

10

New proposal number queries

 

new_pn_latency

5

New proposal number getting latency

Table 8.6. Throttle Metrics Table

Collection NameMetric NameBit Field ValueShort Description

throttle-*

val

10

Currently available throttle

 

max

10

Max value for throttle

 

get

10

Gets

 

get_sum

10

Got data

 

get_or_fail_fail

10

Get blocked during get_or_fail

 

get_or_fail_success

10

Successful get during get_or_fail

 

take

10

Takes

 

take_sum

10

Taken data

 

put

10

Puts

 

put_sum

10

Put data

 

wait

5

Waiting latency

8.7. Ceph OSD metrics

Table 8.7. Write Back Throttle Metrics Table

Collection NameMetric NameBit Field ValueShort Description

WBThrottle

bytes_dirtied

2

Dirty data

 

bytes_wb

2

Written data

 

ios_dirtied

2

Dirty operations

 

ios_wb

2

Written operations

 

inodes_dirtied

2

Entries waiting for write

 

inodes_wb

2

Written entries

Table 8.8. Level Database Metrics Table

Collection NameMetric NameBit Field ValueShort Description

leveldb

leveldb_get

10

Gets

 

leveldb_transaction

10

Transactions

 

leveldb_compact

10

Compactions

 

leveldb_compact_range

10

Compactions by range

 

leveldb_compact_queue_merge

10

Mergings of ranges in compaction queue

 

leveldb_compact_queue_len

2

Length of compaction queue

Table 8.9. Objecter Metrics Table

Collection NameMetric NameBit Field ValueShort Description

objecter

op_active

2

Active operations

 

op_laggy

2

Laggy operations

 

op_send

10

Sent operations

 

op_send_bytes

10

Sent data

 

op_resend

10

Resent operations

 

op_ack

10

Commit callbacks

 

op_commit

10

Operation commits

 

op

10

Operation

 

op_r

10

Read operations

 

op_w

10

Write operations

 

op_rmw

10

Read-modify-write operations

 

op_pg

10

PG operation

 

osdop_stat

10

Stat operations

 

osdop_create

10

Create object operations

 

osdop_read

10

Read operations

 

osdop_write

10

Write operations

 

osdop_writefull

10

Write full object operations

 

osdop_append

10

Append operation

 

osdop_zero

10

Set object to zero operations

 

osdop_truncate

10

Truncate object operations

 

osdop_delete

10

Delete object operations

 

osdop_mapext

10

Map extent operations

 

osdop_sparse_read

10

Sparse read operations

 

osdop_clonerange

10

Clone range operations

 

osdop_getxattr

10

Get xattr operations

 

osdop_setxattr

10

Set xattr operations

 

osdop_cmpxattr

10

Xattr comparison operations

 

osdop_rmxattr

10

Remove xattr operations

 

osdop_resetxattrs

10

Reset xattr operations

 

osdop_tmap_up

10

TMAP update operations

 

osdop_tmap_put

10

TMAP put operations

 

osdop_tmap_get

10

TMAP get operations

 

osdop_call

10

Call (execute) operations

 

osdop_watch

10

Watch by object operations

 

osdop_notify

10

Notify about object operations

 

osdop_src_cmpxattr

10

Extended attribute comparison in multi operations

 

osdop_other

10

Other operations

 

linger_active

2

Active lingering operations

 

linger_send

10

Sent lingering operations

 

linger_resend

10

Resent lingering operations

 

linger_ping

10

Sent pings to lingering operations

 

poolop_active

2

Active pool operations

 

poolop_send

10

Sent pool operations

 

poolop_resend

10

Resent pool operations

 

poolstat_active

2

Active get pool stat operations

 

poolstat_send

10

Pool stat operations sent

 

poolstat_resend

10

Resent pool stats

 

statfs_active

2

Statfs operations

 

statfs_send

10

Sent FS stats

 

statfs_resend

10

Resent FS stats

 

command_active

2

Active commands

 

command_send

10

Sent commands

 

command_resend

10

Resent commands

 

map_epoch

2

OSD map epoch

 

map_full

10

Full OSD maps received

 

map_inc

10

Incremental OSD maps received

 

osd_sessions

2

Open sessions

 

osd_session_open

10

Sessions opened

 

osd_session_close

10

Sessions closed

 

osd_laggy

2

Laggy OSD sessions

Table 8.10. Read and Write Operations Metrics Table

Collection NameMetric NameBit Field ValueShort Description

osd

op_wip

2

Replication operations currently being processed (primary)

 

op_in_bytes

10

Client operations total write size

 

op_out_bytes

10

Client operations total read size

 

op_latency

5

Latency of client operations (including queue time)

 

op_process_latency

5

Latency of client operations (excluding queue time)

 

op_r

10

Client read operations

 

op_r_out_bytes

10

Client data read

 

op_r_latency

5

Latency of read operation (including queue time)

 

op_r_process_latency

5

Latency of read operation (excluding queue time)

 

op_w

10

Client write operations

 

op_w_in_bytes

10

Client data written

 

op_w_rlat

5

Client write operation readable/applied latency

 

op_w_latency

5

Latency of write operation (including queue time)

 

op_w_process_latency

5

Latency of write operation (excluding queue time)

 

op_rw

10

Client read-modify-write operations

 

op_rw_in_bytes

10

Client read-modify-write operations write in

 

op_rw_out_bytes

10

Client read-modify-write operations read out

 

op_rw_rlat

5

Client read-modify-write operation readable/applied latency

 

op_rw_latency

5

Latency of read-modify-write operation (including queue time)

 

op_rw_process_latency

5

Latency of read-modify-write operation (excluding queue time)

 

subop

10

Suboperations

 

subop_in_bytes

10

Suboperations total size

 

subop_latency

5

Suboperations latency

 

subop_w

10

Replicated writes

 

subop_w_in_bytes

10

Replicated written data size

 

subop_w_latency

5

Replicated writes latency

 

subop_pull

10

Suboperations pull requests

 

subop_pull_latency

5

Suboperations pull latency

 

subop_push

10

Suboperations push messages

 

subop_push_in_bytes

10

Suboperations pushed size

 

subop_push_latency

5

Suboperations push latency

 

pull

10

Pull requests sent

 

push

10

Push messages sent

 

push_out_bytes

10

Pushed size

 

push_in

10

Inbound push messages

 

push_in_bytes

10

Inbound pushed size

 

recovery_ops

10

Started recovery operations

 

loadavg

2

CPU load

 

buffer_bytes

2

Total allocated buffer size

 

numpg

2

Placement groups

 

numpg_primary

2

Placement groups for which this osd is primary

 

numpg_replica

2

Placement groups for which this osd is replica

 

numpg_stray

2

Placement groups ready to be deleted from this osd

 

heartbeat_to_peers

2

Heartbeat (ping) peers we send to

 

heartbeat_from_peers

2

Heartbeat (ping) peers we recv from

 

map_messages

10

OSD map messages

 

map_message_epochs

10

OSD map epochs

 

map_message_epoch_dups

10

OSD map duplicates

 

stat_bytes

2

OSD size

 

stat_bytes_used

2

Used space

 

stat_bytes_avail

2

Available space

 

copyfrom

10

Rados 'copy-from' operations

 

tier_promote

10

Tier promotions

 

tier_flush

10

Tier flushes

 

tier_flush_fail

10

Failed tier flushes

 

tier_try_flush

10

Tier flush attempts

 

tier_try_flush_fail

10

Failed tier flush attempts

 

tier_evict

10

Tier evictions

 

tier_whiteout

10

Tier whiteouts

 

tier_dirty

10

Dirty tier flag set

 

tier_clean

10

Dirty tier flag cleaned

 

tier_delay

10

Tier delays (agent waiting)

 

tier_proxy_read

10

Tier proxy reads

 

agent_wake

10

Tiering agent wake up

 

agent_skip

10

Objects skipped by agent

 

agent_flush

10

Tiering agent flushes

 

agent_evict

10

Tiering agent evictions

 

object_ctx_cache_hit

10

Object context cache hits

 

object_ctx_cache_total

10

Object context cache lookups

Table 8.11. Recovery State Metrics Table

Collection NameMetric NameBit Field ValueShort Description

recoverystate_perf

initial_latency

5

Initial recovery state latency

 

started_latency

5

Started recovery state latency

 

reset_latency

5

Reset recovery state latency

 

start_latency

5

Start recovery state latency

 

primary_latency

5

Primary recovery state latency

 

peering_latency

5

Peering recovery state latency

 

backfilling_latency

5

Backfilling recovery state latency

 

waitremotebackfillreserved_latency

5

Wait remote backfill reserved recovery state latency

 

waitlocalbackfillreserved_latency

5

Wait local backfill reserved recovery state latency

 

notbackfilling_latency

5

Notbackfilling recovery state latency

 

repnotrecovering_latency

5

Repnotrecovering recovery state latency

 

repwaitrecoveryreserved_latency

5

Rep wait recovery reserved recovery state latency

 

repwaitbackfillreserved_latency

5

Rep wait backfill reserved recovery state latency

 

RepRecovering_latency

5

RepRecovering recovery state latency

 

activating_latency

5

Activating recovery state latency

 

waitlocalrecoveryreserved_latency

5

Wait local recovery reserved recovery state latency

 

waitremoterecoveryreserved_latency

5

Wait remote recovery reserved recovery state latency

 

recovering_latency

5

Recovering recovery state latency

 

recovered_latency

5

Recovered recovery state latency

 

clean_latency

5

Clean recovery state latency

 

active_latency

5

Active recovery state latency

 

replicaactive_latency

5

Replicaactive recovery state latency

 

stray_latency

5

Stray recovery state latency

 

getinfo_latency

5

Getinfo recovery state latency

 

getlog_latency

5

Getlog recovery state latency

 

waitactingchange_latency

5

Waitactingchange recovery state latency

 

incomplete_latency

5

Incomplete recovery state latency

 

getmissing_latency

5

Getmissing recovery state latency

 

waitupthru_latency

5

Waitupthru recovery state latency

Table 8.12. OSD Throttle Metrics Table

Collection NameMetric NameBit Field ValueShort Description

throttle-*

val

10

Currently available throttle

 

max

10

Max value for throttle

 

get

10

Gets

 

get_sum

10

Got data

 

get_or_fail_fail

10

Get blocked during get_or_fail

 

get_or_fail_success

10

Successful get during get_or_fail

 

take

10

Takes

 

take_sum

10

Taken data

 

put

10

Puts

 

put_sum

10

Put data

 

wait

5

Waiting latency

8.8. Ceph Object Gateway metrics

Table 8.13. RADOS Client Metrics Table

Collection NameMetric NameBit Field ValueShort Description

client.rgw.<rgw_node_name>

req

10

Requests

 

failed_req

10

Aborted requests

 

get

10

Gets

 

get_b

10

Size of gets

 

get_initial_lat

5

Get latency

 

put

10

Puts

 

put_b

10

Size of puts

 

put_initial_lat

5

Put latency

 

qlen

2

Queue length

 

qactive

2

Active requests queue

 

cache_hit

10

Cache hits

 

cache_miss

10

Cache miss

 

keystone_token_cache_hit

10

Keystone token cache hits

 

keystone_token_cache_miss

10

Keystone token cache miss

 

gc_retire_object

10

Count of objects retired since the last restart of the Ceph Object Gateway

Table 8.14. Objecter Metrics Table

Collection NameMetric NameBit Field ValueShort Description

objecter

op_active

2

Active operations

 

op_laggy

2

Laggy operations

 

op_send

10

Sent operations

 

op_send_bytes

10

Sent data

 

op_resend

10

Resent operations

 

op_ack

10

Commit callbacks

 

op_commit

10

Operation commits

 

op

10

Operation

 

op_r

10

Read operations

 

op_w

10

Write operations

 

op_rmw

10

Read-modify-write operations

 

op_pg

10

PG operation

 

osdop_stat

10

Stat operations

 

osdop_create

10

Create object operations

 

osdop_read

10

Read operations

 

osdop_write

10

Write operations

 

osdop_writefull

10

Write full object operations

 

osdop_append

10

Append operation

 

osdop_zero

10

Set object to zero operations

 

osdop_truncate

10

Truncate object operations

 

osdop_delete

10

Delete object operations

 

osdop_mapext

10

Map extent operations

 

osdop_sparse_read

10

Sparse read operations

 

osdop_clonerange

10

Clone range operations

 

osdop_getxattr

10

Get xattr operations

 

osdop_setxattr

10

Set xattr operations

 

osdop_cmpxattr

10

Xattr comparison operations

 

osdop_rmxattr

10

Remove xattr operations

 

osdop_resetxattrs

10

Reset xattr operations

 

osdop_tmap_up

10

TMAP update operations

 

osdop_tmap_put

10

TMAP put operations

 

osdop_tmap_get

10

TMAP get operations

 

osdop_call

10

Call (execute) operations

 

osdop_watch

10

Watch by object operations

 

osdop_notify

10

Notify about object operations

 

osdop_src_cmpxattr

10

Extended attribute comparison in multi operations

 

osdop_other

10

Other operations

 

linger_active

2

Active lingering operations

 

linger_send

10

Sent lingering operations

 

linger_resend

10

Resent lingering operations

 

linger_ping

10

Sent pings to lingering operations

 

poolop_active

2

Active pool operations

 

poolop_send

10

Sent pool operations

 

poolop_resend

10

Resent pool operations

 

poolstat_active

2

Active get pool stat operations

 

poolstat_send

10

Pool stat operations sent

 

poolstat_resend

10

Resent pool stats

 

statfs_active

2

Statfs operations

 

statfs_send

10

Sent FS stats

 

statfs_resend

10

Resent FS stats

 

command_active

2

Active commands

 

command_send

10

Sent commands

 

command_resend

10

Resent commands

 

map_epoch

2

OSD map epoch

 

map_full

10

Full OSD maps received

 

map_inc

10

Incremental OSD maps received

 

osd_sessions

2

Open sessions

 

osd_session_open

10

Sessions opened

 

osd_session_close

10

Sessions closed

 

osd_laggy

2

Laggy OSD sessions

Table 8.15. RADOS Gateway Throttle Metrics Table

Collection NameMetric NameBit Field ValueShort Description

throttle-*

val

10

Currently available throttle

 

max

10

Max value for throttle

 

get

10

Gets

 

get_sum

10

Got data

 

get_or_fail_fail

10

Get blocked during get_or_fail

 

get_or_fail_success

10

Successful get during get_or_fail

 

take

10

Takes

 

take_sum

10

Taken data

 

put

10

Puts

 

put_sum

10

Put data

 

wait

5

Waiting latency