Chapter 5. Ceph metrics for Datadog

The Datadog agent collects the following metrics from Ceph. These metrics may be included in custom dashboards and in alerts.

Metric NameDescription

ceph.commit_latency_ms

The time taken to commit an operation to the journal.

ceph.apply_latency_ms

Time taken to flush an update to disks.

ceph.op_per_sec

The number of I/O operations per second for given pool.

ceph.read_bytes_sec

The bytes per second being read.

ceph.write_bytes_sec

The bytes per second being written.

ceph.num_osds

The number of known storage daemons.

ceph.num_in_osds

The number of participating storage daemons.

ceph.num_up_osds

The number of online storage daemons.

ceph.num_pgs

The number of placement groups available.

ceph.num_mons

The number of monitor daemons.

ceph.aggregate_pct_used

The overall capacity usage metric.

ceph.total_objects

The object count from the underlying object store.

ceph.num_objects

The object count for a given pool.

ceph.read_bytes

The per-pool read bytes.

ceph.write_bytes

The per-pool write bytes.

ceph.num_pools

The number of pools.

ceph.pgstate.active_clean

The number of active+clean placement groups.

ceph.read_op_per_sec

The per-pool read operations per second.

ceph.write_op_per_sec

The per-pool write operations per second.

ceph.num_near_full_osds

The number of nearly full OSDs.

ceph.num_full_osds

The number of full OSDs.

ceph.osd.pct_used

The percentage used of full or near-full OSDs.