[Ceph/ODF]: An ssd OSD has "slow ops" and high client IO latency when deleting RBD snapshots in batches.
Issue
An ssd OSD has "slow ops
" and high client IO latency when deleting RBD snapshots in batches.
Example: Please note all the active Snaptrim Tasks involving osd.20
and the same OSD is experiencing slow ops
.
$ ceph status
cluster:
id: Redacted
health: HEALTH_WARN
1 MDSs report slow metadata IOs
1 MDSs report slow requests
8 slow ops, oldest one blocked for 755 sec, osd.20 has slow ops
services:
mon: 3 daemons, quorum be,bf,bg (age 24h)
mgr: b(active, since 24h), standbys: a
mds: 1/1 daemons up, 1 hot standby
osd: 16 osds: 16 up (since 24h), 16 in (since 3d)
data:
volumes: 1/1 healthy
pools: 4 pools, 657 pgs
objects: 4.40M objects, 8.0 TiB
usage: 24 TiB used, 69 TiB / 93 TiB avail
pgs: 371 active+clean
260 active+clean+snaptrim_wait
25 active+clean+snaptrim
1 active+clean+scrubbing+deep
io:
client: 104 KiB/s rd, 830 KiB/s wr, 4 op/s rd, 72 op/s wr
$ ceph pg dump | grep "snaptrim " | cut -c1-250 | sed -e 's/ *$//'
2.fb 4453 0 0 0 0 16860394496 182 5 2494 3000 2494 active+clean+snaptrim 2025-05-30T07:33:58.801208+0000 184878'209456269 184878:240898235 [6,13,20]
2.d9 4126 0 0 0 0 15874811486 165 5 2310 3000 2310 active+clean+snaptrim 2025-05-30T07:22:05.863316+0000 184878'186284725 184878:202419495 [9,12,20]
2.d1 4204 0 0 0 0 16331157154 413 16 2361 3000 2361 active+clean+snaptrim 2025-05-30T07:34:38.117882+0000 184878'200127472 184878:228118610 [7,3,20]
2.c9 4627 0 0 0 0 17536827922 0 0 2334 3000 2334 active+clean+snaptrim 2025-05-30T07:22:05.862362+0000 184878'183062380 184878:210773020 [20,4,9]
2.b1 4336 0 0 0 0 16377914530 639 40 2328 3000 2328 active+clean+snaptrim 2025-05-30T07:33:41.896198+0000 184878'224808727 184878:290845308 [20,10,0]
2.84 4047 0 0 0 0 15485811386 168 5 2552 3000 2552 active+clean+snaptrim 2025-05-30T07:32:31.309154+0000 184878'222252133 184878:236729486 [12,20,13]
2.83 4292 0 0 0 0 16547249350 363 15 2486 3000 2486 active+clean+snaptrim 2025-05-30T07:33:27.684351+0000 184878'268035684 184878:296568854 [6,9,20]
2.29 4232 0 0 0 0 16335212544 707 25 2512 3000 2512 active+clean+snaptrim 2025-05-30T07:35:34.898580+0000 184878'293565182 184878:322682305 [11,20,6]
2.15 4399 0 0 0 0 16766092072 239 8 2718 3000 2718 active+clean+snaptrim 2025-05-30T07:34:32.404531+0000 184878'243693131 184878:270964014 [1,0,20]
2.42 4218 0 0 0 0 16185806884 0 0 2379 3000 2379 active+clean+snaptrim 2025-05-30T07:35:57.717792+0000 184878'254681113 184878:300503812 [3,7,20]
2.6b 4087 0 0 0 0 15647022503 413 16 2549 3000 2549 active+clean+snaptrim 2025-05-30T07:30:24.466126+0000 184878'190120868 184878:225720216 [10,20,7]
2.70 4100 0 0 0 0 15665591362 435 16 2639 3000 2639 active+clean+snaptrim 2025-05-30T07:35:48.609720+0000 184878'221198954 184878:249457354 [15,4,20]
2.74 4218 0 0 0 0 16273674406 76 8 2444 3000 2444 active+clean+snaptrim 2025-05-30T07:32:53.794684+0000 184878'189143746 184878:217428080 [0,20,1]
2.103 4042 0 0 0 0 15554657280 718 30 2419 3000 2419 active+clean+snaptrim 2025-05-30T07:31:00.115382+0000 184878'266151592 184878:294483261 [0,13,20]
2.109 3890 0 0 0 0 14803629346 10172 49 2587 3000 2587 active+clean+snaptrim 2025-05-30T07:35:29.707839+0000 184878'264042549 184878:283312923 [13,0,20]
2.10a 4427 0 0 0 0 16706313216 478 19 2381 3000 2381 active+clean+snaptrim 2025-05-30T07:35:24.598778+0000 184878'202604431 184878:222881307 [9,1,20]
2.11a 4228 0 0 0 0 15971760470 3163 48 2619 3000 2619 active+clean+snaptrim 2025-05-30T07:35:57.717739+0000 184878'217259743 184878:268351123 [15,13,20]
2.185 4452 0 0 0 0 16917721142 0 0 2547 3000 2547 active+clean+snaptrim 2025-05-30T07:35:25.115743+0000 184878'264031313 184878:300341474 [4,20,12]
2.1b2 4206 0 0 0 0 16425524386 380 15 2516 3000 2516 active+clean+snaptrim 2025-05-30T07:33:42.669719+0000 184878'206472954 184878:235053712 [4,19,6]
2.1c1 4091 0 0 0 0 15438998726 575 21 2355 3000 2355 active+clean+snaptrim 2025-05-30T07:36:11.617496+0000 184878'194786402 184878:218405612 [7,11,0]
2.1fc 4044 0 0 0 0 15450713038 294 9 2425 3000 2425 active+clean+snaptrim 2025-05-30T07:32:53.794799+0000 184878'194622529 184878:220838653 [3,10,20]
Environment
- Red Hat OpenShift Container Platform (OCP) 4.x
- Red Hat OpenShift Data Foundation (ODF) 4.x
- Red Hat Ceph Storage 4
- Red Hat Ceph Storage 5
- Red Hat Ceph Storage 6
- Red Hat Ceph Storage 7
- Red Hat Ceph Storage 8
- Ceph (RADOS) Block Devices (RBD)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.