Ceph - VM hangs when transferring large amounts of data to RBD disk
Issue
-
Virtual machine boots up with no issues, storage disk from Ceph Cluster (RBD) is able to be mounted to the VM, and a file-system is able to be created. Small files < 1GB are able to be transferred without issue. When moderate sized files >1GB are transferred the VM slows to a halt and eventually locks up.
-
If enabled ceph client side logs in
[client]
section of ceph.conf then in client side logs:
7f0b769e7700 -1 -- 192.168.128.30:0/2021513 >> 192.168.128.35:6800/24374 pipe(0x7f0bcabc0000 sd=-1 :0 s=1 pgs=0 cs=0 l=1 c=0x7f0bc55e1ce0).connect couldn't created socket (24) Too many open files
- The following kernel messages are generated:
Sep 1 16:04:15 xxxx-rds kernel: INFO: task jbd2/vdf-8:2362 blocked for more than 120 seconds.
Sep 1 16:04:15 xxxx-rds kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 1 16:04:15 xxxx-rds kernel: jbd2/vdf-8 D ffff88023fd13680 0 2362 2 0x00000080
Sep 1 16:04:15 xxxx-rds kernel: ffff8800b40c7bb8 0000000000000046 ffff8800bab116c0 ffff8800b40c7fd8
Sep 1 16:04:15 xxxx-rds kernel: ffff8800b40c7fd8 ffff8800b40c7fd8 ffff8800bab116c0 ffff88023fd13f48
Sep 1 16:04:15 xxxx-rds kernel: ffff88023ff5e958 0000000000000002 ffffffff811f8310 ffff8800b40c7c30
Sep 1 16:04:15 xxxx-rds kernel: Call Trace:
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffff811f8310>] ? generic_block_bmap+0x70/0x70
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffff8160955d>] io_schedule+0x9d/0x130
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffff811f831e>] sleep_on_buffer+0xe/0x20
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffff81607330>] __wait_on_bit+0x60/0x90
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffff811f8310>] ? generic_block_bmap+0x70/0x70
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffff816073e7>] out_of_line_wait_on_bit+0x87/0xb0
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffff81098270>] ? autoremove_wake_function+0x40/0x40
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffff811f98da>] __wait_on_buffer+0x2a/0x30
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffffa01b776b>] jbd2_journal_commit_transaction+0x175b/0x19a0 [jbd2]
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffff810125c6>] ? __switch_to+0x136/0x4a0
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffffa01bbda9>] kjournald2+0xc9/0x260 [jbd2]
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffff81098230>] ? wake_up_bit+0x30/0x30
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffffa01bbce0>] ? commit_timeout+0x10/0x10 [jbd2]
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffff8109726f>] kthread+0xcf/0xe0
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffff81613cfc>] ret_from_fork+0x7c/0xb0
Sep 1 16:04:15 xxxx-rds kernel: [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140
- This issue has also been seen when formatting a large number of RBD devices directly attached to instances with
mkfs.xfs
.mkfs.xfs
will block and print a the followinghung_task_timeout
message to messages:
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: INFO: task mkfs.xfs:2393 blocked for more than 120 seconds.
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: mkfs.xfs D 0000000000000000 0 2393 2387 0x00000083
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: ffff880fe416f930 0000000000000082 ffff880fe4c7a280 ffff880fe416ffd8
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: ffff880fe416ffd8 ffff880fe416ffd8 ffff880fe4c7a280 ffff88103fc54780
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: 0000000000000000 7fffffffffffffff 0000000000000000 0000000000000000
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: Call Trace:
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff8163a909>] schedule+0x29/0x70
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff816385f9>] schedule_timeout+0x209/0x2d0
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff81058aaf>] ? kvm_clock_get_cycles+0x1f/0x30
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff810d814c>] ? ktime_get_ts64+0x4c/0xf0
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff81639f3e>] io_schedule_timeout+0xae/0x130
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff81639fd8>] io_schedule+0x18/0x20
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff8121cf93>] do_blockdev_direct_IO+0xc03/0x2620
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff81218bd0>] ? I_BDEV+0x10/0x10
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff8121ea05>] __blockdev_direct_IO+0x55/0x60
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff81218bd0>] ? I_BDEV+0x10/0x10
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff81219227>] blkdev_direct_IO+0x57/0x60
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff81218bd0>] ? I_BDEV+0x10/0x10
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff8116affd>] generic_file_direct_write+0xcd/0x190
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff8116b3dc>] __generic_file_aio_write+0x31c/0x3e0
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff81219b2a>] blkdev_aio_write+0x5a/0xd0
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff811dddad>] do_sync_write+0x8d/0xd0
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff811de5cd>] vfs_write+0xbd/0x1e0
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff8110b554>] ? __audit_syscall_entry+0xb4/0x110
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff811df06f>] SyS_write+0x7f/0xe0
Jun 07 11:25:02 volformattest-rhel7.localdomain kernel: [<ffffffff81645b12>] tracesys+0xdd/0xe2
Environment
- Red Hat Ceph Storage 1.3
- Red Hat Ceph Storage 1.2.3
- Red Hat Enterprise Linux 7
- Red Hat Enterprise Linux 6
- Virtual Machine using Qemu-KVM
- Libvirt
- Ceph Block Device (RBD)
- Large Number of OSD's
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.