After live migration a windows instance, two qemu processes are writing to same volume caused corruption

Solution Verified - Updated 2024-06-14T19:26:43+00:00 -

Issue

During upgrade procedure, live migration was performed to free compute nodes for reboot.

An end result was that a windows instance was running on 2 compute nodes accessing the same backing storage on ceph backend:

[root@compute-1 ~]# virsh list --all | grep instance-000054f5
 77    instance-000054f5              running

[root@compute-2 ~]# virsh list --all | grep instance-000054f5
 12    instance-000054f5              running

The initial live migration failed and we see from libvirt log the following error:

2016-11-22T17:07:22.131885Z qemu-kvm: VQ 2 size 0x80 < last_avail_idx 0x1 - used_idx 0x2
2016-11-22T17:07:22.132163Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:05.0/virtio-balloon'
2016-11-22T17:07:22.133659Z qemu-kvm: load of migration failed: Operation not permitted
2016-11-22 17:07:22.137+0000: shutting down
2016-11-22 18:48:55.934+0000: starting up libvirt version: 2.0.0, package: 10.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2016-09-21-10:15:26, x86-038.build.eng.bos.redhat.com), qemu version: 2.6.0 (qemu-kvm-rhev-2.6.0-27.el7),

This happened to a windows 2012 instance. This incident caused corruption on the Ceph Volume because two qemu processes wrote to the same volume.

Timeline of what happened and resulted in the issue:

instance is live migrated
Instance is migrated with errors, instance is running on old compute node (compute-1) and is shutoff on destination compute node(compute-2).
User noticed that instance is shutdown and can't be reached. Turned on, now instance is running on both compute nodes.

Instance had to be restored from backup due to data corruption when both active instances wrote to the same backend.

Environment

Red Hat OpenStack Platform 8.0
not update compute running qemu-kvm-rhev-2.3.0-31.el7_2.24.x86_64
updated compute running qemu-kvm-rhev-2.6.0-27.el7.x86_64

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Select Your Language

After live migration a windows instance, two qemu processes are writing to same volume caused corruption

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links