Live migration fails due to qemu error in in-place upgrades environment

Solution In Progress - Updated 2024-06-13T20:28:47+00:00 -

Issue

When this issue occurred, it failed with the following error on the destination host (overcloud-compute-1).
QEMU log on destination overcloud-compute-1:

2022-10-18 04:56:01.615+0000: starting up libvirt version: 7.0.0, package: 14.3.module+el8.4.0+11878+84e54169 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2021-07-20-02:46:13, ), qemu version: 5.2.0qemu-kvm-5.2.0-16.module+el8.4.0+12393+838d9165.8, kernel: 4.18.0-305.19.1.el8_4.x86_64, hostname: overcloud-compute-1.localdomain
LC_ALL=C \
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
HOME=/var/lib/libvirt/qemu/domain-1-instance-0000001d \
XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-instance-0000001d/.local/share \
XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-instance-0000001d/.cache \
XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-instance-0000001d/.config \
QEMU_AUDIO_DRV=none \
/usr/libexec/qemu-kvm \
-name guest=instance-0000001d,debug-threads=on \
-S \
-object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-instance-0000001d/master-key.aes \
-machine pc-i440fx-rhel7.6.0,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram \
-cpu SandyBridge,vme=on,hypervisor=on,arat=on,xsaveopt=on \
-m 10240 \
-object memory-backend-ram,id=pc.ram,size=10737418240 \
-overcommit mem-lock=off \
-smp 10,sockets=10,dies=1,cores=1,threads=1 \
-uuid f9b9ddb2-7ed3-4cf5-ba29-0da527dce8fb \
-smbios 'type=1,manufacturer=Red Hat,product=OpenStack Compute,version=20.6.2-2.20210607104828.el8ost.4,serial=f9b9ddb2-7ed3-4cf5-ba29-0da527dce8fb,uuid=f9b9ddb2-7ed3-4cf5-ba29-0da527dce8fb,family=Virtual Machine' \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=33,server,nowait \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=utc,driftfix=slew \
-global kvm-pit.lost_tick_policy=delay \
-no-hpet \
-no-shutdown \
-boot strict=on \
...<snip>...
-netdev tap,fds=35:36:37:38:39:40:41:42:43:44,id=hostnet0,vhost=on,vhostfds=45:46:47:48:49:50:51:52:53:54 \
-device virtio-net-pci,mq=on,vectors=22,rx_queue_size=512,host_mtu=1500,netdev=hostnet0,id=net0,mac=AA:AA:AA:AA:AA:AA,bus=pci.0,addr=0x3 \
-add-fd set=21,fd=56 \
-chardev pty,id=charserial0,logfile=/dev/fdset/21,logappend=on \
-device isa-serial,chardev=charserial0,id=serial0 \
-device usb-tablet,id=input0,bus=usb.0,port=1 \
-vnc BB.BB.BB.BB:0 \
-k ja \
-device cirrus-vga,id=video0,bus=pci.0,addr=0x2 \
-incoming defer \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 \
-object rng-random,id=objrng0,filename=/dev/urandom \
-device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x6 \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on
char device redirected to /dev/pts/0 (label charserial0)
2022-10-18T04:56:01.902671Z qemu-kvm: -device cirrus-vga,id=video0,bus=pci.0,addr=0x2: warning: 'cirrus-vga' is deprecated, please use a different VGA card instead
2022-10-18T04:56:03.999678Z qemu-kvm: get_pci_config_device: Bad config data: i=0x9a read: 11 device: 15 cmask: ff wmask: 0 w1cmask:0
2022-10-18T04:56:03.999711Z qemu-kvm: Failed to load PCIDevice:config
2022-10-18T04:56:03.999719Z qemu-kvm: Failed to load virtio-net:virtio
2022-10-18T04:56:03.999725Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:03.0/virtio-net'
2022-10-18T04:56:04.000902Z qemu-kvm: load of migration failed: Invalid argument
2022-10-18 04:56:04.213+0000: shutting down, reason=crashed

Here is history of the instance.

1. Boot instance on RHEL7.9 at overcloud-compute-1
2. Inplace-upgrade(Hybrid mode) both overcloud-compute-0 and overcloud-compute-1
3. Major Upgrade overcloud-compute-0 (RHEL7.9->RHEL8.4)
4.  live-migration from overcloud-compute-1 to overcloud-compute-0 (RHEL7.9->RHEL8.4)
5. Major Upgrade overcloud-compute-1(RHEL7.9->RHEL8.4)
6.  live-migration from overcloud-compute-0 to overcloud-compute-1 ... failed  (RHEL8.4->RHEL8.4)

Environment

Red Hat OpenStack Platform 16.2 (RHOSP)
nova
The instance has the following conditions causes the following conditions.
- Multi Queue is enabled
- The number of allocated vcps is 9 or more
- Instances are running from OSP13

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Select Your Language

Live migration fails due to qemu error in in-place upgrades environment

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links