Live migration fails due to qemu error in in-place upgrades environment
Issue
- When this issue occurred, it failed with the following error on the destination host (overcloud-compute-1).
- QEMU log on destination overcloud-compute-1:
2022-10-18 04:56:01.615+0000: starting up libvirt version: 7.0.0, package: 14.3.module+el8.4.0+11878+84e54169 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2021-07-20-02:46:13, ), qemu version: 5.2.0qemu-kvm-5.2.0-16.module+el8.4.0+12393+838d9165.8, kernel: 4.18.0-305.19.1.el8_4.x86_64, hostname: overcloud-compute-1.localdomain
LC_ALL=C \
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
HOME=/var/lib/libvirt/qemu/domain-1-instance-0000001d \
XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-instance-0000001d/.local/share \
XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-instance-0000001d/.cache \
XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-instance-0000001d/.config \
QEMU_AUDIO_DRV=none \
/usr/libexec/qemu-kvm \
-name guest=instance-0000001d,debug-threads=on \
-S \
-object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-instance-0000001d/master-key.aes \
-machine pc-i440fx-rhel7.6.0,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram \
-cpu SandyBridge,vme=on,hypervisor=on,arat=on,xsaveopt=on \
-m 10240 \
-object memory-backend-ram,id=pc.ram,size=10737418240 \
-overcommit mem-lock=off \
-smp 10,sockets=10,dies=1,cores=1,threads=1 \
-uuid f9b9ddb2-7ed3-4cf5-ba29-0da527dce8fb \
-smbios 'type=1,manufacturer=Red Hat,product=OpenStack Compute,version=20.6.2-2.20210607104828.el8ost.4,serial=f9b9ddb2-7ed3-4cf5-ba29-0da527dce8fb,uuid=f9b9ddb2-7ed3-4cf5-ba29-0da527dce8fb,family=Virtual Machine' \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=33,server,nowait \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=utc,driftfix=slew \
-global kvm-pit.lost_tick_policy=delay \
-no-hpet \
-no-shutdown \
-boot strict=on \
...<snip>...
-netdev tap,fds=35:36:37:38:39:40:41:42:43:44,id=hostnet0,vhost=on,vhostfds=45:46:47:48:49:50:51:52:53:54 \
-device virtio-net-pci,mq=on,vectors=22,rx_queue_size=512,host_mtu=1500,netdev=hostnet0,id=net0,mac=AA:AA:AA:AA:AA:AA,bus=pci.0,addr=0x3 \
-add-fd set=21,fd=56 \
-chardev pty,id=charserial0,logfile=/dev/fdset/21,logappend=on \
-device isa-serial,chardev=charserial0,id=serial0 \
-device usb-tablet,id=input0,bus=usb.0,port=1 \
-vnc BB.BB.BB.BB:0 \
-k ja \
-device cirrus-vga,id=video0,bus=pci.0,addr=0x2 \
-incoming defer \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 \
-object rng-random,id=objrng0,filename=/dev/urandom \
-device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x6 \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on
char device redirected to /dev/pts/0 (label charserial0)
2022-10-18T04:56:01.902671Z qemu-kvm: -device cirrus-vga,id=video0,bus=pci.0,addr=0x2: warning: 'cirrus-vga' is deprecated, please use a different VGA card instead
2022-10-18T04:56:03.999678Z qemu-kvm: get_pci_config_device: Bad config data: i=0x9a read: 11 device: 15 cmask: ff wmask: 0 w1cmask:0
2022-10-18T04:56:03.999711Z qemu-kvm: Failed to load PCIDevice:config
2022-10-18T04:56:03.999719Z qemu-kvm: Failed to load virtio-net:virtio
2022-10-18T04:56:03.999725Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:03.0/virtio-net'
2022-10-18T04:56:04.000902Z qemu-kvm: load of migration failed: Invalid argument
2022-10-18 04:56:04.213+0000: shutting down, reason=crashed
- Here is history of the instance.
1. Boot instance on RHEL7.9 at overcloud-compute-1
2. Inplace-upgrade(Hybrid mode) both overcloud-compute-0 and overcloud-compute-1
3. Major Upgrade overcloud-compute-0 (RHEL7.9->RHEL8.4)
4. live-migration from overcloud-compute-1 to overcloud-compute-0 (RHEL7.9->RHEL8.4)
5. Major Upgrade overcloud-compute-1(RHEL7.9->RHEL8.4)
6. live-migration from overcloud-compute-0 to overcloud-compute-1 ... failed (RHEL8.4->RHEL8.4)
Environment
- Red Hat OpenStack Platform 16.2 (RHOSP)
- nova
- The instance has the following conditions causes the following conditions.
- Multi Queue is enabled
- The number of allocated vcps is 9 or more
- Instances are running from OSP13
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.