rhos7: live-migration of an instance on iSCSI works one time, but migrating back leaves instance unusable
Issue
- live-migration of an instance on iSCSI works one time, but migrating back leaves instance unusable:
Steps to reproduce the problem:
1. Create two boot volumes from an image
[stack@osp7dr1 ~(OC-admin)]$ cinder list
+--------------------------------------+--------+--------------+------+---------------+----------+--------------------------------------+
| ID | Status | Display Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+--------+--------------+------+---------------+----------+--------------------------------------+
| 6db84a20-cf8d-4fc4-9d7f-1d6455badbd6 | in-use | root_vmA1_12 | 10 | DX2-ISCSI-RG | true | 70fd494a-3065-4f42-b25c-3f320ce1393b |
| ae0fe26d-570f-4cbf-9b6c-464435a03516 | in-use | root_vmA1_2 | 10 | DX2-ISCSI-TPP | true | 2841a824-e693-40de-80b0-94cf72a03c97 |
+--------------------------------------+--------+--------------+------+---------------+----------+--------------------------------------+
2. Create two instances from the boot volumes
[stack@osp7dr1 ~(OC-admin)]$ nova list --name vmA1 --fields OS-EXT-SRV-ATTR:host,status,name
+--------------------------------------+------------------------------+--------+---------+
| ID | OS-EXT-SRV-ATTR: Host | Status | Name |
+--------------------------------------+------------------------------+--------+---------+
| 70fd494a-3065-4f42-b25c-3f320ce1393b | osp7r1-compute-1.localdomain | ACTIVE | vmA1_12 |
| 2841a824-e693-40de-80b0-94cf72a03c97 | osp7r1-compute-2.localdomain | ACTIVE | vmA1_2 |
+--------------------------------------+------------------------------+--------+---------+
3. Configure Live Migration
4. Live Migrate instance "vmA1_12" from osp7r1-compute-1.localdomain to osp7r1-compute-2.localdomain
[stack@osp7dr1 ~(OC-admin)]$ nova live-migration vmA1_12 osp7r1-compute-2.localdomain
nova list --name vmA1 --fields OS-EXT-SRV-ATTR:host,status,name
+--------------------------------------+------------------------------+-----------+---------+
| ID | OS-EXT-SRV-ATTR: Host | Status | Name |
+--------------------------------------+------------------------------+-----------+---------+
| 70fd494a-3065-4f42-b25c-3f320ce1393b | osp7r1-compute-1.localdomain | MIGRATING | vmA1_12 |
| 2841a824-e693-40de-80b0-94cf72a03c97 | osp7r1-compute-2.localdomain | ACTIVE | vmA1_2 |
+--------------------------------------+------------------------------+-----------+---------+
5. After migration check the status of the instances - both should now be uctive on osp7r1-compute-2.localdomain
nova list --name vmA1 --fields OS-EXT-SRV-ATTR:host,status,name
+--------------------------------------+------------------------------+--------+---------+
| ID | OS-EXT-SRV-ATTR: Host | Status | Name |
+--------------------------------------+------------------------------+--------+---------+
| 2841a824-e693-40de-80b0-94cf72a03c97 | osp7r1-compute-2.localdomain | ACTIVE | vmA1_2 |
| 70fd494a-3065-4f42-b25c-3f320ce1393b | osp7r1-compute-2.localdomain | ACTIVE | vmA1_12 |
+--------------------------------------+------------------------------+--------+---------+
[root@osp7r1-compute-2 ~]# virsh list
Id Name State
----------------------------------------------------
2 instance-00000005 running
3 instance-00000002 running
[root@osp7r1-compute-1 ~]# virsh list
Id Name State
----------------------------------------------------
[root@osp7r1-compute-1 ~]#
Check the multipath devices on both compute nodes:
[root@osp7r1-compute-1 ~]# multipath -ll
[root@osp7r1-compute-1 ~]#
[root@osp7r1-compute-2 ~]# multipath -ll
3600000e00d28000000280d7e00080000 dm-2 FUJITSU ,ETERNUS_DXL
size=10G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| |- 11:0:0:1 sdf 8:80 active ready running
| `- 12:0:0:1 sdg 8:96 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
|- 13:0:0:1 sdh 8:112 active ready running
`- 14:0:0:1 sdi 8:128 active ready running
3600000e00d28000000280d7e00060000 dm-0 FUJITSU ,ETERNUS_DXL
size=10G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| |- 13:0:0:0 sdd 8:48 active ready running
| `- 14:0:0:0 sde 8:64 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
|- 11:0:0:0 sdb 8:16 active ready running
`- 12:0:0:0 sdc 8:32 active ready running
Make sure both instances are runing and can write data to the disk ...
Instance vmA1_12:
[root@ros2client ~]# ssh -i key-admin.pem -p 6840 cloud-user@172.0.0.112
Last login: Thu Dec 3 07:30:21 2015 from 172.0.0.33
[cloud-user@vma1-12 ~]$
[cloud-user@vma1-12 ~]$
[cloud-user@vma1-12 ~]$ sudo -i
[root@vma1-12 ~]# dd if=/dev/zero of=/home/file count=1000000 conv=sync bs=1024
1000000+0 records in
1000000+0 records out
1024000000 bytes (1.0 GB) copied, 1.62285 s, 631 MB/s
Instance vmA1_2:
[root@ros2client ~]# ssh -i key-admin.pem -p 6860 cloud-user@172.0.0.112
Last login: Thu Dec 3 07:30:36 2015 from 172.0.0.33
[cloud-user@vma1-2 ~]$ sudo -i
[root@vma1-2 ~]# dd if=/dev/zero of=/home/file count=1000000 conv=sync bs=1024
1000000+0 records in
1000000+0 records out
1024000000 bytes (1.0 GB) copied, 1.63544 s, 626 MB/s
[root@vma1-2 ~]#
6. Now start live migration from instance vmA1_12 back to osp7r1-compute-1
[stack@osp7dr1 ~(OC-admin)]$ nova live-migration vmA1_12 osp7r1-compute-1.localdomain
nova list --name vmA1 --fields OS-EXT-SRV-ATTR:host,status,name
+--------------------------------------+------------------------------+-----------+---------+
| ID | OS-EXT-SRV-ATTR: Host | Status | Name |
+--------------------------------------+------------------------------+-----------+---------+
| 2841a824-e693-40de-80b0-94cf72a03c97 | osp7r1-compute-2.localdomain | ACTIVE | vmA1_2 |
| 70fd494a-3065-4f42-b25c-3f320ce1393b | osp7r1-compute-2.localdomain | MIGRATING | vmA1_12 |
+--------------------------------------+------------------------------+-----------+---------+
nova list --name vmA1 --fields OS-EXT-SRV-ATTR:host,status,name
+--------------------------------------+------------------------------+--------+---------+
| ID | OS-EXT-SRV-ATTR: Host | Status | Name |
+--------------------------------------+------------------------------+--------+---------+
| 70fd494a-3065-4f42-b25c-3f320ce1393b | osp7r1-compute-1.localdomain | ACTIVE | vmA1_12 |
| 2841a824-e693-40de-80b0-94cf72a03c97 | osp7r1-compute-2.localdomain | ACTIVE | vmA1_2 |
+--------------------------------------+------------------------------+--------+---------+
7. Check the status of the instances after the live migration finished
Instance vmA1_12:
[root@ros2client ~]# ssh -i key-admin.pem -p 6840 cloud-user@172.0.0.112
Last login: Thu Dec 3 07:46:53 2015 from 172.0.0.33
[cloud-user@vma1-12 ~]$ sudo -i
[root@vma1-12 ~]# dd if=/dev/zero of=/home/file count=1000000 conv=sync bs=1024
1000000+0 records in
1000000+0 records out
1024000000 bytes (1.0 GB) copied, 1.52992 s, 669 MB/s
Instance vmA1_2:
[root@vma1-2 ~]# exit
logout
-bash: /root/.bash_logout: Input/output error
Bus error
[cloud-user@vma1-2 ~]$ exit
[root@ros2client ~]# ssh -i key-admin.pem -p 6860 cloud-user@172.0.0.112
ssh_exchange_identification: Connection closed by remote host
***ERROR ****
The root disk of instance vmA1_2 has problems.
8. Check the multipath status on osp7r1-compute-2 where instance vmA1_2 lives:
[root@osp7r1-compute-2 ~]# multipath -ll
3600000e00d28000000280d7e00070000 dm-0
size=10G features='0' hwhandler='0' wp=rw
[root@osp7r1-compute-2 ~]# lsblk --scsi
NAME HCTL TYPE VENDOR MODEL REV TRAN
sda 0:2:0:0 disk FTS PRAID EP400i 4.25
sdf 11:0:0:1 disk FUJITSU ETERNUS_DXL 1033 iscsi
sdg 12:0:0:1 disk FUJITSU ETERNUS_DXL 1033 iscsi
sdh 13:0:0:1 disk FUJITSU ETERNUS_DXL 1033 iscsi
sdi 14:0:0:1 disk FUJITSU ETERNUS_DXL 1033 iscsi
Check the configured disk device in the libvirt XML file:
[root@osp7r1-compute-2 qemu]# virsh list
Id Name State
----------------------------------------------------
2 instance-00000005 running
<devices>
<emulator>/usr/libexec/qemu-kvm</emulator>
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source dev='/dev/mapper/3600000e00d28000000280d7e00070000'/>
<target dev='vda' bus='virtio'/>
<serial>ae0fe26d-570f-4cbf-9b6c-464435a03516</serial>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</disk>
[root@osp7r1-compute-2 qemu]# ls -al /dev/mapper/3600000e00d28000000280d7e00070000
lrwxrwxrwx. 1 root root 7 Dec 3 13:49 /dev/mapper/3600000e00d28000000280d7e00070000 -> ../dm-0
Environment
- Red Hat Openstack Platform 7
- OSP-d 7.1.0, OSDP 7.0.2
- iSCSI hosting the instances
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.