rhos7: live-migration of an instance on iSCSI works one time, but migrating back leaves instance unusable

Solution Verified - Updated -

Issue

  • live-migration of an instance on iSCSI works one time, but migrating back leaves instance unusable:
Steps to reproduce the problem:

1. Create two boot volumes from an image

[stack@osp7dr1 ~(OC-admin)]$ cinder list
+--------------------------------------+--------+--------------+------+---------------+----------+--------------------------------------+
|                  ID                  | Status | Display Name | Size |  Volume Type  | Bootable |             Attached to              |
+--------------------------------------+--------+--------------+------+---------------+----------+--------------------------------------+
| 6db84a20-cf8d-4fc4-9d7f-1d6455badbd6 | in-use | root_vmA1_12 |  10  |  DX2-ISCSI-RG |   true   | 70fd494a-3065-4f42-b25c-3f320ce1393b |
| ae0fe26d-570f-4cbf-9b6c-464435a03516 | in-use | root_vmA1_2  |  10  | DX2-ISCSI-TPP |   true   | 2841a824-e693-40de-80b0-94cf72a03c97 |
+--------------------------------------+--------+--------------+------+---------------+----------+--------------------------------------+

2. Create two instances from the boot volumes

[stack@osp7dr1 ~(OC-admin)]$ nova list --name vmA1 --fields OS-EXT-SRV-ATTR:host,status,name
+--------------------------------------+------------------------------+--------+---------+
| ID                                   | OS-EXT-SRV-ATTR: Host        | Status | Name    |
+--------------------------------------+------------------------------+--------+---------+
| 70fd494a-3065-4f42-b25c-3f320ce1393b | osp7r1-compute-1.localdomain | ACTIVE | vmA1_12 |
| 2841a824-e693-40de-80b0-94cf72a03c97 | osp7r1-compute-2.localdomain | ACTIVE | vmA1_2  |
+--------------------------------------+------------------------------+--------+---------+

3. Configure Live Migration

4. Live Migrate instance "vmA1_12" from osp7r1-compute-1.localdomain to osp7r1-compute-2.localdomain

[stack@osp7dr1 ~(OC-admin)]$ nova live-migration vmA1_12 osp7r1-compute-2.localdomain

nova list --name vmA1 --fields OS-EXT-SRV-ATTR:host,status,name            
+--------------------------------------+------------------------------+-----------+---------+
| ID                                   | OS-EXT-SRV-ATTR: Host        | Status    | Name    |
+--------------------------------------+------------------------------+-----------+---------+
| 70fd494a-3065-4f42-b25c-3f320ce1393b | osp7r1-compute-1.localdomain | MIGRATING | vmA1_12 |
| 2841a824-e693-40de-80b0-94cf72a03c97 | osp7r1-compute-2.localdomain | ACTIVE    | vmA1_2  |
+--------------------------------------+------------------------------+-----------+---------+

5. After migration check the status of the instances - both should now be uctive on osp7r1-compute-2.localdomain

nova list --name vmA1 --fields OS-EXT-SRV-ATTR:host,status,name

+--------------------------------------+------------------------------+--------+---------+
| ID                                   | OS-EXT-SRV-ATTR: Host        | Status | Name    |
+--------------------------------------+------------------------------+--------+---------+
| 2841a824-e693-40de-80b0-94cf72a03c97 | osp7r1-compute-2.localdomain | ACTIVE | vmA1_2  |
| 70fd494a-3065-4f42-b25c-3f320ce1393b | osp7r1-compute-2.localdomain | ACTIVE | vmA1_12 |
+--------------------------------------+------------------------------+--------+---------+
[root@osp7r1-compute-2 ~]# virsh list
 Id    Name                           State
----------------------------------------------------
 2     instance-00000005              running
 3     instance-00000002              running

[root@osp7r1-compute-1 ~]# virsh list
 Id    Name                           State
----------------------------------------------------

[root@osp7r1-compute-1 ~]#

Check the multipath devices on both compute nodes:

[root@osp7r1-compute-1 ~]# multipath -ll
[root@osp7r1-compute-1 ~]#

[root@osp7r1-compute-2 ~]# multipath -ll
3600000e00d28000000280d7e00080000 dm-2 FUJITSU ,ETERNUS_DXL
size=10G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| |- 11:0:0:1 sdf 8:80  active ready running
| `- 12:0:0:1 sdg 8:96  active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  |- 13:0:0:1 sdh 8:112 active ready running
  `- 14:0:0:1 sdi 8:128 active ready running
3600000e00d28000000280d7e00060000 dm-0 FUJITSU ,ETERNUS_DXL
size=10G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| |- 13:0:0:0 sdd 8:48  active ready running
| `- 14:0:0:0 sde 8:64  active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  |- 11:0:0:0 sdb 8:16  active ready running
  `- 12:0:0:0 sdc 8:32  active ready running

Make sure both instances are runing and can write data to the disk ...

Instance vmA1_12:

[root@ros2client ~]# ssh -i key-admin.pem -p 6840 cloud-user@172.0.0.112
Last login: Thu Dec  3 07:30:21 2015 from 172.0.0.33
[cloud-user@vma1-12 ~]$
[cloud-user@vma1-12 ~]$
[cloud-user@vma1-12 ~]$ sudo -i
[root@vma1-12 ~]# dd if=/dev/zero of=/home/file count=1000000 conv=sync bs=1024
1000000+0 records in
1000000+0 records out
1024000000 bytes (1.0 GB) copied, 1.62285 s, 631 MB/s

Instance vmA1_2:

[root@ros2client ~]# ssh -i key-admin.pem -p 6860 cloud-user@172.0.0.112
Last login: Thu Dec  3 07:30:36 2015 from 172.0.0.33
[cloud-user@vma1-2 ~]$ sudo -i
[root@vma1-2 ~]#  dd if=/dev/zero of=/home/file count=1000000 conv=sync bs=1024
1000000+0 records in
1000000+0 records out
1024000000 bytes (1.0 GB) copied, 1.63544 s, 626 MB/s
[root@vma1-2 ~]#

6. Now start live migration from instance vmA1_12 back to osp7r1-compute-1

[stack@osp7dr1 ~(OC-admin)]$ nova live-migration vmA1_12 osp7r1-compute-1.localdomain

nova list --name vmA1 --fields OS-EXT-SRV-ATTR:host,status,name
+--------------------------------------+------------------------------+-----------+---------+
| ID                                   | OS-EXT-SRV-ATTR: Host        | Status    | Name    |
+--------------------------------------+------------------------------+-----------+---------+
| 2841a824-e693-40de-80b0-94cf72a03c97 | osp7r1-compute-2.localdomain | ACTIVE    | vmA1_2  |
| 70fd494a-3065-4f42-b25c-3f320ce1393b | osp7r1-compute-2.localdomain | MIGRATING | vmA1_12 |
+--------------------------------------+------------------------------+-----------+---------+

nova list --name vmA1 --fields OS-EXT-SRV-ATTR:host,status,name
+--------------------------------------+------------------------------+--------+---------+
| ID                                   | OS-EXT-SRV-ATTR: Host        | Status | Name    |
+--------------------------------------+------------------------------+--------+---------+
| 70fd494a-3065-4f42-b25c-3f320ce1393b | osp7r1-compute-1.localdomain | ACTIVE | vmA1_12 |
| 2841a824-e693-40de-80b0-94cf72a03c97 | osp7r1-compute-2.localdomain | ACTIVE | vmA1_2  |
+--------------------------------------+------------------------------+--------+---------+

7. Check the status of the instances after the live migration finished

Instance vmA1_12:

[root@ros2client ~]# ssh -i key-admin.pem -p 6840 cloud-user@172.0.0.112
Last login: Thu Dec  3 07:46:53 2015 from 172.0.0.33
[cloud-user@vma1-12 ~]$ sudo -i
[root@vma1-12 ~]# dd if=/dev/zero of=/home/file count=1000000 conv=sync bs=1024
1000000+0 records in
1000000+0 records out
1024000000 bytes (1.0 GB) copied, 1.52992 s, 669 MB/s

Instance vmA1_2:

[root@vma1-2 ~]# exit
logout
-bash: /root/.bash_logout: Input/output error
Bus error
[cloud-user@vma1-2 ~]$ exit

[root@ros2client ~]# ssh -i key-admin.pem -p 6860 cloud-user@172.0.0.112
ssh_exchange_identification: Connection closed by remote host

***ERROR ****
The root disk of instance vmA1_2 has problems.

8. Check the multipath status on osp7r1-compute-2 where instance vmA1_2 lives:

[root@osp7r1-compute-2 ~]# multipath -ll
3600000e00d28000000280d7e00070000 dm-0
size=10G features='0' hwhandler='0' wp=rw

[root@osp7r1-compute-2 ~]# lsblk --scsi
NAME HCTL       TYPE VENDOR   MODEL             REV TRAN
sda  0:2:0:0    disk FTS      PRAID EP400i     4.25
sdf  11:0:0:1   disk FUJITSU  ETERNUS_DXL      1033 iscsi
sdg  12:0:0:1   disk FUJITSU  ETERNUS_DXL      1033 iscsi
sdh  13:0:0:1   disk FUJITSU  ETERNUS_DXL      1033 iscsi
sdi  14:0:0:1   disk FUJITSU  ETERNUS_DXL      1033 iscsi

Check the configured disk device in the libvirt XML file:

[root@osp7r1-compute-2 qemu]# virsh list
 Id    Name                           State
----------------------------------------------------
 2     instance-00000005              running

  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source dev='/dev/mapper/3600000e00d28000000280d7e00070000'/>
      <target dev='vda' bus='virtio'/>
      <serial>ae0fe26d-570f-4cbf-9b6c-464435a03516</serial>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>

[root@osp7r1-compute-2 qemu]# ls -al /dev/mapper/3600000e00d28000000280d7e00070000
lrwxrwxrwx. 1 root root 7 Dec  3 13:49 /dev/mapper/3600000e00d28000000280d7e00070000 -> ../dm-0

Environment

  • Red Hat Openstack Platform 7
  • OSP-d 7.1.0, OSDP 7.0.2
  • iSCSI hosting the instances

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In