Direct Attached LUNs Greyed Out after Migration to New SAN
Our team is getting close to the end of a migration of all of our storage from a pair of XIVs, and a very old Clariion, to a pair of Storwize V7000s. Leaving the most difficult case for last; I needed to move two directly attached LUNs, on a RHEV VM, running Windows Server 2008 R2 server. This morning, my plan was to migrate the VM's LUNS from the XIV, to the V7000. What made this a challenge was that this involved a cluster of RHEV nodes, rather than a single server. The Storwize migration wizard specifically warned against attempting moving LUNs associated clustered servers. However, I hit a snag that I feared I would not be able to pull back from. In the interest of reducing stress in all of our lives, I'm sharing my story here:
I used a feature common to SANs, that allows you to mount copies of your LUNs from your new SAN, while your data is copied in the background from the old SAN to the new one. Ordinarily, you can start your server as soon as you have attached the LUN copies from the new SAN, but this was not to be the case this morning. When I got to the point where I added the LUNs (now hosted on the new V7000) to my VM, I found that RHEV thought that my 150GB LUN was only 80GB. (I have learned in test that attempting to mount such a LUN will end in tears.) The other LUN looked correct with a size of about 300GB.
At this point, the migration was proceeding in that my data was being copied from the XIV to the V7000. However, with a likely corrupted LUN, the prospects for mounting and booting my server started to fade. So I backed off a little by removing the mounts to the RHEV nodes. I then waited until the migration was well over 80GB, and I remounted the LUNs on the RHEV cluster nodes*. This time, RHEV reported that my LUNs were the correct size. However, they were greyed out. I attempted to select one of the LUNs anyway, and was rewarded with an error message that indicated that this particular LUN (The UID matched.) was already in use by a disk. The alias for that disk matched that of my VM prior to detaching it from RHEV.
I took a guess and went to the "Disks" tab in RHEVM. There, I found two items that matched the disks. Both had the old disk alias names on them. One showed the incorrect size of 80GB. I deleted both, went back to the VM and found my migrated disks, no longer greyed out. I selected both disks and was able to bring the server back up before the end of the maintenance period.
- RHEV: Power-off VM.
- RHEV: Deactivate both disks.
- RHEV: Remove LUN from VM.
- XIV: Unmap each RHEV node from LUN.
- XIV: Map LUN to host "V7000".
- Storwize: Run Migration Wizard.
- Storwize: Skip "Configure Hosts."
- Before you map the LUN, make sure the migration is over 50% complete, or over 80GB. (Those metrics are guesstimates.)
- Storwize: Map the LUN to the RHEV nodes. (Hosts - Modify LUN Mappings)
- RHEV: Add volume to the VM.**
- VM: Power up your VM and check your LUNs.
- Storwize: When done, finalize the migration.
- Storwize: Rename the migrated LUN.
*RHEV clusters don't recognize new LUNs unless you select each one under the storage tab, in the "New Domain" dialog. Its boring. It's tedious. Hopefully it will be fixed at some point. You need to do this after adding any new LUNs to your RHEV nodes.
**If you find that your volume is present, but greyed out, pay attention to the error messages as you attempt to add it to your VM. The error may indicate that RHEV remembers the LUN ID from the XIV copy of the LUN. At this point, you might find your old discs in the "Disks" tab. You can remove them from RHEV there.