Disk not properly removed from Node on Azure causing data corruption on persistent volume with OpenShift Container Platform 4 on Azure

Solution Verified - Updated -

Issue

  • An application reported that data were lost on the volume after redeployment. When checking the below event was reported:

    MountVolume.MountDevice failed for volume "pvc-aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa": azureDisk - mountDevice:FormatAndMount failed with format of disk "/dev/disk/azure/scsi1/lun0"
        failed: type:("ext4") target:("/var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/mAAAAAAAAAA")
        options:("defaults") errcode:(exit status 1) output:(mke2fs 1.45.6 (20-Mar-2020)
    Discarding
        device blocks:    4096/6553600               failed
        - Remote I/O error
    Creating filesystem with 6553600 4k blocks and 1638400 inodes
    Filesystem
        UUID: bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb
    Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000
    Allocating
        group tables:   0/200 done                            
    Writing
        inode tables:   0/200 done                            
    Creating
        journal (32768 blocks): done
    Writing superblocks and filesystem accounting information:
        0/200 mkfs.ext4: Input/output error while writing out and closing file system
    
  • During a redeployment of application, the persistent volume of the application was corrupted because a previous volume on the OpenShift - Node was not correctly detached when running on Azure.

  • When detaching a disk from the OpenShift Node on Azure, the disk is not removed by storvsc but instead a message Invalid packet len is found in the Nodes journal.
  • Randomly, when detaching a disk from the (Azure) hypervisor, storvsc fails to process the vmbus event sent by the hypervisor and prints only the message Invalid packet len instead of proceeding with the SCSI bus re-scan and the removal of the disk within the Node.

Environment

  • Red Hat OpenShift Container Platform (RHOCP) before 4.13
  • Microsoft Azure

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content