Tape drive is not visible on Redhat 6.5

Latest response

Hi Experts,

server is running with Red Hat Enterprise Linux Server release 6.5 on HP ProLiant DL380p. The tape drives are connected from Sun storage Tek and when we list the tape drives with lsscsi -g | grep -i tape could see all the tape drives connect with this particular server. But when we pull the tape drive info with the help of data protector command /opt/omni/lbin/devbra -dev could see different in tape drive count. And the difference (number of drive counts} most of the time varies.

Responses

Hello Lijin,

Does the lsscsi command return the correct number of tape devices?

Also, please can you try the intensely laborious and annoying test of ensuring all the tape drives are loaded up with media and try the devbra --dev command again please?

Many thanks,
Mark

Hello Mark - Yes drives are visible when we execute lsscsi command, but the same number of tape drives are not visible when we execute devbra command only few tap drives are visible even if we loaded the medias on all the tape drives.

Thank you Lijin,

I'm assuming lsscsi does show the correct number.

I surmise that the tools are using different methods to interrogate the devices.

I would get an strace of each tool (probably a few from the omniback cmd) to see firstly how lsssci is getting the device list and then to see both how devbra is getting its list and how it is interrogating them.

I would expect to see failures or errors on unreported devices in the devbra output.

Depending on how devbra gets its information, I would consider if the problem is in RHEL or with OmniBack. Once that is determined, it would be time for a new bugzilla.

Looking around the net, the media load issue seemed to solve the problem for a lot of cases, as you have tried it without success, we need a new entry point for troubleshooting.

strace can be a little overkill, but it is also the quickest route to determining where the problem is.

Do we see any errors in /var/log/messages? specifically, similar to:

kernel: st 1:0:1:0: reservation conflict

This would suggest that SCSI Reservations are in use and have not been freed on the device.

Any other SCSI issues should be looked at too. You may also find an error about a deprecated system call by devbra this error is non-harmful, the method is still just deprecated (but usable), not yet obsoleted.

There are not many historical cases implicating devbra and where I found them, the investigation was inconclusive or often solved with the reseating of HBA cards.

let me know if you find anything useful in the strace and I can push for a bugzilla if the fault is with RHEL. If not, you might want to consider opening a case with us and we will co-ordinate with HP via TSANet.

Hello Lijin, Hello Mark,

I'm currently facing the same issue it seems as for the symptoms that are the same. When scanning with devbra -dev, I can see 28 devices and only 25 with the lsscsi command.

Following Mark's suggestion I've checked my /var/log/messages file and found a similar looking error that has exactly 3 occurences (happening in loop, but 3 differents) like the number of "missing" drives :

kernel: st 1:0:20:0: rejecting I/O to offline device
kernel: st 1:0:26:0: rejecting I/O to offline device
kernel: st 1:0:33:0: rejecting I/O to offline device

I'm new to strace though so I haven't had the occasion to check it yet, but maybe this info can help. A case is already open with HP so I'll let you know what the investigation gives.

Thanks

I'm updating this topic as I had some exchanges with HP support.

At their demand I've installed the sg3_utils package and performed the following test which confirmed there was an issue with my device (here the st16 one) :

# sg_inq /dev/st16 
sg_inq: error opening file: /dev/st16: No such device or address

But it is correctly reported in lsscsi

# lsscsi | grep st16
[1:0:20:0]   tape    HP       Ultrium 5-SCSI   ED51  /dev/st16

Then they could confirm the problem was not related to Data Protector so they cannot continue assisting further on this matter.

Have you had any lead since your last post?

Thanks

Hi Tony,

where you are seeing; "rejecting I/O to offline device", we have found that it is always a hardware related problem.

We publish guidance only in the form of: Red Hat Enterprise Linux hangs or files are not accessible, error in logs is 'rejecting I/O to offline device'. I'll cut to the chase though, you probably need to speak to the tape vendor again.

However, I'm inclined to ask the question; "lssci appears to report what it expects (or is in cache) and sg_inq appears to report the result from the drive, so which is correct?"

The simplest way to check is to do ls -l /dev/st16 and if that device isn't there, we may need to try and force it out of the kernel and rescan the bus.

How to rescan the SCSI bus to add or remove a SCSI device without rebooting the computer covers how to do that, but read it all first please!

If the file is there, then the HW Vendor needs to check the device.

Let us know,
Mark

Hi Mark,

Thank you for your feedback.

I've performed the test you recomended :

# ls -l /dev/st16
crw-rw---- 1 root disk 9, 16 Feb 28 14:44 /dev/st16

And seing this result went back to HP with your comment so they can have a look now on the hardware side.
The tricky part is that it's not actually a physical drive but an emulated one, so even if our D2D is physically correctly plugged in, their might be so configuration error I couldn't find on this virtual drive.

I'll keep you informed of our progress on this if it can help in the future.

Thanks again,
Tony

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.