RHEL 5.8 Occasional panic during tape write (backups)

Solution Verified - Updated -

Issue

  • Periodically a server crashes when running a back to FC attached tape.
  • Something that st_do_scsi is calling is overwriting the return instruction pointer on the kernel stack with the a bad value (000000000000005f).
    • Bad value is the same in all dumps.
  • The problem is because we're returning from a function and the return IP is incorrect we have no idea where we are.
  • The problem happens around once a month.

Environment

  • Red Hat Enterprise Linux 5 Update 6
    • x86_64 RHEL 5.6 kernel version 2.6.18-308.8.2.el5
    • The customer has a systemtap scripted loaded (iostat_60s)
  • Dell Inc./PowerEdge R715
  • The tape drives are:
  Vendor: HP        Model: Ultrium 5-SCSI    Rev: I57Z
  Type:   Sequential-Access                  ANSI SCSI revision: 06
  Vendor: HP        Model: Ultrium 5-SCSI    Rev: I57Z
  Type:   Sequential-Access                  ANSI SCSI revision: 06
  Vendor: HP        Model: Ultrium 5-SCSI    Rev: I57Z
  Type:   Sequential-Access                  ANSI SCSI revision: 06
  Vendor: HP        Model: Ultrium 5-SCSI    Rev: I57Z
  Type:   Sequential-Access                  ANSI SCSI revision: 06
  Vendor: HP        Model: Ultrium 5-SCSI    Rev: I57Z
  Type:   Sequential-Access                  ANSI SCSI revision: 06
  Vendor: HP        Model: Ultrium 5-SCSI    Rev: I57Z
  Type:   Sequential-Access                  ANSI SCSI revision: 06
  Vendor: ADIC      Model: Scalar i6000      Rev: 615Q
  Type:   Medium Changer                     ANSI SCSI revision: 03

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content