RHEL 5.8 Occasional panic during tape write (backups)
Issue
- Periodically a server crashes when running a back to FC attached tape.
- Something that st_do_scsi is calling is overwriting the return instruction pointer on the kernel stack with the a bad value (000000000000005f).
- Bad value is the same in all dumps.
- The problem is because we're returning from a function and the return IP is incorrect we have no idea where we are.
- The problem happens around once a month.
Environment
- Red Hat Enterprise Linux 5 Update 6
- x86_64 RHEL 5.6 kernel version 2.6.18-308.8.2.el5
- The customer has a systemtap scripted loaded (iostat_60s)
- Dell Inc./PowerEdge R715
- The tape drives are:
Vendor: HP Model: Ultrium 5-SCSI Rev: I57Z
Type: Sequential-Access ANSI SCSI revision: 06
Vendor: HP Model: Ultrium 5-SCSI Rev: I57Z
Type: Sequential-Access ANSI SCSI revision: 06
Vendor: HP Model: Ultrium 5-SCSI Rev: I57Z
Type: Sequential-Access ANSI SCSI revision: 06
Vendor: HP Model: Ultrium 5-SCSI Rev: I57Z
Type: Sequential-Access ANSI SCSI revision: 06
Vendor: HP Model: Ultrium 5-SCSI Rev: I57Z
Type: Sequential-Access ANSI SCSI revision: 06
Vendor: HP Model: Ultrium 5-SCSI Rev: I57Z
Type: Sequential-Access ANSI SCSI revision: 06
Vendor: ADIC Model: Scalar i6000 Rev: 615Q
Type: Medium Changer ANSI SCSI revision: 03
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.